Clustering methods with a focus on self-organising maps and an implementation on retail bank transactional data

Date
2018-12
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH SUMMARY : The aims of this study is to provide an overview of traditional clustering methods, as well as introduce and discuss self-organising maps (SOMs) in detail. This study wants to convince the reader of the usefulness of self-organising maps as a dimension reduction tool. The batch SOMs algorithm was found to be the most appropriate SOM to use in practice, together with random initialisation of the prototypes. Ward linkage hierarchical clustering was found to perform the best on multivariate Gaussian simulated data and it was also found to be the most appropriate traditional clustering method to fit on top of the SOM. Banking transactional data was investigated for client behavioural clusters and the clusters of lower socio-economic class clients, technologically sophisticated clients, older and more traditional clients and low financial activity clients were found. These clusters emerged consistently throughout 9 different data samples.
AFRIKAANSE OPSOMMING : Die doel van hierdie studie is om ’n oorsig oor tradisionele groeperings metodes saam te stel, sowel as om selforganiserende kaarte (SOK) (“self-organising maps”) te bespreek. Hierdie studie wil die leser oortuig van die bruikbaarheid van SOK as ’n dimensie-vermindering tegniek. Die bondel-SOK algoritme is die metode wat in die praktyk aanbeveel word, saam met lukrake inisialisering van die prototipes. Ward-koppeling (“Ward linkage”) hiërargiese groepering het die beste presteer op multivariaat-Gaussies gesimuleerde data. In hierdies studie is ook gevind dat Ward-koppeling die mees toepaslike tradisionele groeperingsmetode was om bo-op die SOK aan te wend. Data uit die transaksionele bank omgewing is ondersoek om kliënt gedragsgroepe te vind. Hierdie gedragsgroepe is geïdentifiseer as laer sosio-ekonomiese klas kliënte, tegnologies gesofistikeerde kliënte, ouer en meer tradisionele kliënte en ook ’n groep met lae finansiële aktiwiteit. Die ontleding het hierdie groepe konsekwent oor 9 verskillende datastelle geïdentifiseer.
Description
Thesis (MCom)--Stellenbosch University, 2020.
Keywords
Unsupervised learning, Self-organising maps, Cluster analysis, K-means clustering, K-medoids clustering, Hierarchical clustering, Big data -- Cluster analysis, UCTD
Citation