Classification in high dimensional data using sparse techniques

Date
2019-04
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH SUMMARY : Traditional classification techniques fail in the analysis of high-dimensional data. In response, new classification techniques and accompanying theory have recently emerged. These techniques are natural extensions of linear discriminant analysis. The aim is to solve the statistical challenges that arise with high-dimensional data by utilising the sparse coding (Johnstone and Titterington, 2009). In this project, our focus is on the following techniques: penalized LDA-FL, penalized LDA-FL, sparse discriminant analysis, sparse mixture discriminant analysis and sparse partial least squares. We evaluated the performance of these techniques in simulation studies and on two microarray gene expression datasets by comparing the test error rates and the number of features selected. In the simulation studies, we found that performance vary depending on the simulation set-up and on the classification technique used. The two microarray gene expression datasets are considered for practical implementation of these techniques. The results from the microarray gene expression datasets showed that these classification techniques achieve satisfactory accuracy.
AFRIKAANSE OPSOMMING : Geen opsomming beskikbaar.
Description
Thesis (MCom)--Stellenbosch University, 2019.
Keywords
High dimensional data, Mathematical statistics, Sparse classification, Sparse grids, Dimension reduction (Statistics), UCTD
Citation