Discriminant analysis using sparse graphical models

dc.contributor.advisorKamper, Francoisen_ZA
dc.contributor.advisorBierman, Suretteen_ZA
dc.contributor.authorBotha, Dylonen_ZA
dc.contributor.otherStellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.en_ZA
dc.date.accessioned2020-02-18T17:58:52Z
dc.date.accessioned2020-04-28T12:12:32Z
dc.date.available2020-02-18T17:58:52Z
dc.date.available2020-04-28T12:12:32Z
dc.date.issued2020-03
dc.descriptionThesis (MCom)--Stellenbosch University, 2020.en_ZA
dc.description.abstractENGLISH SUMMARY : The objective of this thesis is the proposal of a new classification method. This classification method is an extension of classical quadratic discriminant analysis (QDA), where the focus is placed on relaxing the assumption of normality, and on overcoming the adverse effect of the large number of parameters that needs to be estimated when applying QDA. To relax the assumption of normality, we consider assigning to each class density a different nonparanormal distribution. Based on these nonparanormal distributions, new discriminant functions can be derived. When one considers the use of a nonparanormal distribution, the underlying assumption is that the associated random vector, can through the use of an appropriate transformation, be made to follow a Gaussian distribution. Such a transformation is based on the marginals of the distribution, which is to be estimated in a nonparametric way. The large number of parameters in QDA is a result of the estimation of class precision matrices. To overcome this problem, penalised maximum likelihood estimation is performed by placing an L1 penalty on the size of the elements in the class precision matrices. This leads to sparse precision matrix estimates, and therefore also to a reduction in the number of estimated parameters. Combining the above approaches to overcome the problems induced by nonnormality and a large number of parameters to estimate, leads to the following novel classification method. To each class density, a separate transformation is applied. Thereafter L1 penalised maximum likelihood estimation is performed in the transformed space. The resulting parameter estimates are then plugged into the nonparanormal discriminant functions, thereby facilitating classification. An empirical evaluation of the novel proposal shows it to be competitive with a wide array of existing classifiers. We also establish a connection to probabilistic graphical models, which could aid in the interpretation of this new technique.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING : Die doelwit van hierdie tesis is die voorstel van ’n nuwe klassifikasie-metode. Hierdie klassifikasie-metode is ’n uitbreiding van klassieke kwadratiese diskriminant-analise (KDA), waarin die normaliteits-aanname van KDA verslap word, en waarin die negatiewe effek van die groot aantal parameters wat beraam moet word in KDA toepassings, aangespreek word. Ter verslapping van die normaliteits-aanname beskou ons die toekenning van verskillende nie-paranormale verdelings aan elke klas. Op grond van hierdie nie-paranormale digtheidsfunksies kan nuwe diskriminantfunksies afgelei word. Wanneer ’n nie-paranormale verdeling veronderstel word, is die onderliggende aanname dat die geassosieerde vektor van stogastiese veranderlikes na ’n normaalverdeling transformeer kan word. Hierdie transformasie is gebaseer op die marginale verdelings, wat weer op ’n nie-parametriese wyse beraam word. Die groot aantal parameters in KDA is die gevolg van die beraming van presisiematrikse vir elke klas. Om hierdie probleem te oorkom, word gepenaliseerde maksimum aanneemlik-heidsberaming toegepas, spesifiek deur L1-penalisering op die groote van die elemente in die presisiematrikse. Dit lei tot ’n patroon van skaarsheid in die inverse kovariansiematrikse, en derhalwe ook tot ’n vermindering in die aantal beraamde parameters. Die samevoeging van die bogaande twee benaderings ten einde die probleme veroorsaak deur nie-normaliteit en die groot aantal parameters om te beraam, te oorkom, lei tot die volgende nuwe klassifikasie-metode. Vir elke klasdigtheid word ’n aparte transformasie toegepas. Daarna word L1-gepenaliseerde maksimum aanneemlikheidsberaming in die getransformeerde ruimte toegepas. Die beramings wat sodoende gevind word, word dan by die nie-paranormale diskriminant funksies ingestel ten einde klassifikasie te doen. Empiriese evaluering van die nuwe tegniek wys dat dit goed vergelyk met bestaande klassifikasie-metodes. Ons bevestig ook ’n verwantskap met grafiese modelle, wat moontlik kan bydra tot interpretasie van die nuwe tegniek.af_ZA
dc.description.versionMastersen_ZA
dc.format.extentxii, 113 pages ; illustrations, includes annexure
dc.identifier.urihttp://hdl.handle.net/10019.1/107978
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectGaussian distribution -- Graphic methodsen_ZA
dc.subjectGraphical modeling (Statistics)en_ZA
dc.subjectSparse gridsen_ZA
dc.subjectInverse Gaussian distributionen_ZA
dc.subjectMultivariate analysis -- Graphic methodsen_ZA
dc.subjectDiscriminant analysis -- Graphic methodsen_ZA
dc.subjectUCTD
dc.titleDiscriminant analysis using sparse graphical modelsen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
botha_discriminant_2020.pdf
Size:
2.08 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: