Masters Degrees (Statistics and Actuarial Science)
Permanent URI for this collection
Browse
Browsing Masters Degrees (Statistics and Actuarial Science) by Author "Brand, Hilmarie"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemPCA and CVA biplots : a study of their underlying theory and quality measures(Stellenbosch : Stellenbosch University, 2013-03) Brand, Hilmarie; Le Roux, N. J.; Lubbe, Sugnet; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH ABSTRACT: The main topics of study in this thesis are the Principal Component Analysis (PCA) and Canonical Variate Analysis (CVA) biplots, with the primary focus falling on the quality measures associated with these biplots. A detailed study of different routes along which PCA and CVA can be derived precedes the study of the PCA biplot and CVA biplot respectively. Different perspectives on PCA and CVA highlight different aspects of the theory that underlie PCA and CVA biplots respectively and so contribute to a more solid understanding of these biplots and their interpretation. PCA is studied via the routes followed by Pearson (1901) and Hotelling (1933). CVA is studied from the perspectives of Linear Discriminant Analysis, Canonical Correlation Analysis as well as a two-step approach introduced in Gower et al. (2011). The close relationship between CVA and Multivariate Analysis of Variance (MANOVA) also receives some attention. An explanation of the construction of the PCA biplot is provided subsequent to the study of PCA. Thereafter follows an in depth investigation of quality measures of the PCA biplot as well as the relationships between these quality measures. Specific attention is given to the effect of standardisation on the PCA biplot and its quality measures. Following the study of CVA is an explanation of the construction of the weighted CVA biplot as well as two different unweighted CVA biplots based on the two-step approach to CVA. Specific attention is given to the effect of accounting for group sizes in the construction of the CVA biplot on the representation of the group structure underlying a data set. It was found that larger groups tend to be better separated from other groups in the weighted CVA biplot than in the corresponding unweighted CVA biplots. Similarly it was found that smaller groups tend to be separated to a greater extent from other groups in the unweighted CVA biplots than in the corresponding weighted CVA biplot. A detailed investigation of previously defined quality measures of the CVA biplot follows the study of the CVA biplot. It was found that the accuracy with which the group centroids of larger groups are approximated in the weighted CVA biplot is usually higher than that in the corresponding unweighted CVA biplots. Three new quality measures that assess that accuracy of the Pythagorean distances in the CVA biplot are also defined. These quality measures assess the accuracy of the Pythagorean distances between the group centroids, the Pythagorean distances between the individual samples and the Pythagorean distances between the individual samples and group centroids in the CVA biplot respectively.