Masters Degrees (Statistics and Actuarial Science)
Permanent URI for this collection
Browse
Browsing Masters Degrees (Statistics and Actuarial Science) by Author "Coetzer, Frances"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemAspects of multi-class nearest hypersphere classification(Stellenbosch : Stellenbosch University, 2017-12) Coetzer, Frances; Lamont, Mornรฉ Michael Connell; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH SUMMARY : Using hyperspheres in the analysis of multivariate data is not a common practice in Statistics. However, hyperspheres have some interesting properties which are useful for data analysis in the following areas: domain description (finding a support region), detecting outliers (novelty detection) and the classification of objects into known classes. This thesis demonstrates how a hypersphere is fitted around a single dataset to obtain a support region and an outlier detector. The all-enclosing and ๐-soft hyperspheres are derived. The hyperspheres are then extended to multi-class classification, which is called nearest hypersphere classification (NHC). Different aspects of multi-class NHC are investigated. To study the classification performance of NHC we compared it to three other classification techniques. These techniques are support vector machine classification, random forests and penalised linear discriminant analysis. Using NHC requires choosing a kernel function and in this thesis, the Gaussian kernel will be used. NHC also depends on selecting an appropriate kernel hyper-parameter ๐พ and a tuning parameter ๐ถ. The behaviour of the error rate and the fraction of support vectors for different values of ๐พ and ๐ถ will be investigated. Two methods will be investigated to obtain the optimal ๐พ value for NHC. The first method uses a differential evolution procedure to find this value. The R function DEoptim() is used to execute this. The second method uses the R function sigest(). The first method is dependent on the classification technique and the second method is executed independently of the classification technique.