Assessing the influence of observations on the generalization performance of the kernel Fisher discriminant classifier
Date
2008-12
Authors
Lamont, Morné Michael Connell
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
Kernel Fisher discriminant analysis (KFDA) is a kernel-based technique that can be used
to classify observations of unknown origin into predefined groups. Basically, KFDA can
be viewed as a non-linear extension of Fisher’s linear discriminant analysis (FLDA). In
this thesis we give a detailed explanation how FLDA is generalized to obtain KFDA. We
also discuss two methods that are related to KFDA. Our focus is on binary classification.
The influence of atypical cases in discriminant analysis has been investigated by many
researchers. In this thesis we investigate the influence of atypical cases on certain aspects
of KFDA. One important aspect of interest is the generalization performance of the KFD
classifier. Several other aspects are also investigated with the aim of developing criteria
that can be used to identify cases that are detrimental to the KFD generalization
performance. The investigation is done via a Monte Carlo simulation study.
The output of KFDA can also be used to obtain the posterior probabilities of belonging to
the two classes. In this thesis we discuss two approaches to estimate posterior
probabilities in KFDA. Two new KFD classifiers are also derived which use these
probabilities to classify observations, and their performance is compared to that of the
original KFD classifier.
The main objective of this thesis is to develop criteria which can be used to identify cases
that are detrimental to the KFD generalization performance. Nine such criteria are
proposed and their merit investigated in a Monte Carlo simulation study as well as on
real-world data sets.
Evaluating the criteria on a leave-one-out basis poses a computational challenge,
especially for large data sets. In this thesis we also propose using the smallest enclosing
hypersphere as a filter, to reduce the amount of computations. The effectiveness of the
filter is tested in a Monte Carlo simulation study as well as on real-world data sets.
Description
Thesis (PhD (Statistics and Actuarial Science))—Stellenbosch University, 2008.
Keywords
Kernel Fisher discriminant analysis, Atypical cases, Kernel function, Smallest enclosing hypersphere