Browsing by Author "Melonas, Michail C."
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemProjected naive bayes(Stellenbosch : Stellenbosch University, 2020-03) Melonas, Michail C.; Hofmeyr, David; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH SUMMARY : Naïve Bayes is a well-known statistical model that is recognised by the Institute of Electrical and Electronics Engineers (IEEE) as being among the top ten data mining algorithms. It performs classification by making the strong assumption of class conditional mutual statistical independence. Although this assumption is unlikely to be an accurate representation of the true statistical dependencies, naïve Bayes nevertheless delivers accurate classification in many domains. This success can be related to that of linear regression providing reliable estimation in problems where exact linearity is not realistic. There is a rich body of literature on the topic of improving naïve Bayes. This dissertation is concerned with doing so via a projection matrix that provides an alternative representation for the data of interest. We introduce Projected Gaussian naïve Bayes and Projected Kernel naïve Bayes as naïve-Bayes-type classifiers that respectively relies on Gaussianity and kernel density estimation. The proposed method extends the flexibility of the standard naïve Bayes. The approach maintains the simplicity and efficiency of naïve Bayes while improving its accuracy. Our method is shown to be competitive with several popular classifiers on real-world data. In particular, our method’s classification accuracy is compared to that of linear- and quadratic discriminant analysis, the support vector machine and the random forest. There is a close connection between our proposal and the application of naïve Bayes to a class conditionally conducted independent component analysis. In addition to a classification accuracy improvement, the proposed method also provides a tool for visually representing data in low-dimensional space. This visualisation aspect of our method is discussed with respect to the connection to independent component analysis. Our method is shown to give a better visual representation than does linear discriminant analysis on a number of real-world data-sets.