Doctoral Degrees (Computer Science)
Permanent URI for this collection
Browse
Browsing Doctoral Degrees (Computer Science) by Subject "Algorithms"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemLandscape aware algorithm selection for feature selection(Stellenbosch : Stellenbosch University, 2023-10) Mostert, Werner; Engelbrecht, Andries Petrus; Malan, Katherine Mary; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Computer Science Division.ENGLISH ABSTRACT: Feature selection is commonly applied as a pre-processing technique for machine learning to reduce the dimensionality of a problem by removing redundant and irrelevant features. Another desirable outcome of feature selection lies in the potential performance improvement of predictive models. The development of new feature selection algorithms are common within the field, however, relatively little research has historically been done to better understand the feature selection problem from a theoretical perspective. Researchers and practitioners in the field often rely on a trial-and-error strategy to decide on which feature selection algorithm to use for a specific instance of a machine learning problem. This thesis contributes towards a better understanding of the complex feature selection problem by investigating the link between feature selection problem characteristics and the performance of feature selection algorithms. A variety of fitness landscape analysis techniques are used to gain insights into the structure of the feature selection fitness landscape. Performance complementarity for feature selection algorithms is empirically shown, emphasising the potential value of automated algorithm selection for feature selection algorithms. Towards the realisation of a landscape aware algorithm selector for feature selection, a novel performance metric for feature selection algorithms is presented. The baseline fitness improvement (BFI) performance metric is unbiased and can be used for comparative analysis across feature selection problem instances. The insights obtained via landscape analysis are used with other meta-features of datasets and the BFI performance measure to develop a new landscape aware algorithm selector for feature selection. The landscape aware algorithm selector provides a human-interpretable predictive model of the best feature selection algorithm for a specific dataset and classification problem.
- ItemMixtures of heterogeneous experts(Stellenbosch : Stellenbosch University, 2022-10) Omomule, Taiwo Gabriel; Engelbrecht, Andries; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: This research considers the problem of the No-Free-Launch-Theorem, which states that no one machine learning algorithm performs best on all problems due to algorithms having different inductive biases. Another problem is that the combinations of experts of the same type, referred to as a mixture of homogeneous experts, do not capitalize on the different inductive biases of different machine learning algorithms. Research has shown that mixtures of homogeneous experts deliver improved accuracy compared to that of the base experts in the mixture. However, the predictive power of a homogeneous mixture of experts is still limited by the inductive bias of the algorithm that makes up the mixture of experts. Therefore, this research proposes the development of mixtures of heterogeneous experts through the combination of different machine learning algorithms to take advantage of the strengths of the machine learning algorithms and to reduce the adverse effects of the inductive biases of the different algorithms. A set of different machine learning algorithms are selected to develop four different types of mixtures of experts in the research. Empirical analyses are performed using nonparametric statistical tests to compare the generalization performance of the ensembles. The comparison is carried out to investigate the performance of the homogeneous and heterogeneous ensembles in a number of modelling studies examined on a set of classification and regression problems using selected performance measures. The problems represent varying levels of complexity and characteristics to determine the characteristics and complexities for which the heterogeneous ensembles outperform homogeneous ensembles. For classification problems, the empirical results across six modelling studies indicate that heterogeneous ensembles generate improved predictive performance compared to the developed homogeneous ensembles by taking advantage of the different inductive biases of the different base experts in the ensembles. Specifically, the heterogeneous ensembles developed using different machine learning algorithms, with the same and different configurations, showed superiority over other heterogeneous ensembles and the homogeneous ensembles developed in this research. The ensembles achieved the best and second-best overall accuracy rank across the classification datasets in each modelling study. For regression problems, the heterogeneous ensembles outperformed the homogeneous ensembles across five modelling studies. Although, a random forest algorithm achieved competitive generalization performance compared to that of the heterogeneous ensembles. Based on the average ranks, the heterogeneous ensembles developed using different machine learning algorithms where the base members consist of the same and different configurations still produced better predictive performance than a number of heterogeneous ensembles and homogeneous ensembles across the modelling studies. Therefore, the implementation of a mixture of heterogeneous experts removes the need for the computationally expensive process of finding the best performing homogeneous ensemble. The heterogeneous ensembles of different machine learning algorithms are consistently the most or one of the most accurate ensembles across all classification and regression problems. This is attributed to the advantage of capitalizing on the inductive biases of the different machine learning algorithms and the different configurations of the base members in the ensembles.