Browsing by Author "Meyer, Nicholas George"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemStrategies for combining tree-based learners(Stellenbosch : Stellenbosch University., 2020-04) Meyer, Nicholas George; Uys, Daniel W.; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH ABSTRACT: In supervised statistical learning, an ensemble is a predictive model that is the conglomeration of several other predictive models. Ensembles are applicable to both classification and regression problems and have demonstrated theoretical and practical appeal. Furthermore, due to the recent advances in computing, the application of ensemble methods has become widespread. Structurally, ensembles can be characterised according to two distinct aspects. The first is by the method employed to train the individual base learning models that constitute the conglomeration. The second is by the technique used to combine the predictions of the individual base learners for the purpose of obtaining a single prediction for an observation. This thesis considers the second issue. Insofar, the focus is on weighting strategies for combining tree models that are trained in parallel on bootstrap resampled versions of the training sample. The contribution of this thesis is the development of a regularised weighted model. The purpose is two-fold. First, the technique provides flexibility in controlling the bias-variance trade-off when fitting the model. Second, the proposed strategy mitigates issues that plague similar weighting strategies through the application of ℓ2 regularisation. The aforesaid includes an ill-condition optimisation problem for finding the weights and overfitting in low signal to noise scenarios. In this thesis a derivation is provided, which outlines the mathematical details to solve for the weights of the individual models. Crucially, the solution relies on methods from convex optimisation which is discussed. In addition, the technique will be assessed against established ensemble techniques on both simulated and real-world data sets. The results show that the proposal performs well relative to the established averaging techniques such as bagging and random forest. It is argued that the proposed approach offers a generalisation to the bagging regression ensemble. In this regard, the bagging regressor is a highly regularised weighted ensemble leveraging ℓ2 regularisation; not merely an equally weighted ensemble. This deduction relies on the imposition of two constraints to the weights, namely: a positivity constraint and a normalisation constraint. Key words: Ensemble, Regression, Regularisation, Convex Optimisation, Bagging, Random Forest