Browsing by Author "Pretorius, Arnu"
Now showing 1 - 3 of 3
Results Per Page
- ItemAdvances in random forests with application to classification(Stellenbosch : Stellenbosch University, 2016-12) Pretorius, Arnu; Bierman, Surette; Steel, Sarel J.; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics & Actuarial Science.ENGLISH SUMMARY : Since their introduction, random forests have successfully been employed in a vast array of application areas. Fairly recently, a number of algorithms that adhere to Leo Breiman’s definition of a random forest have been proposed in the literature. Breiman’s popular random forest algorithm (Forest-RI), and related ensemble classification algorithms which followed, form the focus of this study. A review of random forest algorithms that were developed since the introduction of Forest-RI is given. This includes a novel taxonomy of random forest classification algorithms, which is based on their sources of randomization, and on deterministic modifications. Also, a visual conceptualization of contributions to random forest algorithms in the literature is provided by means of multidimensional scaling. Towards an analysis of advances in random forest algorithms, decomposition of the expected prediction error into bias and variance components is considered. In classification, such decompositions are not as straightforward as in the case of using squared-error loss for regression. Hence various definitions of bias and variance for classification can be found in the literature. Using a particular bias-variance decomposition, an empirical study of ensemble learners, including bagging, boosting and Forest-RI, is presented. From the empirical results and insights into the way in which certain mechanisms of random forests affect bias and variance, a novel random forest framework, viz. oblique random rotation forests, is proposed. Although not entirely satisfactory, the framework serves as an example of a heuristic approach towards novel proposals based on bias-variance analyses, instead of an ad hoc approach, as is often found in the literature. The analysis of comparative studies regarding advances in random forest algorithms is also considered. It is of interest to critically evaluate the conclusions that can be drawn from these studies, and to infer whether novel random forest algorithms are found to significantly outperform Forest-RI. For this purpose, a meta-analysis is conducted in which an evaluation is given of the state of research on random forests based on all (34) papers that could be found in which a novel random forest algorithm was proposed and compared to already existing random forest algorithms. Using the reported performances in each paper, a novel two-step procedure is proposed, which allows for multiple algorithms to be compared over multiple data sets, and across different papers. The meta analysis results indicate weighted voting strategies and variable weighting in high-dimensional settings to provide significantly improved performances over the performance of Breiman’s popular Forest-RI algorithm.
- ItemLearning dynamics of linear denoising autoencoders(PMLR, 2018) Pretorius, Arnu; Kroon, Steve; Kamper, HermanDenoising autoencoders (DAEs) have proven useful for unsupervised representation learning, but a thorough theoretical understanding is still lacking of how the input noise influences learning. Here we develop theory for how noise influences learning in DAEs. By focusing on linear DAEs, we are able to derive analytic expressions that exactly describe their learning dynamics. We verify our theoretical predictions with simulations as well as experiments on MNIST and CIFAR-10. The theory illustrates how, when tuned correctly, noise allows DAEs to ignore low variance directions in the inputs while learning to reconstruct them. Furthermore, in a comparison of the learning dynamics of DAEs to standard regularised autoencoders, we show that noise has a similar regularisation effect to weight decay, but with faster training dynamics. We also show that our theoretical predictions approximate learning dynamics on real-world data and qualitatively match observed dynamics in nonlinear DAEs.
- ItemOn noise regularised neural networks: initialisation, learning and inference(Stellenbosch : Stellenbosch University, 2019-12) Pretorius, Arnu; Kroon, R. S. (Steve); Kamper, M. J.; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.ENGLISH ABSTRACT: Innovation in regularisation techniques for deep neural networks has been a key factor in the rising success of deep learning. However, there is often limited guidance from theory in the development of these techniques and our understanding of the functioning of various successful regularisation techniques remains impoverished. In this work, we seek to contribute to an improved understanding of regularisation in deep learning. We specifically focus on a particular approach to regularisation that injects noise into a neural network. An example of such a technique which is often used is dropout (Srivastava et al., 2014). Our contributions in noise regularisation span three key areas of modeling: (1) learning, (2) initialisation and (3) inference. We first analyse the learning dynamics of a simple class of shallow noise regularised neural networks called denoising autoencoders (DAEs) (Vincent et al., 2008), to gain an improved understanding of how noise affects the learning process. In this first part, we observe a dependence o f learning behaviour on initialisation, which leads us to study how noise interacts with the initialisation of a deep neural network in terms of signal propagation dynamics during the forward and backward pass. Finally, we consider how noise affects inference in a Bayesian context. We mainly focus on fully-connected feedforward neural networks with rectifier linear unit (ReLU) activation functions throughout this study. To analyse the learning dynamics of DAEs, we derive closed form solutions to a system of decoupled differential equations that describe the change in scalar weights during the course of training as they approach the eigenvalues of the input covariance matrix (under a convenient change of basis). In terms of initialisation, we use mean field theory to approximate the distribution of the pre-activations of individual neurons, and use this to derive recursive equations that characterise the signal propagation behaviour of the noise regularised network during the first forward and backward pass o f training. Using these equations, we derive new initialisation schemes for noise regularised neural networks that ensure stable signal propagation. Since this analysis is only valid at initialisation, we next conduct a large-scale controlled experiment, training thousands of networks under a theoretically guided experimental design, for further testing the effects of initialisation on training speed and generalisation. To shed light on the influence of noise on inference, we develop a connection between randomly initialised deep noise regularised neural networks and Gaussian processes (GPs)—non-parametric models that perform exact Bayesian inference—and establish new connections between a particular initialisation of such a network and the behaviour of its corresponding GP. Our work ends with an application of signal propagation theory to approximate Bayesian inference in deep learning where we develop a new technique that uses self-stabilising priors for training deep Bayesian neural networks (BNNs). Our core findings are as follows: noise regularisation helps a model to focus on the more prominent statistical regularities in the training data distribution during learning which should be useful for later generalisation. However, if the network is deep and not properly initialised, noise can push network signal propagation dynamics into regimes of poor stability. We correct this behaviour with proper “noise-aware” weight initialisation. Despite this, noise also limits the depth to which networks are able to train successfully, and networks that do not exceed this depth limit demonstrate a surprising insensitivity to initialisation with regards to training speed and generalisation. In terms of inference, noisy neural network GPs perform best when their kernel parameters correspond to the new initialisation derived for noise regularised networks, and increasing the amount of injected noise leads to more constrained (simple) models with larger uncertainty (away from the training data). Lastly, we find our new technique that uses self-stabilising priors makes training deep BNNs more robust and leads to improved performance when compared to other state-of-the-art approaches.