Browsing by Author "Nel, Gerrit Stephanus"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemA hyperheuristic approach towards the training of artificial neural networks(Stellenbosch : Stellenbosch University, 2021-03) Nel, Gerrit Stephanus; Van Vuuren, J. H.; Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering.ENGLISH ABSTRACT: In 2015, approximately 2.5 × 1018 bytes of data were generated on a daily basis. The enormity and nature of these data have laid bare the inadequacies of standard data analytic approaches. Researchers and practitioners have for long been unequipped with the necessary means to extract insight from the vast amounts of data at their disposal - until now, that is. Recent advances within the domain of artificial intelligence have ushered in a new era, providing the essential connective tissue between data and analysis. These advances can be attributed to instrumental research conducted within the field of machine learning, research that has provided algorithms with the inherent ability to learn. A groundbreaking algorithm at the forefront of the current machine learning impetus is the artificial neural network. Artificial neural networks are computational models inspired by biological neural networks. This process of neurological emulation enables artificial neural networks to gain an ability intrinsic to their muse - i.e. to learn from experience. A characteristic that distinguishes this algorithm from other machine learning algorithms is the efficiency and effectiveness with which it can recognise complex patterns and abstractions within data. The process according to which this algorithm recognises patterns from data is called training and is arguably its most intriguing facet. Conventionally, the method of gradient descent (or steepest ascent) is employed to find good network parameter values. A limitation is, however, imposed on the level of abstraction at which optimisation can thus transpire. A gradient-free approach offers a good alternative. More specifically, the research field of metaheuristics provides powerful optimisation techniques that are applicable in the context of training artificial neural networks. A metaheuristic optimisation approach allows for far greater freedom during artificial neural network training - the network weights, its structure, and its activation functions can be optimised concurrently. This versatility of metaheuristics, as well as their proven capability in many optimisation contexts, serves as justification for why they feature centrally in this dissertation. A challenge to all optimisation approaches, however, relates to the decision of which algorithm to employ for this purpose. Fortunately, the relatively new and promising field of hyperheuristics provides the necessary means to circumvent this challenge - a hyperheuristic is essentially a heuristic that chooses heuristics. The hyperheuristic considered in this dissertation is called the AMALGAM method. AMALGAM is a powerful and robust optimisation approach that delivers significant performance improvements (approaching a factor of ten), whilst enhancing the level of general applicability over various benchmark problems. This hyperheuristic has not been applied in the literature to the optimisation problem of training artificial neural networks in respect of their network weights, network structure, and activation functions concurrently. An AMALGAM-based hyperheuristic training algorithm is therefore proposed in this dissertation. The novelty of the problem under investigation, however, necessitates a new mathematical learning model. In addition, novel modifications in respect of AMALGAM are made so as to enable its use in neural network training. A bi-objective hyperheuristic training algorithm is designed, in which the main objective represents a novel network performance measure while a secondary so-called helper objective is incorporated to guide the search process. A test suite, comprising several data sets, is created in order to evaluate the efficacy of the proposed training algorithm. Three extensive parameter evaluations are performed so as to gain insight into algorithmic performance under different conditions. An in-depth algorithmic performance comparison is also performed during which the performance achieved by the proposed hyperheuristic training algorithm is compared with those of its constituent sub-algorithms. The robustness of the proposed approach is also validated by means of a meta-generalisation analysis. A comparison between the hyperheuristic training algorithm and powerful gradient-based training algorithms is performed which is supplemented by an investigation into the potential consolidation of the hyperheuristic approach with the best gradient-based algorithm. An in-depth investigation is launched into the temporal dynamics of the hyperheuristic's sub-algorithms with a view to gain new insight into this novel approach towards training artificial neural networks and to predict algorithmic performance. A demonstration of how the working of the hyperheuristic can be improved by means of the prediction model is also provided. The structural attributes related to favourable networks produced by the hyperheuristic are analysed with a view to gain new insight into the working of the hyperheuristic.