Bayesian machine learning : theory and applications

Date
2020-12
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH SUMMARY : Machine learning problems in general are concerned with the ability of different methods and algorithms to extract useful and interpretable information from large datasets, possibly ones which are corrupt due to noisy measurements or errors in data capturing. As the size and complexity of data increases, the demand for efficient and robust machine learning techniques is greater than ever. All statistical techniques can be divided into either a frequentist approach or a Bayesian approach depending on how probability is interpreted and how the unknown parameter set is treated. Bayesian methods have been present for several centuries; however, it was the advent of improved computational power and memory storage that catalysed the use of Bayesian modelling approaches in a wider range of scientific fields. This is largely due to many Bayesian methods requiring the computation of complex integrals, sometimes ones that are analytically intractable to compute in closed form, now being more accessible for use since approximation methods are less time-consuming to execute. This thesis will consider a Bayesian approach to statistical modelling and takes the form of a postgraduate course in Bayesian machine learning. A comprehensive overview of several machine learning topics are covered from a Bayesian perspective and, in many cases, compared with their frequentist counterparts as a means of illustrating some of the benefits that arise when making use of Bayesian modelling. The topics covered are focused on the more popular methods in the machine learning literature. Firstly, Bayesian approaches to classification techniques as well as a fully Bayesian approach to linear regression are discussed. Further, no discussion on machine learning methods would be complete without consideration of variable selection techniques, thus, a range of Bayesian variable selection and sparse Bayesian learning methods are considered and compared. Finally, probabilistic graphical models are presented since these methods form an integral part of Bayesian artificial intelligence. Included with the discussion of each technique is a practical implementation. These examples are all easily reproducible and demonstrate the performance of each method. Where applicable, a comparison of the Bayesian and frequentist methods are provided. The topics covered are by no means exhaustive of the Bayesian machine learning literature but rather provide a comprehensive overview of the most commonly encountered methods.
AFRIKAANSE OPSOMMING : probleme het oor die algemeen te make met die vermoë van verskillende metodes en algoritmes om nuttige en interpreteerbare inligting uit groot en moontlik onbruikbare datastelle te haal. Soos die grootte en kompleksiteit van data toeneem, is die aanvraag vir doeltreffende en robuuste masjienleertegnieke groter as ooit tevore. Alle statistiese tegnieke kan in 'n frekwentistiese of 'n Bayes-benadering verdeel word, afhangende van hoe die waarskynlikheid geïnterpreteer word en hoe die onbekende parameterstel hanteer word. Bayes metodes bestaan al 'n hele paar dekades lank. Dit was egter die koms van verbeterde rekenaarkrag en geheue-berging wat die gebruik van Bayes-modelleringsbenaderings in 'n wyer verskeidenheid wetenskaplike velde gekategoriseer het. Dit is grotendeels te danke aan baie Bayes-metodes wat die berekening van komplekse integrale vereis, wat soms analities onuitvoerbaar is om in geslote vorm te bereken, wat nou meer toeganklik is vir gebruik, aangesien benaderingsmetodes minder tydrowend is om uit te voer. In hierdie proefskrif word die Bayes-benadering tot statistiese modellering bespreek en is in die vorm van 'n nagraadse kursus in Bayes-masjienleer. 'n Omvattende oorsig van verskeie masjienleeronderwerpe word vanuit 'n Bayes-perspektief behandel. In baie gevalle word dit vergelyk met hul frekwentistiese-eweknieë om die voordele van Bayes-modellering gebruik word, te illustreer. Die onderwerpe wat behandel word, fokus op die meer gewilde metodes in die masjienleerliteratuur. Eerstens word Bayes-benaderings tot klassifikasietegnieke sowel as 'n volledige Bayes-benadering tot lineêre regressie bespreek. Verder sou geen bespreking oor masjienleermetodes volledig wees sonder inagneming van tegnieke vir veranderlike seleksie nie. 'n Reeks Bayes veranderlike seleksie en sommige Bayes-leermetodes word dus oorweeg en vergelyk. Laastens word grafiese waarskynlikheidsmodelle bespreek, aangesien hierdie metodes 'n belangrike rol in Bayes kunsmatige intelligensie speel. 'n Praktiese voorbeeld is by die bespreking van elke tegniek ingesluit. Hierdie voorbeelde is maklik om te hergebruik en wys die voordele van elke metode. Waar moontlik, word ook 'n vergelyking van die Bayes en frekwentistiese-metodes gegee. Die onderwerpe wat aangebied word, sluit geensins die volledig Bayes-masjienleerliteratuur in nie, maar bied 'n omvattende oorsig van die metodes wat die meeste voorkom en gebruik word.
Description
Thesis (MCom)--Stellenbosch University, 2020.
Keywords
Machine learning -- Theory, methods, etc., Bayesian statistical decision theory -- Theory, methods, etc., UCTD
Citation