Initialisation of noise-regularised neural networks

dc.contributor.advisorKroon, R. S. (Steve)en_ZA
dc.contributor.authorVan Biljon, Elanen_ZA
dc.contributor.otherStellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.en_ZA
dc.date.accessioned2021-09-09T14:02:23Z
dc.date.accessioned2021-12-22T14:14:36Z
dc.date.available2021-09-09T14:02:23Z
dc.date.available2021-12-22T14:14:36Z
dc.date.issued2021-12
dc.descriptionThesis (MSc)--Stellenbosch University, 2021.en_ZA
dc.description.abstractENGLISH ABSTRACT: Recently, proper initialisation and stochastic regularisation techniques have greatly improved the performance and ease of training of neural networks. Some research has gone into how the magnitude of the initial weights impact optimisation, while others have focused on how initialisation affects signal propagation. In terms of noise regularisation, dropout has allowed networks to train relatively quickly and reduced overfitting. Much research has gone towards understanding why dropout improves the generalisation of networks. Two major theories are (i) that it prevents neurons from becoming too dependent on the output of other neurons and (ii) that dropout leads a network to optimise a smoother loss landscape. Despite this, our theoretical understanding of the interaction between regularisation and initialisation is sparse. Thus, the aim of this work was to broaden our knowledge of how initialisation and stochastic regularisation interact and what impact this has on network training and performance. Because rectifier activation functions are widely used, we extended new network signal propagation theory to rectifier networks that may use stochastic regularisation. Our theory predicted a critical initialisation that allows for stable pre-activation variance signal propagation. However, our theory also indicated that stochastic regularisation reduces the depth to which correlation information can propagate in ReLU networks. We validated this theory and showed that it accurately predicts a boundary across which networks do not train effectively. We then extended the investigation by conducting a large-scale randomised control trial to search for initialisations in a region that conserves input signal around the critical initialisation in the hopes of finding initialisations that provide advantages to training or generalisation. We compare the critical initialisation to 10 other initialisation schemes in a trial that consisted of over 12000 networks. We found that initialisations much larger than the critical initialisation provide extremely poor performance, while network initialisations close to the critical initialisation provide similar performance. No initialisations clearly outperformed the critical initialisation. Thus, we recommend it as a safe default for practitioners.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Geen opsomming beskikbaar.af_ZA
dc.description.sponsorshipThe financial assistance of the Council for Scientific and Industrial Research (CSIR) towards this research is hereby acknowledged. Opinions expressed and conclusions arrived at, are those of the author and are not necessarily to be attributed to the CSIR.en_ZA
dc.description.versionMastersen_ZA
dc.format.extentxiii, 146 pages : illustrationsen_ZA
dc.identifier.urihttp://hdl.handle.net/10019.1/123661
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectDeep learning (Machine learning)en_ZA
dc.subjectNeural networks (Computer science) -- Noiseen_ZA
dc.subjectStochastic regularisationen_ZA
dc.subjectCritical initialisationen_ZA
dc.subjectSignal propagationen_ZA
dc.subjectNeural network initialisationen_ZA
dc.subjectNoise (Computer science)en_ZA
dc.subjectUCTD
dc.titleInitialisation of noise-regularised neural networksen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
vanbiljon_initialisation_2021.pdf
Size:
4.66 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: