A deep convolutional neural network architecture for image classification

Pretorius, Willem Lodewikus (2020-12)

Thesis (MCom)--Stellenbosch University, 2020.

Thesis

ENGLISH SUMMARY : Convolutional Neural Networks (CNNs), a specialised form of Neural Networks (NNs), are wellknown for their state-of-the-art results obtained in Computer Vision (CV) and Deep Learning (DL) tasks throughout the past few years. Some of the exciting application areas of CNNs include image classi cation, object detection, video processing, natural language processing, and speech recognition. The powerful learning ability of deep CNNs is primarily owed to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in hardware technology has accelerated the research done in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of di erent activation and loss functions, parameter optimisation, regularisation, and architectural innovations by using di erent layer structures. Therefore, the objective of this study is based on image classi cation and object detection tasks, that is, creating custom-designed CNN architectures for deployment on real-world datasets while comparing these custom-designed architectures to those state-of-the-art architectures found in literature while comparing di erent optimisation procedures and activations functions. All major developments of CNNs are discussed and critically considered, with a view to improve CNNs in the context of the number of parameters used to obtain satisfactory results and additionally to obtain a better understanding of the term known as a `black box' which is usually associated with CNNs such that they are complex models with little understanding in the way how their classi cations are done. The most promising modern CNN architectures with associated hyperparameters are further explored by means of empirical work. Evaluation is done on the validity of ndings reported in the literature and comments are made on the e ectiveness of recent proposals through the use of ve di erent real-world datasets. The empirical work done will be complemented by additional coded notebooks that could be used to implement state-of-the-art techniques, as well as for comparative and model assessment experiments.

AFRIKAANSE OPSOMMING : Konvolusionele Neurale Netwerke (KNNe), 'n gespesialiseerde vorm van neurale netwerke, is bekend vir hul nuutste resultate wat gedurende die afgelope paar jaar in rekenaarvisie en diepleer take behaal is. Sommige van die opwindende toepassingsgebiede van KNNe sluit beeldklassi kasie, voorwerpopsporing, videoverwerking, natuurlike taalverwerking en spraakherkenning in. Die kragtige leervermo e van diep KNNe is hoofsaaklik te danke aan die gebruik van verskeie funksie-ekstraksie-fases wat outomaties voorstellings uit die data kan leer. Die beskikbaarheid van 'n groot hoeveelheid data en verbetering in hardeware-tegnologie het die navorsing wat in KNNe gedoen is, versnel, en daar is onlangs berig oor interessante diep KNN-argitekture. Verskeie inspirerende idees om vooruitgang in KNNe te bring, is ondersoek, soos die gebruik van verskillende aktiverings- en verliesfunksies, parameter optimalisering, regulering en argitektoniese innovasies waardeur verskillende laagstrukture gebruik word. Daarom is die doelstelling van hierdie studie gebaseer op beeldklassi kasie en opsporingstake, dit wil s^e die opstel van KNN-argitekture wat ontwerp is vir die implementering op werklike datastelle, terwyl hierdie ontwerpte argitekture vergelyk word met `state-of-the-art' argitekture wat in die literatuur voorkom, terwyl verskillende optimaliseringsprosedures en aktiveringsfunksies vergelyk word. Alle belangrike ontwikkelings van KNNe word bespreek en krities oorweeg, met die oog op die verbetering van KNNe in die konteks van die aantal parameters wat gebruik word om bevredigende resultate te verkry, en om 'n beter begrip te kry van die term wat bekend staan as `swart boks', wat gewoonlik geassosieer met KNNe is, waar dit komplekse modelle is met min begrip van die manier waarop hul klassi kasies gedoen word. Die mees moderne KNN-argitekture met gepaardgaande hiperparameters word verder ondersoek deur middel van praktiese werk. Evaluering word gedoen oor die geldigheid van die bevindings wat in die literatuur gerapporteer word en kommentaar word gelewer op die doeltre endheid van onlangse voorstelle deur die gebruik van vyf verskillende werklike datastelle. Die empiriese werk wat verrig word, sal aangevul word deur addisionele gekodeerde notaboeke wat gebruik kan word om moderne tegnieke te implementeer, asook vir vergelykende en model assesserings eksperimente.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/109135
This item appears in the following collections: