The class imbalance problem in computer vision

Crous, Willem Hendrik

The class imbalance problem in computer vision

dc.contributor.advisor	Brink, Willie	en_ZA
dc.contributor.author	Crous, Willem Hendrik	en_ZA
dc.contributor.other	Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences (Applied Mathematics)	en_ZA
dc.date.accessioned	2022-02-23T08:03:54Z
dc.date.accessioned	2022-04-29T09:28:18Z
dc.date.available	2022-02-23T08:03:54Z
dc.date.available	2022-04-29T09:28:18Z
dc.date.issued	2022-04
dc.description	Thesis (MSc)--Stellenbosch University, 2022.	en_ZA
dc.description.abstract	ENGLISH ABSTRACT: Class imbalance is a naturally occurring phenomenon, typically characterised as a dataset consisting of classes with varying numbers of samples. When trained on class imbalanced data, networks tend to favour frequently occurring (majority) classes over the less frequent (minority) classes. This poses chal- lenges for tasks reliant upon accurate recognition of the less frequent classes. The aim of this thesis is to investigate general methods towards addressing this problem. First we establish why a network may favour majority classes. We contend that as less frequent classes are likely to under-represent the re- quired underlying distribution for a given task, training may produce a decision boundary that transgresses the feature space of minority classes. Additionally we find that the weight norms of the classification layer in a neural network may tend towards the distribution of the training data, thus affecting the de- cision boundary. We determine that this decision boundary shift impacts both the accuracy and confidence calibration of neural networks. We investigate several approaches to shift the decision boundary. The first approach acquires additional data and increases the representation of minority classes. This is achieved through either creating synthetic samples following a distribution- aware regularisation method, or utilising additional unlabelled data in a semi- supervised setting. The second approach aims to adjust the classifier weight norms by separately training the classifier and feature extractor. We find that implementing an effective regularisation method with a simple decoupled sam- pling scheme can provide considerable improvements over standard sampling methods. Furthermore we find that utilising additional unlabelled data may lead to additional gains given certain dataset characteristics are taken into consideration.	en_ZA
dc.description.abstract	AFRIKAANSE OPSOMMING: Geen opsomming beskikbaar	af_ZA
dc.description.version	Masters	en_ZA
dc.format.extent	76 pages	en_ZA
dc.identifier.uri	http://hdl.handle.net/10019.1/124719
dc.language.iso	en_ZA	en_ZA
dc.publisher	Stellenbosch : Stellenbosch University	en_ZA
dc.rights.holder	Stellenbosch University	en_ZA
dc.subject	Computer vision	en_ZA
dc.subject	Image classification	en_ZA
dc.subject	Class imbalance	en_ZA
dc.subject	UCTD	en_ZA
dc.subject	Data sets -- Characteristic	en_ZA
dc.title	The class imbalance problem in computer vision	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: crous_class_2022.pdf
Size:: 5.25 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Masters Degrees (Applied Mathematics)