Extreme value-based novelty detection

Steyn, Matthys Lucas (2017-12)

Thesis (MCom)--Stellenbosch University, 2017.

Thesis

ENGLISH SUMMARY : This dissertation investigates extreme value-based novelty detection. An in-depth review of the theoretical proofs and an analytical investigation of current novelty detection methods are given. It is concluded that the use of extreme value theory for novelty detection leads to superior results. The first part of this dissertation provides an overview of novelty detection and the various methods available to construct a novelty detection algorithm. Four broad approaches are discussed, with this dissertation focusing on probabilistic novelty detection. A summary of the applications of novelty detection and the properties of an efficient novelty detection algorithm are also provided. The theory of extremes plays a vital role in this work. Therefore, a comprehensive description of the main theorems and modelling approaches of extreme value theory is given. These results are used to construct various novelty detection algorithms based on extreme value theory. The first extreme value-based novelty detection algorithm is termed the Winner-Takes-All method. The model’s strong theoretical underpinning as well as its disadvantages are discussed. The second method reformulates extreme value theory in terms of extreme probability density. This definition is utilised to derive a closed-form expression of the probability distribution of a Gaussian probability density. It is shown that this distribution is in the minimum domain of attraction of the extremal Weibull distribution. Two other methods to perform novelty detection with extreme value theory are explored, namely the numerical approach and the approach based on modern extreme value theory. Both these methods approximate the distribution of the extreme probability density values under the assumption of a Gaussian mixture model. In turn, novelty detection can be performed in complex settings using extreme value theory. To demonstrate an application of the discussed methods a banknote authentication dataset is analysed. It is clearly shown that extreme value-based novelty detection methods are extremely efficient in detecting forged banknotes. This demonstrates the practicality of the different approaches. The concluding chapter compares the theoretical justification, predictive power and efficiency of the different approaches. Proposals for future research are also discussed.

AFRIKAANSE OPSOMMING : Hierdie verhandeling ondersoek anomalie-opsporing wat op kstreemwaardeteorie gegrond is. Die teoretiese bewyse word breedvoerig beskryf en huidige metodes word ontleed. Daar word bevind dat die gebruik van ekstreemwaardeteorie vir anomalie-opsporing tot uitsonderlike resultate lei. Die eerste deel van die verhandeling bied 'n oorsig van anomalie-opsporing en verskillende metodes wat gebruik kan word om 'n anomalie-opsporingsalgoritme te formuleer. Vier benaderings tot anomalie-opsporing word bespreek. Die verhandeling lê klem op een daarvan, naamlik probabilistiese anomalie-opsporing. Die gedeelte sluit af met 'n opsomming van die praktiese toepassings van anomalie-opsporing en die eienskappe van 'n doeltreffende anomalieopsporingsalgoritme. Ekstreemwaardeteorie speel 'n uiters belangrike rol in hierdie werk. Daarom word 'n omvattende beskrywing van die vernaamste grondbeginsels en modelleringsbenaderings tot ekstreemwaardeteorie gegee. Dié resultate word benut om verskeie anomalieopsporingsalgoritmes te formuleer wat op ekstreemwaardeteorie gegrond is. Daar word eerstens gekyk na die anomalie-opsporingsalgoritme wat op ekstreemwaardeteorie gegrond is en wat die Wenner-Vat-Alles-metode genoem word. Daar word bewys dat die model teoreties korrek is. In die tweede metode word ekstreemwaardeteorie ten opsigte van ekstreme waarskynlikheidsdigtheid geherdefinieer. Hierdie definisie word gebruik om 'n geslote-vorm uitdrukking van die waarskynlikheidsverdeling van 'n Gaussiese waarskynlikheidsdigtheid af te lei. Gevolglik word daar aangetoon dat hierdie verdeling in die minimum aantrekkingsgebied van die ekstreme Weibull-verdeling val. Daarna volg 'n oorsig van twee ander metodes wat vir anomalie-opsporing met ekstreemwaardeteorie gebruik kan word, naamlik die numeriese metode en die metode gebaseer op moderne ekstreemwaardeteorie. In albei hierdie metodes word die verdeling van die ekstreme waarskynlikheidsdigtheidwaardes op die veronderstelling van 'n Gaussiese mengselmodel gegrond. Anomalie-opsporing kan dus in komplekse omgewings uitgevoer word deur ekstreemwaardeteorie te gebruik. Om te demonstreer hoe hierdie metodes prakties toegepas kan word, word 'n datastel vir banknoot-verifikasie ontleed. Daar word duidelik aangetoon dat anomalie-opsporing wat op ekstreemwaardeteorie gegrond is uiters doeltreffend is om vervalste banknote uit te ken. Dit beklemtoon die praktiese toepassing van die verskillende benaderings. Die laaste hoofstuk vergelyk die teoretiese regverdiging, voorspellingskrag en doeltreffendheid van die verskillende benaderings. Voorstelle vir toekomstige navorsing word ook bespreek.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/102955
This item appears in the following collections: