dc.contributor.advisor | Brink, Willie | en_ZA |
dc.contributor.author | Magangane, Luyolo | en_ZA |
dc.contributor.other | Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Applied Mathematics. | en_ZA |
dc.date.accessioned | 2020-12-01T08:38:23Z | |
dc.date.accessioned | 2021-01-31T19:44:53Z | |
dc.date.available | 2020-12-01T08:38:23Z | |
dc.date.available | 2021-01-31T19:44:53Z | |
dc.date.issued | 2020-12 | |
dc.identifier.uri | http://hdl.handle.net/10019.1/109328 | |
dc.description | Thesis (MSc)--Stellenbosch University, 2020. | en_ZA |
dc.description.abstract | ENGLISH ABSTRACT: Reasoning over knowledge expressed in natural language is a problem at the forefront of artificial
intelligence. Question answering is one of the core tasks of this problem, and is concerned
with giving machines the capability of generating an answer given a question, by mimicking the
reasoning behaviour of humans. Relational learning, in combination with information retrieval,
has been explored as a framework for solving this problem. Knowledge graphs (KGs) are
used to represent facts about multiple domains as entities (nodes) and relations (edges), and
the resource description framework formalism, subject-predicate-object, is used to encode
these facts. Link prediction then powers knowledge discovery by scoring possible relationships
between entities.
This thesis explores latent feature modelling using tensor factorisation as an approach to link
prediction. Tensor decompositions are an attractive approach as relational domains are usually
high-dimensional and sparse, a setting where factorisation methods have shown very good
results. Previous approaches have focused on shallow models that can scale to large datasets,
and recently deep models have been applied, specifically neural tensor factorisation models, as
these models are more expressive and automatically learn the most useful latent features for
entities and relations. In this work we introduce training algorithm optimisations to the neural
tensor network (NTN) and HypER neural tensor factorisation models.
We make use of the TensorFlow reimplementation of NTNs and apply early stopping, adaptive
moment estimation and hyperparameter optimisation using random search. We see improvements
in both cost and accuracy over the baseline NTN reimplementation, using standard link
prediction benchmark datasets WordNet and Freebase. We then apply optimisations to the
HypER model training algorithm. We begin with compensating for covariate shift caused by
hypernetworks, using batch normalisation, and propose HypER+. We see similar performance
to the HypER baseline on the WN18 dataset, and see significant improvement using the FB15k
dataset. We extend our optimisation by initialising entity and relation embeddings using pretrained
word vectors from the GloVe language model. We see marginal improvements over
the baseline using the WN18RR and FB15k-237 datasets. Our results establish HypER+ as a
state-of-the-art model in latent feature modelling based link prediction. | en_ZA |
dc.description.abstract | AFRIKAANSE OPSOMMING: Redenering oor kennis wat in natuurlike taal uitgedruk word, is ’n probleem aan die voorpunt
van kunsmatige intelligensie. Die beantwoording van vrae is een van die kerntake van hierdie
probleem, en poog om masjiene die vermoë te gee om ’n antwoord te skep vir ’n gegewe
vraag, deur die redenasiegedrag van mense na te boots. Verhoudingsleer, in kombinasie met
die inwin van inligting, is al ondersoek as ’n raamwerk vir die oplossing van hierdie probleem.
Kennisgrafieke (KG’s) word gebruik om feite oor veelvuldige domeine as entiteite (punte) en
verhoudings (lyne) voor te stel, en die bronbeskrywingsraamwerk-formalisme, nl. onderwerppredikaat-
voorwerp, word gebruik om sulke feite te enkodeer. Skakelvoorspelling dryf dan
kennisontdekking deur moontlike verhoudings tussen entiteite te bepunt.
Hierdie tesis ondersoek latente kenmerkmodellering met behulp van tensorfaktorisering, as ’n
benadering tot skakelvoorspelling. Tensor-ontbindings is ’n aantreklike benadering, aangesien
verhoudingsdomeine gewoonlik hoogdimensioneel en yl is; omstandighede waar faktoriseringsmetodes
reeds baie goeie resultate getoon het. Vorige benaderings het op vlak modelle
gefokus, wat kan skalleer met groot datastelle. Meer onlangs is diep modelle toegepas, spesifiek
neurale tensorfaktoriseringsmodelle, aangesien hierdie modelle meer ekspressief is en
outomaties die nuttigste latente kenmerke vir entiteite en verhoudings kan aanleer. In hierdie
werk stel ons optimering van afrigalgoritmes voor vir die neurale tensornetwerk (NTN) en
HypER neurale tensorfaktoriseringsmodelle.
Ons maak gebruik van die TensorFlow-herimplementering van NTN’s, en pas vroeë-stop,
aanpasbare momentskatting, sowel as hiperparameteroptimering met ewekansige soeke, toe.
Ons sien verbeterings in koste sowel as akkuraatheid oor die basiese NTN-herimplementering,
in die standaard skakelvoorspellingsdatastelle WordNet en Freebase. Ons pas dan optimerings
toe op die HypER-model se afrigtingsalgoritme. Ons begin met die kompensering van
kovariantskuif wat deur hipernetwerke veroorsaak word, met behulp van bondelnormalisering,
en stel HypER+ voor. Ons sien prestasies soortgelyk aan die HypER-basismodel op die
WN18-datastel, en beduidende verbetering op die FB15k-datastel. Ons brei ons optimering uit
deur entiteit- en verhoudingsinbeddings te inisialiseer met vooraf-afgerigte woordvektore van
die GloVe-taalmodel. Ons sien marginale verbeterings oor die basismodel op die WN18RR
en FB15k-237 datastelle. Ons resultate vestig HypER+ as ’n mededingende model in latente
kenmerkmodelleringsgebaseerde skakelvoorspelling. | af_ZA |
dc.format.extent | viii, 84 pages : illustrations | en_ZA |
dc.language.iso | en_ZA | en_ZA |
dc.publisher | Stellenbosch : Stellenbosch University | en_ZA |
dc.subject | Link prediction | en_ZA |
dc.subject | Factorization (Mathematics) | en_ZA |
dc.subject | Machine learning | en_ZA |
dc.subject | Tensor Algebra | en_ZA |
dc.subject | Artificial intelligence | en_ZA |
dc.subject | UCTD | |
dc.title | Link prediction in knowledge graphs using latent feature modelling and neural tensor factorisation | en_ZA |
dc.type | Thesis | en_ZA |
dc.description.version | Masters | en_ZA |
dc.rights.holder | Stellenbosch University | en_ZA |