Training neural word embeddings for transfer learning and translation

dc.contributor.advisorVan Rooyen, G-J.en_ZA
dc.contributor.advisorBengio, Yoshuaen_ZA
dc.contributor.advisorHovy, Eduarden_ZA
dc.contributor.authorGouws, Stephanen_ZA
dc.contributor.otherStellenbosch University. Faculty of Engineering. Dept. of Electrical and Electronic Engineering.en_ZA
dc.descriptionThesis (D. Phi)--Stellenbosch University, 2016.en_ZA
dc.description.abstractENGLISH ABSTRACT: In contrast to only a decade ago, it is now easy to collect large text corpora from theWeb on any topic imaginable. However, in order for information processing systems to perform a useful task, such as answer a user’s queries on the content of the text, the raw text first needs to be parsed into the appropriate linguistic structures, like parts of speech, named-entities or semantic entities. Contemporary natural language processing systems rely predominantly on supervised machine learning techniques for performing this task. However, the supervision required to train these models are expensive to come by, since human annotators need to mark up relevant pieces of text with the required labels of interest. Furthermore, machine learning practitioners need to manually engineer a set of task-specific features which represents a wasteful duplication of efforts for each new task. An alternative approach is to attempt to automatically learn representations from raw text that are useful for predicting a wide variety of linguistic structures. In this dissertation, we hypothesise that neural word embeddings, i.e. representations that use continuous values to represent words in a learned vector space of meaning, are a suitable and efficient approach for learning representations of natural languages that are useful for predicting various aspects related to their meaning. We show experimental results which support this hypothesis, and present several contributions which make inducing word representations faster and applicable for monolingual and various cross-lingual prediction tasks. The first contribution to this end is SimTree, an efficient algorithm for jointly clustering words into semantic classes while training a neural network language model with the hierarchical softmax output layer. The second is an efficient subsampling training technique for speeding up learning while increasing accuracy of word embeddings induced using the hierarchical softmax. The third is BilBOWA, a bilingual word embedding model that can efficiently learn to embed words across multiple languages using only a limited sample of parallel raw text, and unlimited amounts of monolingual raw text. The fourth is Barista, a bilingual word embedding model that efficiently uses additional semantic information about how words map into equivalence classes, such as parts of speech or word translations, and includes this information during the embedding process. In addition, this dissertation provides an in-depth overview of the different neural language model architectures, and a detailed, tutorial-style overview of the available popular techniques for training these models.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Geen opsomming beskikbaaraf_ZA
dc.format.extentxii, 129 pagesen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.subjectSemantic computingen_ZA
dc.subjectNeural computersen_ZA
dc.subjectLearning systems (Automatic control)en_ZA
dc.subjectTranslators (computer programs)en_ZA
dc.titleTraining neural word embeddings for transfer learning and translationen_ZA
dc.rights.holderStellenbosch Universityen_ZA

Files in this item


This item appears in the following Collection(s)