A grammatical framework for the computational parsing of written Afrikaans sentences

dc.contributor.advisorGouws, R. H.en_ZA
dc.contributor.advisorVan Rooyen, G-J.en_ZA
dc.contributor.advisorOosthuizen, J.en_ZA
dc.contributor.authorSwarts, Johannes Jacobusen_ZA
dc.contributor.otherStellenbosch University. Faculty of Arts and Social Sciences. Dept. of Afrikaans and Dutch.en_ZA
dc.date.accessioned2019-10-04T09:18:42Z
dc.date.accessioned2019-12-11T06:44:29Z
dc.date.available2019-10-04T09:18:42Z
dc.date.available2019-12-11T06:44:29Z
dc.date.issued2019-12
dc.descriptionThesis (PhD)--Stellenbosch University, 2019.en_ZA
dc.description.abstractENGLISH ABSTRACT: This dissertation investigates which grammatical framework is best suited to computationally represent and parse written Afrikaans sentences. This knowledge is necessary to build a large scale Afrikaans treebank – a resource which does not yet exist, but is a critical prerequisite for advanced endeavours in Afrikaans natural language processing. To gain this knowledge, we formally describe the building blocks of written Afrikaans from the perspectives of two major grammatical frameworks: constituency grammar and dependency grammar. Using these formal descriptions, we construct the first linguistically motivated treebank for Afrikaans, annotated with both constituency and dependency graphs. We perform k-fold cross-validation on multiple variations of this treebank with four state of the art sentence parsers, and fine-comb the results. Combining insights from the formal descriptions of written Afrikaans with the data obtained during parser evaluation, we conclude that dependency grammar outperforms constituency grammar at computationally representing the syntactic structure of written Afrikaans sentences under the conditions tested.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Hierdie proefskrif ondersoek watter grammatikale raamwerk meer geskik is vir die rekenaarmatige voorstelling en ontleding van geskrewe Afrikaanse sinne. Hierdie kennis is nodig om ’n grootskaalse Afrikaanse boombank te bou – ’n hulpbron wat tans ontbreek, maar ’n kritiese voorvereiste is vir gevorderde Afrikaanse natuurlike taalverwerking. Ten einde hierdie kennis te verwerf, beskryf ons die boublokke van geskrewe Afrikaans formeel vanuit die perspektiewe van twee dominante grammatikale raamwerke: samestellingsgrammatiek (”constituency grammar”) en afhanklikheidsgrammatiek (“dependency grammar”). Hierdie formele beskrywings word ingespan om die eerste taalkundig gemotiveerde Afrikaanse boombank te bou wat annotasies vanuit beide grammatikale raamwerke bevat. Met verskeie variasies van hierdie boombank voer ons dan k-voudige kruisvalidering uit met vier toonaangewende sinsontleders en fynkam hul resultate. Aan die hand van hierdie resultate, sowel as die teoretiese insigte verkry tydens die formele beskrywings van geskrewe Afrikaans, lei ons af dat afhanklikheidsgrammatiek samestellingsgrammatiek oortref vir die rekenaarmatige voorstelling van die sintaktiese struktuur van geskrewe Afrikaanse sinne binne die getoetsde toestande.af_ZA
dc.description.versionDoctoralen_ZA
dc.format.extent223 pages : illustrationsen_ZA
dc.identifier.urihttp://hdl.handle.net/10019.1/107037
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectAfrikaans language -- Sentencesen_ZA
dc.subjectAfrikaans language -- Grammaren_ZA
dc.subjectComputational linguisticsen_ZA
dc.subjectGrammar, Comparative and general -- Sentencesen_ZA
dc.subjectSentence parsingen_ZA
dc.subjectUCTDen_ZA
dc.titleA grammatical framework for the computational parsing of written Afrikaans sentencesen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
swarts_grammatical_2019.pdf
Size:
4.24 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: