Automatic oral proficiency assessment of second language speakers of South African English

Muller, Pieter F.de V. (2010-03)

Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010.

Thesis

ENGLISH ABSTRACT: The assessment of oral proficiency forms an important part of learning a second language. However, the manual assessment of oral proficiency is a labour intensive task requiring specific expertise. An automatic assessment system can reduce the cost and workload associated with this task. Although such systems are available, they are typically aimed towards assessing students of American or British English, making them poorly suited for speakers of South African English. Additionally, most research in this field is focussed on the assessment of foreign language students, while we investigate the assessment of second language students. These students can be expected to have more advanced skills in the target language than foreign language speakers. This thesis presents a number of scoring algorithms for the automatic assessment of oral proficiency. Experiments were conducted on a corpus of responses recorded during an automated oral test. These responses were rated for proficiency by a panel of raters based on five different rating scales. Automatic scoring algorithms were subsequently applied to the same utterances and their correlations with the human ratings determined. In contrast to the findings of other researchers, posterior likelihood scores were found to be ineffective as an indicator of proficiency for the corpus used in this study. Four different segmentation based algorithms were shown to be moderately correlated with human ratings, while scores based on the accuracy of a repeated prompt were found to be well correlated with human assessments. Finally, multiple linear regression was used to combine different scoring algorithms to predict human assessments. The correlations between human ratings and these score combinations ranged between 0.52 and 0.90.

AFRIKAANSE OPSOMMING: Die assessering van spraakvaardigheid is ’n belangrike komponent van die aanleer van ’n tweede taal. Die praktiese uitvoer van sodanige assessering is egter ’n arbeids-intensiewe taak wat spesifieke kundigheid vereis. Die gebruik van ’n outomatiese stelsel kan die koste en werkslading verbonde aan die assessering van ’n groot aantal studente drasties verminder. Hoewel sulke stelsels beskikbaar is, is dit tipies gemik op die assessering van studente wat Amerikaanse of Britse Engels wil aanleer, en is dus nie geskik vir sprekers van Suid Afrikaanse Engels nie. Verder is die meerderheid navorsing op hierdie gebied gefokus op die assessering van vreemde-taal sprekers, terwyl hierdie tesis die assessering van tweede-taal sprekers ondersoek. Dit is te wagte dat hierdie sprekers se spraakvaardighede meer gevorderd sal wees as di´e van vreemde-taal sprekers. Hierdie tesis behandel ’n aantal evaluasie-algoritmes vir die outomatiese assessering van spraakvaardighede. Die eksperimente is uitgevoer op ’n stel opnames van studente se antwoorde op ’n outomatiese spraaktoets. ’n Paneel van menslike beoordelaars het hierdie opnames geassesseer deur gebruik te maak van vyf verskillende punteskale. Dieselfde opnames is deur die outomatiese evaluasie-algoritmes verwerk, en die korrelasies tussen die beoordelaars se punte en die outomatiese evaluerings is bepaal. In kontras met die bestaande navorsing, is daar gevind dat posterieure waarskynlikheidsalgoritmes nie ’n goeie aanduiding van spraakvaardighede gee vir ons datastel nie. Vier algoritmes wat van segmentasies gebruik maak, is ook ondersoek. Die evaluerings van hierdie algoritmes het redelike korrelasie getoon met die punte wat deur die beoordelaars toegeken is. Voorts is drie algoritmes ondersoek wat daarop gemik is om die akkuraatheid van herhaalde sinne te bepaal. Die evaluerings van hierdie algoritmes het goed gekorreleer met die punte wat deur die beoordelaars toegeken is. Laastens is liniˆere regressie gebruik om verskillende outomatiese evaluerings te kombineer en sodoende beoordelaars se punte te voorspel. Die korrelasies tussen hierdie kombinasies en die punte wat deur beoordelaars toegeken is, het gewissel tussen 0.52 en 0.90.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/4165
This item appears in the following collections: