Evaluating four readability formulas for Afrikaans

Jansen, Carel ; Richards, Rose ; Van Zyl, Liezl (2017)

CITATION: Jansen, C., Richards, R. & Van Zyl, L. 2017. Evaluating four readability formulas for Afrikaans. Stellenbosch Papers in Linguistics Plus, 53:149-166, doi:10.5842/53-0-739.

The original publication is available at http://spilplus.journals.ac.za/


For almost a hundred years now, readability formulas have been used to measure how difficult it is to comprehend a given text. To date, four readability formulas have been developed for Afrikaans. Two such formulas were published by Van Rooyen (1986), one formula by McDermid Heyns (2007) and one formula by McKellar (2008). In our quantitative study the validity of these four formulas was tested. We selected 10 texts written in Afrikaans – five articles from a popular magazine and five documents used in government communications. All characteristics included in the four readability formulas were first measured for each text. We then developed five different cloze tests for each text to assess actual text comprehension. Thereafter, 149 Afrikaans-speaking participants with varying levels of education each completed a set of two of the resulting 50 cloze tests. On comparing the data on text characteristics to the cloze test scores from the participants, the accuracy of the predictions from the four existing formulas for Afrikaans could be determined. Both Van Rooyen formulas produced readability scores that were not significantly correlated with actual comprehension scores as measured with the cloze tests. For the McKellar formula, however, this correlation was significant and for the McDermid Heyns formula the correlation with the cloze test scores almost reached significance. From the outcomes of each of these last two formulas, about 40% of the variance in cloze scores could be predicted. Readability predictions based only on the average number of characters per word, however, performed considerably better: about 65% of the variance in the cloze scores could be predicted just from the average number of characters per word.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/103015
This item appears in the following collections: