Application of data mining techniques to identify significant patterns in the Grade 12 results of the Free State Department of Education

Date
2017-03
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH SUMMARY : The Free State Department of Education (FSDoE) has a mandate to ensure that examinations and assessment processes are conducted according to the set out legislations and that they produce expected results. It has become common for Grade 12 results to be challenged by interested parties within and outside the government on their credibility. It is, therefore, the responsibility of the Free State Department of Education to ensure that the input data which represent raw marks obtained by the learners give a true reflection of what individual learners have achieved during a particular assessment period. This study seeks to explore the role that data mining (DM) can play in establishing credibility of the Grade 12 data in the FSDoE. The study makes use of open-source data mining software called WEKA. The software is applied on the 2010-2013 Grade 12 data results in the Free State. For this study, two algorithms, j48, and simpleKMeans algorithms, have been selected for classification and clustering respectively. In line with the universally accepted Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, the selected data has been modified and saved in WEKA software-compliant csv format. The prepared data represent four selected subjects which are English Home Language (EHL), English First Additional Language (EFAL), Mathematics and Mathematical Literacy. Four Different models were iteratively generated and analysed and valuable insights were drawn from them to highlight how their possible influence on future decision making in the FSDoE. The analysis focuses on performance of learners within the performance categories (levels 1 to 7) and compares them Free State’s Grade 12s average performance during the selected 2010 to 2013 period. The English Languages (EHL and EFAL) models and the Mathematics (Mathematics and Mathematical Literacy) models are analysed and interpreted according to the identified patterns as observed over the four year period (2010-2013). In addition, the study makes sense of the models generated from WEKA by interpreting them using theories from Bloom’s Mastery Learning and Argyris’ Learning Organisations. Furthermore, the study delves into the 2011 census data and make sense of the results obtained from the application of WEKA in the selected 2010-2013- Grade 12 results in the FSDoE. The study concludes by giving recommendations which the Free State Department of Education may use as they plan not only for future Grade 12 results but across all grades. It is through the application of DM tools that credibility, as seen with Grade 12 data in the FSDoE, can be established through sense making which can assist during decision making.
AFRIKAANSE OPSOMMING : Die Vrystaatse Onderwysdepartement (VOD) het 'n mandaat om te verseker dat eksamens en assessering prosesse volgens die uiteengesette wetgewing uitgevoer word en dat hulle verwagte resultate produseer. Dit is deesdae algemeen dat Graad 12-uitslae uitgedaag word deur belanghebbende partye binne en buite die regering ten opsigte van geloofwaardigheid. Dit is dus die verantwoordelikheid van die VOD om te verseker dat die rou punte wat deur leerders behaal word 'n ware weerspieeling van individue se prestasie tydens 'n assesserings periode is. Hierdie studie beoog om die rol van data-ontginning in die bepaling van die geloofwaardigheid van Graad 12-data in die Vrystaat te ondersoek. Die studie maak gebruik van WEKA, ‘n publieke data-ontginningsagteware pakket. Die sagteware word toegepas op 2010-2013 se Graad 12 resultate in die Vrystaat. Vir hierdie studie sal twee algoritmes, j48, en simpleKMeans, onderskeidelik vir klassifikasie en groepering gebruik word. Die data is bygewerk en in csv formaat volgens CRISP-DM metodologie gestoor. Die bygewerkte data verteenwoordig vier geselekteerde vakke wat Engels Huistaal (EHT), Engels Eerste Addisionele Taal (EEAT), Wiskunde en Wiskundige Geletterdheid insluit. Vier modelle is iteratief gegenereer en ontleed wat interessante insigte met ‘n impak op toekomstige besluitneming van die VOD gelewer het. Die analise fokus op die prestasie van leerders binne die prestasie kategoriee (vlakke 1-7) en vergelyk dit met Vrystaat se gemiddelde prestasie tydens die gekose 2010-2013 tydperk. Die Engelse taal modelle (EHT en EEAT) sowel as die Wiskunde modelle (Wiskunde en Wiskundige Geletterdheid) is volgens die geidentifiseerde patrone, soos waargeneem oor die tydperk van vier jaar (2010-2013), ontleed en vertolk. Daarmee saam het die studie sin gemaak van die Weka gegenereerde modelle en met behulp van Bloom se Bemeesterings Leerteorie en Lerende Organisasies soos opgevat deur Argyris geinterpreteer. Verder maak die studie gebruik van die 2011-sensus data om meer insigte oor die gegenereerde modelle wat die Graad 12-resultate van die VOD te bekom. Ten slotte maak die studie aanbevelings vir die VOD wat hulle kan gebruik vir die beplanning van nie net die toekomstige Graad 12 eksamens nie, maar in alle grade. Met sinvolle toepassing van data-ontginningsagteware tydens besluitneming kan geloofwaardigheid, soos gesien met Graad 12 data van die VOD, vasgestel word.
Description
Thesis (MPhil)--Stellenbosch University, 2017.
Keywords
Database management, Data mining -- Educational, Department of Education -- Free State (South Africa), Machine learning, UCTD
Citation