A framework for identifying the most likely successful underprivileged tertiary bursary applicants

Steynberg, Renier (2016-12)

Thesis (MEng)--Stellenbosch University, 2016.

Thesis

ENGLISH ABSTRACT: A number of non-governmental organisations (NGOs) are mandated to assist in the removal of financial barriers preventing underprivileged, prospective students from enrolling for tertiary studies, by managing the provision of bursaries to promising individuals. These NGOs are, however, often overwhelmed by the number of bursary applications they receive. In order to select the best applicants, very basic and sometimes unjustifiable methods involving weighted criteria are used in industry. A scientifically justifiable decision support system (DSS) framework is instead proposed in this thesis for aiding NGOs in this selection process. This framework is capable of both predicting the tertiary study (success or failure) outcome and ranking of bursary applicants in terms of potential merit. The three main components of the framework are a predictive component (containing multiple statistical learning models in an ensemble manner which learn from past data and then make future outcome predictions in respect of new applicants), an integration component (which combines the predictions made by the aforementioned models into a single prediction for each applicant), and a ranking component (which produces a rank level for each applicant in addition to his or her combined prediction). Examples of models that are included in the predictive component include logistic regression, classification and regression trees, random forests, the C4.5 algorithm, and support vector ma- chines, while majority voting and weighted majority voting are examples of methodologies that may be included in the integration component. The working of the integration component is based on weighting the various model outputs according to their predictive accuracies in respect of a holdout set. Possible methodologies that may be included in the ranking component may be found within the realm of multi-criteria decision analysis techniques. Examples of these techniques are the ELimination Et Choix Traduisant la REalite III (ELECTRE III) and the Preference Ranking Organisation METHod for Enrichment Evaluations II (PROMETHEE II). In order to demonstrate the practical use of the DSS framework, it is implemented in the context of sample data provided by two NGO industry partners. During an assessment of the performance of the DSS in this context, it is found that the accuracy of the combined success or failure predictions for applicants is superior to those of the individual models on a one-to-one comparison basis. It is also found that the average overall accuracy of the combined predictions surpasses that of the manual processes currently employed by the industry partners. The sample data are further analysed for trends of interest and to identify those variables that seem to be best suited for predicting the tertiary success of prospective students. Surprising and perhaps counter-intuitive results are obtained, indicating that high school averages and subject marks are, in fact, negatively correlated to the eventual tertiary success of past students. This observation is likely due to better performing high school students gravitating to the more challenging, and potentially more prestigious, tertiary institutions, study fields, and qualification types.

AFRIKAANSE OPSOMMING: Verskeie nie-regeringsorganisasies (NROs) vervul die mandaat om finansiële struikelblokke uit die weg te ruim wat minder bevoorregte, voornemende studente daarvan weerhou om vir tersiêre studies in te skryf, deur die proses van beurstoekennings aan hierdie individue te bestuur. Hierdie NROs word egter dikwels oorval deur die ontvangs van menige beursaansoeke. In 'n poging om gepaste aansoekers vir finansiële steun te identifiseer, word baie eenvoudige en soms wetenskaplik onverantwoordbare metodes in die bedryf gebruik wat op die weging van kriteria berus. 'n Wetenskaplik verantwoordbare besluitsteunstelsel (BSS)-raamwerk word egter in hierdie tesis daargestel om NROs in hierdie moeilike seleksiebesluitnemingsproses by te staan. Hierdie raamwerk is daartoe in staat om beide die tersi^ere studie-uitkoms (sukses of mislukking) van beursaansoekers te voorspel en om hierdie aansoekers in volgorde van potensiële meriete te rangskik. Die drie hoofkomponente van hierdie raamwerk is 'n voorspellingskomponent (wat verskeie statistiese leermodelle op 'n ensemble-wyse inspan om uit historiese data te leer en dan voorspellings ten opsigte van nuwe beursaansoekers te maak), 'n integrasiekomponent (wat die voorspellings van die bogenoemde modelle tot 'n enkele voorspelling vir elke beursaansoeker kombineer), en 'n rangorde-komponent (wat buiten die voorspelling vir elke beursaansoeker ook 'n rangorde-vlak aan elke beursaansoeker toeken). Voorbeelde van modelle wat by die voorspellingskomponent ingesluit word, sluit logistiese re- gressie, klassifikasie- en regressiebome, lukrake woude, die C4.5 algoritme, en steunvektormasjiene in, terwyl meerderheidstemming en geweegde meerderheidstemming voorbeelde is van metodologi wat by die integrasiekomponent ingesluit kan word. Die werking van die integrasiekomponent berus op die weging van die onderskeie modelafvoere volgens die voorspellingsakkuraatheid van hierdie modelle in die konteks van 'n uithou-versameling. Moontlike metodologieë wat by die rangorde-komponent ingesluit kan word, spruit uit die studieveld van veelvuldige-kriteria besluitnemingsanalise en sluit ELimination Et Choix Traduisant la REalite III (ELECTRE III) en Pref- erence Ranking Organisation METHod for Enrichment Evaluations II (PROMETHEE II) in. Die praktiese toepasbaarheid van die BSS-raamwerk word gedemonstreer deur die stelsel op steekproefdata toe te pas wat deur twee NRO-nywerheidsvennote verskaf is. Gedurende 'n assessering van die BSS in hierdie konteks word daar bevind dat die akkuraatheid van die gekombineerde sukses- of mislukkingsvoorspellings vir beursaansoekers beter is as di van die individuele modelle op 'n een-tot-een vergelykingsbasis. Daar word ook bevind dat die gemiddelde algehele akkuraatheid van die gekombineerde voorspellings diè van die huidige prosesse wat deur die nywerheidsvennote gebruik word, uitstof. Die steekproefdata word verder analiseer om interessante neigings in die data sowel as veranderlikes te identifiseer wat goed gebruik kan word om die tersi^ere sukses van voornemende studente te voorspel. Verbasende en moontlik teen-intuitiewe resultate word sodoende verkry wat trouens daarop dui dat gemiddeldes en vakpunte op hoêrskool negatief met die uiteindelike tersi^ere sukses van vorige studente korreleer. Hierdie waarneming kan moontlik daaraan toegeskryf word dat beter presterende hoêrskoolleerders na meer uitdagende, en potensieel meer gesogte, tersi^ere inrigtings, studievelde en kwalifikasietipes aangetrek word.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/100336
This item appears in the following collections: