A Comparison Between Existing Mortality Risk Algorithms and Machine Learning Techniques

Scholtz, Jenny (2022)


Abstract This thesis assesses the feasibility and benefits of using the patient data of a large private South African hospital group to estimate a model of mortality risk using flexible machine learning tech- niques. Specifically, I investigate whether such a model would have been able to outperform a com- monly used medical scoring system, SAPS 3, in predicting mortality during the second half of the Covid-19 pandemic. A LightGBM machine learning model is shown to be much more accurate in predicting mortality (76.15% accuracy, compared to 56.58% for SAPS 3) for the Covid-19 positive sample. Roughly half of this gain in predictive accuracy is obtained from using the most recent and relevant data to train the model, while the remaining lift is attributable to allowing the model to find patient symptoms and attributes that are measured but ignored by SAPS 3. Interestingly, the flexible functional form of the machine learning models, which allow the predictors to affect mortality through non-linearities and interactions, has a negligible effect on predictive accuracy. The same method is also found to produce more accurate forecasts for patients who tested negative for Covid-19, but this improvement is smaller than for Covid-19 positive sample. The results of this thesis illustrate that machine learning methods are valuable tools to predict patient outcomes, particularly when there are unexpected shifts in the relationship between patient features and patient outcomes. Large hospital groups can obtain more accurate forecasts from a dynamic scoring system which is frequently frequently retrained on their own patient data.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/126394
This item appears in the following collections: