A comparison of the classification performance of two approaches to polychotomous logistic regression
Logistic regression is frequently used to classify an entity of unknown origin into one of a number of available groups. Before this can be done, the unknown parameters in the logistic discriminant functions must be estimated from, training data. In the case of more than two groups this requires iterative maximisation of the relevant likelihood function. Since suitable software for this purpose is not generally available, an alternative approach, called the individualised binary approach, was proposed by Begg and Gray (1984) and subsequently widely applied. In this paper the classification performance of the individualised binary approach is compared to that of the classification function obtained from maximisation of the likelihood function. The latter approach is recommended, since it is generally found to be superior This superiority increases with a decrease in the prevalence of the reference group, with the introduction of correlation amongst the feature variables, and with an increase in the ratio of the number of feature variables to the total training sample size.