Influential data cases when the C-p criterion is used for variable selection in multiple linear regression

dc.contributor.advisorSteel, S. J.
dc.contributor.advisorVan Vuuren, J. O.
dc.contributor.authorUys, Daniel Wilhelm
dc.contributor.otherStellenbosch University. Faculty of Economic and Management Sciences . Dept. of Statistical and Actuarial Science.en_ZA
dc.date.accessioned2012-08-27T11:35:29Z
dc.date.available2012-08-27T11:35:29Z
dc.date.issued2003
dc.descriptionDissertation (PhD)--Stellenbosch University, 2003.en_ZA
dc.description.abstractENGLISH ABSTRACT: In this dissertation we study the influence of data cases when the Cp criterion of Mallows (1973) is used for variable selection in multiple linear regression. The influence is investigated in terms of the predictive power and the predictor variables included in the resulting model when variable selection is applied. In particular, we focus on the importance of identifying and dealing with these so called selection influential data cases before model selection and fitting are performed. For this purpose we develop two new selection influence measures, both based on the Cp criterion. The first measure is specifically developed to identify individual selection influential data cases, whereas the second identifies subsets of selection influential data cases. The success with which these influence measures identify selection influential data cases, is evaluated in example data sets and in simulation. All results are derived in the coordinate free context, with special application in multiple linear regression.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Invloedryke waarnemings as die C-p kriterium vir veranderlike seleksie in meervoudigelineêre regressie gebruik word: In hierdie proefskrif ondersoek ons die invloed van waarnemings as die Cp kriterium van Mallows (1973) vir veranderlike seleksie in meervoudige lineêre regressie gebruik word. Die invloed van waarnemings op die voorspellingskrag en die onafhanklike veranderlikes wat ingesluit word in die finale geselekteerde model, word ondersoek. In besonder fokus ons op die belangrikheid van identifisering van en handeling met sogenaamde seleksie invloedryke waarnemings voordat model seleksie en passing gedoen word. Vir hierdie doel word twee nuwe invloedsmaatstawwe, albei gebaseer op die Cp kriterium, ontwikkel. Die eerste maatstaf is spesifiek ontwikkelom die invloed van individuele waarnemings te meet, terwyl die tweede die invloed van deelversamelings van waarnemings op die seleksie proses meet. Die sukses waarmee hierdie invloedsmaatstawwe seleksie invloedryke waarnemings identifiseer word beoordeel in voorbeeld datastelle en in simulasie. Alle resultate word afgelei binne die koërdinaatvrye konteks, met spesiale toepassing in meervoudige lineêre regressie.af_ZA
dc.format.extent189 p.
dc.identifier.urihttp://hdl.handle.net/10019.1/53464
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectRegression analysisen_ZA
dc.subjectDissertations -- Statistics and actuarial scienceen_ZA
dc.subjectC-p criterionen_ZA
dc.subjectVariable selectionen_ZA
dc.subjectTheses -- Statistics and actuarial scienceen_ZA
dc.titleInfluential data cases when the C-p criterion is used for variable selection in multiple linear regressionen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
uys_influential_2003.pdf
Size:
42.06 MB
Format:
Adobe Portable Document Format
Description: