The implementation of noise addition partial least squares

Moller, Jurgen Johann

The implementation of noise addition partial least squares

dc.contributor.advisor	Kidd, M.	en_ZA
dc.contributor.author	Moller, Jurgen Johann	en_ZA
dc.contributor.other	University of Stellenbosch. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.
dc.date.accessioned	2009-03-05T11:04:41Z	en_ZA
dc.date.accessioned	2010-07-09T11:08:35Z
dc.date.available	2009-03-05T11:04:41Z	en_ZA
dc.date.available	2010-07-09T11:08:35Z
dc.date.issued	2009-03	en_ZA
dc.description	Thesis (MComm (Statistics and Actuarial Science))--University of Stellenbosch, 2009.	en_ZA
dc.description.abstract	When determining the chemical composition of a specimen, traditional laboratory techniques are often both expensive and time consuming. It is therefore preferable to employ more cost effective spectroscopic techniques such as near infrared (NIR). Traditionally, the calibration problem has been solved by means of multiple linear regression to specify the model between X and Y. Traditional regression techniques, however, quickly fail when using spectroscopic data, as the number of wavelengths can easily be several hundred, often exceeding the number of chemical samples. This scenario, together with the high level of collinearity between wavelengths, will necessarily lead to singularity problems when calculating the regression coefficients. Ways of dealing with the collinearity problem include principal component regression (PCR), ridge regression (RR) and PLS regression. Both PCR and RR require a significant amount of computation when the number of variables is large. PLS overcomes the collinearity problem in a similar way as PCR, by modelling both the chemical and spectral data as functions of common latent variables. The quality of the employed reference method greatly impacts the coefficients of the regression model and therefore, the quality of its predictions. With both X and Y subject to random error, the quality the predictions of Y will be reduced with an increase in the level of noise. Previously conducted research focussed mainly on the effects of noise in X. This paper focuses on a method proposed by Dardenne and Fernández Pierna, called Noise Addition Partial Least Squares (NAPLS) that attempts to deal with the problem of poor reference values. Some aspects of the theory behind PCR, PLS and model selection is discussed. This is then followed by a discussion of the NAPLS algorithm. Both PLS and NAPLS are implemented on various datasets that arise in practice, in order to determine cases where NAPLS will be beneficial over conventional PLS. For each dataset, specific attention is given to the analysis of outliers, influential values and the linearity between X and Y, using graphical techniques. Lastly, the performance of the NAPLS algorithm is evaluated for various	en_ZA
dc.identifier.uri	http://hdl.handle.net/10019.1/3362
dc.language.iso	en	en_ZA
dc.publisher	Stellenbosch : University of Stellenbosch
dc.rights.holder	University of Stellenbosch
dc.subject	Dissertations -- Statistics and actuarial science	en
dc.subject	Theses -- Statistics and actuarial science	en
dc.subject	Assignments -- Statistics and actuarial science	en
dc.subject.lcsh	Chemistry, Analytic -- Statistical methods	en_ZA
dc.subject.lcsh	Principal components analysis	en_ZA
dc.subject.lcsh	Regression analysis	en_ZA
dc.subject.lcsh	Ridge regression (Statistics)	en_ZA
dc.title	The implementation of noise addition partial least squares	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: moller_implementation_2009.pdf
Size:: 2.01 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Masters Degrees (Statistics and Actuarial Science)