A comparative evaluation of non-linear time series analysis and singular spectrum analysis for the modelling of air pollution

Date
2000-12
Authors
Diab, Anthony Francis
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH ABSTRACT: Air pollution is a major concern III the Cape Metropole. A major contributor to the air pollution problem is road transport. For this reason, a national vehicle emissions study is in progress with the aim of developing a national policy regarding motor vehicle emissions and control. Such a policy could bring about vehicle emission control and regulatory measures, which may have far-reaching social and economic effects. Air pollution models are important tools 10 predicting the effectiveness and the possible secondary effects of such policies. It is therefore essential that these models are fundamentally sound to maintain a high level of prediction accuracy. Complex air pollution models are available, but they require spatial, time-resolved information of emission sources and a vast amount of processing power. It is unlikely that South African cities will have the necessary spatial, time-resolved emission information in the near future. An alternative air pollution model is one that is based on the Gaussian Plume Model. This model, however, relies on gross simplifying assumptions that affect model accuracy. It is proposed that statistical and mathematical analysis techniques will be the most viable approach to modelling air pollution in the Cape Metropole. These techniques make it possible to establish statistical relationships between pollutant emissions, meteorological conditions and pollutant concentrations without gross simplifying assumptions or excessive information requirements. This study investigates two analysis techniques that fall into the aforementioned category, namely, Non-linear Time Series Analysis (specifically, the method of delay co-ordinates) and Singular Spectrum Analysis (SSA). During the past two decades, important progress has been made in the field of Non-linear Time Series Analysis. An entire "toolbox" of methods is available to assist in identifying non-linear determinism and to enable the construction of predictive models. It is argued that the dynamics that govern a pollution system are inherently non-linear due to the strong correlation with weather patterns and the complexity of the chemical reactions and physical transport of the pollutants. In addition to this, a statistical technique (the method of surrogate data) showed that a pollution data set, the oxides of Nitrogen (NOx), displayed a degree of non-linearity, albeit that there was a high degree of noise contamination. This suggested that a pollution data set will be amenable to non-linear analysis and, hence, Non-linear Time Series Analysis was applied to the data set. SSA, on the other hand, is a linear data analysis technique that decomposes the time series into statistically independent components. The basis functions, in terms of which the data is decomposed, are data-adaptive which makes it well suited to the analysis of non-linear systems exhibiting anharmonic oscillations. The statistically independent components, into which the data has been decomposed, have limited harmonic content. Consequently, these components are more amenable to prediction than the time series itself. The fact that SSA's ability has been proven in the analysis of short, noisy non-linear signals prompted the use of this technique. The aim of the study was to establish which of these two techniques is best suited to the modelling of air pollution data. To this end, a univariate model to predict NOx concentrations was constructed using each of the techniques. The prediction ability of the respective model was assumed indicative of the accuracy of the model. It was therefore used as the basis against which the two techniques were evaluated. The procedure used to construct the model and to quantify the model accuracy, for both the Non-linear Time Series Analysis model and the SSA model, was consistent so as to allow for unbiased comparison. In both cases, no noise reduction schemes were applied to the data prior to the construction of the model. The accuracy of a 48-hour step-ahead prediction scheme and a lOO-hour step-ahead prediction scheme was used to compare the two techniques. The accuracy of the SSA model was markedly superior to the Non-linear Time Series model. The paramount reason for the superior accuracy of the SSA model is its adept ability to analyse and cope with noisy data sets such as the NOx data set. This observation provides evidence to suggest that Singular Spectrum Analysis is better suited to the modelling of air pollution data. It should therefore be the analysis technique of choice when more advanced, multivariate modelling of air pollution data is carried out. It is recommended that noise reduction schemes, which decontaminate the data without destroying important higher order dynamics, should be researched. The application of an effective noise reduction scheme could lead to an improvement in model accuracy. In addition to this, the univariate SSA model should be extended to a more complex multivariate model that explicitly encompasses variables such as traffic flow and weather patterns. This will explicitly expose the inter-relationships between the variables and will enable sensitivity studies and the evaluation of a multitude of scenarios.
AFRIKAANSE OPSOMMING: Die hoë vlak van lugbesoedeling in die Kaapse Metropool is kommerwekkend. Voertuie is een van die hoofoorsake, en as gevolg hiervan word 'n landswye ondersoek na voertuigemissie tans onderneem sodat 'n nasionale beleid opgestel kan word ten opsigte van voertuigemissie beheer. Beheermaatreëls van so 'n aard kan verreikende sosiale en ekonomiese uitwerkings tot gevolg hê. Lugbesoedelingsmodelle is van uiterste belang in die voorspelling van die effektiwiteit van moontlike wetgewing. Daarom is dit noodsaaklik dat hierdie modelle akkuraat is om 'n hoë vlak van voorspellingsakkuraatheid te handhaaf. Komplekse modelle is beskikbaar, maar hulle verg tyd-ruimtelike opgeloste inligting van emmissiebronne en baie berekeningsvermoë. Dit is onwaarskynlik dat Suid-Afrika in die nabye toekoms hierdie tydruimtelike inligting van emissiebronne gaan hê. 'n Alternatiewe lugbesoedelingsmodel is dié wat gebaseer is op die "Guassian Plume". Hierdie model berus egter op oorvereenvoudigde veronderstellings wat die akkuraatheid van die model beïnvloed. Daar word voorgestel dat statistiese en wiskundige analises die mees lewensvatbare benadering tot die modellering van lugbesoedeling in die Kaapse Metropool sal wees. Hierdie tegnieke maak dit moontlik om 'n statistiese verwantskap tussen besoedelingsbronne, meteorologiese toestande en besoedeling konsentrasies te bepaal sonder oorvereenvoudigde veronderstellings of oormatige informasie vereistes. Hierdie studie ondersoek twee analise tegnieke wat in die bogenoemde kategorie val, naamlik, Nie-lineêre Tydreeks Analise en Enkelvoudige Spektrale Analise (ESA). Daar is in die afgelope twee dekades belangrike vooruitgang gemaak in die studieveld van Nie-lineêre Tydreeks Analise. 'n Volledige stel metodes is beskikbaar om nie-lineêriteit te identifiseer en voorspellingsmodelle op te stel. Dit word geredeneer dat die dinamika wat 'n besoedelingsisteem beheer nie-lineêr is as gevolg van die sterk verwantskap wat dit toon met weerpatrone asook die kompleksiteit van die chemiese reaksies en die fisiese verplasing van die besoedelingstowwe. Bykomend verskaf 'n statistiese tegniek (die metode van surrogaatdata) bewyse dat 'n lugbesoedelingsdatastel, die okside van Stikstof (NOx), melineêre gedrag toon, alhoewel daar 'n hoë geraasvlak is. Om hierdie rede is die besluit geneem om Nie-lineêre Tydreeks Analise aan te wend tot die datastel. ESA daarenteen, is 'n lineêre data analise tegniek. Dit vereenvoudig die tydreeks tot statistiese onafhanklike komponente. Die basisfunksies, in terme waarvan die data vereenvoudig is, is data-aanpasbaar en dit maak hierdie tegniek gepas vir die analise van nielineêre sisteme. Die statisties onafhanklike komponente het beperkte harmoniese inhoud, met die gevolg dat die komponente aansienlik makliker is om te voorspel as die tydreeks self. ESA se effektiwitiet is ook al bewys in die analise van kort, hoë-graas nie-lineêre seine. Om hierdie redes, is ESA toegepas op die lugbesoedelings data. Die doel van die ondersoek was om vas te stel watter een van die twee tegnieke meer gepas is om lugbesoedelings data te analiseer. Met hierdie doelwit in sig, is 'n enkelvariaat model opgestel om NOx konsentrasies te voorspel met die gebruik van elk van die tegnieke. Die voorspellingsvermoë van die betreklike model is veronderstelom as 'n maatstaf van die model se akkuraatheid te kan dien en dus is dit gebruik om die twee modelle te vergelyk. 'n Konsekwente prosedure is gevolg om beide die modelle te skep om sodoende invloedlose vergelyking te verseker. In albei gevalle was daar geen geraasverminderings-tegnieke toegepas op die data nie. Die akuraatheid van 'n 48-uur voorspellingsmodel en 'n 100-uur voorspellingsmodel was gebruik vir die vergelyking van die twee tegnieke. Daar is bepaal dat die akkuraatheid van die ESA model veel beter as die Nie-lineêre Tydsreeks Analise is. Die hoofrede vir die ESA se hoër akkuraatheid is die model se vermoë om data met hoë geraasvlakke te analiseer. Hierdie ondersoek verskaf oortuigende bewyse dat Enkelvoudige Spektrale Analiese beter gepas is om lugbesoedelingsdata te analiseer en gevolglik moet hierdie tegniek gebruik word as meer gevorderde, multivariaat analises uitgevoer word. Daar word aanbeveel dat geraasverminderings-tegnieke, wat die data kan suiwer sonder om belangrike hoë-orde dinamika uit te wis, ondersoek moet word. Hierdie toepassing van effektiewe geraasverminderings-tegniek sal tot 'n verbetering in model-akkuraatheid lei. Aanvullend hiertoe, moet die enkele ESA model uitgebrei word tot 'n meer komplekse multivariaat model wat veranderlikes soos verkeersvloei en weerpatrone insluit. Dit sal die verhoudings tussen veranderlikes ten toon stel en sal sensitiwiteit-analises en die evaluering van menigte scenarios moontlik maak.
Description
Thesis (MScEng)--University of Stellenbosch, 2000.
Keywords
Air -- Pollution, Air -- Pollution -- Measurement, Air quality management, Air -- Pollution -- South Africa -- Cape Town, Non-linear time series analysis, Dissertations -- Mechanical engineering, Theses -- Mechanical engineering
Citation