Encoding remotely sensed time series data as two-dimensional images for urban change detection using convolution neural networks

Date
2022-04
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH ABSTRACT: Urban expansion is the most pervasive form of land cover change in South Africa. A method that can effectively detect and indicate areas that have a higher probability of displaying urban change will therefore be a valuable asset to analysts. That is why it is critical to derive a rapid framework that can accurately map urban change. An alternative remote sensing approach that uses multi-temporal time series data and deep learning techniques has been proposed as a potential method for performing a successful urban change detection. The interdisciplinary scientific field of computer vision holds a framework for encoding time-series data as two-dimensional (2D) images for input to a convolution neural network (CNN). Traditional image classifications techniques and more recent studies that have deployed machine learning and deep learning classifiers (namely support vector machine (SVM), random forest (RF), k-nearest neighbour (kNN), long short-term memory (LSTM) and CNN) have been used for urban land cover classification. In this study, a unique framework proposed within computer vision that exploits Gramian angular fields (GAF) and Markov transition fields (MTF) as the transformations for encoding time series data as 2D imagery prior to deep learning classification is investigated for urban change detection. Two main experiments were carried out, both of which utilised the proposed framework for performing an effective urban change detection. The first experiment used coarse resolution data derived from Pretoria using MODIS 500m and 250m normalised difference vegetation index (NDVI). The proposed framework was then deployed, and Gramian angular summation field (GASF), Gramian angular difference field (GADF), and MTF transformations used to encode the time series data. A concatenated encoded image containing the information from all three transformations was formed and was run alongside the three individual transformations. Multiple pre-trained CNN architectures (namely ResNet, DenseNet, InceptionV3, InceptionResNetV2, VGG and MobileNet) were used, from which an urban change detection was derived. It was established that the concatenated images yielded the highest accuracy at 91% and 93% for the 500m and 250m resolution datasets, respectively. The proposed framework was compared to a current state-of-the-art time series classifier (LSTM) to illustrate the effectiveness of encoding and processing deep learning classifiers. The results also outperformed that of other urban change detections studies conducted in South Africa. The second experiment made use of higher resolution Sentinel-2 data derived from a resampled 30m resolution NDVI product of Pretoria. Several investigations were made into the influencing elements that affect the performance of the urban change detection. These were the spatial and temporal resolutions, training data size and different classification schemes. Using the proposed Stellenbosch University https://scholar.sun.ac.za iv framework from the first experiment, the spatial and temporal resolutions were tested. The results showed that an increase in spatial or temporal resolution will have a positive effect on the performance. The 30m resolution dataset yielded a 4% increase over the 250m resolution data tested in the first experiment. Altering the time-series length (TSL) from 32 to 82, the accuracy increased from 96% to 98%, respectively. It was also illustrated that by increasing the amount of training data, one could improve the performance of the change detection. Multiple classifications were performed, and the accuracy assessed using a confusion matrix. It was established that a 70%+ minimum pixel probability and the majority ensemble classifier performed the best. The frameworks generalisability was tested at three different locations (Durban, Gqeberha, and Khayelitsha), and was able to generalise using the Durban dataset. However, the models were unable to generalise using the Gqeberha, and Khayelitsha datasets due to the diverse ecological and climatic properties. The experiments showed that deploying a computer vision framework of encoding multi-temporal time series data as two-dimensional images for an urban change detection using CNN classifications is, in fact possible, and proved to be one of the most effective urban change detection methods performed in South Africa. However, it is recommended that further research deploys a signature extension approach for training the models in order to improve the generalisability. Additional research into using Landsat8 and increased TSL datasets is also recommended.
AFRIKAANSE OPSOMMING: Stedelike uitbreiding is die heersende vorm van grondbedekkingsverandering in Suid-Afrika. 'n Metode om gebiede met 'n groter waarskynlikheid van stedelike veranderinge te toon of effektief te kan kan opspoor en aandui, sal 'n waardevolle bate vir ontleders wees. Daarom is dit van kritieke belang om 'n minder tydrowende raamwerk op te stel wat stedelike verandering akkuraat kan karteer. 'n Alternatiewe afstandswaarnemingsbenadering wat multi-temporale tydreeksdata en diepleertegnieke gebruik, word voorgestel as 'n moontlike metode vir suksesvolle opsporing van stedelike veranderinge. Die interdissiplinere wetenskaplike veld van rekenaarvisie bevat 'n raamwerk vir die kodering van tydreeksdata as tweedimensionele beelde wat as invoer dien vir 'n konvolusionele neurale netwerk (CNN). Tradisionele beeldklassifikasietegnieke en meer onlangse studies wat masjienleer- en diepleerklassifiseerders (naamlik ondersteuningsvektormasjien (SVM), ewekansige woud (RF), k-naaste buurtklassifiseerder (kNN), lang-kort-termyn-geheue (LSTM) en CNN) word dikwels gebruik vir klassifikasie van stedelike grondbedekkings. In hierdie studie word 'n unieke raamwerk voorgestel wat binne rekenaarvisie ontwikkel is wat Gramian-hoekvelde (GAF) en Markov-oorgangsvelde (MTF) benut as ‘n transformasie in die kodering van tydreeksdata as tweedimensionele beelde voordat diepleerklassifikasie ondersoek word vir die opsporing van stedelike veranderinge . Twee eksperimente is uitgevoer, wat beide die voorgestelde raamwerk gebruik het vir opsporing van stedelike veranderinge. Die eerste eksperiment het gegewens gebruik van growwe resolusie wat uit Pretoria verkry is, met behulp van MODIS 500m en 250m genormaliseerde verskil plantegroei-indeks (NDVI) data. Die voorgestelde raamwerk is daarna ontplooi deur Gramian hoeksomvelde (GASF), Gramian hoekverskilvelde (GADF) en MTF transformasies te gebruik om die tydreeksdata te kodeer. 'n Saamgevoegde gekodeerde beeld wat al drie transformasies bevat, is gemaak en saam met die drie individuele transformasies analiseer. Veelvuldige vooraf-opgeleide CNN-argitekture (naamlik ResNet, DenseNet, InceptionV3, InceptionResNetV2, VGG en MobileNet) is gebruik, waaruit die stedelike verandering afgelei is. Daar is vasgestel dat die saamgevoegde beelde die hoogste akkuraatheid gelewer het met 91% en 93% vir die datastelle van onderskeidelik 500m en 250m. Die voorgestelde raamwerk is vergelyk met 'n huidige moderne tydreeksklassifiseerder (LSTM) om die doeltreffendheid van kodering en verwerking van 'n diepleerklassifiseerder te illustreer. Die resultate was ook beter as die van ander stedelike veranderingstudies in Suid-Afrika. Die tweede eksperiment het gebruik gemaak van Sentinel-2-data met 'n hoer resolusie, ook afgelei van 'n NDVI-produk vir Pretoria, verwerk na 30m. Verskeie ondersoeke is gedoen om vas te stel wat die faktore is wat die akkuraatheid van die opsporing van stedelike verandering beinvloed, byvoorbeeld, die ruimtelike en temporale resolusies, die grootte van die opleidingsdata en verskillende klassifikasie skemas. Met behulp van die voorgestelde raamwerk van die eerste eksperiment, is die effek van ruimtelike en temporale resolusies getoets. Die resultate het getoon dat 'n toename in ruimtelike of temporale resolusie 'n positiewe uitwerking op die akkuraatheid sal hê. Die datastel met 'n resolusie van 30m het 'n toename van 4% opgelewer in vergelyking met die resolusiedata van 250m wat in die eerste eksperiment getoets is. Deur die tydreekslengte (TSL) van 32 na 82 te verander, het die akkuraatheid toegeneem van 96% tot 98%. Die studie het ook aangedui dat die akkuraatheid van veranderingopsporing sou verbeter kon word deur die hoeveelheid opleidingsdata te vermeerder. Veelvuldige klassifikasie skemas is uitgevoer en die akkuraatheid met behulp van 'n verwarringsmatriks getoets. Daar is vasgestel dat 'n 70%+ minimum pixelwaarskynlikheid en die meerderheidsensemble-klassifiseerder die beste gevaar het. Die veralgemeenbaarheid van die raamwerke is op drie verskillende plekke (Durban, Gqeberha en Khayelitsha) getoets, maar kon slegs in Durban veralgemeen word. Die modelle kon nie stedelike verandering met Gqeberha- en Khayelitsha -datastelle optel nie weens die uiteenlopende ekologiese en klimaatseienskappe. Die eksperimente het getoon dat die implementering van 'n rekenaarvisie raamwerk vir die kodering van multi-temporale tydreeksdata as tweedimensionele beelde vir die opsporing van stedelike veranderinge met behulp van CNN-klassifikasies in werklikheid moontlik is en een van die mees doeltreffende opsporingstegnieke vir stedelike veranderinge in Suid-Afrika kan wees. Dit word egter aanbeveel dat verdere navorsing 'n uitbreidingsbenadering gebruik vir die opleidingsdata vir die modelle om die veralgemenbaarheid te verbeter. Bykomende navorsing oor die gebruik van Landsat8 en verhoogde TSL-datastelle word ook aanbeveel.
Description
Thesis (MSc)--Stellenbosch University, 2021.
Keywords
Landsat satellites, Urban geography -- Remote sensing -- South Africa, Neural networks (Computer science), Three dimensional imaging, Machine learning, UCTD
Citation