Machine learning and high spatial resolution multi-temporal Sentinel-2 imagery for crop type classification

Journal Title
Journal ISSN
Volume Title
Stellenbosch : Stellenbosch University
ENGLISH SUMMARY : Spatially-explicit crop type information is useful for estimating agricultural production areas. Such information is used for various monitoring and decision-making applications, including crop insurance, food supply-demand logistics, commodity market forecasting and environmental modelling. Traditional methods, such as ground surveys and agricultural censuses, involve high production costs and are often labour intensive, which limit their use for timely and accurate crop type data production. Remote sensing, however, offers a dependable, cost-effective and timely way of mapping crop types. Although remote sensing approaches – particularly using multitemporal techniques – have been successfully employed for producing crop type information, this information is mostly available post-harvest. Thus, researchers and decision-makers have to wait several months after harvest to have such information, which is usually too late for many applications. The availability and accessibility of imagery collected with optical sensors make such data preferable for mapping crop types. However, these sensors are subject to cloud-interference, which has been recognised as a source of error in the retrieval of surface parameters. It is therefore important to assess the strengths and weaknesses of using multi-temporal optical imagery for differentiating crop types. This study utilises Sentinel-2A and 2B imagery to perform several experiments in selected parts of the Western Cape, South Africa, to undertake this assessment. The first three experiments assessed the significance of image selection on the accuracies of crop type classification. A recommended number of Sentinel-2 images was selected, using two different methods. The first of the three experiments was conducted with uni-temporal images. Based on the performance rankings of the uni-temporal images, five images with the highest ranks were used to set up Experiment 2. The third experiment was undertaken with a handpicked set of five images, based on crop developmental stages. The two image selection methods were compared to each other and subsequently to the entire time-series, to determine the significance of selecting images for crop type mapping. These classifications were undertaken with several supervised machine learning classifiers and one parametric classifier. Results showed no significant difference in classification accuracies between the two image selection methods and the entire time-series. Overall, the support vector machine (SVM) and random forest (RF) algorithms outperformed all the other classifiers. The fourth experiment was undertaken by chronologically adding images to the classifiers. The progression of classification accuracies against time and the increase in the number of images were analysed to determine the earliest period (pre-harvest) when crops can be classified with sufficient accuracies. The highest pre-harvest accuracy achieved was then compared to that obtained at the end of the season, including images acquired post-harvest, to assess the effectiveness of machine learning classifiers for classifying crop types when only pre-harvest images are used. The results of this experiment showed that machine learning classifiers can classify crops when only preharvest images are used, with accuracies similar to those obtained when the entire time-series is used. Satisfactory classification accuracies were attainable as early as Aug/Sept (eight weeks before harvest). The fifth to tenth experiments were undertaken to assess the impact of cloud cover and image compositing on crop type classification accuracies. The fifth and sixth experiments were performed with non-composited images. Experiment Five (5) was undertaken with cloud-free images only, while the sixth experiment involved using all available images, including cloudcontaminated observations. The seventh to tenth experiments were undertaken with monthly image composites computed using four different image compositing approaches. All these experiments were undertaken using several machine learning classifiers. The results showed that machine learning classifiers performed best when all images – including cloud-contaminated images – are used as input to the classifiers. Image compositing had a detrimental effect on classification accuracies. Generally, multi-temporal Sentinel-2 data hold great potential for operational crop type map production early in the season. However, more work is needed to develop simple workflows for eliminating cloud cover, particularly for crop type mapping in areas characterised by frequent overcast conditions.
AFRIKAANSE OPSOMMING : Eksperiment 2 op te stel. Die derde eksperiment is gedoen met ’n uitgesoekte stel van vyf beelde, gebaseer op stadiums van gewasontwikkeling. Die twee beeldseleksiemetodes is met mekaar vergelyk en gevolglik met die hele tydreeks, om te bepaal wat die betekenis daarvan is om beelde te kies vir gewastipe-kartering. Hierdie klassifikasies is onderneem met verskeie masjienlerende klassifiseerders en een parametriese klassifiseerder, onder toesig. Resultate het geen beduidende verskil in klassifikasie-akkuraathede gewys tussen die twee beeldseleksiemetodes en die algehele tydreeks nie. In die geheel het die steunvektormasjien- (SVM) en lukrake-woud- (“random forest”, RF) -algoritmes beter presteer as al die ander klassifiseerders. Die vierde eksperiment is onderneem deur beelde chronologies by die klassifiseerders te voeg. Die progressie van klassifikasie-akkuraathede teenoor tyd en die toename in die aantal beelde is geanaliseer om die vroegste periode (voor-oes) te bepaal wanneer gewasse met voldoende akkuraathede geklassifiseer kan word. Die hoogste voor-oes-akkuraatheid is toe vergelyk met dit wat teen die end van die seisoen behaal is, insluitend beelde wat na-oes ingesamel is, om die doeltreffendheid van masjienlerende klassifiseerders te bepaal by die klassifisering van gewastipes wanneer slegs voor-oes-beelde gebruik is. Die resultate van hierdie eksperiment het gewys dat masjienlerende klassifiseerders gewasse kan klassifiseer wanneer slegs voor-oes-beelde gebruik is, met akkuraathede wat soortgelyk is aan dit wat behaal is wanneer die hele tydreeks gebruik is. Bevredigende klassifikasie-akkuraathede is so vroeg as Aug/Sep behaal (agt weke voor oes). Die vyfde tot tiende eksperimente is onderneem om die impak van wolkbedekking en beeldsamestelling op klassifikasie-akkuraathede van gewastipes te bepaal. Die vyfde en sesde eksperimente is met nie-saamgestelde beelde uitgevoer. Eksperiment Vyf (5) is slegs met wolkvrye beelde gedoen, terwyl die sesde eksperiment die gebruik van alle beskikbare beelde, insluitend wolkgekontamineerde observasies, betrek het. Die sewende tot tiende eksperimente is onderneem met maandelikse beeldsamestellings wat bereken is deur middel van die gebruik van vier verskillende benaderings tot beeldsamestelling. Al hierdie eksperimente is met behulp van verskeie masjienlerende klassifiseerders uitgevoer. Die resultate het gewys dat masjienlerende klassifiseerders die beste presteer het wanneer alle beelde – insluitend wolkgekontamineerde beelde – as invoer aan die klassifiseerders gebruik word. Beeldsamestelling het ’n nadelige uitwerking op klassifikasie-akkuraathede gehad. Oor die algemeen het multitemporale Sentinel-2-data vroeg in die seisoen goeie potensiaal vir operasionele gewastipe-kaartproduksie. Meer werk is nietemin nodig om eenvoudige werkvloei te ontwikkel om wolkbedekking te elimineer, veral vir gewastipe-kartering in areas wat gereeld gekenmerk word deur oortrokke toestande.
Thesis (MPhil)--Stellenbosch University, 2019.
Crops -- Classification, Remote sensing, Machine learning, Crops -- Pre-harvest, Remote-sensing images -- Image quality, Sentinel-2, UCTD