Efficacy of machine learning and lidar data for crop type mapping

Prins, Adriaan (2019-12)

Thesis (MA)--Stellenbosch University, 2019.

Thesis

ENGLISH ABSTRACT: Accurate crop type maps are important for obtaining agricultural statistics such as water use or harvest estimations. The traditional approach to obtaining maps of cultivated fields is by manually digitising the fields from satellite or aerial imagery. However, manual digitising is time-consuming, expensive and subject to human error. Automated remote sensing methods have been a popular alternative for crop type map creation, with machine learning classification algorithms gaining popularity for classifying crop types from satellite imagery. However, using light detection and ranging (LiDAR) data for crop type mapping has not been widely researched. This study assessed the use of LiDAR data for crop type classification, by using it on its own and in combination with Sentinel-2 and aerial imagery. The first experiment evaluated the use of LiDAR data and machine learning for classifying vineyards. The LiDAR data was obtained from a 2014 survey by the City of Cape Town. The normalised digital surface model (nDSM) and intensity raster derived from the LiDAR data were interpolated at four resolutions (1.5 m, 2 m, 2.5 m and 3 m) and then used for generating a range of texture measures. The textures measures were generated using two window sizes (3x3 and 5x5) per resolution scenario, which resulted in eight datasets. The resulting dataset was then used as input for 11 machine learning classification algorithms, which performed a binary classification of vineyards and non-vineyards. The results showed that LiDAR data are able to discriminate between vineyards and non-vineyards, with the random forest (RF) classifier obtaining the highest overall accuracy (OA) of 80.9%. Furthermore, the results showed that a significant difference in accuracy can be achieved with neural networks and distance-based classifiers when the input data are standardised. The second experiment used the methods developed for the first experiment to perform a five-class classification. The five classes consisted of maize, cotton, groundnuts, orchards and non-agriculture. Sentinel-2 and aerial imagery data were added to the analysis and were compared to LiDAR data. The LiDAR data was obtained from a 2016 survey of the Vaalharts irrigation scheme. Furthermore, the three datasets (Sentinel-2, aerial imagery and LiDAR data) were combined in order to evaluate which combination of datasets produces the highest OA. The results showed that the performance of LiDAR data was similar to that of Sentinel-2 imagery, with LiDAR data obtaining a mean OA of 84.3%, while Sentinel-2 obtained a mean OA of 83.6%. The difference between the OAs of LiDAR and Sentinel-2 were statistically insignificant. The highest OA (94.6%) was obtained with RF when the LiDAR, Sentinel-2 and aerial datasets were combined. However, a combination of LiDAR data and Sentinel-2 imagery obtained similar results to when all three datasets were used in combination, with the difference in OA being statistically insignificant. Generally, LiDAR data are suitable for classifying different crop types, with RF obtaining the highest OAs in both experiments. The combination of multispectral and LiDAR data produced the highest OA.

AFRIKAANSE OPSOMMMING:Akkurate digitale gewaskaarte is belangrik vir die verkryging van landboustatistieke soos watergebruiks- of gewasopbrengsberaming. Die tradisionele benadering tot die verkryging van digitale gewaskaarte is om dit met die hand van satelliet- of lugfoto’s te versyfer. Hand-versyfering is egter tydrowend, duur en vatbaar vir menslike foute. Outomatiese afstandswaarnemingsmetodes is ’n gewilde alternatief vir die skep van gewaskaarte, met masjienleeralgoritmes wat gewild raak vir die klassifisering van gewasse vanaf satellietbeelde. Die gebruik van slegs ligbespeuring-en-afstandsbepaling (LiBEA)-data vir gewasklassifikasie is egter nog nie wyd ondersoek nie. Hierdie studie het die gebruik van LiBEA-data vir gewasklassifikasie geassesseer deur hierdie data op sy eie, asook in kombinasie met Sentinel-2 beelde en lugfoto’s, te gebruik. Die eerste eksperiment het die gebruik van LiBEA-data en masjienleer vir die klassifikasie van wingerde geëvalueer. Die LiBEA-data is van ’n 2014-opname deur die Stad Kaapstad verkry. Die LiBEA-afgeleide genormaliseerde digitale oppervlakmodel (gDOM) en intensiteitsbeeld is by vier resolusies (1,5 m, 2 m, 2,5 m en 3 m) geïnterpoleer en toe vir tekstuurmetings gebruik. Twee venstergroottes (3x3 en 5x5) per resolusie is vir die generering van die tekstuurmetings gebruik, wat agt datastelle tot gevolg gehad het. Die resulterende datastel is as toevoer vir 11 masjienleer-klassifikasie-algoritmes gebruik, wat ’n binêre klassifikasie van wingerde en nie-wingerde uitgevoer het. Die resultate het getoon dat LiBEA-data tussen wingerde en nie-wingerde kan diskrimineer, met die ewekansige woud (EW) klassifiseerder wat die hoogste algehele akkuraatheid (AA) van 80,9% behaal het. Verder het die resultate getoon dat die standaardisering van die toevoerdata ’n beduidende verbetering aan die resultate van die neurale netwerke en afstandsgebaseerde klassifiseerders te weeg gebring het. Die tweede eksperiment het die metodes wat vir die eerste eksperiment ontwikkel is gebruik om ’n vyfklas-klassifikasie uit te voer. Die vyf klasse het bestaan uit mielies, katoen, grondbone, boorde en nie-landbou. Sentinel-2 en lugfoto-data is ook by die analise gevoeg en is met LiBEA-data vergelyk. Die LiBEA-data is uit 'n 2016-opname van die Vaalharts-besproeiingskema verkry. Verder is die drie datastelle (Sentinel-2, lugfoto’s en LiBEA-data) gekombineer om te bepaal watter kombinasie van datastelle die hoogste AA tot gevolg het. Die resultate het getoon dat die werksverrigting van LiBEA-data soortgelyk aan dié van Sentinel-2-beelde was, met LiBEA-data wat ’n gemiddelde AA van 84,3% behaal het, terwyl Sentinel-2 ’n gemiddelde AA van 83,6% behaal het. Die verskil tussen die AAs van LiDAR en Sentinel-2 was statisties onbeduidend. Die hoogste behaalde AA (94,6%) is verkry deur die EW-klassifiseerders wat van die gekombineerde data van LiBEA, Sentinel-2 en lugfoto’s gebruik gemaak het. Met die kombinasie van LiBEA- data en Sentinel-2 is soortgelyke resultate egter verkry as wanneer al drie datastelle in kombinasie gebruik is, met ombeduidende verskille in AA. Oor die algemeen was LiBEA-data geskik om verskillende gewastipes te klassifiseer, met EW wat die hoogste AA in beide eksperimente behaal het. Die kombinasie van multispektrale data en LiBEA het die hoogste AA tot gevolg gehad.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/107067
This item appears in the following collections: