Concept demonstrator for a decision support tool for agricultural applications

Date
2022-04
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering.
Abstract
ENGLISH SUMMARY: Farmers face daily challenges, and there are numerous factors to consider to produce crops profitably. For example, large amounts of data can be overwhelming and complex if not utilised correctly. However, tools such as decision support systems can be incorporated to support the decision-making process. Precision Agriculture presents several opportunities and challenges. An industry partner, Company A, was approached to identify and test a real-world PA problem. The manual element of analysing several data layers is time-consuming and require a more user-friendly way to display data. This research study developed and presented a concept demonstrator of a decision support tool to illustrate how several components can be used to improve the decision-making process. Soil- and nutrient classification data were provided by the use case, Farm X, which produces winter wheat in a summer rainfall area in South Africa. Chlorophyll data from 2017 to 2020 were provided by the Airbus Verde service of Company A. The assumption was made to add historical and current meteorological data acquired from the South African Weather Services. QGIS was used to extract soil and nutrients classification and chlorophyll data from 296 GPS-specific points on the crop circle. The data table consisted of 85 soil and nutrient and weather features. A major challenge was presented when no GPS-specific yield was available for Farm X. A third (11 088)of the total chlorophyll data were missing, and only 24 849 data points were available foranalysis. Nevertheless, Python was used to clean and analyse the available data to provide one chlorophyll value per month for every 296 points. After careful consideration, it was decided to use all features to identify agricultural trends and predict chlorophyll values on a crop circle. A sequential forward feature selector was used to determine which features influence chlorophyll values. A lazy regressor was used to determine the best performing algorithms for feature selection and chlorophyll prediction. The algorithms included the (i) Random Forest regressor, (ii) HistGradientBoost regressor, (iii) XGB regressor and (iv) Extra Trees regressor. The latter outperformed the other algorithms and achieved an R2 value of 0.86 to predict chlorophyll values for August and September. Operational validation was done using 80% of the data set for training and 20% for testing. The model was then presented with an unknown years data table used for testing to predict chlorophyll for August and September. An R2 value of 0.273 was achieved. This was to be expected due to the data quality issues and the absence of yield data. The model was provided with at most two chlorophyll values to train with and monthly weather values (instead of daily) to predict a time-series value. The model achieved a positive R2 value. The concept demonstrator was successfully developed and tested on a real-world use case. It illustrated how different data sets, machine learning algorithms, predictions and visualization tools could be integrated and used in a decision support tool for agricultural purposes.
AFRIKAANS OPSOMMING: Boere word deur daaglikse uitdagings in die gesig gestaar en daar is talle faktore wat in ag geneem moet word om gewasse winsgewend te produseer. Groot hoeveelhede data kan oorweldigend en kompleks wees as dit nie reg aangewend word nie. Hulpmiddels soos besluitondersteuningstelsels kan egter geïnkorporeer word om die besluitnemingsproses te ondersteun. Presisielandbou bied verskeie geleenthede asook uitdagings aan. 'n Bedryfsvennoot, Maatskappy A, is genader om 'n werklike PA-probleem te identifiseer en te toets. Die handmatige element van die ontleding van verskeie datalae is tydrowend en vereis 'n meer gebruikersvriendelike manier om data te vertoon. Hierdie navorsingsstudie het 'n konsepdemonstrator van 'n besluitondersteuningsinstrument ontwikkel en aangebied om te illustreer hoe verskeie komponente gebruik kan word om die besluitnemingsproses te verbeter. Maatskappy A het grond- en voedingstofklassifikasiedata van Plaas X verskaf, wat winterkoring in 'n somerreënvalgebied in Suid-Afrika produseer. Chlorofildata van 2017 tot 2020 is verskaf deur die Airbus Verde-diens van Maatskappy A. Die aanname is gemaak om historiese en huidige meteorologiese data by te voeg wat van die Suid-Afrikaanse Weerdienste verkry is. QGIS sagteware is gebruik om grond- en voedingstofklassifikasie data asook chlorofildata van 296 GPS-spesifieke punte op die oessirkel te onttrek. Die datatabel het uit 85 grond- en voedingstof- en weerkenmerke bestaan. ’n Groot uitdaging het na vore gekom toe geen GPS-spesifieke opbrengs data vir Plaas X beskikbaar was nie. ’n Derde (11 088) van die totale chlorofildata was vermis en slegs 24 849 datapunte vir ontleding was beskikbaar. Nietemin, is Python gebruik om die data skoon te maak en die beskikbare data te ontleed om een chlorofilwaarde per maand vir elk van die 296 punte te verskaf. Die besluit is geneem om die data patrone te ontleed en om chlorofilwaardes vir Augistus en September op 'n oessirkel te voorspel. 'n “Sequential forward feature selector” metode is gebruik om te bepaal watter veranderlikes chlorofilwaardes beïnvloed. 'n “Lazy regressor” is gebruik om die beste presterende algoritmes te bepaal om te gebruik vir die keuse van veranderlikes en chlorofilvoorspelling. Die algoritmes het die (i) Random Forest regressor, (ii) HistGradientBoost regressor, (iii) XGB regressor en die (iv) Extra trees regressor ingesluit. Laasgenoemde het beter as die ander algoritmes gevaar en 'n R-kwadraatwaarde van 0.86 behaal om chlorofilwaardes vir Augustus en September te voorspel. Operasionele validering is gedoen deur 80% van die data vir die leerproses en 20% van die datastel vir die toetsproses te gebruik. ‘n Onbekende datatabel van ‘n spesifieke jaar is vir die model gegee wat gebruik is vir die toetsproses om Chlorofil vir Augustus en September te voorspel. 'n R2 van 0,273 is behaal. Dit was te verwagte weens die datakwaliteitkwessies en die afwesigheid van opbrengsdata. Die model is voorsien van hoogstens twee chlorofilwaardes om mee te leer en maandelikse weerdata (in plaas van daagliks) om 'n tydreekswaarde te voorspel. Steeds het die model 'n positiewe R2 behaal.
Description
Thesis (MEng)--Stellenbosch University, 2022.
Keywords
Precision farming, Smart farming, Agricultural informatics, UCTD
Citation