Estimation methods for date palm yield : a feasibility study

Heyns, Karlien (2021-03)

Thesis (MEng)--Stellenbosch University, 2021.

Thesis

ENGLISH ABSTRACT: With a growing population and a need for food security, crop yield prediction is vital; not only is it used by exporters and importers, but also by the farmer who needs to plan marketing strategies and determine prices. Methods on crop yield prediction are more abundant for annual plants than for perennials. Very few reliable crop yield prediction models have been developed on the date palm, which is grown in arid regions with plentiful water available. Date fruit is a nutritious food which is produced in many countries and consumed widely around the world. Farming with date palms is a complex process with a large variety of factors affecting the annual yield. This study investigated the feasibility of predicting date yield using data collected by a research partner producing date fruit. Data on some farming practices as well as weather conditions was collected from 2010 onwards, at different levels of detail. Machine learning techniques were considered for prediction of yield; however, four applica- ble linear regression techniques were identified and could be used with the available data for feature selection. The dataset has many features, but dominant features were extracted from the data. Some of the feature selection methods used were a correlation technique, stepwise regression and regularisation. These features were further used to develop regres- sion models. It was found that some weather features were important, as well as features describing the date bunch mass. The latter were observed by sampling bunches from trees in different orchards. Linear regression models were developed on orchard level and on farm level, i.e., for the farm as a whole, and the best-performing linear regression models were selected (while avoiding overfitting). The yield predictions following from these models were compared to the actual annual yield recorded, as well as the estimated yield determined by a rather pragmatic yield prediction method devised by the research partner. The selected models produced a 4% prediction error while the farm method gives a 7% error. The proposed models reduced the prediction error and eliminate the need for laborious sampling work done to support the farm prediction model. The study found that certain data that is collected is not needed by the proposed linear regression models. The study was done from an industrial engineering perspective, and a systematic process was followed to critically assess the data available. This was done to keep complexity of the models at a level suitable for reasonable and accurate yield prediction, and to eliminate some unnecessary data collection labour on the farm.

AFRIKAANSE OPSOMMING: Met ‘n groeiende wˆereldopulasie en ‘n behoefte aan voedselsekuriteit is oesvoorspelling van gewasse noodsaaklik – nie net vir in- en uitvoerders nie, maar ook vir die produsent wat bemarkingstrategie ̈e beplan en prysbepaling doen. Oesvoorspelling is meer gevorder vir eenjarige gewasse as vir meerjariges. Op die dadelpalm, wat groei in dro ̈e gebiede waar water volop beskikbaar is, is weinig betroubare oesvoorspellings ontwikkel. Die dadel is ‘n voedsame vrug wat verbou word in baie lande en regoor die wˆereld geniet word. Die verbouing van dadelpalms is ‘n ingewikkelde proses, waar ‘n groot verskeidenheid faktore die oes be ̈ınvloed. Hierdie studie het die lewensvatbaarheid ondersoek van die oesvoorspelling van dadels met data wat deur ‘n navorsingsvennoot, ‘n dadelprodusent, ingesamel is. Hierdie data, gedokumenteer sedert 2010, bevat bestuurspraktyke op die plaas, sowel as weerstoestande, met wisselende vlakke van detail. Verskeie masjienleertegnieke is oorweeg vir die oesvoorspelling. Uiteindelik is vier lineˆere regressietegnieke uitgesonder en kon hierdie vier gebruik word op die beskikbare data om veranderlikes te kies. Die datastel bestaan uit baie veranderlikes, maar dominante veran- derlikes is ge ̈ıdentifiseer met hierdie seleksiemetodes, wat ‘n korrelasietegniek, stapsgewyse regressie en regularisering insluit, en die veranderlikes is verder gebruik om regressiemod- elle te ontwikkel. Sommige veranderlikes wat die weerstoestande in sekere tye bevat, en veranderlikes wat die vrugtrosmassas beskryf, is van groter belang. Die trosmassas is verkry deur ‘n steekproef van ‘n enkele boom in elke boord se trosse. Lineˆere regressiemodelle is ontwikkel op boord- en op plaasvlak, d.w.s. vir die plaas as ‘n geheel, en die modelle met die beste resultate is gekies. Die oeste wat deur hier-die modelle voorspel is, is vergelyk met die werklike jaarlikse oes en die geskatte oes bepaal met ‘n pragmatiese oesvoorspellingsmetode deur die navorsingsvennoot. Die gekose mod- elle verbeter die voorspellingsfout van die huidige pragmatiese metode se 7% na 4%. Die voorgestelde modelle verminder dus die voorspellingsfout en terselfdertyd sorg dit vir die weglating van tydsame en arbeidsintensiewe steekproefwerk wat die huidige skattingsme- tode van die plaas benodig. Die studie het gevind dat sekere data wat ingesamel word, nie benodig word vir die voorgestelde modelle nie. Die studie is vanuit ‘n bedryfsingenieursperspektief benader, en ‘n sistematiese proses is gevolg om die data krities te assesseer, om die kompleksiteit van die modelle op ‘n gepaste vlak vir sinvolle en akkurate oesvoorspelling te hou, en om onnodige arbeid vir datainsamelings op die plaas uit te skakel.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/110450
This item appears in the following collections: