Data-driven river flow routing using deep learning: predicting flow along the lower Orange river, Southern Africa

Briers, C. J. (2019-04)

Thesis (MSc)--Stellenbosch University, 2019.

Thesis

ENGLISH ABSTRACT : The Vanderkloof Dam, located on the Orange River, is responsible for the water supply to consumers along its 1 400 km reach up to where it flows into the Atlantic Ocean. The Vaal River, which joins the Orange River approximately 200 km downstream of the dam, contributes significant volumes of water to the flow in the Orange River. These contributions are, however, not taken into account when planning for releases from the Vanderkloof Dam. In this thesis we aimed to develop an accurate and robust flow routing model of the Orange and Vaal River system to predict the effects of releases from the Vanderkloof Dam and anticipate inflows from the Vaal River. Since the factors that impact on flow rate and volume along the river are hard to quantify over long distances, a data-driven approach is followed which uses machine learning to predict the flow rate at downstream flow gauging stations based on flow rates recorded at upstream gauging stations. We restrict the model input to data that would be readily available in an operational setting, making the model practically implementable. A variety of neural network architectures, including fully-connected networks, convolutional neural networks (CNNs) and recurrent neural networks (RNNs), were investigated. It was found that fully-connected networks produce results with accuracy comparable to a simple linear regression model, but display a superior ability to predict the timing of peaks and troughs in flow rate trends. CNNs and RNNs displayed the same ability, as well as showing improvements in accuracy. The best-performing CNN model had a mean absolute percentage error (MAPE) of 14.5 % compared to 16.9 % of a linear regression model. To anticipate contributions from the Vaal River we investigated including inflows recorded at stations on the Vaal River and two of its tributaries, the Modder and Riet Rivers. Both approaches which were investigated, i.e. incorporating these inflows as part of multi-dimensional input into a CNN, and using a parallel CNN model architecture, showed promise with a MAPE of 21.6 % and 23.5 %, respectively. Although these models did not achieve a high level of accuracy, they did display the ability to anticipate contributions from the Vaal River system. It is believed that they could, with additional refinement or using appropriate safety factors, be practically applied in an operational setting. We further investigated including seasonal data as input into our models. Including the time of the year, and including evaporation data recorded at meteorological stations in the recent past, both resulted in improved MAPE accuracy (14.4 % and 14.8 %, respectively, compared to 18.4 % for a model including no seasonal data). Observations of errors staying relatively constant over time prompted us to include errors made in the recent past as input into subsequent predictions. A model trained with this additional data achieved a MAPE of 10.2 %, a significant improvement over other applied methods

AFRIKAANSE OPSOMMING : Die Vanderkloof-dam, wat op die Oranjerivier gelee is, verskaf water aan verbuikers langs die 1 400 km stroom-af rivierloop tot by die Atlantiese Oseaan. Die Vaalrivier, wat met die Oranjerivier ongeveer 200 km stroom-af van die dam saamvloei, maak ’n beduidende bydrae tot die vloei in die Oranjerivier. Dit word egter nie in ag geneem wanneer loslate uit die Vanderkloofdam beplan word nie. In hierdie tesis beoog ons om ’n akkurate en prakties implimenteerbare vloei-roeterings model om die stroom-af effekte van loslate by die Vanderkloof-dam, en wat die bydraes van die Vaalrivier in ag neem, te ontwikkel. Aangesien faktore wat stroomvloei affekteer moeilik is om oor lang afstande te kwantifiseer, word data-gedrewe masjienleer-tegnieke toegepas deur vloeitempo wat by stroom-op stasies gemeet word te gebruik om vloeitempo by stroom-af stasies te voorspel. Om te verseker dat ons modelle in die praktyk aangewend kan word, beperk ons die invoer tot data wat in ’n operasionele omgewing beskikbaar is. Implimentasie van vol-konneksie, konvolusionele en herhaal-terugvoer neurale netwerke was ondersoek. Volle-konneksie netwerke se resultate was vergelykbaar met die van ’n lineêrê regressie model, maar het die tydsberekening van stygings en daling in vloeitempo beter voorspel. Konvolusionele en herhaal-terugvoer netwerke het die tydsberekening goed voorspel, asook ’n verbetering in akkuraatheid getoon. Die model met die beste resultate was ’n konvolusionele netwerk met ’n absoluut gemiddelde persentasie-fout van 14.5 %, in vergelyking met 16.9 % vir ’n lineêrê regressie model. Om bydraes tot vloei vanaf die Vaalrivier in te sluit, is daar ondersoek ingestel om vloeimetings van meetstasies op die Vaalrivier en twee van sy sytakke, die Modder- en Rietriviere, in die invoer tot die modelle in te sluit. Twee opsies is ondersoek, om dit as multi-dimensionele invoer vir ’n konvolusionele netwerk in te sluit, en die gebruik van ’n parralelle argitektuur. Die opsies het onderskeidelik absoluut gemiddelde persentasie-foute van 21.6 % en 23.5 % gelewer. Alhoewel hierdie resultate nie besonder akkuraat is nie, het die modelle wel bydrae van die Vaalrivier af relatief goed voorspel, en sal hulle prakties implimenteerbaar wees indien gepaste veiligheidsfaktore op die resultate toegepas word. Ons het gepoog om seisoenale invloed op stroomvloei te voorspel deur addisionele data as deel van die invoer te verskaf. Deur die tyd van die jaar en die verdampingmetings van ’n nabygeleë weerstasie as invoer in te sluit het die absoluut gemiddelde persentasie-foute tot 14.4 % en 14.8 %, onderskeidelik verminder (vanaf 18.4 % vir ’n model met geen seisoenale invoer nie). Ons het waargeneem dat die fout tussen die gemete en voorspelde vloei relatief konstant bly oor tyd en het daarom die foute wat in die onlangse verlede gemaak is, ingesluit in latere voorspellings. Hierdie addisionele invoer het die model se absoluut gemiddelde persentasie-fout verminder na 10.2 %, ’n beduidende verbetering op ander metodes.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/105754
This item appears in the following collections: