Assessment of Conventional and Machine Learning Methods for Completing Precipitation Series

Abstract

This paper discusses the results of an assessment of eight machine learning-based methods and three methods that have historically been used to complete missing data in time series. Data used for the analysis correspond to monthly precipitation totals collected by the Colombian Institute of Hydrology, Meteorology and Environmental Studies (Ideam) at four weather stations in the Baché river basin (Palermo municipality, Huila, Colombia). In order to evaluate the methods, the existing data was reproduced as missing data, and three different error metrics were calculated based on the difference between them: Root Mean Square Error (RMSE), Nash-Sutcliffe Efficiency (NSE) and bias. Results show that machine learning methods for completing time series are reliable, seeing as similar (and in some cases, better) results can be achieved without an accurate implementation and, consequently, a greater attention to them can lead to less uncertain results
PDF (Spanish)

Keywords

machine learning
missing data
interpolation
monthly precipitation
regression