Ensemble learning of daily river discharge modeling for two watersheds with different climates

In order to reduce the uncertainties and improve the river discharge modeling accuracy, several topography‐based hydrological models (TOPMODEL), generated by different combinations of parameters, were incorporated into an ensemble learning framework with the boosting method. Both the Baohe River Basin (BRB) with humid climate, and the Linyi River Basin (LRB) with semi‐arid climate were chosen for model testing. Observed daily precipitation, pan evaporation and stream flow data were used for model development and testing. Different Nash‐Sutcliffe efficiency coefficients, the coefficient of determination and the Root Mean Square Error were adopted to implement a comprehensive assessment on model performances. Testing results indicated that ensemble learning method could improve the modeling accuracy by comparing with the best single TOPMODEL. During the validation periods, the boosting method could increase the modeling accuracy by 9 and 16% for BRB and LRB, respectively. The ensemble method significantly narrowed the gap of model performances over watersheds with different climatic conditions. Hence, using the ensemble learning to enhance the feasibility of hydrological models for different climatic regions is promising.


| INTRODUCTION
Runoff not only affects the climate and weather systems , but also plays an important role for occurrences of natural hazards such as floods and debris flows . Simple physically-based models have been still widely used for runoff forecasting (Adamovic et al., 2016;Hao et al., 2018). For example, TOPMODEL is a semi-distributed and conceptual model that provides computationally efficient prediction of distributed hydrological responses with a very simple framework (Beven, 2001). Many studies on streamflow modeling improvement were based on this model (Takeuchi et al., 2008;Bouilloud et al., 2010;Xu et al., 2012). Recently, artificial intelligence technology was also combined with such simple models (Srinivasulu and Jain, 2009;Liu et al., 2013a;Nikoo et al., 2016). However, a single model, even a great distributed model, cannot be able to well depict the runoff variation due to uncertainties in both model schemes and parameters (Roundy et al., 2019). Combining multiple forecasts was deemed as an effective way to reduce the prediction errors (Adhikari and Agrawal, 2012). Ensemble learning could be a suitable way for improving runoff forecasting. Tehrany et al. (2014) used a novel ensemble weights-of-evidence and support vector machine models to map the flood susceptibility. Berkhahn et al. (2019) found that ensemble approaches for the artificial neural network help to overcome the overfitting. Other models, such as random forest, fuzzy neuro, were also combined with the ensemble approaches (Razavi Termeh et al., 2018;Choubin et al., 2019).
The boosting method is one of the most robust ensemble learning methods (Wang, 2012). It was originally designed for classification problems. However, it can cope with not only classification problems but also regression problems (Li et al., 2016). For example, Liu et al. (2014) used a modified AdaBoost technique to improve the efficiency of a conceptual hydrological model. Then, a new statistically based model, called an online gradient-boosted regression tree, is proposed to simulate streamflow in a changing environment (Zhang et al., 2019). As is well known, some hydrological models, such as TOPMODEL, cannot always show satisfactory results for different climatic regions (Liu et al., 2013b). Maybe, the ensemble learning could be an alternative to address this issue. However, the combination of the boosting approach and traditional hydrological model and its feasibility in climatic conditions have not been well explored.
Therefore, in this study, we have incorporated the TOPMODEL into an ensemble learning framework with the boosting algorithm to build a robust rainfall-runoff model (Boost-TOP). Both the Boost-TOP and TOPMODEL were applied in two river basins with different climate types to investigate the effects of ensemble learning and the scope of application.
2 | MATERIALS AND METHODS 2.1 | Introduction of TOPMODEL TOPMODEL, first described in Beven and Kirkby (1979), is a semi-distributed and physically-based watershed model. The "α" and "tanβ" in topographic index ln(α/ tanβ) in TOPMODEL describe the upslope contributing area and the hill slope, respectively. All the surface and subsurface runoff converge into outlet of the catchment by isochrones method (Saghafian et al., 2002). Three assumptions need be followed, which include (a) that the hydraulic gradient of saturated area is approximately equal to tanβ, (b) that the conductivity is the exponential decay function of soil moisture deficit, (c) and that runoff is spatially uniform.
The model contains five main parameters including SZM (maximum storage capacity of the unsaturated zone), SRmax (maximum storage capacity of the root zone), TD (delay time of gravity drainage), lnT 0 (natural logarithm of soil moisture conductivity at saturation) and Chv (valid convergence speed of stream channel). These parameters have been determined by the Monte Carlo optimization algorithm.

| Ensemble learning algorithm
As a widely used ensemble learning method, the boosting can construct a robust classifier as linear combination of some weaker models. It uses weighted samples to focus learning on most difficult examples and combining classifiers with weighted votes. The boosting scheme in this study was derived from the algorithm of Zemel and Pitassi (2001). When changing the model parameters, the NSE cannot always reach the satisfactory level due to the physical schemes. We adopted the absolute error to find the best weak model. Then weak models with different weights would be merged into a strong model. The Nash-Sutcliffe efficiency coefficient was used as the evaluation function for the final strong model.
The calculating procedure is demonstrated as follows: 1. Input: a. Training dataset: (P 1 , R 1 ), (P 2 , R 2 ),…,(P n , R n ), where P indicates the precipitation; R means the river runoff; n is the sample size. b. Base learner: every iterative step produces a hypothesis f(x) whose accuracy is judged by error rate ε t . f(x) represents the TOPMODEL in this study.

Initialize the distribution of weights
3. Inner t-step iteration to find weak models: a. Call weak learners, and then find the best weak learner to minimize the ε t . "τ" is a threshold which is used to demarcate the correct and incorrect results and initially determined by experiences.
b. Normalize the absolute error for computation. c t means the weight of the best weak learner. If ε t < 1, seek the c t (0,1) to minimize J t that can control the upper bound of total error.
c. Update the weights distribution of each sample: 4. In the outer loop, construct a strong rainfall-runoff model and use the NSE criterion to get the satisfactory ensemble model: 5. If the target is not reached, the inner loop will be repeated.

| Incorporating TOPMODEL into an ensemble learning framework
In the ensemble learning framework, each TOPMODEL with different parameters was regarded as the base learner or weak model. To find out the best learner for each iteration "t", Monte Carlo random number was utilized to decide the five parameters mentioned above. We set the iteration number to be 20,000. Then, a robust rainfall-runoff model, called Boost-TOP, was constructed using the Equation (6). The whole roadmap is shown in Figure 1. The threshold "τ" was determined by trial and error. The number of single weak models was empirically set to five in this study. All the models were implemented by C language programming.

| Study areas and data
The Baohe River Basin (BRB, the lower plot in Figure 2), with area of 3,415 km 2 , is in the upper reaches of the Hanjiang River (the largest branch of the Yangtze River). Its outlet is at the Madao hydrological station (33 26 0 N, 107 E). It belongs to humid region. The annual precipitation is about 930 mm, and the terrain elevation varies from 470 to 3,408 m. The basin is less influenced by anthropogenic activities, and is well covered with vegetation. Daily precipitation and pan evaporation from local meteorological stations and daily streamflow data observed at the Madao gauging station were used as input data. The 3-year daily series from 1981 to 1983 were used for calibration and 2-year daily data from 1984 to 1985 were considered for model testing.
Linyi River Basin (LRB, the upper plot in Figure 2) with 10,040 km 2 area, annual precipitation about 800 mm, 32 -37 N, 114 -121 E, located at the upstream of the Yishusi Catchment in Shandong province, China,

| Evaluation criterion
The performances of models were evaluated by the NSE (Nash-Sutcliffe Efficiency coefficient) (Nash and Sutcliffe, 1970). NSE ranges from −∞ to 1. The NSE become larger as the fit between observed and simulated stream flow improves. The value of NSE threshold = 0.65 was proposed as the threshold value for an acceptable model efficiency (NSE ≥ 0.65) and another two intervals for better performances ("good" and "very good" levels) were defined by the thresholds approximately to NSE = 0.80 and NSE = 0.90 (Ritter and Muñoz-Carpena, 2013). In order to assess the simulation for both high flows and low flows, two adaptative versions of NSE (named after NSE_LOW for low flows and NSE_HIGH for high flows in this paper) described by Hoffmann et al. (2004) were also calculated. In addition, coefficient of determination (R 2 ) and Root Mean Square Error (RMSE) were employed for a comprehensive evaluation. To focus more on the model ability of runoff yield, we converted the unit of river discharge (m 3 /s) to that of runoff depth (mm/day) for comparisons. Index Calibration (1981-1983 Validation (1984)(1985) TOPMODEL  Table 1 indicates that for the model training, both the NSE and R 2 of TOPMODEL were about 0.88 while those of the Boost-TOP outstripped 0.90 in the BRB. During the validation period, the NSE and R 2 of TOPMODEL were about 0.66 while those of the Boost-TOP were about 0.72, which showed the 9% improvement. The RMSE of Boost-TOP was always smaller than that of TOPMODEL. We found that the boosting method improved the modeling more effectively for high flows than for low flows. From the results for the rainy season in 1984 (Figure 3), the Boost-TOP captured at least three peaks. Although two models underestimated the peak flow on fifth and 15th in July respectively, the simulated runoff by Boost-TOP was much closer to the observations than TOPMODEL. It also be noted that performance was not improved so much in the low-runoff period. The ensemble learning with longterm observations may be prone to flood modeling in the BRB.

| Discussion
No matter whether the study area is humid or semiarid, the Boost-TOP has always outperformed than the single model, which is almost in accord with the conclusions of other works (Brochero et al., 2012;Liu et al., 2014;Coltin et al., 2016). Reweighting of samples with big error could help the model adaptively to think highly of these time ranges which could belong to drought period when the runoff coefficient was very low so that it was difficult to simulate the runoff by a single TOPMODEL. For classification problem the training error will exponentially tends to zero with iterative time increasing (Freund et al., 1999). However, for the hydrological regression problem, the range of model parameters and the selection of threshold "τ" would have a big effect on the calibration of models. Ensemble learning with TOPMODEL would be conducive to flood forecasting in humid watershed. General results show larger improvements in LRB than in BRB. Due to the larger amount of rainfall in the humid area, there has already a good correlation between rainfall and runoff, the traditional TOPMODEL model can perform excellent so that the room for improvement is limited. Oppositely, the advantage of boosting algorithm could capture the pattern of complex rainfall-runoff relations in semi-arid area to enhance the modeling more significantly. In addition, because rainfall-induced runoff is a crucial parameter for debris-flow early warning , ensemble learning for runoff modeling may further promote related researches.

| CONCLUSIONS
In this study, several TOPMODELs, generated by different combinations of parameters, were incorporated into an ensemble learning framework. Both the Baohe River Basin with humid climate, and the Linyi River Basin with semi-arid climate were chosen for model building and testing. Based on a series of evaluation indexes, results indicated that ensemble learning method could improve the modeling accuracy by comparing with the best single TOPMODEL. During the validation periods, the boosting method made the modeling accuracy increasing by 9 and 16% for Baohe River Basin and Linyi River Basin, respectively. The method narrowed the gap of model performances over humid and semi-arid regions. Results also suggested that Boost-TOP may be conducive to enhance flood forecasting in humid watersheds and drought monitoring in semi-arid areas. Hence, using the ensemble learning to enhance the feasibility of hydrological models for different climatic regions is promising, and this method can be also easily extended to other models.