Uncertainty reduction in quantitative precipitation prediction by tuning of Kain–Fritch scheme input parameters in the WRF model using the simulated annealing optimization method

In the study, using the simulated annealing (SA) optimization method, tried to reduce the uncertainty of the prediction of the Weather Research and Forecasting (WRF) numerical model in convective rainfall forecasts. To this end, three parameters Pd, Pe and Ph in Kain–Fritch convective scheme that's related to downdraft mass flux, entrainment mass flux and starting height of downdraft above updraft source layer, respectively, are optimized using SA algorithm to achieve better values. Two nested domain were used in the study with 30 and 10 km resolution which inner domain cover southern coasts of the Caspian Sea for the study area. Runtime of the model was 36 hr with the first 12 hr spin‐up time. Study period selected October 12, 2012 for training algorithm and October 8, 2015 for test run. After 100 iteration of the algorithm, 1, 1 and 50 was obtained for Pd, Pe and Ph respectively, while default values of these parameters was 0, 0, and 150. Results show that in both cases, model with default values underestimates rainfall amount and after optimization, model performance improves. Also, spatial distribution of the model precipitation forecast was less than observations and after optimization, spatial distribution improves. Statistical analysis of results indicated Mean Bias (MB) and root mean square error (RMSE) in training case were −4.6 and 17.1 in model with default values and became −3.7 and 14.3 in optimized one, respectively. Also, MB and RMSE for test case increased to −8.4 and 23.8, respectively, in model result with optimized parameters from −10.6 to 29.4 in model with default parameters.


| INTRODUCTION
Convective systems are among the most important factors causing severe rainfall and flood disaster, leading to economic, environmental and health damage to agriculture, urban, military and other sectors. In Iran, convective rainfall mostly occurs in spring and autumn (Azizi and Rabbani, 2013;Hejazizadeh et al., 2015;Molavi-Arabshahi et al., 2016) and it constitutes an important part of the country's water resources. The rainfall sometimes causes a lot of damage and injury to both crops and people (Daliran Firouz et al., 2016).
The convective process is irregularly related to the amount and severity of precipitation, and the diabatic heating induced by the convective process is an important driver of global and regional circulations. Hence, for a better understanding and to restrict the convective parameters, numerical weather prediction (NWP) models were used (Warner and Hsu, 2000;Liu et al., 2001). Over the past decade many different convection parameterization schemes (CPSs) have been developed and incorporated into NWP models in order to improve the predictability of the convective systems.
In general, the accurate quantitative prediction of precipitation is a challenging problem and forecast skill is always low. In addition, the skill in rainfall prediction varies with different seasons and is significantly lower in summer than in winter (Fritsch and Carbone, 2004). The NWP models have low skill and low relative progress in order to improve rainfall forecasting in the warm seasons because of their weakness at predicting heavy convective precipitation (Sun and Zhang, 2008).
The NWP models focus more on identifying, analysing and evaluating the procedure for the development and evolution of convective phenomena and their strengthening. The results of studies on convection phenomena, which include mostly shallow convection, wind stratification, moisture variations at low levels of the troposphere, thunderstorms, fogs, low-level clouds and squall lines, indicate that because of a small spatial and temporal resolution, the NWP models have a moderate ability to simulate convective events (Das, 2009;Basnayake, 2010;Das et al., 2010;Litta et al., 2012;Akter and Ishikawa, 2014); and their investigation using the results of forecasting model simulations has greater complications. Hence, reducing uncertainty (uncertainty quantification-UQ) in weather forecasting results has particular importance in increasing the accuracy and optimization of the performance of the models.
The UQ is a science of quantitative features and reducing uncertainty in applications. It determines that if certain aspects of the system are not exactly known, then there is a probability of certain results. Various methods have been used to reduce the uncertainty of the output variables of the NWP models. These approaches include changing the initial conditions, changing the input parameters of the model, simulations with different parameterization schemes, the ensemble simulation with several models, and so on (Allen et al., 2000;Giorgi and Mearns, 2002;Stainforth et al., 2005;Lopez et al., 2006). One method recently considered is optimization of the parameters of the parameterization schemes of the model and the selection of their optimal values Yan et al., 2014;Zhang et al., 2015). The specification and reduction of uncertainty in the tuneable input parameters of the NWP models can improve the understanding of the physical properties in atmospheric processes as well as the accuracy of weather forecasts.
Physical parameterization schemes incorporated in the NWP models usually contain many input parameters which are determined by physical processes or by tuning their values based on a validation of the result to obtain a qualitative agreement between the simulations and the observations from limited local measurements or global observations.
Various optimization methods have been used to estimate the optimal value of parameters in the physical parameterization, such as particle swarm optimization (PSO) (Mohammadi et al., 2016), genetic algorithm (GA) (Lee et al., 2006), simulated annealing (SA) (Dolan et al., 1989;Johnson et al., 1989) and multiple very fast simulated annealing (MVFSA) , to obtain optimum values by minimizing the possible errors.
The SA is a stochastic importance heuristic algorithm that finds approximate solutions in small amounts of computation time (Aarts and Korst, 1988). Its most important advantage of over other methods is the ability to avoid being trapped in local extreme areas. There may be many local minimums and maximums during the global extreme process found. Repeating the SA process can help prevent such local trapping and identify the global minimum Villagran et al., 2008).
Also, other common methods such as Monte Carlo, metropolis/Gibbs, quasi-Monte Carlo and Latin hypercube selection are grid-search methods. A grid search is a simple way to test the sensitivity of parameters by separating each parameter space at intervals equal to the distance and assessing the uncertainty caused by those combinations. However, this method may require huge computational resources. For example, if five parameters are needed, about 10 5 simulations have to be examined with 10 intervals for each parameter. On the other hand, the SA algorithm significantly reduces computation time . Jackson et al. (2008), optimized six parameters related to the cloud process in the global weather model (GCM) using the MVFSA to determine the role of cloud processes in the hydrological cycle and future climate change modelling uncertainty. Yang et al. (2012) used the MVFSA method to estimate the optimal Kain-Fritch (KF) scheme parameters. Their results showed significant improvement after adjusting the parameters. Lee et al. (2006) estimated the parameters of the KF convection scheme by using a GA. The study was conducted to estimate the parameters within the KF scheme in order to predict heavy rainfall in the Korean Peninsula for a warm period. The results showed that the effect of using optimized parameters in the convection scheme was significant for predicting rainfall. Mazarakis et al. (2011) changed several parameters in the KF convection scheme. These changes included maximizing the convective rainfall efficiency, shifting the time step of the convection, forcing the convective scheme time step to generate less or more cloud masses, trigger function changing and the vertical profile of the updraft mass flux variation. They applied these changes to predict the warmseason rainfall in Greece, which resulted in a significant improvement after adjusting the parameters. Hacker et al. (2011) assessed the effects of the initial conditions and model parameterization uncertainty on the Weather Research and Forecasting (WRF) forecasting system and found that different configurations of the parameterization schemes along with changing parameters values could create a better ensemble forecasting system.
The purpose of the present study is to adjust the parameters of the KF convective scheme to local optimal values for short-term forecasting with the highest possible accuracy of the occurrence of convective precipitation in Iran by using the SA algorithm. In a first step, the KF scheme is modified for more accurate predictions of an atmospheric system associated with a squall line, which means that the optimal parameters of the adjustable parameters within the KF convection scheme are obtained by using an optimization algorithm. Note that the reason for the selection of the KF scheme is because of its success in the numerical prediction of mesoscale systems by many previous studies (Kain et al., 2008;Mazarakis et al., 2011;Di et al., 2015;Davitashvili et al., 2016;Zeyaeyan et al., 2017).
The paper is structured as follows. Section 2 defines the parameters in the KF scheme, introduces the SA optimization algorithm and the configuration of the WRF model and observation data, describes the study area, and introduces the cases. Section 3 presents the optimization results of the model uncertainty on precipitation forecast as well as the dependence of optimization on the model settings. Section 4 concludes.

| Kain-Fritch (KF) scheme and five most critical parameters
The KF scheme was designed for the parameterization of convective processes in mesoscale systems. It solves the mass flux of cloud with a one-dimensional model (Kain and Fritsch, 1990;Kain, 2004), and has two basic features: one in which the cloud model is formulated as a detrainment-entrainment model in which the parcel buoyancy in the form of a function of mixed-up parcels is computed implicitly between the air and the updraft currents; and the other in which the conservation of mass, thermal energy, total humidity and momentum are included.
The scheme includes a more complete description of the physical processes inside the cloud compared with other current existing parameterization schemes. Additionally, the parameterization of the detrainments in the model allows for the better simulation of the mesoscale reaction compared with other schemes (Kain, 2004). One constraint of the scheme is that the convective available potential energy (CAPE) is not that suitable for tropical regions and can result in very intense convection (Yang et al., 2015).
The trigger function for convection begins when the adjacent air of 60 hPa gains positive buoyancy and ascends. The redistribution of the air mass occurs because of detrainment, entrainment, updraft and downdraft until 10% of the CAPE remains (Yan et al., 2014). The activation of deep convection occurs when the updraft flux rises to a given cloud depth and the turbulent kinetic energy (TKE) is responsible for shallow convection. Through a layer of 150-200 hPa above the cloud base, a downdraft occurs and the mass of air around and inside the cloud (updraft) is exchanged through the entertainment-detrainment process across each layer (Yan et al., 2014). Yang et al. (2012) describe five key parameters in the KF scheme that affect the model's precipitation forecast: a co-efficient related to the downdraft flux rate (P d ); the entrainment mass flux rate (P e ); the starting height (P h ) of the downdraft above the updraft surface layer (USL); the maximum TKE in the sub-cloud layer (P t ); and the average consumption time of the CAPE (P c ). Descriptions of these parameters, their formulation and the range of acceptable variations are given below and also summarized in Table 1  .
The P d and P e are scale factors for the modulation of the downdraft and entrainment flux intensities in KF scheme, respectively, which allow the rate of both fluxes to change between 0.5 and 2 times their default values .
In the KF scheme, the P d and P e relate to the downdraft and entrainment fluxes of the updraft mass flux at the top of the USL. The default P d and P e values are considered as zero in the KF scheme in the respective equations (Kain, 2004), and the general forms of the terms are described by Yang et al. (2012): where m usl u is the updraft mass flux at the top of the USL (kg s −1 ); m USL d is the downdraft mass flux at the top of the USL (kg s −1 ); R is the cloud radius (m); δp is the pressure depth of a model layer (Pa); and δm is the mixing rate (maximum possible entrainment rate).
The default P t , P c and P h values are 5 m 2 ·s −2 , 2,700 s and 150 hPa, respectively. In order to increase the variation range of the downdraft starting height, the consumption time of the CAPE and more TKE in the sub-cloud layer, the P t , P c and P h values have the following ranges: [3,12], [900,7,200] and [50,350], respectively. Since the study of Yang et al. (2012), it has been shown that meteorological variables, such as the outgoing long wave radiation (OLR), latent heat (LH), sensible heat (SH), airspecific humidity for 900-800 hPa (Q(L)) and so on, are more sensitive to the parameters P d , P e and P h than to the other two parameters, and the influence of these three parameters on output variables is greater. Therefore, the focus is on the adjustment of the three convective process input parameters P e , P d and P h .

| WRF model configuration and study domain
The Weather Research and Forecasting (WRF) model v.3.7.1 (Skamarock et al., 2008), with a KF convection scheme, is used to simulate convective precipitations. The WRF is a non-hydrostatic mesoscale numerical whether prediction model with various schemas to simulate the atmospheric boundary layer, convection, short wave and long wave radiations, cloud microphysics, and other atmospheric processes. This system is used for both atmospheric research and operational forecasting (Giordano et al., 2014).
As shown in Figure 1, two fixed nested domains with horizontal resolutions of 30 and 10 km for the outer and inner domains, respectively, were used. The outer domain prepared the boundary conditions for the inner domain. Also, the Global Forecast System (GFS) forecasts from the National Center for Environmental Prediction Five parameters related to convection in the Kain-Fritch (KF)

| Simulated annealing algorithm
In fact, the simulated annealing method is a descent algorithm derived from thermodynamics in which a liquid or metal is cooled slowly to obtain a pure crystal or highquality metal (Wilkinson and Reinsch, 2012;Stoer and Bulirsch, 2013). Analogous to the SA method in an optimization algorithm, problem solutions correspond to the state of the material and its temperature when compared with a control parameter. The SA basically consists of two random processes: the generation of solutions and their acceptance. This random ascent repeats until a solution moves to global minima (Du and Swamy, 2016).

| Methodology description 2.4.1 | Objective function
The objective function of the problem being optimized was the equitable threat score (ETS), which is also known as the Gilbert skill score (GSS). The ETS was formulated in order to calculate those hits that can only happen randomly. The main equation for the ETS score is as follows (Levizzani et al., 2007): The ETS penalizes false and missed alarms and is therefore commonly used for the verification of rainfall prediction using the NWP models. For a perfect forecast, ETS = 1 (Levizzani et al., 2007). The quantities needed to calculate the ETS score are presented in Table 3. In order to calculate the quantities required for calculation of the ETS, the precipitation thresholds need to be defined. In the study, four thresholds of > 0.1, > 5, > 10 and > 15 mm were considered.
2.4.2 | SA algorithm procedure to finding global solution As stated above, the goal of the study was to find the optimal values for the three parameters P d , P e and P h for the KF scheme. For this purpose, a random value for each parameter, considering its relevant threshold values (Table 1), was initially generated to form an initial guess of the vector parameter m 0 . The WRF model was then run and the ETS score calculated for the first guess values.
The new values for the parameters (m new i ) were then calculated according to the following equation : where y i (−1, 1) is the Cauchy distribution, which is dependent on the thermal co-efficient T, and was obtained as : Configuration of the Weather Research and Forecasting (WRF) model used in the study  Model Options  where in the above equations the i and k indices are, respectively, the number of the parameter and the iteration number; "sgn" is the sign function; and RND is a random number of a uniform distribution between 0 and 1. For the acceptance of m new i , a control parameter T k was set with an initial value of T 0 (where a T k decrease corresponds to the increase of iteration k): where P is a random number between 0 and 1. The probability of acceptance follows from Equation 8, and with a decreasing control parameter, the probability of acceptance also decreases. With decreases of T, the SA algorithm moves progressively to a region of the allowed range of parameters that minimizes the model errors. The amount of T k in each step becomes smaller and approaches zero. In the study, the initial T 0 value was considered to be 100. A flowchart of the SA algorithm used is presented in Figure S1 in the additional supporting information. Also, the method of using the simulation annealing optimization algorithm was in offline mode. That means that, unlike online modelling, where optimization is performed during the time steps of the model implementation, in order to optimize the desired input co-efficients with the SA optimization algorithm, the WRF model was run each time for the entire simulation hours. The new co-efficients were then extracted, and when placed with the previous values, the model was again executed and the optimization process was performed during the execution.
For a statistical evaluation of the results, five statistical indices mean bias (MB), root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (CC) and multiplicative bias were used. Table 4 depicts the formulation for each statistical index and their best values.

| Description of the cases
Two cases of heavy convective rainfall associated with two typical squall lines were selected for simulation using the WRF model. One was used to find the optimized KF convective co-efficient parameters using the SA method; the other was used to test the obtained optimized KF convective co-efficient values in the first case.
One of the most important factors causing heavy convective rainfall during both cases of training (October 12, 2012) and testing (October 8, 2015) along the southern coast of the Caspian Sea is the merging of the subtropical and polar jets in the region. This process increases the meridional component of the horizontal wind and, as a result, strengthens the cold and humidity advections, which in turn enhance the ascending currents and associated precipitation (Motevalli-Taher et al., 2015;Salahi et al., 2016). Also, as seen in Figure 2, the southern coast of the Caspian Sea is located in front of the 500 hPa trough axis, which may lead to the amplification of the ground-level large-scale synoptic system located over the area.
Two maximum and minimum specific humidities can be seen over the southern and northern shores of the Caspian Sea, respectively. The reason for the existence of a minimum specific humidity south of the Caspian Sea could be because of the lower air temperature over the land compared with those over the sea during October. Moreover, because of the warmer nature of the sea compared with the coasts, the wind direction over the Caspian Sea is northerly and thus the moving air has a higher specific humidity. The reason for the existence of the southward flow could be attributed to the existence of a high-pressure system located over the Caspian Sea. The above-mentioned high-pressure system with its anticyclonic circulation causes a northerly flow over the Caspian Sea, which in turn intensifies the temperature and humidity advections, and as a result ascension and T A B L E 4

Statistical index
Abbreviation Formulation Best value

| Model performance in response to the KF scheme's convection co-efficients
In order to evaluate the efficiency of the WRF model with the new values for the parameters incorporated into the convection scheme, the model's performance which was quantified based on an ETS of 100 simulations for the SA method is shown in the top panel of Figure 3. As described in Table 1, the default values for P d , P e and P h are 0, 0 and 150 hPa, respectively. Also, the ETS = 0.53 in the simulation with the KF scheme's default parameter values. As shown in Figure 5, the ETS varies between 0.28 and 0.73, and in 57 simulations the ETS is larger than or equal to the default ETS value. The optimal values for P d , P e and P h were 1, 1 and 50 hPa, respectively. With these optimal parameters, an ETS = 0.7 was obtained in the 83rd iteration of the simulations. Although larger quantities were obtained for the ETS, they were not approved according to the constraints of the SA process introduced in Section 2. As noted above, the ETS in 57 simulations has had a value greater than or equal to the default value of the ETS. These results are called "acceptable simulations." The frequency distribution of the acceptable simulations as a function of each convective parameter is shown in the bottom panel of Figure 3.
The results revealed that more than 60% of the acceptable simulations are obtained with P d values between 0.8 and 1.0, and more than 80% of acceptable simulations are produced by P e values between 0.4 and 1.0, when specifying that the ratio of the downdraft mass flux rate and the rate of mass flux in the entertainment layer in the standard KF scheme (shown in Equations 1 and 2) are lower than the proper values for the simulation of convection systems in the desired area. Likewise, the optimal value for P h is 50 hPa, which shows that the starting height of the downdraft above the USL in a standard KF scheme is above that of the real one in the considered area, and it should be reduced to obtain better results for squall line predictions.
The spatial distribution of the observed and simulated precipitations for both optimal and default parameter values for October 12, 2012, are presented in Figure 4. Overall, the simulated precipitation with default parameters predicts the occurrence and situation of heavy precipitation on the southeastern shores of the Caspian Sea, but the amounts of rainfall are underestimated. The maximum precipitation predicted with the default mode of the model is about 35 mm, while its observational value is nearly 87 mm. Also, the model with default parameters could not predict a precipitation maximum located over the southwestern Caspian Sea. Using the calculated optimum values of the input parameters, the model prediction showed a significant increase in maximum rainfall, and its amount reached about 47 mm. Of course, this amount is still less than that for the observation, and the model still underestimates the precipitation values. In optimum mode, the model predicts the precipitation and spatial distribution of its kernels on the southeast coast. Figure 5 shows the criteria for the optimization convergence diagram according to the number of model runs. As previously stated, the highest acceptable ETS score was 0.7, and despite the fact that higher amounts for this quantity were obtained, these values were not considered to be a failure to meet the other optimization method constraints. No more model execution was considered since the annealing co-efficient T(k) was less than random, which was one of the main constraints of the SA algorithm.
As summarized in Table 5, the results show that the MB in default mode is −4.6. After optimization, it decreased to −3.7. Both simulations represent an underestimation of rainfall ( Figure 4). The RMSE in the optimal state is 14.3, which shows an improvement compared with its default value of 17.1. The results for the MAE, CC and multiplicative bias statistics show that the MAE decreases from 7.67 to 6.50 in the optimum compared with the default simulations. The CC between predictions and observations rises from 0.3 in the default to 0.6 in the optimum simulations. Also, multiplicative bias is 0.4 in the default simulations and increases to 0.52 in the optimal state.

| Validation of optimization results
After obtaining the optimal values for the KF convective parameters P d , P e and P h in the first case (October 12, 2012), which was called a training case, another case with similar conditions (the formation of a squall line with heavy convective precipitation on the southern shores of the Caspian Sea), on October 8, 2015, was selected, and the WRF model was run with the new parameter values. The results were compared with those of the default values condition.
In this case, as shown at the top panel of Figure 6, the model's underestimation in the default mode is much higher than in the first one, and after applying the new co-efficients to the KF scheme in the WRF model, despite being negligible, the estimated precipitation is improved (Figure 6, middle). However, the model could not predict the convective rainfall of the southern and southeastern shores of the Caspian Sea in both the default and optimal modes ( Figure 6). The maximum measured rainfall is 131 mm, while the maximum predicted rainfall in the default state is 61 mm, and in optimal mode reached 80 mm, indicating improved predictions after applying the optimal input parameters to the model.
The case test (2015) also saw an improvement in the results of the abovementioned statistical indices. The calculated statistical scores specify that the MB decreased from −10.6 in default mode to −8.4 in optimal mode. The RMSE is also reduced from 29.4 in optimal mode to 23.8 in the default condition. The MAE is 12.66 the in default simulations and its optimum is 11.1. The CC index ranged from 0.31 in the default state to 0.50 at optimum. The multiplicative bias statistic increased from 0.17 in the optimum state to 0.27 in the default state (Table 5).

| SUMMARY AND CONCLUSIONS
The study uses the simulated annealing (SA) optimization method to improve the uncertainty in the results of short-term forecasts of the Weather Research and Forecasting (WRF) prediction model for two cases of a squall line that occurred on October 12, 2012, and October 8, 2015. To this end, the three input parameters identified in the Kain-Fritch (KF) convection scheme, which are representative of downdraft mass flux (P d ), entrainment mass flux (P e ) and starting height of downdraft above updraft surface layer (USL) (P h ), were tuned. The southern coasts of the Caspian Sea were selected as the study area because of the prevalence of convective rainfall in the area. Two 36-hr periods with convective rainfall and the observation of a squall line were identified. The first on October 12, 2012, was selected for training and finding the optimal parameters, and the other case, on October 8, 2015, was used for testing the obtained co-efficients.
The default values of P d , P e and P h were 0, 0 and 150, respectively. After completing the optimization process with 100 iterations, the values 1, 1 and 50 were obtained for the specified three parameters.
The results also showed that the model in the default mode estimates the amount of precipitation to a lesser extent than that of the observed value, and optimization improves the amount of rainfall prediction significantly. In spite of the improvement in post-optimization results, the spatial distribution of precipitation in both the default and optimized modes is less than that of the areas covered by precipitation. Mean bias (MB) and the root mean square error (RMSE) of default mode were −4.6 and 17.1, respectively, and these values decreased to −3.7 and 14.3 in the optimized mode in first precipitation case (October 12, 2012).
In the test case, the results were similar to those of the first case, and again the model underestimated the amount of precipitation. After optimization, the results were somewhat better, but the rainfall was still less than the observation. The spatial distribution of the precipitation in both F I G U R E 6 Spatial distribution of convective precipitation on October 8, 2015, for the (a) default and (b) optimized modes of the model and (c) observations the default and optimized modes was much lower than that recorded. The MB and RMSE in comparison with observations showed 21% and 19% improvements in optimal simulations than the default simulations, respectively.
Other statistics such as mean absolute error (MAE), correlation co-efficient (CC) and multiplicative bias had results that also showed 12%, 38% and 59% improvements in optimal simulations than default simulations compared with observations, respectively, after optimizing the input parameters of the KF scheme.
The study showed that along with the post-processing of output data, one of the suitable error-reduction methods for predicting convective precipitation is to adjust the input parameters of the convection scheme for each area using the SA optimization method. It also showed that these co-efficients could be used for future weather predictions if the results are repeatable in other heavy rainfall-occurrence cases.
It is suggested that in future studies other optimization methods be used for this purpose and their results evaluated with the results of the present paper for the diagnosis of the best optimization method to use in order to reduce uncertainty in the results of weather forecasting.