Improvement of the 24 hr forecast of surface UV radiation using an ensemble approach

A methodology is proposed to improve the 24 hr forecast of the ultraviolet (UV) index and the duration of exposure to obtain the minimal erythemal dose (MED). A forecast ensemble consisting of 10 members (differing in initial and boundary conditions) is examined to search for the best performed ensemble member. Routine UV measurements are used for the forecast validation. These are carried out at Belsk (20.8 ° E, 51.8 ° N) and in Racibórz (18.2 ° E, 50.1 ° N) representing a rural and an urban site in Poland, respectively. Each ensemble member is built using the clear‐sky simulations by a radiative transfer model. The clear‐sky irradiance is attenuated using the cloud modification factor (CMF) depending on the cloud cover by low‐ and mid‐level clouds. The 24 hr forecast of cloudiness is obtained by the Weather Research and Forecasting (WRF) model. Every day, for each ensemble member, the optimal CMF values are built by the offline bootstrapping of the original CMF matrix. The performance of all ensemble members is evaluated for the day preceding the forecast. The best one is subsequently used for the next‐day forecast. This procedure provides a more accurate forecast than that based on a single member of the ensemble. For both sites, the root mean square percentage error for the duration of the MED exposure changes from about 30% to about 15%, and mean absolute percentage error from about 20–25% to about 10%.


| INTRODUCTION
Overexposure to ultraviolet (UV) radiation is related to various biological risks affecting humans, animals, plants and materials (Diffey, 1991;Lucas et al., 2006;Young, 2006). Skin redness (sunburn) is a visible sign of the overexposure of the human skin and it increases the risk of skin cancers especially the most dangerous melanoma (Chang et al., 2014). The effect of UV radiation on the human skin is given by weighting the solar spectrum by an erythemal action spectrum (EAS). Several analytical formulas for EAS were discussed (for an overview, see Webb et al., 2011). The analytical formula proposed by McKinlay and Diffey (1987) was updated and approved as the standard by the Commission Internationale de l'Eclairage (CIE) (CIE, 1998). The integral (over wavelengths) of the spectral solar irradiance weighted by this CIE action spectrum, i.e. the so-called erythemal irradiance (EI), is a commonly used metric to determine the strength of the solar radiation in producing sunburn. For an easier understanding of the level of UV radiation, the UV index (UVI) was defined by multiplying EI (expressed in WÁm -2 ) by 40 to obtain 40 WÁm -2 (Vanicek et al., 2000). The UVI is a dimensionless parameter that could maximally reach ≥ 20 in high-altitude sites within the Tropics (Herman et al., 2018). The World Health Organization (WHO) developed several scenarios for outdoor behaviour to avoid a strong UV radiation. If UVI > 2, people should start protection themselves against the UV (WHO, 2002).
In 1994, the UVI forecast programme was initiated in the United States by the National Weather Service. The UVI has been served in many countries as a basic metric for alerting the public of the surface UV intensity. The UVI operational forecast is now a standard programme carried out by many national weather services (Long, 2003;He et al., 2013;Schmalwieser et al., 2017), and also the UVI forecast for the whole world is available on various websites (https:// kunden.dwd.de/uvi/dosis.jsp; https://sunburnmap.com).
The standard UVI forecast comprises three steps: the prediction of ozone in the vertical column (i.e. the so-called total ozone); the calculation of the clear-sky part of the UVI based on a radiative transfer model using the forecasted total ozone and fixed (climatological) aerosols optical depth; and a modification of the clear-sky UVI by clouds (Vanicek et al., 2000). The last step is mostly responsible for the UVI forecast uncertainty as cloud properties exhibit strong spatial and temporal variability that requires high-resolution calculations and proper parameterization of the cloud microphysics (Koepke et al., 1998;Staiger and Koepke, 2005).
Originally, the UVI has been introduced to limit human solar exposure. However, UV radiation is a basic source of vitamin D for humans, and a restrictive protection against solar UV cannot be recommended as it could lead to vitamin D 3 deficiency in the human body (Holick and Chen, 2008). Skin redness appears after an exposure to a certain erythemal dose. The minimal erythemal dose (MED) causing skin redness depends on skin colour. Fitzpatrick (1988) identified six skin phototypes with pertaining MEDs. Exposure to a fraction of the MED is necessary to synthetize an adequate amount of vitamin D 3 (Dowdy et al., 2010;Krzy scin et al., 2016). Thus, it seems that a forecast of the maximum safe (without erythema risks) duration of exposure yielding exactly 1 MED needs to be added to the UV forecasts for the public. However, what is more beneficial to human health, avoiding solar UV or a limited tanning, has not yet been solved (Baggerly et al., 2015;. The presently used operational UVI forecast models were developed in the 1990s and early 2000s. The option of avoiding UV dominated at that time. The novelty of the present paper stems from the following points. First, it focuses on the performance of the 24 hr forecast for both the UVI and the maximum duration of safe exposure (DSE) intraday variability. Second, a new algorithm is proposed to minimize the forecast error by using the ensemble approach and the offline bootstrap resampling of co-efficients describing the cloud attenuation of solar UV. The best performing ensemble member for the next-day forecast, which is the closest to the measured UV values, is selected from a set of forecasts. Thus, the optimal ensemble member is not fixed over the whole forecast period (June-August 2018) and it may vary from forecast to forecast by taking into account the previous performance of all ensemble members.

| UV measurements
The EIs measured in a rural region of central Poland (Belsk, 20 8 0 E, 51 8 0 N) and in the industrial zone in the southwestern Poland (Racibórz, 18 2 0 E, 50 2 0 N) are used to evaluate the performance of the 24 hr forecast of the UV index and the maximum DSE, i.e. the period of sunbathing to receive exactly 1 MED. Here, an MED = 250 JÁm -2 was selected as the standard for a person with skin phototype II, which is typical for the Polish population.
The UV measurements are routinely carried out in these sites by Kipp and Zonen UV-EA radiometers within the Institute of Geophysics, Polish Academy of Science, UV network. The quality of the UV observations at Belsk is checked every month by comparisons with UV indices taken simultaneously at the site by the Brewer spectrophotometer for clear-sky days. The radiometer in Racibórz was put into operation in June 2018 after calibration with the Brewer spectrophotometer. The quality of the Belsk's Brewer is assured by regular comparisons with the secondary world standard. For forecast/observation intercomparison, the UVI and the maximum DSE were calculated every 15 min from 0600 to 1600 hours, i.e. about 5 hr before and after solar noon, between June and August 2018.

| Forecast model
The UV forecast model used comprises of the standard three-steps set-up, as mentioned in Section 1. The 24 hr forecast of total ozone (TO 3 ) over Poland starting at 0000 GMT with a time step of 6 hr and spatial resolution 0.5 × 0.5 is taken from the Global Forecast System (GFS) run by the US National Centers for Environmental Prediction (NCEP). Linear interpolation is used to obtain TO 3 data every 15 min, and the bilinear interpolation from TO 3 gridded values is applied to find ozone over the UV measuring sites.
Clear-sky UV irradiance is taken from a look-up table, which is built using the Tropospheric Ultraviolet and Visible model (TUV) (Madronich and Flocke, 1997) for various combinations of model inputs including total ozone (in the range 200-500 DU with steps of 10 DU), aerosol optical thickness (AOT) at 340 nm (in the range 0-1.5 with steps of 0.1) and elevation (in the range 0-2.5 km with steps of 100 m). Surface albedo is assumed to be 0.03 (typical for snowless ground covered by plants) as the clear-sky calculations are only for the warm part of the year. Aerosol characteristics (single scattering albedo and the asymmetry factor) are equal to the continental aerosol values. The GFS TO 3 forecast and climatological values of the AOT from satellite observations over the territory of Poland are used as the TUV input values to build the look-up table.
Various analytical formulas have been proposed for cloud modification factor (CMF) dependence on the cloud fraction (CF) by the low-, mid-and high-level clouds (Vanicek et al., 2000;Calbó et al., 2005). Guzikowski et al. (2017) found an empirical formula after comparison of the 24 hr ensemble forecasts with the UV measurements in 2015 at Belsk. Ensemble members differed in the parameterization of cloud microphysics, atmospheric long and short wave radiation, and turbulence in the surface and boundary layer. Finally, for operational use, a CMF matrix was built by taking into account the ensemble member with the best agreement with the UV observations (member 1 in Table 2A of Guzikowski et al., 2017). The 5 × 5 CMF matrix comprises an array co-efficients for five classes of CF by the mid-and low-level clouds, CF < 0.25, 0.25 ≤ CF < 0.50, 0.50 ≤ CF < 0.75, 0.75 ≤ CF < 0.95 and CF ≥ 0.95. For example, CMF = 1 and 0.37 for almost clear-sky (the matrix coefficient {1,1}) and overcast conditions (the matrix coefficient {5,5}), respectively.
CF values < 800 hPa (low-level clouds), between 800 and 450 hPa (mid-level clouds), and > 450 hPa (highlevel clouds) are calculated with 15 min resolution by the Weather Research and Forecasting (WRF) model (Skamarock et al., (2008)) for a nested grid (0.10 × 0.10 ) over the whole of Poland. Here, the next-day forecasts are performed using the CMF values recommended by Guzikowski et al. (2017) by taking into account various cloud parameterizations (see member 1 in Table 1A of Guzikowski et al., 2017). Different assumptions of initial and boundary conditions (IBC) to run the WRF model are also possible for the next-day forecasts. The Global Ensemble Forecast System (GEFS) allows one to apply up to 21 separate IBC types. It is used operationally by the National Oceanic and Atmospheric Administration (NOAA). The GEFS is produced four times per day with a 6 hr interval starting at 0000 GMT. The resolution of the IBC data is approximately 34 km horizontally and 64 hybrid vertical levels (Zhou et al., 2017). The ensemble system has 20 perturbation members and one control member. Daily forecasts are available via a public ftp server (https:// nomads.ncep.noaa.gov/pub/data/nccf/com/gens/prod/).  For the needs of the present paper, 10 IBC members from the GEFS ensemble were arbitrarily selected. It cannot be excluded that using more IBC types would lead to better results, but here not all possible IBC were examined owing to limits in the computing/processing speed of the computer used. For each selected IBC member, the next-day UV irradiances with 15 min resolution were calculated and stored. Linear interpolation is used to have UV irradiance with a 1 min resolution, and these values are integrated in time until the dose reaches 250 JÁm -2 , i.e. the value equivalent to 1 MED for skin phototype II. Thus, every 15 min forecast of the UVI and duration of exposure to obtain the MED are available for each IBC member for further optimization.
In the next step, the performance of all ensemble members is evaluated by comparing the prognostics UV values obtained in the period 0600-1600 UTC with the corresponding measured UV values. For the IBC members' ranking, the standard statistics for the goodness-of-a-model fit to the observations are calculated: root mean square percentage error (RMSPE) and mean absolute percentage error (MAPE) (see the next section for definitions). In previous studies, the UVI forecasts were evaluated using the whole forecasting period, but here the statistics are calculated every day for a shorter period before the next-day forecast. The number of days used for the ranking of the IBC ensemble members varied from one to 20, but it appeared that the best forecast was for the ranking based on the one day comparison, i.e. for the day preceding the forecast. This is done using the IBC ensemble member with minimal values for the statistics (RMSPE or MAPE). The next-day UV forecast started on June 2 using the ensemble performance on June 1 and it was repeated every day until August 31, 2018. Thus, the last forecast was by the best performed IBC ensemble member on August 30.
The initial CMF matrix coefficients used in the forecast are equal to those for member 1 of the ensemble discussed by Guzikowski et al. (2017), i.e. they were built by taking into account the data collected in 2015 at Belsk. It is possible that the CMF matrix is not optimal in 2018 calculations especially for other sites (e.g. Racibórz). Thus, the forecasts will be improved by adding small values, which are randomly drawn from the range {−0.1, 0.1}, to all the CMF coefficients. A new CMF matrix is used for the next-day forecast. This offline procedure is repeated 300 times, which is an arbitrarily selected number, for each IBC ensemble member. Finally, based on the one day performance on the whole forecast sample consisting of 300 (total number of random CMF matrices) × 10 (total number of the IBC ensemble members) members, the best one is selected to be used in the next-day forecast.
It seems that the performance of the best rank member could be further improved by removing a bias found in the forecast/observation comparison when evaluating the best performed ensemble member. First, the regression (Equation (1)) of the observed UV values (UVI or the maximum DSE), UV O (t), on the forecast values by each ensemble member i, UV i (t), is calculated to obtain coefficients const 1,i (the line intercept) and const 2,i (the line slope): where Noise i (t) is the residual part of the linear regression for the i-th ensemble member. Next, these coefficients are used to build unbiased forecast, UV i *(t), for day t: T A B L E 2 As for Finally, the next-day (t + 1) forecast is done by the best performed ensemble member in the preceding day, UV B *(t), selected from the ranking of the forecast performance of all bias-corrected ensemble members.
The results of the next-day forecasts by the best biascorrected ensemble member are marked by the letter "B", while "A" denotes the results by the best member without such correction. Moreover, index "0" (e.g. B0 and A0) means that the criterion for the best forecast model was the minimum RMSPE. Otherwise, index "1" denotes that the criterion was the minimum MAPE.

| RESULTS
The following statistical characteristics of the goodness of fit of a model to the observed data are calculated: the correlation co-efficient (CC), root mean square error (RMSE), RMSPE, bias (BIAS), MAPE and linear regression co-efficients (A = intercept, B = slope) of the modelled value Xmi on the observed value Xoi, i = {1, …, N}: where: The forecast of TO 3 is essential for the UV modelling (Sections 1 and 3). Here TO 3 forecasts are imported from external sources (output of the GFS model). The TO 3 from the 24 hr GFS forecast is close to that observed by TO 3 by the Dobson spectrometer operating at Belsk since March 1963 (Figure 1). The comparison of the observed and 24 hr forecast total ozone yields the following statistical characteristics derived for the period June-August 2018: CC = 0.96, RMSE = 6.27 DU, RMSPE = 1.86%, BIAS = 0.19 DU, MAPE = 1.40% and linear regressioncoefficients (modelled TO 3 versus observed TO 3 ), A = 26.2 DU and B = 0.92. T A B L E 4 As for  The performance of each IBC ensemble member based on the whole analysed period (June-August 2018) is shown both for UVI and the maximum DSE derived from next-day forecasts every 15 min between 0600 and 1600 UTC (Table 1 for Belsk, Table 2 for Racibórz, and  Table 3 for Belsk and Table 4 for Racibórz), respectively. A large scatter is seen in the statistics by the considered ensemble members. For example, the sixth member of the ensemble provides the best agreement with the Belsk's UVI observations with RMSPE = 36.6% (UVI), and for the maximum DSE the best one (eighth member) yields 27.9%. The corresponding values for Racibórz are 37.8% (sixth member) and 32.0% (fourth member). The poorest performances are 56.1% (UVI, fourth member) and 32.0% (DSE, fourth member) for Belsk, the corresponding values for Racibórz are 58.0% (UVI, fourth member) and 43.9% (maximum DSE, sixth member). Figure 2a,b illustrates forecast/observation scatter at Belsk for the best member of the ensemble, i.e. the UVI forecasts by the sixth member and the maximum DSE forecast by the eighth member. The maximum UVI is around 7 for both the observations and the forecast values (i.e. typical cloudless value around noon in late spring and at the beginning of summer in Poland). In this case, the maximum DSE (for skin phototype II) is around 30 min. Sometimes, the cloudiness forecast is either too low or too high, e.g. the modelled UVI is 7, but the observed UVI is 3.5, or the observed UVI is 6, but the modelled UVI is 2.5. Similarly, the maximum DSE forecast is 180 min, but the corresponding observed value is only 60 min, or the observed value is 135 min, but the forecast is 45 min.
Note that the ensemble performance ranking based on the whole period of the forecasts does not allow one to make an optimal forecast for each day within this period. The ensemble member with the best performance could not be revealed until the end of the comparative period (August 31, 2018). Thus, it is better to search for the optimal 24 hr forecast by analysing output of all ensemble members over overlapping shorter periods (few days or few weeks preceding the forecast) and make the next-day forecast using the best performed ensemble member over this shorter period. Moreover, the whole season ranking seems to be useless for next year simulations as it cannot be excluded that another optimal forecast will be disclosed after the forecast period in 2019. Figure 3 shows the ranking of the ensemble members based on the performance of the whole ensemble derived from one day (Figure 3a,c) and seven days (Figure 3b,d) forecast/observation UVI comparisons preceding the forecast day. Here, the criterion for the best performing ensemble member was the maximum correlation coefficient found in the comparison period. It is seen that each ensemble member appeared at least once in the whole analysed period. The correlation coefficient for the best ensemble member vary between 0.5 and about 1. The ranking based on the one day comparison period is different than that inferred from the seven day comparison period, but the correlation coefficients are usually higher, i.e. 0.89 ± 0.11(1σ) and 0.78 ± 0.10(1σ), respectively. Various comparison periods (from 1 to 20 days) were examined for the purpose of the model ranking before the next-day forecast. Finally, the one day ranking is used because it provided the best next-day UV forecasts. Table 5 (for Belsk) and Table 6 (for Racibórz) show values characterizing the goodness of the UV forecasts over the whole analysed period June-August 2018. Here the criterion for the best performing IBC member of the ensemble was the minimum of RMSPE (index 0) or the minimum MAPE (index 1) found in the one day forecast/ observation comparison preceding the next-day forecast. Model B uses the unbiased approach as defined by Equations (1) and (2). The initial CMF matrix was the same as that proposed by Guzikowski et al. (2017) for member 1 of the 24 hr UVI forecast ensemble based on various WRF parameterization schemes of the atmospheric processes for the next-day UVI forecast. Using the approach, i.e. everyday selection of the optimal next-day forecast model, improves the overall goodness of UV forecasts. For example, the best performance by a single IBC member of the ensemble fixed for the whole analysed period (Tables 1 and 2) yields at best an RMSPE of about 35% (both sites), but finally the RMSPE is about 32% for the best ensemble member. The similar improvement is seen for the maximum DSE from about 28% (Belsk) and about 32% (Racibórz) by the fixed IBC member to about 20% for the best performed ensemble member. Better performance is also found when the minimal MAPE criterion is considered, i.e. a change from about 20% (Belsk) and about 24% (Racibórz) for the maximum DSE to about 15% finally for both stations. The same improvement is found for UVI forecast, i.e. the change of MAPE from about 28% (both stations) to 24%.
Using the random CMF coefficients by the offline bootstrap resampling of the initial CMF matrix improves further the model/observation agreement for the whole period of the UV forecast (Tables 7 and 8) The ranking is based on the one day forecast/observation comparison in the day preceding the next-day forecast. The criterion for the best rank ensemble member is the RMSPE (index "0") or MAPE (index "1"). Models A and B denote next-day forecasts with and without the bias correction procedure described by Equations (1) and (2). RMSE and bias are in UVI unit for UVI forecast, but in minutes for the maximum DSE forecast, respectively. RMSPE and MAPE are in per cent.
T A B L E 6 As for

| DISCUSSION AND CONCLUSIONS
The mains source of the ultraviolet (UV) forecast error is the parameterization of the cloud attenuation of UV radiation. Output of the forecast model (Weather Research and Forecasting (WRF)) for limited area (the territory of Poland) could be also sensitive to selected initial and boundary conditions (IBC) values provided by a global weather forecast model. Here Global Ensemble Forecast System (GEFS) is chosen, which provides 21 IBC types. Moreover, various schemes of the cloud parameterization by the WRF model are possible. These were examined by the authors previously (Guzikowski et al., 2017) and the best one was identified based on the forecast/observation comparison at Belsk in 2015. Thus, many IBC/cloud parameterization combinations are possible, and because of computing speed limit, not all of them could be simultaneously tested to find the optimal one, i.e. with the best agreement to the UV observations. Here a methodology is proposed to search for the best 24 hr UV forecast model. The WRF model was run to produce a cloud cover course for the next day using 10 arbitrarily selected IBC members together with offline bootstrapping of the initial cloud modification factor (CMF) co-efficients for the best cloud parameterization found by Guzikowski et al. (2017). It means that only 10 forecast WRF models are run simultaneously. The best forecast is identified by analysing the performance of all members of the ensemble for the day preceding the forecast. Ensemble forecasting is a standard practice in weather prediction schemes (e.g. Palmer, 2018). It allows for the estimation of uncertainty for the forecast in an effective way by using the parallel architecture of model codes. It seems that the issued UV forecast should be presented in the old format, i.e. one possibly most accurate value giving people information on how to balance between the harmful and beneficial solar UV effects. Here, a methodology to improve a single forecast is presented based on the ranking of ensemble member performance in the days preceding the forecast. Thus, an optimal forecast model is not fixed and is subject to change between the respective forecasts.
The basis for the assumption of using such an approach is a persistence of atmospheric processes allowing for the use of the best performed model for the next-day forecast. Here, it is found that UV forecasts for Poland will be improved when the ensemble model ranking is based on the one day performance of the ensemble members. However, the ranking requires observing data in days preceding the forecast. It means that the forecast could be valid only for the UV observation sites and their vicinity. Previous studies showed that for low land stations in Europe, the measured UV index (UVI) is representative for sites within about 10 km radius (Weihs et al., 2008;Krzy scin et al., 2018). Usually the density of the national UVI network, providing high-quality data, is too low. This limits the possibility of the UV forecast for large areas of the country. The UVI could be effectively reconstructed using the total solar irradiation (TSI) as a proxy for cloud effects on UV radiation (Alados et al., 2007;Bilbao and Miguel, 2010;Tereszchuk et al., 2018). The TSI observations are carried out at many national weather stations, and the measuring instruments are easier to handle than those for UVI measurements. There is T A B L E 7 As for Table 5 (Belsk), but for each initial and boundary conditions (IBC) ensemble member offline bootstrap resampling was used to modify the initial cloud modification factor (CMF) co-efficients by Guzikowski et al. (2017)  also a possibility to use satellite observations of cloudiness to retrieve the daily course of the UVI for the day preceding UV forecast. Thus, the presented methodology could be used to produce UV forecast maps not only for the selected sites close to the UV observing network but also for the remainder of the country. Note that the UV forecasts are adapted to local atmospheric conditions (e.g. cloudiness pattern, aerosol properties), i.e. the IBC ensemble member and the CMF matrix could reproduce specific local properties of the UV attenuation over different sites. Here two sites are considered: a clean rural site (Belsk) and a contaminated industrial site (Racibórz). The initial (before the bootstrapping) CMF matrix is the same for both sites. Originally it was found from the model/observation comparisons carried out at Belsk in 2015 (Guzikowski et al., 2017). It is not necessarily optimal for other sites or in the following years (e.g. see the larger RMSPE and MAPE in the maximum duration of safe exposure (DSE) forecasts for Racibórz in Tables 3 and 4). The bootstrapping of the CMF matrix together with the biasreduction procedure allows for improved next-day forecasts adapted to local atmospheric properties (see Tables 7  and 8 for similar RMSPE and MAPE provided by the B models for the UVI and the maximum DSE for these stations).
The presented maximum DSE is derived from the UV exposure on a horizontal surface. The real personal erythemal doses received during outdoor activities are usually smaller (Siani et al., 2009;Weihs et al., 2013). Thus, the real-time spend outdoors could be longer than the forecasts, i.e. the forecast is much safer. It cannot be excluded that some parts of the body (inclined surfaces such as the forehead and shoulders) could attain higher doses than the estimated ambient dose. This could happen in the case of a large surface albedo (i.e. during winter in the mountains) and when skin areas are perpendicular to the solar rays, which usually occurs for a short period of time (Siani et al., 2008).
The conclusions as follows: • Effective 24 hr forecast of the maximum DSE is possible and its performance is even better than that for UVI forecast. • A methodology is proposed to improve the performance of the 24 hr forecast of the UVI and the maximum DSE. • Forecasts are better than those obtained by a fixed (for the whole period of analysis) member of the UV forecast ensemble.
Here, in the preliminary studies of the UV forecast ensemble modelling, a limited number of the ensemble members (N = 10) is examined. There are suggestions that around 20-50 ensemble members should be implemented in the mesoscale national weather prediction models (Eckel and Delle Monache, 2016). Thus, it seems that a potential for improving the next-day UV forecast still exists.