Changing temperature extremes based on CMIP5 output via semi‐parametric quantile regression approach

The Expert Team on Climate Change Detection and Indices (ETCCDI) proposed a set of 27 core indices, describing extreme weather and climate events based on daily temperature and precipitation data. There are six percentile‐based temperature extreme indices: four exceedance rates (ERs) (TN10p, TN90p, TX10p and TX90p) and two durations (cold spell duration [CSDI] and warm spell duration [WSDI]), derived by using percentiles for calendar days during the base period 1961–1990 as thresholds. The widely used empirical quantile (percentile) estimator or its bootstrap resampling adjustment may result in inhomogeneity in the derived annual ER series or in their linear trends, as well as seasonally varying biases in the monthly ER series. We present a new data set of the six indices for the historical and three representative concentration pathways (RCP2.6, RCP4.5 and RCP8.5) scenarios simulated by 19 global climate models participating the Coupled Model Intercomparison Project Phase 5, generated via the semi‐parametric quantile regression approach. Percentiles estimated upon the independence condition, inhomogeneity and seasonal biases are removed from the new data set, so that the changes in temperature extremes can be authentically revealed from the climatological distribution point of view. Compared with its counterpart produced by the CLIMDEX project, the new data set shows similar spatial and temporal change patterns under the RCP scenarios, but with much smaller magnitudes. By the end of 21st century, TN90p increases to 17.3, 24.5 and 40.0%, and TX90p increases to 16.9, 22.1 and 35.3%, under RCP2.6, RCP4.5 and RCP8.5, respectively. WSDI increases to 35 days under RCP8.5. For the three increasing indices, the most significant changes occur in tropical and extratropical regions. The greatest increases in TN90p and TX90p occur in southeast Asia and Amazon Basin, up to 85 and 68% in the summer under RCP8.5; that in WSDI occurs in Amazon Basin up to 120 days under RCP8.5.


Funding information
The Expert Team on Climate Change Detection and Indices (ETCCDI) proposed a set of 27 core indices, describing extreme weather and climate events based on daily temperature and precipitation data. There are six percentile-based temperature extreme indices: four exceedance rates (ERs) (TN10p, TN90p, TX10p and TX90p) and two durations (cold spell duration [CSDI] and warm spell duration [WSDI]), derived by using percentiles for calendar days during the base period 1961-1990 as thresholds. The widely used empirical quantile (percentile) estimator or its bootstrap resampling adjustment may result in inhomogeneity in the derived annual ER series or in their linear trends, as well as seasonally varying biases in the monthly ER series. We present a new data set of the six indices for the historical and three representative concentration pathways (RCP2.6, RCP4.5 and RCP8.5) scenarios simulated by 19 global climate models participating the Coupled Model Intercomparison Project Phase 5, generated via the semiparametric quantile regression approach. Percentiles estimated upon the independence condition, inhomogeneity and seasonal biases are removed from the new data set, so that the changes in temperature extremes can be authentically revealed from the climatological distribution point of view. Compared with its counterpart produced by the CLIMDEX project, the new data set shows similar spatial and temporal change patterns under the RCP scenarios, but with much smaller magnitudes. By the end of 21st century, TN90p increases to 17.3, 24.5 and 40.0%, and TX90p increases to 16.9,22.1 and 35.3%,under RCP2.6,RCP4.5 and RCP8.5,respectively. WSDI increases to 35 days under RCP8.5. For the three increasing indices, the most significant changes occur in tropical and extratropical regions. The greatest increases in TN90p and TX90p occur in southeast Asia and Amazon Basin, up to 85 and 68% in the summer under RCP8.5; that in WSDI occurs in Amazon Basin up to 120 days under RCP8.5.

| INTRODUCTION
Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) (http://wcrp-climate.org/etccdi) worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis (Peterson, 2005). The CLIMDEX project (http://www.climdex.org) produced a public domain data set of such indices for data sets from a variety of sources, including global land-based observations, reanalysis data and output from global climate models (GCMs) participating in the Coupled Model Intercomparison Project Phase 3 (CMIP3) and Phase 5 (CMIP5). Based on the ETCCDI indices and the CLIMDEX data set, a wide range of works concerning observed and projected changes in climate extremes and GCM performance assessment were carried out (e.g., Bürger et al., 2012;Donat et al., 2013;Li et al., 2013;Sillmann et al., 2013a;2013b;Zhou et al., 2014;Chen and Sun, 2015;Dong et al., 2015;Yu and Li, 2015;Zhou et al., 2016). The European Climate Assessment & Dataset (ECAD) project (http://www.ecad.eu) also adopted the ETCCDI indices as part of their own indices. The Intergovernmental Panel on Climate Change (IPCC) cited extensively the ETCCDI indices-based analysis results in their Fifth Assessment Report (Intergovernmental Panel on Climate Change, 2013).
The ETCCDI indices may roughly fall into four groups: (a) extreme value indices, (b) absolute threshold-based exceedance indices, (c) duration indices and (d) percentilebased indices. The first three groups of indices are derived from data straightforwardly according to their definitions. The last group of indices describes the exceedance rates (ERs), durations or cumulants using percentiles for calendar days during the base period 1961-1990 as thresholds. There are six percentile-based temperature extremes indices: cold nights and days (TN10p and TX10p), warm nights and days (TN90p and TX90p) and cold and warm spell duration indices (CSDI and WSDI) ( Table 1). Percentiles must be estimated prior to the calculation of these indices and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. Originally recommended quantile (percentile) estimator is an empirical method (Hyndman and Fan, 1996;Folland and Anderson, 2002). Zhang et al. (2005) (referred to as ZHZK hereafter) showed that for the four derived ER indices TNnp and TXnp (n = 10, 90), there are jumps at the boundaries between the in-base and out-of-base periods, resulting in inhomogeneity in their annual series and thus biased linear trend estimates. ZHZK carried out a set of Monte Carlo experiments and concluded that the problem arises from the biased percentile estimates using data from the base period, due to the empirical quantile estimator (EQE), sampling variability and autocorrelation in data time series. ZHZK finally suggested a bootstrap resampling procedure to adjust the percentile estimates for the base period so that the inhomogeneity in the annual ER series can be removed. The CLIMDEX project adopted ZHZK's method in the calculation of the four ER indices.
However, there are still some issues unaddressed (or brought about) by ZHZK's method (Yang and Xu, 2017, referred to as Y&X hereafter). ZHZK's method does not actually take the autocorrelation in data into account, but may introduce extra biases for a warming climate. In addition, by definition CSDI/WSDI should be derived using the same 10th/90th percentile estimates as used by TN10p/ TX90p. However, the CLIMDEX project applied ZHZK's method to adjust the in-base percentiles only for the four ER indices, and used the unadjusted EQE estimates for deriving CSDI and WSDI. As a result, CSDI/WSDI is inconsistent with TN10p/TX90p, at least for the base period, in the CLIMDEX data set.
As an attempt to address the above issues, Y&X proposed a semi-parametric regression approach (Green and Silverman, 1994) to the climatological quantile estimation, particularly for deriving ETCCDI percentile-based temperature extremes indices. Percentiles for daily temperature are expressed by a quantile regression model (QRM) in the form where q τ (t) is the τth percentile for the tth day of the base period; s(d t ) is the nonparametric term for seasonality as a nonlinear function of calendar date d t 2 {1, 2, …, 365} for the tth day; x(t − i) (i = 1, 2, 3) are the previous ith day's observations; β i (i = 0, 1, 2, 3) are coefficients. Model 1 is fitted through the vector generalized additive model (VGAM) approach (Yee, 2015), by which the response is assumed to follow the asymmetric Laplace distribution (Kotz et al., 2001). Once fitted, model 1 can be used straightforwardly to estimate (predict) percentiles for either the base or the out-of-base period. Monte Carlo experiments show that through QRMs the derived ERs present negligible biases not only on an annual basis, but also on a monthly basis as well. Later on, Yang (2017) further applied this approach to daily maximum and minimum temperature outputs from 19 CMIP5 GCMs for the historical and three representative concentration pathways (RCP2.6, RCP4.5 and RCP8.5) emission scenarios (Taylor et al., 2012), yielding an improved data set of the six percentile-based indices for these GCMs. This paper provides an overview of the new data set, focusing on the differences from its counterpart in the CLIMDEX data set. To facilitate comparison, we largely follow the conventions adopted by Sillmann et al. (2013a) and Sillmann et al. (2013b), particularly the graphs of Sillmann et al. (2013b) (referred to as SKZZB hereafter). In the following, we first discuss some algorithm issues particularly related to the warming climate as simulated by the CMIP5 models in section 2, as complementary to Y&X. We briefly describe the new data set in section 3. In section 4, we analyse our results from a multiple points of view and compare them with those from the CLIMDEX data set. In section 5, we summarize our results to conclude the paper.

| ALGORITHM ISSUES
ZHZK carried out a set of Monte Carlo experiments simulating a zero mean and unit variance autoregressive process [AR(1)] X t with lag 1-day autocorrelation α and white noise innovations Z t to show the jump at the boundary between the in-base and out-of-base periods in the annual ER series. ZHZK then suggested a bootstrap resampling procedure to adjust the percentile estimates for the base period so that the jump can be removed. However, this method works only in the stationary situation such as model 2, which is not the case for the warming climate as simulated by CMIP5 models. For non-stationary cases with trends, it may introduce extra biases. This can be illustrated by simply adding an increasing trend to model 2 such as where Y t = 0.05 × (t\365) is a step function of t ("\" is the integer division operator), but also a linear function of the 365-day calendar year (cyan dashed lines in Figure 1). We carried 5,000 simulations of model 3 for 60 years. To illustrate more clearly the effect of ZHZK's method on an increasing trend, here we used the first 50 years as base period. Figure 1 shows the ensemble-mean ERs for the 10th and 90th percentiles (solid lines) and their averages over the base period (horizontal dashed lines), derived by EQE, ZHZK's method and QRM, respectively. Note that the EQE's dashed line is so close to the QRM's one that it is almost covered by the latter. In Figure 1a, the ZHZK's solid line is higher than the EQE's one at the beginning of the base period and then goes down closer and closer to the latter; in Figure 1b it is quite the contrary. In both cases, the largest difference between ZHZK's and EQE's solid lines is about 1.73%, and the ZHZK's dashed line is about 0.52% higher than the EQE's one, which are all significant by t test with p-values <2.2 × 10 −16 . It can be seen in Figure 1b that the EQE-derived ER transits smoothly from the in-base period to the out-of-base period; its first-order derivative (linear trend) is continuous at the boundary. However, the linear trend is greater in the ZHZK-adjusted ER than in the EQE-derived ER at the boundary, so that the transition of the ZHZK-adjusted in-base ER to the EQEderived out-of-base ER is not smooth. This time, it is ZHZK's adjustment that introduces a "drop" in the linear trend in the annual 90th percentile ER at the boundary between the in-base and out-of-base periods. The QRM approach still works well, and the EQE is also fine in terms of homogeneity. Nevertheless, neither the jump nor the drop is a serious problem. The important issue is that ZHZK's method, or basically the EQE, does not actually take account of the autocorrelation in data. The effect of autocorrelation can also be seen through the simplest AR(1) case as well. In model 2, if Var(Z t ) = σ 2 , then Var(X t ) = σ 2 /(1 − α 2 ). Assume that X t is the climatology of daily temperature, and Z t is the weather processes that is random conditional on X t . The existence of autocorrelation (nonzero α) makes the variability of X t conditional on the seasonal cycle visually greater than its real random variability (σ 2 ) caused by Z t . As a result, the variability of EQE-derived ER is also greater than in the case where α = 0. If the quantile estimator can decorrelate the autocorrelation, as performed in model 1, then the percentiles of X t and related ERs can be estimated upon the independence condition. If the actual number of exceeding days is of interest, no matter whether the temperature records on these days are autocorrelated or not, then the absolute threshold-based exceedance indices, such as frost days (FD) and tropical nights (TR) counting the days when daily minimum temperature is below 0 C or above 20 C, can serve the purpose. The percentile-based ER indices, however, quantify the behaviours of temperature extremes from the climatological distribution point of view, hence should be derived upon the independence condition so that the results are realistic.
In addition, if the ER indices are calculated on a monthly basis using EQE or ZHZK's method, there could be systematic, seasonally varying biases that can hardly be seen on an annual basis (refer to fig. 3b of Y&X for an illustration). The reasons for such biases are probably associated with the seasonally varying variability and the autocorrelation of daily temperature data. Because it is often desirable to analyse changes in climate extremes on a monthly or seasonally basis, season-related biases should not be ignored either.

| DATA SET
The new data set of the six percentile-based indices is available to download following Yang (2017). The 19 CMIP5 GCMs (Table 2) were chosen because they still had relatively complete data sets accessible through the CMIP5 data portal (https://esgf-node.llnl.gov/projects/cmip5/) by the end of 2016. They are largely but not exactly the same as those listed in table 1 of SKZZB. The RCP2.6 output is not available for CMCC-CM, CMCC-CMS and HadGEM2-CC. All the indices start from individual starting years to 2005 for the historical scenario, and from 2006 to 2100 for all the RCP scenarios. Indices are calculated only for the ensemble member r1i1p1 of each GCM run.

| RESULTS AND ANALYSIS
In order to compare our results with those from SKZZB, we still focus on the temporal evolution and spatial patterns of the six indices over land only, and follow the 21 subregions ( Figure 2) defined by Giorgi and Francisco (2000, table 2) to quantify the regional variations.

| ER indices 4.1.1 | Temporal evolution
To show the temporal evolution of the ER indices globally, their annual series are spatially averaged over land first and then are smoothed by using a 20-year moving window. The CMIP5 multi-model median and inter-quantile range (the range between the 25th and 75th percentiles) of the smoothed series are estimated to quantify the average and spread of the multi-model temporal evolution of the projected ER indices. Figure 3 shows the multi-model temporal evolution of the four ER indices under the historical and three RCP scenarios. By construction, their averages over the base period 1961-1990 are about 10%. TN10p and TX10p decrease from the late 20th to the 21st century in all the RCP scenarios. The median of TN10p decreases to 5.9, 3.7 and 1.8% by the end of 21st century under RCP2.6, RCP4.5 and RCP8.5, respectively, while that of TX10p decreases to 6.6, 4.9 and 3.1%. TN90p and TX90p increase consistently in the meantime, the former is more pronounced than the latter. TN90p increases to 17.3, 24.5 and 40.0%, respectively, while TX90p increases with smaller magnitude to 16.9, 22.1 and 35.3%. As a comparison, in the CLIMDEX data set (c.f., fig. 8 of SKZZB), TN10p decreases to 3, 1.5 and 0.3%, respectively, and TX10p decreases to 4, 2 and 0.7%.
TN90p increases to 31, 44 and 69%, respectively, and TX90p increases to 26, 39 and 62%. All the changes estimated from the CLIMDEX data set are greater in magnitude (the increases are much more drastic in particular) than those from the new data set.

| Spatial and seasonal patterns of changes
For the spatial patterns of change in annual ER indices, multi-model medians of their on-land temporal averages over the period 2081-2100 are calculated, with reference to those over the base period 1961-1990 that are about 10% by construction. Changes in medians between the two periods are tested using the Wilcoxon rank sum test (Bauer, 1972). All the changes are significant at the 5% significance level. Figure 4 shows the multi-model medians of the annual TNnp (n = 10, 90) temporally averaged over the period 2081-2100 as absolute values of ER (in %). Their spatial patterns are similar to those resulted from the CLIM-DEX data set in general; that is, the largest decreases in TN10p and largest increases in TN90p are projected in tropical regions (c.f., fig. 9 of SKZZB). However, magnitudes of changes are generally smaller than those resulted from the CLIMDEX data set. Figure 5 is the same as respectively. This is also consistent with the analysis of Figure 3.
As mentioned in section 2, EQE or ZHZK's method can result in seasonally varying biases when applied to the estimation of monthly ER indices. Some threshold indices may also vary with seasons if calculated on a monthly basis. The percentile-based ER indices, however, count days using percentiles rather than absolute values as thresholds; they should not vary with seasons if calculated on a monthly basis, at least for the base period. If projected in the future emission scenarios, there could be seasonal variations due to the mechanisms in GCMs. In order to check if such biases exist, differences between multimodel medians of winter (December, January and February [DJF]) and summer (June, July and August [JJA]) indices averaged over the base period and then over individual subregions are calculated for the CLIMDEX and the new data set, respectively (Table S1, Supporting Information). For the CLIMDEX data set, of all the 84 results (four indices for 21 subregions) up to two decimal places, there are 51 positive ones, 26 negative ones and 7 zeros; the largest difference in magnitude is 0.22% for TN10p in GRL. Considering that most of the subregions are in the Northern Hemisphere so that the dominant greater DJF values are systematic, we have reason to worry that there are undesirable seasonal biases in the CLIMDEX data set. However for the new data set, there are 30 positive results, 22 negative ones and 32 zeros, all of which are less than 0.04% in magnitude. Therefore, we can safely use the new data set to investigate the regional and seasonal variations in the projected ER indices. Figures 6 and 7 show multi-model medians and interquantile ranges of annual, JJA and DJF indices averaged over the period 2081-2100 and then over individual subregions for TNnp and TXnp (n = 10, 90), respectively, using the box-and-whiskers plot. Their CLIMDEX counterparts are figs. 10 and S3 of SKZZB. TN90p and TX90p have pronounced seasonal variation in the future: increases are greater in the summer than in the winter. The most substantial increases in TN90p and TX90p occur in tropical and

| Duration indices
By definition, CSDI and WSDI also count days using the 10th and 90th percentiles as thresholds, respectively, similar to TN10p and TX90p, but with a restriction: consecutive for at least 6 days. This restriction makes CSDI and WSDI  much smaller than their corresponding TN10p and TX90p; most often, they are simply zeros. When globally averaged over the base period, CSDI and WSDI are very close to zero. Under the RCP scenarios, while CSDI reduces to zero, WSDI increases quickly. Following SKZZB, we use 1981-2000 as the reference period to look at changes in CSDI and WSDI as differences relative to the reference period. Figure 8a,b shows the multi-model medians and inter-quantile ranges of temporal changes in CSDI and WSDI, respectively. The median of CSDI decreases from above 0 days before 1980 to below 0 days after 2000, but within a very narrow range (approximately ±0.06 days), throughout all the historical and RCP scenarios. The median of WSDI increases from merely   120 days (AMZ). Once again, the most substantial increases in WSDI under RCP8.5 occur in AMZ, CAM, WAF, EAF and SEA, which is consistent with the increases in TX90p. In the CLIMDEX results (c.f., fig. S1 of SKZZB), patterns of changes are similar but with greater magnitudes; particularly, the global and regional WSDI under RCP8.5 increase up to 170 and 250 days (AMZ), respectively. Figure 9 shows the multi-model medians of temporally averaged changes in CSDI and WSDI over the period 2081-2100. Considerably different from the CLIMDEX results (c.f., fig. 7 of SKZZB) showing that the changes in CSDI and WSDI are significant everywhere over land under all RCP scenarios, only WSDI under RCP8.5 in our results shows substantial changes in the Tropics, especially in AMZ.

| CONCLUSIONS
We present a new data set of ETCCDI percentile-based temperature extremes indices for the historical and three RCP scenarios simulated by 19 CMIP5 GCMs, generated through the semi-parametric quantile regression approach, as an improved alternative to its CLIMDEX counterpart. The daily temperature data are decorrelated via the QRM (1) so that their percentiles are estimated upon the independence condition. Inhomogeneity is avoided not only in the annual ER series, but also in their linear trends. Particularly, the new ER results do not show significant seasonal variations during the base period if derived on a monthly basis. This feature is consistent with the construction of ER indices. All the six indices are derived by using the same set of percentile estimates for the base period so that they are consistent with each other. The new data set can authentically reveal the changes in temperature extremes from the climatological distribution point of view. Compared with its CLIMDEX counterpart, the new data set shows broadly the same change patterns in both the temporal evolutions and the regional variations, but with much smaller magnitudes. In the new (CLIMDEX) data set, TN90p increases to 17.3% (31%), 24.5% (44%) and 40.0% (69%), and TX90p increases to 16.9% (26%), 22.1% (39%) and 35.3% (62%), under RCP2.6, RCP4.5 and RCP8.5, respectively, by the end of 21st century. WSDI increases to 35 (170) days under RCP8.5. For the three increasing indices, the most significant changes occur in tropical and extratropical regions such as AMZ, CAM, WAF, EAF and SEA. The greatest increases in TN90p and TX90p occur in SEA and AMZ, up to 85% (99%) and 68% (89%) in the summer under RCP8.5; that in WSDI occurs in AMZ up to 120 (250) days under RCP8.5. In addition, unlike other types of indices that are directly based on observations, percentile-based indices change in a relatively mild way under the global warming, revealing that the climatological distributions are less susceptible to the global warming than observations. For a better understanding of the anthropogenic climate change, we need to study further the relationship between climatic observations and their underlying distributions.