Assessing changes in risk of amplified planetary waves in a warming world

Summer weather extremes are often associated with high‐amplitude atmospheric planetary waves (Petoukhov et al., 2013). Such conditions lead to stationary weather patterns, triggering heat waves and sometimes prolonged intense rainfall. These wave events, referred to as periods of Quasi‐Resonant Amplification (QRA), are relatively rare though and hence provide only a few data points in the meteorological record to analyse. Here, we use atmospheric models coupled to boundary conditions that have evolved slowly (i.e., climate), to supplement measurements. Specifically we assess altered probabilities of resonant episodes by employing a unique massive ensemble of atmosphere‐only climate simulations to populate statistical distributions of event occurrence. We focus on amplified waves during the two most extreme European heat waves on record, in years 2003 and 2015 (Russo et al., 2015). These years are compared with other modelled recent years (1987–2011), and critically against a modelled world without climate change. We find that there are differences in the statistical characteristics of wave event likelihood between years, suggesting a strong dependence on the known and prescribed Sea Surface Temperature (SST) patterns. The differences are larger than those projected to have occurred under climate change since the pre‐industrial period. However, this feature of small differences since pre‐industrial is based on single large ensembles, with members consisting of a range of estimates of SST adjustment from pre‐industrial to present. Such SST changes are from projections by a set of coupled atmosphere–ocean (AOGCM) climate models. When instead an ensemble for pre‐industrial estimates is subdivided into simulations according to which AOGCM the SST changes are based on, we find differences in QRA occurrence. These differences suggest that to reliably estimate changes to extremes associated with altered amplification of planetary waves, and under future raised greenhouse gas concentrations, likely requires reductions in any spread of future modelled SST patterns.

for pre-industrial estimates is subdivided into simulations according to which AOGCM the SST changes are based on, we find differences in QRA occurrence. These differences suggest that to reliably estimate changes to extremes associated with altered amplification of planetary waves, and under future raised greenhouse gas concentrations, likely requires reductions in any spread of future modelled SST patterns.

K E Y W O R D S
atmospheric models, climate change, heat waves, large ensembles, mid-latitude extremes, quasi-resonant amplification, Rossby waves

| INTRODUCTION
The implications of raised atmospheric greenhouse gas (GHG) concentrations for the climate system are the subject of intense scientific research (IPCC, 2013). Such research is strongly motivated by the need to assess changes to extreme weather event frequencies, which will have a disproportionate effect on society (e.g., Easterling et al., 2000). Observations and climate simulations indicate general warming since pre-industrial times, yet the statistical distribution of temperatures may also change shape, even asymmetrically (Kodra and Ganguly, 2014), and potentially in situations where the global average temperature is relatively unchanged. This is despite, when averaging globally, that overall variability might be relatively invariant (Huntingford et al., 2013). Any extension of the upper tails of the temperature distribution, alongside background warming, is of particular concern for heat stress risk. Higher GHG levels also cause an attributable intensification of the hydrological cycle (Min et al., 2011). Climate models project average atmospheric water vapour to increase by up to 3% per degree of global warming, although data-based constraints suggest this is an overestimation (DeAngelis et al., 2015). Changes to the intensity of extreme rainfall events is rising much faster, at nearer 7% per degree of warming (Westra et al., 2013), again indicating extremes exhibit different behaviour to mean variation.
For mid-latitude extremes, it is necessary to explore what may happen in a warmer climate, and if changes are distinct from the interannual variability of Sea-Surface Temperature (SST) and other boundary conditions (Hoskins and Woollings, 2015). Many report geographically varying changes to extreme temperature events (Herring et al., 2015;Russo et al., 2015;Herring et al., 2016;Seneviratne et al., 2016), suggesting processes beyond direct radiative warming, and again indicative of additional alteration to dynamics and circulation patterns (Coumou and Rahmstorf, 2012;Palmer, 2013). Particular to mid-latitudes, persistent weather patterns are often associated with near stationary and highly amplified Rossby waves (Francis and Vavrus, 2012). Stationary and potentially stationary waves, with high amplitude ridges and troughs, have caused dangerous heat waves or flooding (Woollings, 2010;Coumou et al., 2014;Huntingford et al., 2014;Petoukhov et al., 2016;Stadtherr et al., 2016), and more specifically in the Central United States, Western United States and Central Eurasia (Kornhuber et al., 2017a). One concern is whether, within a GHG-enriched atmosphere, such amplification will occur more often. Understanding this possibility has led to the concept of Quasi-Resonant Amplification (QRA) of zonally elongated Rossby waves at mid-latitudes, which can become persistent when trapped by waveguides and amplified by orographic and thermal gradients (Petoukhov et al., 2013). However, this may not alone explain the increase in summer extremes, as there is no significant increase observed in wave numbers six (Screen et al., 2013;Screen and Simmonds, 2013a) although evidence exists of increasing wavenumber seven events (Kornhuber et al., 2017b).
Clusters of QRA events have occurred in recent decades (Petoukhov et al., 2013;Coumou et al., 2014;Petoukhov et al., 2016;Kornhuber et al., 2017b). Atmospheric dynamics and thermal structure conditions for QRA are summarised by Petoukhov et al. (2013) and in our Supporting Information. The robust identification of changes in the prominence of particular wavelength Rossby waves is limited because the resonant behaviour is relatively rare. Others have questioned if we are entering a period of more resonant-type events (Screen and Simmonds, 2013b), but recommend the QRA hypothesis (Petoukhov et al., 2013) deserving of more analysis. One assessment (Mann et al., 2017) has been to investigate hemispheric features of surface temperature associated with long duration wavenumber seven QRA events for GCMs in the CMIP5 database (Taylor et al., 2012). In that analysis, the occurrence of QRA events is linked to a resolved surface temperature "fingerprint," determined from historical reanalysis data. Mann et al. (2017) suggest that zonal mean surface temperature conditions, based on CMIP5 projections, are increasingly favourable for QRA as atmospheric GHG concentrations increase.

| MODELLING FRAMEWORK
We take a novel approach to understanding QRA events, by scanning a massive ensemble of atmospheric climate simulations with the HadAM3P model (Massey et al., 2015). By using a large atmospheric model ensemble, it is possible to understand better the evolution of statistics describing extremes. For instance, Schaller et al. (2016) use this ensemble to demonstrate that changed atmospheric circulations since pre-industrial times likely played a role in the UK Winter 2013/2014 severe flood events, adding to direct thermodynamic influences. The HadAM3P atmospheric model is known to have a good representation of mid-latitude dynamical regimes (Mitchell et al., 2017) while the relatively high model resolution allows more confidence in the calculation of QRA statistics directly from the modelled atmospheric wind-fields. This combined atmospheric modelling plus QRA detection framework enables the importance of SSTs, meridional temperature gradients and anthropogenic global warming contribution to be tested for their impact on conditions for resonance occurrence. Extremes are by definition rare, and as measurement time series are short, there can only be low confidence in determining whether return times are changing. This problem is also relevant for most climate models, due to their limited years of simulations. One exception is our "climateprediction-dotnet" (CPDN) HadAM3P-based system, which has available massive ensembles of climate simulations. CPU cycles donated as a citizen-science project build these extensive sets of simulations. These large ensembles provide us with a vastly improved sampling of extreme events and enable indepth analysis of resonant mechanisms, their drivers and their likelihood of occurrence.
Wind and temperature fields from the CPDN ensemble are used to calculate QRA statistics based on the equations in Petoukhov et al. (2013). The particular form of the QRA detection scheme we utilise is described in Kornhuber et al. (2017b) and outlined in our Supporting Information. Table S1 provides the conditions for resonance. We use seven large ensembles, analysing the modelled summer months (JJA) in the Northern Hemisphere. Six ensembles are for the particular years of 2003, 2012 and 2015, for which many thousands of simulations are available. These are years of concern, as 2003 and 2015 correspond to two severe summer European heat waves, and year 2012 similarly for the United States. Each of these modelled years has two ensembles describing present day ("all forcings") and pre-industrial state ("natural"), where the latter represents an experiment designed to remove the anthropogenic influence on the climate system. The all forcings simulations use 5-day OSTIA sea-ice extent and SSTs (Donlon et al., 2012) for the given years. The natural calculations are forced with lower SSTs and more sea-ice, as estimates for a world in a pre-industrial state (along with pre-industrial estimates of atmospheric radiatively-active GHGs). The anthropogenic SST increase is estimated using the protocol described in Schaller et al. (2016), giving the monthly mean projected changes from pre-industrial to present day. Calculation is for 11 GCMs. For each month, the changes are derived from the mean SST values in model years [1996][1997][1998][1999][2000][2001][2002][2003][2004][2005], and as the difference between "Historical" and "HistoricalNat" GCM simulations in the CMIP5 database (Taylor et al., 2012). The "HistoricalNat" simulations contain no modelled anthropogenic influence on climate. Different estimates based on multiple GCMs are considered in order to represent current uncertainty in how SSTs have evolved since the preindustrial period. These SST changes are then subtracted from the yearly values of the all forcings simulations, to generate boundary conditions for the natural experiments. The seventh simulation is for individual yearly simulations representative of different years in period 1987 to 2011 inclusive. These simulations have smaller ensembles of order hundreds for each year and no parallel pre-industrial equivalent years. We call this the many-year all forcings ensemble.

| NUMERICAL RESULTS
In Figure 1 we provide visual evidence of performance capability by our atmospheric model HadAM3P, and by extending the existing analysis of Mitchell et al. (2017). QRA is strongly dependent on background wind fields, which in turn link tightly to atmospheric pressure distribution. Figure 1 presents modelled 300mb geopotential height for period JJA, and comparison is made against ERA-Interim data (Dee et al., 2011) for recent years. This figure provides evidence that CPDN performs well regarding estimating the gross features of the upper-tropospheric pressure distribution, at least on average. It has relatively low bias in predicting spatial variation in both Northern latitude summer mean 300mb height and in its daily standard deviation. The difference between CPDN and ERA-Interim compares well to the overall spread of uncertainty by the CMIP5 models at estimating 300mb height (Ao et al., 2015). Additionally, with QRA dependent on the strength and shape of the zonal mean of the zonal wind (see eq. 9 of Kornhuber et al., 2017a) and related meridional gradients in that quantity, we show the mean and spread of daily values of this wind diagnostic in Supporting Information Figure S1. Figure S1 is also for the JJA period, at 300 mb height, and for both CPDN and ERA-Interim. Additionally, in Figure S2, we compare the spread of wave amplitudes for wavenumbers 3-8, JJA period. In the most general sense, CPDN performs well in comparison to the ERA-Interim data of zonal mean zonal wind and of wave amplitudes. Figure 2 presents the CPDN (i.e., HadAM3P-based) estimates of QRA for different durations, and for all seven ensembles, as well as diagnosed from the ERA-Interim reanalysis product (Dee et al., 2011). If resonance is discovered in any wavenumber six to eight for a particular day, it is recorded as a single event. Hence if a long-duration wavenumber six event overlaps one of wavenumber eight, days in common are counted only once in the probability  . Also shown are the frequencies for estimated boundary conditions and forcings for pre-industrial (natural, orange). That is, representing those years but as if climate change has not occurred. Right-hand panels (b), (d), (f) and (h) repeat those of the left-hand panels, but only when QRA conditions are met, and the extreme resonance events are higher than the wave amplitude threshold identified in Petoukhov et al. (2013). All plots show resonance occurrence based on binning into durations of width 5 days. In all panels, the vertical axes are logarithmic to provide more focus on the very rare long duration QRA events statistics. Top panels show the probability of resonance events of different lengths for the ensemble representing multiple recent years, and for comparison, the ERA-Interim data. Uncertainty bounds (Supporting Information) are bigger for the re-analysis product due to fewer available data points. Although some differences exist in the respective distributions, there are broad similarities. CPDN uncertainty bounds are generally within those of ERA-Interim, indicating model reproducibility. This feature of the uncertainty bounds is valid for general QRA conditions (left panels) and QRA conditions with occurrence of amplified waves (right panels); definitions identical to that in Petoukhov et al. (2013), and see also Supporting Information. We then analyse our three specific years of 2003, 2012 and 2015, for which very large ensembles are available. Shown are frequencies of QRA occurrence by simulations with known SST and sea-ice extents (and prescribed atmospheric GHG concentrations) for those particular years, that is, from the all forcings ensemble. QRA frequencies for the ensembles representing pre-industrial conditions, that is, the natural simulations, are also presented. Figure 2 shows estimates for the complete natural ensemble, and hence each of the ensemble members uses one of a range of different ocean-atmosphere coupled GCMs (i.e., AOGCMs; Taylor et al., 2012) to estimate pre-industrial SST levels. In Figure 2c-h, there is almost no change in the probability of resonance between the all forcings ensemble versus the natural ensemble. This invariance is noted for different durations, year, and whether considering only conditions for QRA, or QRA conditions combined with high wave amplitudes above the thresholds utilised in Petoukhov et al. (2013). There is, however, some evidence that the atmospheric model underestimates the probabilities of the longest duration events, which could lead to projections of very persistent heat wave events that are too conservative. We investigate if there are factors that may cause different probabilities of QRA conditions and wave amplification. Figure 3a shows inter-annual differences, and for the manyyear all forcings ensemble that has order hundreds of members for each year in period 1987-2011. This figure illustrates that the probability of QRA conditions plus wave amplification has differences between modelled years, and  these are statistically significant as uncertainty bounds often do not overlap. The main modelled variation is inter-year differences in boundary conditions given to HadAM3P, which are the known SST and sea-ice conditions that correspond to each year. The implication is that variation in these boundary conditions can influence the chances of resonance. SST and sea-ice state influence near-surface conditions, and guided by this we first look for linkages between probability of QRA and features of zonal-mean temperature at 1.5 m, including over both ocean and land.
For the many-year all forcings ensemble covering period 1987-2011, we derive for each year the within-ensemble mean meridional gradient of 1.5 m zonal-mean temperature over land and ocean (K deg −1 ). The gradient is calculated based on its change over latitudes 25 N to 55 N, and the magnitude of these values relates to the horizontal axis of Figure 3b. The vertical axis is the probability of QRA conditions with wave amplification. Blue circles show each year, except for that of 2003, which is red. The dashed line is the result from a regression analysis performed between these two quantities. Two-sided confidence level bounds are marked as annotated text, and as these do not include zero, the gradient (value −0.808 [K(deg. lat) −1 ] −1 ) is significant at the 95% level. This regression provides additional evidence that features of near-surface temperature patterns affect QRA probability. Also plotted, for comparison, are the very large ensemble focus years of 2003, 2012 and 2015. The relative nearness for those years between modelled contemporary and pre-industrial times (marked by joined lines) again highlights the small changes in QRA probability between the all forcings and natural ensemble simulations.
The large ensembles that focus on individual years (i.e., 2003, 2012 and 2015) employ a range (11) of AOGCMbased calculations of pre-industrial SSTs to enable estimates of their equivalent natural state, as outlined above. As Figure 3b suggests QRA occurrence may have a dependence on near-surface temperature features, studying statistics for the entire natural ensembles could, therefore, suppress the impact of individual estimates of SST change from the different AOGCMs. That is, different estimates of SST changes may cause compensations when generating statistics that aggregate attributes from the individual ensemble members. This aggregation may create the small differences between estimated QRA occurrence for the all forcings numerical experiments and when compared to the full natural ensemble experiments. To test for this, the year 2015 natural ensemble is split into subsets and according to the particular AOGCM used to estimate SST changes since the pre-industrial period. Such subdivision allows characterisation of the effect of different estimates of the SST changes subtracted from their known year-2015 values to represent pre-industrial conditions (method of Schaller et al., 2016).
The two left-hand bars in Figure 3c,d are an aggregation of Figure 2g,h, showing the overall daily probability of QRA conditions (panel c) or QRA conditions and amplified wave occurrence (panel d). As expected, the two left-hand bars in Figure 3c are similar to each other, and with calculations using the full ensemble for the natural estimate. Likewise for Figure 3d. All the further right-hand orange bars are natural estimates of daily probability of QRA conditions or QRA conditions plus amplified waves, but for different AOGCM-based SST changes, with models as labelled. The nonoverlapping uncertainty bounds for many pairs confirms AOGCM-specific differences. This also provides evidence that there could be changes in QRA likelihood between preindustrial times and present (compare AOGCM-specific values to the red all forcings histogram bar). By implication, the coupled AOGCMs most capable of estimating SST changes over that period are more likely to capture any adjustments to the occurrence of QRA. Uncertainty in the evolution of SSTs is already known to affect the ability to predict future changes to other attributes of the atmosphere, for example, Ma and Xie (2013). The SST projections across the CMIP5 model ensemble form clusters of similar responses to altered GHG concentrations and atmospheric features reflect this classification (Mizuta et al., 2014). Additionally the fifth IPCC report notes that contemporary estimates of SSTs with AOGCMs often contain significant biases, dependent on location (Flato et al., 2013).
In Figure 4 we expand on Figure 3b, to identify how meridional gradients in zonal-mean 1.5 m temperature influence the probability of QRA conditions with wave amplification, depending on the latitudinal region considered. Presented are regression values based on QRA probabilities and mean temperature gradients calculated for individual years in period 1987-2011, that is, with the many-year all forcings ensemble. Hence shown in Figure 4a are regressions derived identically to the calculation of the gradient of the dashed line in Figure 3b, except now across latitudinal bands with various lower and upper values as marked. The selection of 1.5 m temperature allows disaggregation of zonal mean calculations, to over ocean only (Figure 4b) and over land only (Figure 4c). Figure 4 illustrates that the gradient can take a different sign, based on the region considered. Coumou et al. (2014) find that a decrease in meridional gradients favours QRA, which are positive values in our Figure 4. This diagram also demonstrates a particularly strong, and often statistically significant, dependence on 1.5 m meridional gradient when considering only land points.
In Figure 5 we further analyse the direct SST influence (rather than 1.5 m temperature) on likelihood of QRA conditions. Figure 5 shows the latitudinal distribution of the ensemble-mean meridional gradient in zonal-mean SST, plotted as the difference between when QRA conditions exist minus for when the conditions are not met. There is consistency, between the two simulations based on year 2015 (i.e., all forcing and natural estimates), and of the influence of SST gradient on QRA conditions. Similar levels of consistency are between the two year 2015-based ensembles and the many-year all forcings ensemble across multiple years of 1987-2011. However, there is very little correspondence between these curves and the identical calculations based on the ERA-Interim historical record (orange curve), noting the magnitude of variability is much lower in the latter. The general point to be made is that all curves have latitudinal locations where their values are away from the zero line. Hence during periods of meeting QRA conditions, on average meridional SST gradients will be slightly different to those when such conditions are not met. Representative uncertainty bounds are placed on the many-year all forcings curve, and these are extremely small due to massive sampling enabled by the large ensemble size (Methods).

| DISCUSSION AND CONCLUSIONS
Mid-latitude atmospheric QRA occurrences with associated high-amplitude waves (Petoukhov et al., 2013;Petoukhov et al., 2016) link to extreme warming and flood events, and therefore have strong societal impacts. They are relatively rare, preventing analysis from measurement records alone of any altered likelihood under climate change. Climate simulations can supplement measurements to construct statistical distributions of extremes, and estimate meteorological conditions for when very little data is available such as the pre- industrial period. Despite this potential offered by climate models, in general computational demands limit the number of GCM model years performed, even when in atmosphereonly mode. This prevents the population of statistical distributions characterising rare event frequency. We circumnavigate the lack of ability in research centres to undertake large numbers of simulations by instead using a unique massive ensemble of atmospheric simulations (Massey et al., 2015) donated from spare computational time on personal computers. For years 2003, 2012 and 2015, particularly large all forcings ensembles are available, as well as parallel natural ensembles describing pre-industrial conditions. The latter use modulated SST drivers, based on decadal mean climate change projections of alterations in such temperatures since pre-industrial times. These SST changes are derived from multiple coupled ocean-atmosphere AOGCMs in the CMIP5 ensemble (Taylor et al., 2012). We then apply the QRA-detection scheme (Kornhuber et al., 2017b) to diagnostics from the large ensembles, to determine when conditions for quasi-resonant amplification are present.
Comparing the full natural ensemble to that of all forcings shows that for June, July and August, the probability of QRA is similar in both ensembles. Such similarity would initially suggest that QRA occurrence is not affected by any climate change caused by historical increases in GHG concentrations. However, when analysing a further ensemble describing different years across the last three decades, then interannual variations are present in modelled resonance probability. Atmospheric model configuration differences for HadAM3P between years are predominantly altered SST and sea-ice fields, suggesting a sizeable dependence on them. Based on these noted variations by different boundary conditions, we return to the natural ensemble relevant to year 2015. This leads us to split the full natural ensemble into subsets, indexed by the GCM used to estimate SST changes under climate change and since the pre-industrial period. We find that for some GCM-specific modelled SST change estimates, QRA conditions with amplified wave occurrence increases (Figure 3c,d). For others, we find the opposite. We conclude from these differences, and through our use of large ensembles of atmospheric simulations, that QRA occurrence can change in response to raised GHG concentrations altering features of SST and sea-ice patterns.
The dependence of QRA on near-surface thermal conditions (further illustrated in Figures 4 and 5) has an analogy to links identified between long-duration resonance likelihood and both land and ocean temperature gradients (Mann et al., 2017). In particular, Mann et al. (2017) argue that the emerging effects of strong land-ocean temperature contrasts may also impact on QRA occurrence, and this has relevance to our land-based Figure 4c. Coupled climate models project particularly large and latitude-dependent land warming in high Northern latitudes compared to background global warming amounts (Huntingford and Mercado, 2016). Hence any future analysis of the SST-driven atmospheric-only CPDN simulations could include a rigorous assessment of their performance at projecting temperature changes over high latitude land points, and including their representation of terrestrial surface energy partitioning.
Our analysis highlights the importance of reducing uncertainty in projecting the response of SSTs to increasing greenhouse gas concentrations by coupled atmosphere-ocean models. Improving understanding of SST changes will, therefore, help towards generating more robust estimates of future resonance occurrence, as well as providing a better interpretation of historical changes. Careful consideration of the method of emergent constraints (Hall et al., 2019) may lead to a contemporary fluctuating and measurable quantity that can differentiate between reliable and unreliable estimates of future SST warming patterns. Similarly, any new techniques to aid the more accurate reconstruction of historical sea-ice extent will enhance understanding through providing refined driving conditions to atmospheric climate models. Better estimates of sea-ice extent will also reduce the risk of numerical inconsistencies at their bounds with SST fields. The dependence of probability of QRA on features of 1.5 m temperature over land demonstrate the importance of accurate depiction of any teleconnections linking terrestrial warming to projected SST changes in climate models.
P002099/1). The thoughtful advice of the anonymous reviewers led to significant improvements to the manuscript.

CONFLICT OF INTEREST
All authors confirm they have no conflict of interest.

DATA AND CODE AVAILABILITY
The large ensembles used in this study are, in total, of order 10TB. Upon request, these will be made available by copying on to supplied external hard drives, for returning by mail. Computer code used throughout the analysis is available from C.H., except for the QRA analysis code, which is available from K.K.