Does increased atmospheric resolution improve seasonal climate predictions?

We assess the impact of atmospheric horizontal resolution on the prediction skill and fidelity of seasonal forecasts. We show the response to an increase of atmospheric resolution from 0.8 to 0.3° horizontal grid spacing in parallel ensembles of forecasts. Changes in the prediction skill of major modes of tropical El Nino Southern Oscillation (ENSO) and extratropical North Atlantic Oscillation (NAO) variability are small and not detected and there is no discernible impact on the weak signal‐to‐noise ratio in seasonal predictions of the winter NAO at this range of resolutions. Although studies have shown improvements in the simulation of tropical cyclones as model resolution is increased, we find little impact on seasonal prediction skill of either their numbers or intensity. Over this range of resolutions it appears that the benefit of increasing atmospheric resolution to seasonal climate predictions is minimal. However, at yet finer scales there appears to be increased eddy feedback which could strengthen weak signals in predictions of the NAO. Until prediction systems can be run operationally at these scales, it may be better to use additional computing resources for other enhancements such as increased ensemble size, for which there is a clear benefit in extratropical seasonal prediction skill.

Here we compare results from two parallel ensembles of seasonal hindcasts which are identical in all respects except atmospheric resolution, which is more than doubled in our tests. We compare the prediction skill of the main modes of tropical and extratropical variability and also tropical cyclones, all of which are known to exhibit seasonal predictability. We also examine eddy feedback on the Atlantic jet stream at different resolutions as this is one of the likely mechanisms by which increased resolution could improve simulated climate.

| STANDARD AND HIGHER-RESOLUTION PREDICTION SYSTEMS
Our hindcasts are taken from the GloSea5 prediction system which is used to make real-time seasonal forecasts at the UK Met Office . We examine hindcasts for the boreal summers from 1993 to 2015 and the neighbouring boreal winters from 1994 to 2016. Lagged ensembles are formed from three start dates centred on May 1 for JJA predictions and November 1 for DJF predictions. For DJF we use hindcast start dates of October 25, November 1 and November 9, while for summer we use April 25, May 1 and May 9. Further ensemble member perturbations are generated with a stochastic physics scheme as described in MacLachlan et al. (2014). We use a total of 21 ensemble members for each season. The system and model components are as described in MacLachlan et al. (2014) with initialisation of the ocean, atmosphere and sea ice. Ocean resolution is 0.25 latitude and longitude throughout the globe with 75 quasi-horizontal ocean levels. The atmospheric component contains 85 quasi-horizontal levels and is run at two horizontal resolutions for this study: standard horizontal resolution is 0.83 while higher horizontal resolution is 0.35 longitude.

| PREDICTION SKILL OF ENSO
Seasonal predictability of El Nino Southern Oscillation (ENSO) has been well established for many years (Cane et al., 1986) and the high predictability of ENSO is now a cornerstone of current seasonal prediction capability (Smith et al., 2012). Nevertheless, outstanding questions remain (Tang et al., 2018), for example regarding the predictability of different ENSO types (Imada et al., 2015;Ren et al., 2019), or increases in skill as models improve (e.g., Luo et al., 2008). Although comprehensive models now appear to have the edge over simpler prediction models, there is also variation in prediction skill over time (Barnston et al., 2012). This variation may simply be due to low-frequency fluctuations in the amount of ENSO activity itself (Chen et al., 2004) so here we examine seasonal prediction skill for ENSO in our two resolutions for the same set of boreal winter and summer seasons. Figure 1 shows ensemble ENSO predictions for the standard and higher-resolution models. Both models produce slightly overactive predictions, with ENSO anomalies in both warm El Nino and cold La Nina cases exceeding the observed amplitude. The predictions are also overconfident in the sense that the ensemble members cluster around the ensemble mean but do not span the observed values in some cases, as is well known for tropical seasonal predictions (Weisheimer and Palmer, 2014;. Comparison of the standard and high-resolution cases gives a very clear message: the skill scores are almost identical in the two systems, albeit with slightly lower skill in summer than winter, consistent with the lower amplitude anomalies in summer. Similarly, the overprediction of the strength of anomalies and the overconfidence of the ensembles is very similar across the two resolutions. Seasonal prediction skill for ENSO therefore appears to be insensitive to the doubling of atmospheric model resolution tested here.

| PREDICTION SKILL OF TROPICAL CYCLONES
Seasonal forecasts of tropical cyclone numbers and activity has been carried out using empirical-statistical methods based on ENSO and other factors for many years (Klotzbach et al., 2017). Over the last two decades, coupled ocean-atmosphere general circulation models have been shown to skilfully predict tropical cyclones on seasonal timescales (Vitart et al., Vitart and Stockdale, 2001) and these are now improving to show skill on specific regional scales (Vecchi et al., 2014), with some skill in landfall predictions (Camp et al., 2015). As the record of events grows, there is also a growing number of successes in real-time prediction of extreme tropical cyclone seasons (e.g., Camp et al., 2018).
F I G U R E 1 Seasonal prediction skill of ENSO at different atmospheric resolutions. Ensemble predictions are shown for standard resolution (left) and higher resolution (right) for summer (JJA, upper) and winter (DJF, lower). Anomalies in Niño3.4 are plotted for the observations (black), the ensemble mean (red line) and ensemble members (red dots). Correlation skill scores are shown for each case and observed indices are from HadISST1.1 (Rayner et al., 2003) F I G U R E 2 Bias in tropical cyclone track frequency. The climatological bias in tropical cyclone track frequency is plotted for standard resolution (upper) and higher resolution (lower) hindcasts. Differences between the forecast and observations are plotted for the June-November period 1993-2015. Observed tropical cyclone data for the North Atlantic and East Pacific are taken from HURDAT2 (Landsea and Franklin, 2013), for all remaining basins, data are from the Joint Typhoon Warning Centre besttrack database (Chu et al., 2002) Given their horizontal scale and structure, there is sensitivity of tropical cyclone simulations and short range forecasts to horizontal model resolution (e.g., Davis et al., 2010;Gopalakrishnan et al., 2012) and so we examine here whether there is also a sensitivity of seasonal prediction skill to resolution for the commonly used metrics of storm numbers and accumulated cyclone energy.
We first examine the climatology of storm numbers in hindcasts. Figure 2 shows the difference between simulated and observed storm track frequency. The standard resolution model produces a reasonable climatology of storm numbers in the Atlantic basin but overestimates the number in the Pacific. Although this bias may depend to some extent on the choice of tracking method, we used the same method on the higher-resolution storms to give a fair comparison (see Camp et al., 2015). In this case increased resolution gives a small improvement in storm numbers over the Atlantic, but the excess of storms in the Pacific actually increases and there is no improvement to the overestimate of storm numbers in the North Indian Ocean, so there is little overall benefit to climatological storm track frequency over the range of resolutions considered here. Table 1 shows the skill of standard resolution hindcasts compared to the skill from higher-resolution hindcasts for each tropical ocean basin and the Northern Hemisphere as a whole. Positive skill in storm numbers is found in all basins (except for the North Indian Ocean) and at both resolutions, with highly significant and potentially useful levels of skill in the Atlantic and East Pacific. However, we find no statistically significant change in skill from the doubling of resolution between standard and higher resolution. Similar results follow for accumulated cyclone energy (Table 2) where again, good skill levels are found in the Atlantic, the East Pacific and in this case, also the West Pacific. However, as with cyclone numbers, there is no significant change in skill with doubled atmospheric resolution, suggesting that any benefits to seasonal prediction of tropical storms from increased resolution, at least in the range considered here, is likely to be small.

| PREDICTION SKILL OF THE NAO
The GloSea5 system produces skilful forecasts of the winter North Atlantic Oscillation (NAO) from initial conditions in early November  and in recent years, multiple seasonal forecast systems have been shown to produce skilful predictions of the winter NAO at this lead time (Athanasiadis et al., 2014;Butler et al., 2016;Baker et al., 2018). Figure 3 shows ensemble hindcasts of the winter NAO at the standard and higher resolutions. The correlation scores exceed 0.5 for the standard resolution and are similar to the skill reported by Scaife et al. (2014). There is no improvement in these scores at the higher resolution and correlation is nominally lower, although the value is within the uncertainty and therefore not significantly different (cf., Siegert et al., 2016).
While seasonal predictions of the NAO have been demonstrated to contain significant skill, there is a prominent outstanding problem with the amplitude of the forecast NAO signals Scaife et al., 2014). This results in ensemble predictions that contain inherently low levels of predictability in the sense that they are unable to skilfully predict single ensemble members and yet, they are still able to predict the observed NAO. This so-called signalto-noise paradox  is present in different forecast systems (Baker et al., 2018) and could in principle be due to the limited resolution of these systems. To test whether the signal to noise ratio of the standard and higherresolution systems is the same, we compare the predictable (ensemble mean) and total (ensemble member) standard deviation of the NAO in our two sets of hindcasts. For the standard resolution case, the ensemble mean standard deviation is 2.5 hPa and the ensemble member standard deviation is 8.3 hPa, giving a ratio of around 0.3. For the higher-resolution case, the ensemble mean has a standard deviation of 2.3 hPa and the ensemble members have a standard deviation of 7.8 hPa. In this case the ratio remains 0.3 and so we detect no increase in the signal to noise ratio as the horizontal resolution of the forecast system is doubled. At least over the range considered here, the under-prediction of the strength of ensemble mean NAO signals noted in other studies of ensemble seasonal hindcasts is therefore insensitive to the horizontal atmospheric resolution of our prediction system.

| EFFECT OF HORIZONTAL RESOLUTION ON SMALL SCALE EDDY FEEDBACK
As resolution increases, we can expect to resolve more of the mesoscale eddy spectrum of the atmosphere which falls off relatively slowly with wavenumber (~k −5/3 ) below scales of 100 km or so (Nastrom et al., 1984). Eddies on this scale might be better resolved in our higher-resolution hindcasts. These high-frequency mesoscale eddies have also been shown to feedback positively onto larger scale anomalous flows in the atmosphere (Lau, 1988;Robinson, 1991; F I G U R E 3 Seasonal prediction skill of the NAO at different atmospheric resolutions: Ensemble predictions are shown for standard resolution (left) and higher resolution (right) for winter (DJF) NAO predictions. Anomalies in the NAO sea level pressure index between Iceland and Azores are plotted for the observations (black), the ensemble mean (red line) and ensemble members (red dots). Observed indices are taken from HadSLP2 (Allan and Ansell, 2006) F I G U R E 4 Bias in the North Atlantic winter jet and eddy momentum forcing. Climatological bias of the North Atlantic zonal wind at 850 hPa (m/s, upper panels) and eddy momentum flux convergence at 200 hPa (m/s/day, lower panels) standard resolution is shown on the left and higher resolution on the right. Black contours are mean climatology from observational reanalysis (Dee et al., 2011) and shading shows the difference between predictions and observational analysis. The right column displays zonal averages across the Atlantic basin Feldstein and Lee, 1998;Kug and Jin, 2009;Kang et al., 2010) and so it might be possible to increase the strength of the NAO signals discussed above if resolution were increased to the point where mesoscale eddy feedback was better resolved. We first examine eddy feedback on the Atlantic jet stream in our standard and higher-resolution hindcasts. Figure 4 shows the mean zonal winds and their biases relative to observational reanalysis winds in the standard and higher-resolution hindcasts for DJF. Both resolutions show a poleward bias in the jet location compared to observational reanalyses (Figure 4c). We also compare the high-frequency eddy momentum flux convergence in both sets of hindcasts: where u' and v' are the transient components of the wind using 6 hr data and the overbar indicates the time mean. This quantity feeds positively into the momentum budget of the mid latitude jets as shown in Figure 4d,e but consistent with the bias in mean winds there is also a corresponding lack of eddy momentum flux convergence, which is too weak in both our standard and higher-resolution model ensembles (Figure 4e). Figure 4e also shows that there is only a small improvement as resolution is increased over the range considered here. This result is consistent with weak eddy forcing in some other models (e.g., Willison et al., 2013;Lu et al., 2015). So if the eddy forcing is too weak in our models and if it is also insensitive to the resolution range tested in our hindcasts, then perhaps we need to increase resolution further to better represent the mesoscale eddy activity? Indeed, recent evidence suggests that there may be some sensitivity of the eddy driven jet to resolution (Lu et al., 2015) and sensitivity of simulated extratropical cyclones to resolution at thẽ 10 km scale (Sheldon et al., 2017). Unfortunately, although global models are now being developed at this latter scale and beyond, we do not have seasonal hindcasts at this higher resolution due to their computational cost. However, we do have a sample of global atmosphere-only simulations of 15 winters at our highest (0.14 ) atmospheric resolution, from which we can calculate high-frequency eddy feedback. For consistency we compare these with parallel sets of atmosphere only simulations at our standard and higher resolutions and a set of lower-resolution simulations, each of which have 192 winters.
In order to relate results back to the strength of NAO anomalies, which are known to be at least partly driven by synoptic eddy feedback (Limpasuvan and Hartmann, 2000;Ren et al., 2009), we calculate the synoptic eddy vorticity forcing following Yu and Lin (2016): where f is the Coriolis parameter, g o is the standard acceleration due to gravity, V is horizontal wind vector and ζ is relative vorticity. Primes indicate that a 2-8-day band-pass filter was applied to the 6-hourly data and the overbar indicates the time average. We then regress this feedback onto the centres of action of the NAO to give the anomalous eddy forcing per standard deviation of the NAO in our winter simulations. Figure 5 shows the amount of eddy vorticity forcing and also the regressed amount of upper tropospheric geopotential F I G U R E 5 North Atlantic eddy feedback as a function of resolution. Modelled eddy vorticity forcing (left) and geopotential height (right) per unit standard deviation of the NAO. Values are calculated by regression onto the NAO index. Observational reanalysis shows higher values than found in models, especially for lowresolution models and the values only approach convergence at N1280 (0.14 ) resolution or finer. The horizontal scale is proportional to grid box size height per unit change of the NAO for simulations at the lower (1.88 ), standard (0.83 ), higher (0.35 ) and highest (0.14 ) resolutions. The same quantities are also plotted for observational reanalyses, which we can consider to have very high resolution.
The comparison of standard (0.83 ) and lower (1.88 ) resolution simulations yields a similar result as in Figure 4, with little change in eddy feedback between resolutions. This feedback is substantially weaker than in observational reanalyses in both cases and is weakest at the lower resolution. Interestingly, increase of resolution to 0.18 shows an increase in eddy forcing ( Figure 5a) and a corresponding significant increase in geopotential height signatures in the NAO (Figure 5b). Although it is still lower than the observed value, the strength of the eddy feedback at this resolution and the relationship between upper tropospheric geopotential height and the surface NAO approaches the value seen in observational analyses. It seems that this underrepresentation of eddy feedbacks may therefore be important for large scale features in the atmosphere and the strength of predicted NAO signals. Significantly higher resolution than is currently used for seasonal predictions may therefore be needed to correct this error.

| CONCLUSIONS
We have examined the effects of more than doubling atmospheric resolution, over a range of 0.83 to 0.35 , on seasonal prediction skill in parallel ensemble hindcasts. Although no study can carry out a fully comprehensive analysis of seasonal hindcast skill, we examined the skill of the main tropical (ENSO) and extratropical (NAO) modes of variability and what are arguably the most devastating hydrometeorological extreme events (tropical cyclones); all of which have previously been shown to exhibit predictability on seasonal timescales.
There are still questions about whether current general circulation models converge to realistic solutions as resolution is increased (e.g., Gustafson Jr. et al., 2014) but benefits of increasing atmospheric resolution beyond the 1 scale have previously been shown for seasonal prediction of surface climate , tropical cyclones (Chen and Lin, 2013) and extratropical storm tracks . Here we further increased atmospheric resolution and find that doubling from around 0.8 to around 0.3 resolution is not enough to make a large difference to seasonal prediction skill, as indeed has been shown for simulation of atmospheric features such as blocking and ocean-atmosphere interaction (Schiemann et al., 2017;Sheldon et al., 2017;Wan et al., 2018). However, we did find evidence for increased eddy feedback at yet higher resolution, which appears to feedback positively onto the NAO. This suggests that significantly higher resolution than is currently available for operational predictions may be required to increase the strength of predicted signals in seasonal forecasts of the NAO and resolve the signal to noise paradox in current climate simulations .
It is important to note that other aspects of resolution such as ocean resolution or vertical atmospheric resolution were not tested in this study. Nevertheless, there is evidence that improved ocean resolution can improve the fidelity of simulations relevant to seasonal prediction (Scaife et al., 2011;Hewitt et al., 2017). In particular, higher ocean resolution may increase the strength of ocean-atmosphere coupling (Minobe et al., 2008;Kirtman et al., 2017), although there is as yet limited evidence for increased prediction skill from increased ocean resolution. Similarly, existing literature shows that improved vertical resolution and domain can improve long range forecasts (Marshall and Scaife, 2010;Fereday et al., 2012;Sigmond et al., 2013), not least due to the effects of the stratosphere on extratropical prediction skill of the NAO , but this was also not tested here.
In summary we find little impact of a doubling of atmospheric horizontal resolution from 0.83 to 0.35 on seasonal predictions of ENSO, the NAO or tropical cyclones. However, we did find evidence for increased feedback from small scale eddies at much higher resolution that could strengthen predicted signals in seasonal forecasts of the NAO. Until this resolution can be implemented in ensemble prediction systems, it may be better to increase the ensemble size, vertical resolution or perhaps ocean resolution to improve the skill of global operational seasonal prediction systems.