Seasonal forecast skill for extratropical cyclones and windstorms

Extratropical cyclones and their associated extreme wind speeds are a major cause of vast damage and large insured losses in several European countries. Reliable seasonal predictions of severe extratropical winter cyclones and associated windstorms would thus have great social and economic benefits, especially in the insurance sector. We analyse the climatological representation and assess the seasonal prediction skill of wintertime extratropical cyclones and windstorms in three multi‐member seasonal prediction systems: ECMWF‐System3, ECMWF‐System4 and Met Office‐GloSea5, based on hindcasts over a 20‐year period (1992–2011).


INTRODUCTION
Extratropical cyclones can produce high wind speeds near the surface, damaging physical structures, causing fatalities and enormous financial losses. Most insured losses over Europe are related to winter storm events (Munich Re, 2017). Strong midlatitude cyclones affecting Europe form when baroclinic disturbances over the North Atlantic undergo rapid intensification, leading to a fall in surface pressure and steep pressure gradients. In conjunction with their related frontal structures, these intense cyclones can produce extremely high surface wind speeds over a large footprint region. Across Europe, the average number of damaging cyclones and windstorms varies by season, for example, from 7 to 10 per season for the north of Scotland, to 2 to 3 over central Europe. Nevertheless, their frequency varies greatly from year to year, with many studies using climatological information to attempt to understand causative factors of such variability. The important role of large-scale atmospheric variability modes has been known for many years, although not fully understood, e.g. the steering influence of the North Atlantic Oscillation (NAO: e.g. Hurrell and Deser, 2009;Donat et al., 2010;Renggli et al., 2011). Recent studies also highlight the influence of other important modes like the East Atlantic pattern (EA) or the Scandinavian pattern on observed windstorm activity over different regions of Europe (Walz et al., 2018a). The skill in forecasting seasonal windstorm frequency is therefore determined by the models' ability to simulate both the large-scale modes themselves and the link between the large-scale modes and windstorm activity. In a recent study the role of large-scale drivers in steering windstorm activity over Europe has been investigated using the European Centre for Medium-Range Weather Forecasts (ECMWF) System 4 seasonal forecasts (Walz et al., 2018b).
A skilful forecast of the severity of the coming season, in terms of windstorm occurrence, would be useful for many applications, including preparation for disaster management, mid-to short-term planning of business operations, as well as military planning applications.
So far, only a few studies have directly investigated the skill of extreme event forecasts with lead times beyond a couple of weeks. The earliest one known to the authors is a study of the ENSEMBLES and DEMETER seasonal hindcast experiments . In these early seasonal forecast systems, they found small but significant skill for extratropical cyclone-related windstorm frequency over parts of central western Europe. One core finding was that winter seasons with enhanced frequency are better predicted than normal storm seasons, when initialised in November. More recent studies investigating the Met Office GloSea5 hindcast dataset, based on the HadGEM3-GA3 model, have revealed a step-change in midlatitude seasonal forecasting skill, demonstrating that this forecast system does now show significant, usable skill for the major climate variability mode over Europe, the NAO (Scaife et al., 2014;Palin et al., 2016;Clark et al., 2017). Stationary Rossby waves, triggered by tropical convection, have been proposed as a potential mechanism for this NAO skill (Trenberth and Fasullo, 2012;Trenberth et al., 2014;Scaife et al., 2017).
This study investigates the extent to which the modern seasonal forecast systems, from the ECMWF and the Met Office Hadley Centre, are capable of forecasting the frequency of extratropical cyclones and windstorms over the Northern Hemisphere. We analyse the skill of forecasts starting around the beginning of November for the upcoming winter season (December-February). In addition, the skill of forecasting windstorm occurrence in the North Atlantic/European region using a prediction of the NAO and the relationship between the NAO and windstorm frequency is explored.
We describe the data used in Section 2. Cyclone and windstorm tracking schemes are explained in the methods in Section 3. Results are presented in Section 4, with subsections assessing the models' ability to simulate the climatological distribution and interannual variability of cyclones and windstorms in the Northern Hemisphere, and whether an NAO-based forecast is beneficial. We conclude with a summary and discussion in Section 5.

DATA
We use four datasets in this study: a pseudo-observational dataset (ERA-Interim reanalysis: Dee et al., 2011) with a horizontal resolution of T L 255 (≈0.7 • ) and 60 vertical levels, and three seasonal hindcast datasets: ECMWF System 3 (hereafter ECMWF-S3: Anderson et al., 2007) with a resolution of T L 159 (1.125 • ) and 62 vertical levels, ECMWF System 4 (hereafter ECMWF-S4: Molteni et al., 2011) with a resolution of T L 255 (≈0.7 • ) and 91 vertical levels, and GloSea5 (MacLachlan et al., 2015) using an N216 grid (≈0.8 • in longitude; ≈0.5 • in latitude) with 85 vertical levels. The time period investigated, common to all datasets, comprises 20 full winters from 1992/1993 to 2011/2012. When we refer to, for example, winter 1992, we consider the three months from December 1992 to February 1993. We use 6-hourly mean-sea-level pressure to identify cyclones, and 12-hourly wind speeds at 925 hPa to identify windstorms (see next section). The ECMWF hindcasts are initialized on 1 November, whereas the GloSea5 hindcasts are started on 25 October, 1 November and 9 November, with eight realizations for each start date. All seasonal prediction systems are ensemble systems, each with a different number of individual realizations: 41 members in ECMWF-S3, 51 members in ECMWF-S4 and 24 members in GloSea5. Cyclone and wind identification and tracking are performed on each ensemble member. The skill assessment is undertaken on the ensemble mean, where each ensemble member has equal weight. The results obtained from the seasonal forecast models therefore appear smoother than the single realization of the reanalysis.

IDENTIFICATION AND TRACKING OF CYCLONES AND WINDSTORMS
In this study we identify and track cyclones using an algorithm first introduced by Murray and Simmonds (1991) with the modifications specified in  and . The analysis is based on 6-hourly mean-sea-level pressure (MSLP) fields. Each field is interpolated onto a T159 grid, to decrease the dependency of the algorithm on grid resolution. Cyclone centres are detected by identifying maxima of the Laplacian of the MSLP field and thus a maximum of the quasi-geostrophic relative vorticity. For the cyclone tracking procedure, a subsequent position of each cyclone centre is predicted and compared to cyclone centres identified in the following time step. Cyclone tracks with lifetimes shorter than 24 h are filtered out. Furthermore, only cyclone events which have been strong and closed at least once during their lifetime are considered, thus excluding open depressions. This tracking methodology has previously been used in numerous studies (e.g. Grieger et al., 2014;Kruschke et al., 2014;Befort et al., 2016) and is included in the Intercomparison of Mid-Latitude Storm Diagnostics Initiative (IMILAST: Neu et al., 2013;Ulbrich et al., 2013).
We define extreme cyclones as those exceeding the 95th percentile of the Laplacian of the MSLP at least once in their lifetime, following Leckebusch and Ulbrich (2004). This 95th percentile is calculated based on all cyclone events over the Northern Hemisphere. The absolute number of identified extreme cyclone tracks consequently accounts for 5% of all cyclone tracks on the integrated, hemispheric scale, but shows significant spatial variations on regional scales.
For windstorm events we follow the identification and tracking scheme developed by Leckebusch et al. (2008). A more recent and extensive description of this algorithm can be found in Kruschke (2015). Originally established for wind speeds at a height of 10 m and 6-hourly data, we here apply the algorithm to wind speeds at 925 hPa and 12-hourly data due to data availability from the GloSea5/ECMWF hindcasts. The scheme identifies regions where the wind speed exceeds the local (grid-point level) 98th percentile of climatology. The region must exceed 150,000 km 2 to be considered a windstorm. The 98th percentile is calculated from the wind speed distribution for the winter months December to February from 1992 to 2012 as this period is available for all datasets. The wind storm tracking procedure follows a nearest neighbour approach. Wind storm tracks with a lifetime shorter than 24 h, i.e. two time steps, are filtered out. This algorithm has also been applied in previous studies (e.g. Renggli et al., 2011;Nissen et al., 2014a;2014b;Befort et al., 2015;Kruschke, 2015;Wild et al., 2015;Befort et al., 2016). Note that, due to using 12-hourly 925 hPa wind speeds rather than 6-hourly 10 m wind speeds, a reduction in the number of windstorm events is seen.
We apply both algorithms to the core winter months. Spatial track densities are calculated following Befort et al.
(2016) with a search radius of 700 km. In GloSea5, wind speeds are set to zero if the 925 hPa pressure level is below the surface, making it impossible to track wind fields over these areas. To minimize this effect, we exclude all grid cells for where this occurs in more than 5% of all time steps. This mask is then applied to all the datasets we use. Temporal correlations are based on Kendall's b rank correlation coefficient. This coefficient takes into account ties in the ranks of the time series, which is necessary in our case of absolute cyclone/wind storm counts. Statistical significance is obtained under the null hypothesis of no association, and as our data contains ties, a normal approximation with continuity correction is applied when the correlation coefficient is calculated (Kendall and Gibbons, 1990). Comparisons to other more common correlation methods (i.e. Pearson and Spearman) and other statistical significance tests (i.e. Student's t-test and r-test) show only minimal differences.

Climatological representation of cyclones and windstorms in the Northern Hemisphere
Prior to analysing the seasonal forecast skill of windstorms and cyclones, the ability of the models to reproduce the observed climatological spatial patterns of both phenomena is assessed. This will highlight any limitations of the models in simulating these events. The observed climatological spatial track density of all extratropical cyclones shows the two well-known centres of activity over the North Pacific and the North Atlantic ( Figure 1). The highest number of cyclones occurs at the respective jet exit region with around 60 cyclones per winter. There is also a secondary maximum with up to 30 cyclones per winter over the Mediterranean region. All seasonal forecast systems generally capture this spatial distribution but show slight discrepancies with the reanalysis regarding the absolute number of cyclones in some regions. ECMWF-S3 underestimates the number of cyclones over the whole Northern Hemisphere and has no secondary maximum over the Mediterranean. The ECMWF-S4 extratropical cyclone track density compares very well to the reanalysis. GloSea5 generally overestimates the number of cyclone events over the Northern Hemisphere with around 20% more cyclones in the North Atlantic storm track region. These climatological results change little if we consider only the 5% strongest cyclones for the same time period (Figure 2).
The observed spatial distribution of the windstorm event track density also shows two climatological centres of activity over the North Pacific and the North Atlantic ( Figure 3). Similar to cyclone track densities, both centres of activity are orientated southwest-northeast. The two main differences to the spatial distribution of cyclones are fewer windstorm events and an equatorward shift of around 1,000 km of the regions with most windstorms. The smaller number of windstorms compared to cyclones is due to the fact that for windstorms only events exceeding a specific intensity and size are taken into account (clustered extreme wind speeds above the 98th percentile). The southward shift of the maximum reflects that the highest wind speeds associated with northern hemispheric cyclones are typically located to the south of the pressure centre, along frontal zones. The spatial windstorm distribution is well-captured by all seasonal forecast models, with smaller biases (absolute and relative) for all models compared to cyclones and extreme cyclones.

Forecasted interannual variability of cyclones and windstorms in the Northern Hemisphere (direct method)
As the models are able to reproduce the observed climatological spatial patterns of (extreme) cyclones and windstorms, the forecast skill in predicting the interannual variability of cyclones and windstorms is now considered.

Temporal variability
We assess the skill of forecasting the interannual cyclone and windstorm variability by correlating the number of events per winter with the reanalysis, at each grid point and for each model. We call this approach the direct method. The picture that emerges for all cyclones reveals several regions with high correlations across all models: in the eastern and western Pacific, in the majority of the storm-track region in the Atlantic including parts of northwestern Europe especially for ECMWF-S4 and in northern parts of the Atlantic especially in ECMWF-S4 and GloSea5 (Figure 4).
When comparing the skill of the models in forecasting the winter frequency of windstorms across the Northern Hemisphere, some common features are seen ( Figure 6). ECMWF-S4 and GloSea5 show high and mostly significant correlations over the eastern North Atlantic and central Europe, which is particularly useful given the high damage potential in these areas. In addition, there are high correlations over parts of the northern Pacific and northern America, but slightly negative correlations over the western North Atlantic for all forecast systems.

Spatial variability
To assess the models' ability to capture the interannual variability of the spatial track density patterns, the centred anomaly correlation coefficient (ACC: Wilks, 1995) for the Atlantic sector (90 • W to 10 • E, 20-70 • N) is calculated. For all cyclones it is found that the agreement of the models' ensemble mean spatial distribution of events per winter with ERA-Interim is strongly dependent on the year under consideration (Figure 7a). Overall, ECMWF-S4 shows the highest mean ACC over the time period analysed. In some years the value of the ACC exceeds 0.4 in ECMWF-S4 and GloSea5, while in other years the models fail to capture the reanalysis' cyclone spatial distribution. ACC time series for extreme cyclones show higher year-to-year fluctuations than for all cyclones, indicating that the forecast skill in predicting the spatial pattern is more variable (compare Figure 7a and Figure 7b).
Regarding windstorm events, we find years with both high and low agreement when comparing to reanalysis (Figure 7c). The overall temporal mean of ACC values is similar to the one found for all cyclones and higher than for extreme cyclones, with GloSea5 a slightly higher mean ACC than the other two models.
To compare the models more directly we used one of the models as reference instead of ERA-Interim as done previously. Here, we made the arbitrary choice of the ECMWF-S4 ensemble mean (GloSea5 or ECMWF-S3 could equally have been used). The analysis reveals that for all cyclones there is a slightly higher agreement between the models than the models show with the reanalysis (compare Figure 7a and Figure 7d). Interestingly, ACC values are notably higher in the last analysed decade compared to the first decade for GloSea5. For extreme cyclones, we also find temporal means of the ACC values being increased if ECMWF-S4 replaces ERA-Interim, again suggesting higher agreement amongst the models (Figure 7d,e).
The impact of using ECMWF-S4 as reference in contrast to ERA-Interim is largest for windstorm events. The temporal average of the ACC value is about 0.2 if using ERA-Interim as reference, and this increases to about 0.5 if using ECMWF-S4 (compare Figure 7c and Figure 7f).

Forecasted interannual variability of windstorms derived using the NAO for the North Atlantic/European region (indirect method)
The North Atlantic Oscillation (NAO) is the most prominent variability pattern in the North Atlantic/European region, with a substantial influence on winter windstorms on various time-scales (e.g. Hurrell and Deser, 2009;Pinto et al., 2009;Donat et al., 2010;Renggli et al., 2011). UK Met Office seasonal hindcasts of the winter mean NAO have recently shown promising results (Scaife et al., 2014;. Consequently, in this section we forecast the variation in winter windstorm frequency using the models' forecast of the NAO and the observed relationship between the NAO and windstorm frequency (derived from ERA-Interim). We call this statistical approach using the NAO as the predictor, the indirect method.
The NAO index is calculated according to the station-based definition used by Hurrell (1995) FIGURE 8 Regression slope of windstorm track density regressed onto station-based winter mean NAO derived from ERA-Interim time series at the two points is normalized by their local mean and standard deviation, prior to calculating the difference of both. The highest correlation between forecasted and observed NAO (derived from ERA-Interim) is found for GloSea5 (correlation coefficient of 0.62), followed by much smaller correlations for ECMWF-S3 (0.27) and ECMWF-S4 (0.24).
The regression map of windstorm track densities onto the interannual NAO calculated from ERA-Interim reveals a dipole structure, with a nodal line around 45 • N (Figure 8). The positive windstorm response shows a maximum west of Ireland of around two more storms per winter for each NAO unit, and affects most of northern and parts of central Europe. The negative response is primarily located over the southern parts of the North Atlantic (south of 40 • N) and affects to a lesser extent the Mediterranean region -predominantly its western part. The NAO has no effect on interannual windstorm variability in between these two nodes.
The correlation maps of ERA-Interim track densities and NAO-predicted track densities from the three seasonal forecasts (indirect approach) show principally similar patterns, with positive skill over large parts of the North Atlantic and western Europe in ECMWF-S4 and GloSea5 (Figure 9e,f). The skill in ECMWF-S3 is lower in comparison to both these models in the entire North Atlantic region (Figure 9d).
Regarding the spatial skill pattern, we find overall good agreement between the direct and indirect approach in predicting windstorms in the North Atlantic sector (compare Fig. 9a-c & Fig. 9d-f). There are, however, regional quantitative differences of skill, as shown by the correlation coefficient differences (Figure 9g-i). The direct method performs better along a latitudinal band from around 40 • to 50 • N over the eastern North Atlantic and central western Europe. This can be mainly explained by the small impact of NAO variability onto windstorm variability over this region (Figure 8), leading to a less skilful prediction compared to directly detecting windstorms (direct approach).
In the cyclogenesis region east of the North American continent, the indirect approach shows higher (but mostly not significant) skill for all hindcasts. Furthermore, the indirect method also adds a small amount of skill over the North Sea in ECMWF-S4 and GloSea5, and west of the British Isles in GloSea5. In summary, in all forecast systems, the direct method typically shows higher (significant) skill in forecasting winter storm frequency in central-western Europe, whilst the NAO approach slightly improves skill over northwestern Europe.
Interestingly, the indirect approach performs only slightly better in GloSea5 compared to ECMWF-S4 even though the forecast skill of the NAO index is much higher in GloSea5 than it is in ECMWF-S4. This could reflect the importance of other large-scale variability modes in driving windstorm activity over Europe (see discussion) but could also be related to the small NAO signals in GloSea5 predictions (low signal-to-noise ratio: Scaife et al., 2014).

SUMMARY AND DISCUSSION
In this study, we have analysed the climatological representation and seasonal prediction skill of wintertime extratropical cyclones and windstorms in three ensemble-based seasonal prediction systems. In addition to investigating the skill of predicting the winter frequency of extratropical cyclones, this is the first study to explicitly investigate the capability of such models to forecast windstorm impact -through assessing near-surface, damage-relevant winds.
The main features of the long-term mean spatial distribution of both extratropical cyclones and windstorms are well represented in each of the models, across the Northern Hemisphere, although some regional biases are seen. Windstorms often result from strong and deep cyclones. Our results show differences in the regions where cyclone and windstorm frequency are skilfully predicted, which highlights the benefit of analysing both measures independently; otherwise, skilful winter predictions might be underestimated or vice versa. In the direct approach (by detecting cyclones and windstorms), the seasonal prediction systems all generally show small to moderate positive (and often significant) skill for all analysed quantities in many parts of the Northern Hemisphere. Skill values are generally higher for all cyclones and windstorm events than for extreme cyclones, supporting the approach of detecting windstorms directly to derive information on extreme events and damage. For windstorms the regions of positive and negative skill are at similar locations between the three model suites. In addition to these spatial similarities in skill between the models, we also find temporal similarity across the models in the pattern correlation of the storm frequency with the observations over the Atlantic basin. These temporal similarities are especially pronounced for windstorms.  The temporal similarities between the models may indicate that the predictability of windstorm activity is higher for some seasons compared to others and that this season-dependent predictability is model independent. Higher predictability could further reflect physical processes linked to windstorm frequency, which are especially pronounced in specific seasons. In this case, our results suggest that these relevant physical processes are represented similarly in the three different model suites. Scaife et al. (2014) have recently shown that the GloSea5 seasonal prediction system can generate skilful NAO forecasts with a correlation skill higher than 0.6. Consequently, we also examine whether an NAO-based regression model can be used for the seasonal forecast of windstorms in the North Atlantic region, making use of the observed relationship between the NAO and windstorm frequency in reanalysis data. For most regions with significant skill in either method, the direct approach outperforms the indirect one. The indirect statistical approach can improve upon the direct method in some regions, e.g. around the North Sea. However, as expected, the results show lower skill along the nodal line of the NAO including most central western European countries when compared to the direct approach (where actual windstorm events are directly identified from wind speeds). We thus conclude that the NAO is a useful predictor of interannual variability of European windstorms north of its nodal line in current seasonal forecast models. As the indirect and direct method reveal different areas with positive significant skill, a combination of both approaches would optimise the forecast skill for windstorms over Europe. However, if the NAO is used as the sole predictor for European windstorms, forecast skill would be lost in regions along its nodal line (from around 40 • to 50 • N) including regions with large damage potential due to storminess. Further investigations into which processes other than the NAO affect windstorm frequency on seasonal time-scales in western Europe are necessary. In particular, the East Atlantic pattern (EA) may play an important role: Woollings et al. (2010) showed that the EA has a large effect on the latitude of the North Atlantic jet stream position, which can steer cyclones and windstorms. Mailier et al. (2006) also found that the EA was more important than the NAO in explaining monthly variability in winter cyclone counts on the southern flank of the North Atlantic storm track, over an area coincident with the nodal line of the NAO. This is also supported by findings from Walz et al. (2018a), who showed that the EA pattern strongly influences windstorm activity between nodes of the NAO. Currently, further analysis is carried out to clarify how far the EA pattern is responsible for the skill in central western Europe found in ECMWF-S4 and GloSea5. It should be further mentioned that the skill measure is based on correlation coefficients only; however, the amplitude of the prediction is also important to judge its usefulness.
Our results appear promising overall and corroborate the emerging evidence of predictability on seasonal time-scales regarding extratropical cyclones and windstorms Riddle et al., 2013;Scaife et al., 2014;Smith et al., 2016). Our analyses extend these recent studies by considering different seasonal prediction systems. Using identification and tracking algorithms, we are further able to assess the frequency of extreme extratropical cyclones and windstorms directly rather than deriving these quantities by distribution quantiles. We thus show for the first time, significant skill in predicting the winter frequency of extreme extratropical windstorm events.
Our results suggest that ECMWF-S4 has a more realistic representation of the climatological frequency of extratropical cyclones and windstorms compared to the other models. However, with respect to forecast performance over the eastern Atlantic/western Europe, both: GloSea5 and ECMWF-S4 reveal significant skill in forecasting windstorm events on seasonal time-scales, with higher values as found for extreme extratropical cyclones. One reason for the higher skill of ECMWF-S4 and GloSea5 compared to ECMWF-S3 in forecasting windstorms might be the configuration of the multi-member ensembles, e.g. the number of members, or how they are initialized. As we do not explicitly address the influence of the ensemble size on the results in this study we cannot definitively quantify the benefit of more members. The higher skill for ECMWF-S4 compared to ECMWF-S3 might also reflect improvements caused by model development (S4 uses a newer model version than S3) and the higher horizontal resolution of ECMWF-S4 compared to ECMWF-S3. Better representation of relevant processes linked to extratropical cyclone and windstorm frequency may be a further reason for quantitative differences in skill.
It is important to note that the time period used in this study is limited to 20 years (1992/1993 to 2011/2012) due to the fact that GloSea5 hindcasts were produced for these years only. Both ECMWF-S4 and ECMWF-S3 are available for longer time periods, e.g. from 1982 onwards in the case of ECMWF-S4. Our results show that the skill for windstorms in ECMWF-S4 decreases if using the years 1982-2011 compared to using 1992-2011, whereas the general skill pattern remains similar. Further analyses are needed to explain the reason for this decrease in skill for ECMWF-S4 when using the longer hindcast period.
The striking similarities in the anomaly correlation coefficients for North Atlantic/European windstorms between the different model suites once more raises the question of which processes are responsible for seasonal variability and the extent to which these are captured by the models. Potential drivers of winter storm variability include a horseshoe-like sea-surface temperature anomaly pattern in the North Atlantic a few months ahead of the winter season (Renggli, 2011) and an increased meridional surface temperature gradient in the west Atlantic cyclogenesis region (Wild et al., 2015). Besides these oceanic drivers, further analysis is needed to quantify the role of atmospheric large-scale variability modes in steering windstorm variability in current seasonal forecast suites (as done by Walz et al. (2018b) for the European continent). Further research to improve our understanding of such drivers of seasonal variability will be essential and ultimately pave the way to successfully predicting whether a calm or severe storm season lies ahead.