German Bight storm activity, 1897–2018

This study investigates the evolution of German Bight (southeastern North Sea) storminess from 1897 to 2018 through analysing upper quantiles of geostrophic wind speeds, which act as a proxy for past storm activity. Here, geostrophic wind speeds are calculated from triplets of mean sea level pressure observations that form triangles over the German Bight. The data used in the manuscript are provided by the International Surface Pressure Databank and the national meteorological services of Denmark, Germany, and the Netherlands. The derivation of storm activity is achieved by enhancing the established triangle proxy method via combining and merging storminess time series from numerous partially overlapping triangles in an ensemble‐like manner. The utilized approach allows for the construction of robust, long‐term and subdaily German Bight storminess time series. Further, the method provides insights into the underlying uncertainty of the time series. The results show that storm activity over the German Bight is subject to multidecadal variability. The latest decades are characterized by an increase in activity from the 1960s to the 1990s, followed by a decline lasting into the 2000s and below‐average activity up until present. The results are backed through a comparison with reanalysis products from four datasets, which provide high‐resolution wind and pressure data starting in 1979 and offshore wind speed measurements taken from the FINO‐WIND project. This study also finds that German Bight storminess positively correlates with storminess in the NE Atlantic in general. In certain years, however, notably different levels of storm activity in the two regions can be found, which likely result from shifted large‐scale circulation patterns.


| INTRODUCTION
Storms are one of the major extreme events affecting the population in Germany and Europe due to high wind speeds and triggering of other extreme events such as storm surges, high waves, flooding, and extreme precipitation events. Due to the high impact potential of storms on coastal inhabitants, the assessment of long-term storm activity provides valuable information. However, acquiring information about long-term trends and temporal variability of storm activity requires the availability of homogeneous wind speed records, which are usually not available as technological advances or improvements in the measurement techniques can lead to inhomogeneities in the data records over time. Also, changes in the surroundings of an observation site, such as station relocations, vegetation changes, or the construction of buildings can disturb the wind measurements and cause inhomogeneities (Vose et al., 1992;Heino, 1999;Lindenberg et al., 2012). Similarly, long reanalysis datasets do not pose a suitable alternative approach as they are affected by an increasing station density and better data quality over time, also from the introduction of satellite data at the end of the 1970s resulting in artificial trends (Bengtsson et al., 2004;Krueger et al., 2013).
On the contrary, air pressure as a large-scale variable is usually less affected by the aforementioned factors. For that reason, less inhomogeneities are found in time series of air pressure measurements. Additionally, instruments and techniques used for air pressure observations have undergone only small changes in the past century, providing us with more consistent data. Consequently, one approach to avoid inhomogeneities and uncertainties from station data is to make use of the geostrophic wind speed proxy based on triplets of air pressure measurements. Schmidt and von Storch (1993) used the proxy to describe the long-term behaviour of upper quantiles of geostrophic wind speeds under the assumption that the variation of the statistics of this proxy describes the variation of the statistics of wind speed-an assumption that Krueger and von Storch (2011) later showed to be valid, in particular over sea surfaces and flat terrain, where ageostrophic disturbances on atmospheric movement are minor. Other studies adapted the proxy to infer about the long-term storm climate, mostly over the northeast Atlantic Ocean (e.g., Alexandersson et al., 1998;2000;Matulla et al., 2008;Wang et al., 2009;Krueger et al., 2019). Most studies found an increase of storminess from the 1960s to the 1990s, but when viewed over longer time scales, storm activity is mostly stationary with super-imposed pronounced interdecadal variability (Feser et al., 2015). The proxy of geostrophic wind speed statistics is also used by national meteorological services. For instance, the Swedish national service SMHI makes use of the method to provide analyses of the wind climate all over Sweden (Wern and Bärring, 2009). For the North Sea region, which is one of the most densely observed regions in the world, there is also the possibility of making use of other records for different variables, such as records of storm surges along the coasts, to make inferences about storm activity (Dangendorf et al., 2014). The surge-based storminess displays interannual to multidecadal variability without any significant trend. However, as the occurrence of surges strongly depends on wind direction, the layout of the coastlines (Ganske et al., 2018), and the nearshore bathymetry, storm surges can only provide complimentary information with regard to the description of long-term storm activity.
As the synoptic atmospheric activity over Europe is influenced by the large-scale circulation, atmospheric modes can contribute to variability in the storm climate. The dominant mode in the northeast Atlantic is the North Atlantic Oscillation (NAO) described by the NAO index. The NAO index is based on the standardized differences in pressure anomalies between Rejkyavik (Iceland) and either Lisbon (Portugal), Gibraltar, or Ponta Delgada (The Azores) (Hurrell, 1995;Jones et al., 1997). Positive NAO phases coincide with an increase in zonal weather patterns, favouring cyclogenesis and storm tracks over Central Europe (Trigo et al., 2002;Feser et al., 2015). Negative NAO phases are often associated with below-average numbers of storm days over Central Europe (Donat et al., 2010). Allan et al. (2009) examined the relationship between NAO phases and winter storm activity over the British Isles and found that although a significant positive correlation exists, the connection fluctuates over time. Matulla et al. (2008) concluded that variations in the NAO can be used to explain northwestern European storminess variability to some extent, but the link fails for other regions in Europe. Matulla et al. (2008) also emphasized that the connection between NAO and storminess strongly depends on the time period and region examined, which was later confirmed by Pinto and Raible (2012) and Raible et al. (2014).
Though Schmidt and von Storch (1993) provide a first analysis of the long-term storm climate over the German Bight, and subsequently the German national meteorological service Deutscher Wetterdienst (DWD) incorporates and updates the time series annually in their reports about the current state of the German storm climate (DWD, 2017b), we see a need to investigate German Bight storminess in more detail. The analysis of Schmidt and von Storch (1993) is based on three stations forming a triangle covering the German Bight with missing periods of observations being replaced with data from neighbouring stations. Even though Krueger and von Storch (2011) show that one triangle provides sufficient information for the description of storminess, our analysis utilizes data from eight stations to form 18 triangles exploiting the dense observational network along the North Sea coast to estimate German Bight storminess robustly for the period 1897-2018, and allowing us to also provide new uncertainty estimates.
The manuscript is structured as follows. Section 2 describes the data used in this study and provides information about the derivation of German Bight storminess. In Section 3, we present and discuss our findings with regard to the derived annual and seasonal time series of German Bight storm activity, also in connection with the large-scale circulation and uncertainties. Section 4 validates the reconstructions through comparison with insitu measurements and reanalysis products. Concluding remarks are given in Section 5.
The ISPD is a global database of historical pressure observation data, spanning the period 1722-2016. The ISPD consists of a blend of international pressure datasets with added metadata, such as station altitude and coordinates. The database also provides quality control flags, which stem from the production of the 20th Century Reanalysis (appendix B in Compo et al., 2011Compo et al., ) up until 2013. Data that is distributed by the DMI contains pressure observations at sea level, including station metadata. While historical data is available for 1874-2018, the temporal resolution is confined to three data points per day (07Z, 13Z, 20Z) until July 1987, as well as 3-hourly and later hourly observations thereafter. Pressure records from the DWD Climate Data Center comprise data both at sea and station levels. The data availability differs among the stations with Hamburg being the longest continuous time series . The German datasets contain quality flags that stem from the DWD quality control, and from additional station and instrumental metadata. All data were converted into UTC using provided time zone metadata. Historical pressure observations from the Netherlands used are a blend of data obtained through the ISPD, personal communication with the KNMI, and the KNMI data centre (KNMI, 2019). Available Dutch pressure data at sea level are recorded hourly, respectively every 10 min for recent periods. Furthermore, we compare our results with measurement data from the FINO1 measurement site (Leiding et al., 2016, marked in Figure 1) and with data from the four reanalysis datasets ERA5 , ERA-Interim (Dee et al., 2011), MERRA2 (Gelaro et al., 2017 and a combination of CFSR version 1 (Saha et al., 2010) andversion 2 (Saha et al., 2014). Note that version 1 of CFSR stops at the end of 2010 and version 2 directly continues afterwards, such that we are able to concatenate both datasets.

| Data preparation, quality control and geostrophic storminess
The calculation of geostrophic wind speeds requires quality controlled pressure data at sea level. As pressure data from both the ISPD and DWD have not entirely been reduced to the sea level, we use the barometric equation (Equation 1) for an atmosphere with a constant vertical temperature gradient and with modified standard atmospheric values to correct for small station altitudes as we lack knowledge about the state of atmospheric conditions at the time of the measurements. We assume an air temperature of 10 C at station height z, which is close to the annual mean temperature in the area being investigated (van der Schrier et al., 2011;DWD, 2017b;Cappelen et al., 2019). The height correction of the pressure p at height z station to the mean sea level (z = 0) reads F I G U R E 1 Map of eight stations from which pressure observations are used to calculate geostrophic winds and 18 triangles that are constructed based on geometric criteria. The location of the FINO1 research platform is marked in red for which −0.0065 K m is the assumed lapse rate ∂T ∂z and 5.255 the ratio κ κ − 1 for κ = 1.235 being the assumed isentropic coefficient.
The quality control consists of multiple steps. As a first level of quality control, we make use of the quality control flags provided by the ISPD by removing flagged data (where available). We then perform manual quality checks based on the idea that erroneous values become detectable, when they are either unrealistically high or low, or when absolute pressure tendencies or differences of tendencies between consecutive time steps exceed certain thresholds. First, unrealistically high or low pressure values are removed. The removal thresholds for low and high values of pressure readings are set to 940 and 1,065 hPa based on observed extremes of atmospheric pressure records in Germany, the Netherlands and Denmark (de Haij, 2009;Würtz, 2017;Cappelen et al., 2019). Second, we examine pressure changes between measurements to filter out pressure shifts and spikes that would be physically implausible, following the First and Second Derivative Extrema (FSDE) approach by Paraskevopoulou et al. (2013). We introduce thresholds for discrete first and second order differences of pressure in time and omit observations exceeding those limits. The thresholds are empirically set to 7 hPa Á hr −1 and 15 hPa Á hr −1 for first and second order differences, respectively. Finally, we compare the pressure observations to data from adjacent stations to identify spatial outliers (Dey and Morone, 1985). For every station, we calculate the median pressure of all stations located within a radius of 250 km if there are at least two other stations available. We build an index based on the difference Δp between the pressure of the tested station p and the surrounding reference median pressure p median at the times t n , t n + 1 and t n − 1 . Valid observations conform to the conditions given in Equation (2), which reads 1013hPa−p median 10 Term A describes the absolute difference Δp between the station pressure p and the median of the surrounding stations p median at the time t n . Term B takes into account the difference between the rate of change of term A before and after t n , similar but not equal to a second derivative in time. Data points are removed as soon as this index exceeds a certain threshold, which is given in term C. As pressure fluctuations and gradients are usually larger near areas of lower pressure such as cyclones, the threshold is modified by the reference median pressure p median .
In order to calculate geostrophic wind speeds from pressure gradients, pressure data needs to be available at simultaneous time steps (as in Krueger et al., 2019). Due to different time zones, measurement techniques and guidelines, observations have not always been recorded at the same time during the day. To overcome these inconsistencies, we obtain synchronized pressure data every 3 hr starting at 00Z. In order to do so, we linearly interpolate the pressure data to 3-hourly values. We discard periods in our dataset in which gaps exceed 13 hr because the linear interpolation would remove too much of the natural synoptic scale pressure variability.
To derive geostrophic winds from pressure values, we adapt the approach by Schmidt and von Storch (1993).
First, we examine the 8 3 =56 possible distinct triangles that can be formed by combining eight stations given in Table 1 into triplets for geometric features. Our investigations have shown that strongly obtuse triangles exhibit a larger error propagation of pressure uncertainties resulting in a shift of the average wind direction towards the orientation axis of the triangle itself (not shown). For that reason, we discard triangles whose smallest interior angle is smaller than 15 . We also manually remove triangles that only cover coastlines or land areas. These geometric and physical criteria are chosen to ensure that the selected triangles cover a sufficiently large part of the German Bight.
For each of the remaining 18 triangles ( Figure 1 and Table 2), we follow the approach of Alexandersson et al. (1998) and assume that the pressure p at the coordinates (x,y) within one triangle can be described by the equation with x and y being defined in a local Cartesian coordinate system: y =R e ϕ: Here, R e denotes the Earth radius, λ the longitude, ϕ the latitude. The coefficients a, b, and c in Equation (3) can be derived through solving the set of equations given in Equation 6 for which the pressure p i at station location (x i ,y i ) (i = 1, 2, 3) is made use of: p 1 = ax 1 +by 1 +c p 2 = ax 2 +by 2 +c p 3 = ax 3 +by 3 +c: The geostrophic wind speed is then calculated as with T A B L E 1 List of stations with coordinates and data availabilities  1971, 1972, 1991, 1992. c Does not include 1939,1940,[1942][1943][1944][1945][1946].
where ρ is the density of air (set at 1.25 kg Á m −3 ) and f the Coriolis parameter, which is the average of the Coriolis parameter from the three measurement sites. The coefficients a and b denote the zonal and meridional pressure gradients. After having derived U geo at each time step, we calculate 95th and 99th seasonal and annual percentiles of geostrophic wind speeds for each triangle. To reduce errors resulting from years in which geostrophic wind data is too sparse to be considered representative for the entire year, we remove years with a data availability of less than 80.0% (79.8% in a leap year). The requirements for seasonal values follow similar criteria.
In a last step, we treat the resulting time series of the 18 triangles as an ensemble of realizations of storminess time series covering the German Bight, which we standardize individually and average (as in Alexandersson et al. (1998) and subsequent studies). The standardization is achieved by the subtraction of the long-term average and by division of the SD, both calculated for the period 1961-2010. We use 1961-2010 as our reference time frame because it encompasses at least one full period of the dominant multidecadal variability of large-scale storm activity in the Northeast Atlantic. Also, data availability in the reference time frame should be as high as possible. The exact starting and ending years, however, are chosen arbitrarily.
Resulting values of the time series are dimensionless, however can be understood as multiples of one SD. The standardization ensures that the resulting time series are in the same range, even though the spread should not be large as the triangles are in the same vicinity and overlap. We consider the averages of the resulting 18 time series for the 95th and the 99th percentiles of geostrophic wind speeds as proxies for German Bight storminess. Note that for the actual comparison with measurement data and data from reanalyses we use non-standardized geostrophic wind speeds at shorter and annual time scales, where we see fit (see Figures 13 and 14, and Table 3).

| Uncertainty estimation
Even though we assume that inferring storminess from pressure observations provides us with a more robust and homogeneous time series compared to using wind speed measurements (Weisse and von Storch, 2009) directly, we still have to account for unavoidable uncertainties and errors induced by limited instrument accuracy, measurement routines and our own sampling. As there are no extensive metadata available in the data that would consistently describe such errors, but would likely be reflected through our previously described quality control, we use a bootstrapping approach (Efron and Tibshirani, 1986;DiCiccio and Efron, 1996) to mimic and quantify the uncertainty of our time series. Bootstrapping is a nonparametric and established method of estimating sample frequency distributions of a variable or statistic through the repeated use of subsampling with replacement. In our approach, we aim at estimating the range of possible realizations of German Bight storminess, which acts as an approximation of the uncertainty. In each month within the period of 1897-2018, we first select random values with replacement from the 3-hourly geostrophic wind database until the size of the selected subset reaches the number of 3-hourly time steps in 1 month. Missing observations that stem from low data availability or result out of the quality control procedures are treated like valid observations in the sampling process to make sure that they can also be included in the new samples. We repeat this process for every month within 1 year, whereby the monthly subsets are assigned to their respective seasons. Afterward, we determine 95th and 99th percentiles for the four seasons and the entire year, on the condition that the data availability of the T A B L E 3 Spatial and temporal resolutions of reanalysis products, as well as correlations between observation-based geostrophic storminess and storminess derived from wind speed reanalysis data

Reanalyis dataset
Spatial resolution Note: 1980-2018 was chosen as the reference period for both storminess calculations and correlation analysis due to limited data availability. Note that, as reconstructed wind speeds are available at 3-hr intervals, hourly wind speed data from ERA5 and MERRA2 have been downsampled to 3-hr intervals to ensure data synchronization.
original dataset for the respective season or year and triangle is at least 80%. Note that, for the calculation of winter storminess, we have to use data from the December of the previous year. The entire sampling procedure is then iterated over every year in the 122-year period and subsequently repeated for each of the 18 triangles. Eventually, the 18 resulting time series are standardized and averaged, following the methods described in Section 2.1. This process is repeated 10,000 times, so that we obtain an empirical distribution of 10,000 possible realizations of German Bight storminess, which we can use to quantify the uncertainty. Note that each of the 10,000 new samples is the same size as the original geostrophic wind dataset, but may contain a different composition of geostrophic wind values due to sampling with replacement. To illustrate the uncertainty, we define the 95% confidence interval of German Bight storminess as the difference between the 0.025-and 0.975-quantiles of the empirical distribution of bootstrapped values. A flowchart visualization of the uncertainty estimations is given in Figure 2.

| RESULTS AND DISCUSSION
3.1 | Long-term storminess As described in Section 2.1, we derive 3-hourly geostrophic wind speeds over 18 triangles shown in Figure 1, which we then use to build annual and seasonal frequency distributions to derive the 95th and 99th percentiles of geostrophic wind speeds (standardized and averaged). Figure 3 depicts the time series of the standardized annual 95th and 99th percentiles of geostrophic winds over the German Bight from 1897 to 2018, averaged over all 18 triangles. A low-pass filter (Gaussian with σ = 3) has been applied to highlight interdecadal variability. The German Bight storminess time series show multidecadal variability for both the 95th and 99th percentiles. Storm activity is characterized by a period of near-average activity around 1900, followed by an increase in the years thereafter. Afterwards, the 1930s and 1940s show below-average storm activity in the German Bight. Storminess subsequently increases again to shortly reach above-average levels around 1950. This phase is characterized by a peak in 1949, when annual 95th and 99th percentiles increase to 3.32 and 2.42, coinciding with historical records of observed storm events (Lamb, 1991). Throughout the 1960s and 1970s, storminess shows slightly below average activity levels followed by an increase during the 1980s and early 1990s to aboveaverage activity levels. In the late 1990s, German Bight storminess decreases again to below average levels of storminess. Until the end of the examined period, storm activity remains below the long-term average of activity levels. During this period, the lowest values of storminess are found in 2003, with a minimum value of −2.09 (−2.14) for the 95th (99th) percentiles of geostrophic wind speed over the German Bight. The corresponding absolute geostrophic wind speeds for annual 95th and 99th percentiles, as well as the medians, are shown in Figure 4.
On the seasonal time scale, the behaviour is similar as shown in Figure 5. Our seasonal statistics of geostrophic winds are given with respect to the annual mean and SD of 1961-2010 in order to explain the seasonal contribution of storm activity to the annual time series. The winter (DJF) time series shows the closest resemblance to the annual time series (Figure 3) with pronounced periods of high activity around 1910, 1950 and 1990. As the magnitude of DJF storminess is highest compared with the other seasons, DJF storminess contributes most to German Bight storm activity. Note that storm events in the winter 1961/62 (Lamb, 1991) contribute to the maximum of storm activity levels in early 1962. In the opposite manner, JJA shows the weakest activity levels and remains mostly unchanged between the 1890s and 1960, but decreases below its long-term average during the 1970s. Spring and autumn can be described as transitioning periods between high-activity winters and low-activity summers. The MAM maximum in 1906 coincides with strong storm events in March 1906 that led to storm surges at the North Sea coast (Lamb, 1991). Contrary to the increase in winter storminess over the past couple of years, spring storminess is on a downward trend starting in 1990. Long-term trends of autumn storm activity are strongly superimposed by year-to-year fluctuations. Depending on the temporal extent of the winter storm season, the number of windstorm events and eventually the storminess level during this transitioning period is subject to fluctuations. Therefore, the contribution of MAM and SON storminess varies from year to year, but can be described as generally higher than JJA and lower than DJF.
Our results generally resemble the reconstruction of Dutch storminess from homogenized windstorm insurance losses by Cusack (2013). Here, the time series follow a multidecadal oscillation with maxima in the 1910s, 1940s and 1980s, close to the maxima in German Bight storminess. Periods of lower storm activity levels are found during the 1930s, 1960s and since the late 1990s, largely coinciding with phases of reduced storminess over the nearby German Bight.
Note that geostrophic wind speeds and their statistics well approximate the variability of near-surface marine atmospheric wind speeds as shown in Figures 13 and  14 that illustrate the agreement for short (intraday) and long (interannual) timescales by assessing wind speeds from observations and from modern reanalysis datasets. We found correlations of up to 0.80 (0.81) for the annual (3-hourly) time scale. Further details can be found in Section 4.

| Large-scale circulation
As the German Bight is located in the eastern sector of the northeast Atlantic storm track (Feser et al., 2015), its storm climate is influenced by the large scale circulation. The NAO represents the dominant mode of large-scale atmospheric variability in the Atlantic sector that drives

F I G U R E 2
Flowchart visualization of the uncertainty estimation process the westerlies in the Atlantic midlatitudes (Stendel et al., 2016).
A comparison between German Bight and northeast Atlantic storminess (Krueger et al., 2019) is given in Figure 6. Both the annual and low-pass filtered time series of German Bight storminess are significantly positively correlated with the corresponding time series of NE Atlantic storminess at the 0.05-significance level (0.58 and 0.65 for annual and low-pass filtered 95th percentiles, 0.52 and 0.48 for annual and low-pass filtered 99th percentiles). We determined the significance of the correlations by performing a Fisher z-transformation (Fisher, 1915) on the coefficients and testing whether the transformed coefficients significantly differ from 0. As the German Bight is essentially a part of the NE Atlantic, the underlying multidecadal variability in storm activity can be found in both low-pass filtered time series. In some decades however, storminess in the two regions proves to be remarkably different from one another. In the 1910s, German Bight storminess is above average, whereas NE Atlantic storminess recedes to near-average values, leaving a gap of nearly one SD between the two curves. While both regions show a reduction in storminess during the late 1920s, the low activity during the 1930s is much more pronounced over the German Bight. In the 1940s, both low-pass filtered curves show an increase in activity with a return to sub-average levels in the 1960s. The subsequent increase in German Bight storm activity leads to a storminess maximum in the 1980s, when NE Atlantic storminess is still near the longterm average. The NE Atlantic time series reaches its maximum in the early 1990s, nearly a decade after the German Bight. In recent years, the two curves show an increase in disagreement, both for the 95th and 99th percentiles. Especially during the 2010s, a period with recurring anticyclonic circulation patterns over the German Bight during the storm seasons (DWD, 2015;2017a), German Bight storminess exhibits a downward trend to below-average levels, while NE Atlantic storminess is rising to above-average levels.
The significant positive correlation with NE Atlantic storminess implies that storm activity over the German Bight covaries with the larger-scale storm activity over the NE Atlantic. Still, there are significant differences between both the annual and low-pass filtered storminess curves for the German Bight and the NE Atlantic. One explanation for the differences can be given by the extent of the examined domain. Variability on a smaller spatial scale is more likely to be disregarded when the triangle method is applied to a larger area, as the geostrophic wind acts as a proxy for the area mean of the true wind. Because the German Bight comprises a smaller area than the entire NE Atlantic, the storminess proxy allows for consideration of smaller-scale features, thus increasing the annual variability compared to the NE Atlantic.
The annual correlation of 0.36 (0.35) between the 95th (99th) percentile of geostrophic wind speed over the German Bight and the NAO index is weak, but significant at the 0.05-significance level. The correlation indicates that the variability of German Bight storminess is only weakly connected to that of the NAO, visible as strong disagreeing periods between the time series in Figure 7. Moreover, these correlations coefficients are lower than values found for the correlation between the NAO and the NE Atlantic (Krueger et al., 2019). Additionally, the pronounced multidecadal variability in the low-pass filtered curves for both the German Bight and the NE Atlantic does not agree with the underlying variability of the NAO index indicated by low correlations of about 0.32 (0.31). However, the correlation between annual German Bight storminess and the NAO index increases to 0.52 (0.47) for 95th (99th) percentiles if only the winter season (DJF) is considered. This increase suggests that, for winter months only, the NAO is a larger contributor to the variability of German Bight storminess than in other seasons.
The connection between the NAO and storminess over the NE Atlantic has been shown to vary with time (Hanna et al., 2008;Matulla et al., 2008;Pinto and Raible, 2012). As our previously calculated correlation coefficients only take the entire 122-year time series into account, we compute moving correlations between annual German Bight storminess and the NAO index, as well as NE Atlantic storminess, over centred 31-year windows ( Figure 8). The moving correlation between German Bight and NE Atlantic storminess is high during the 1910s and 1920s with a maximum of 0.6 (0.62) for the 95th (99th) percentiles and remains between 0.4 and 0.6 until the mid-1960s. From around 1970 onward, the correlation increases to about 0.7 near the end of the period. The running correlation with the NAO is weaker and finds its minimum in the 1960s at about 0.1. Afterwards, the correlation with the NAO index increases to levels above 0.4, with maxima between 0.5 and 0.6. The changes of the correlation between the NAO index and German Bight storminess indicates that the connection between the NAO and German Bight storm climate strongly varies with time, agreeing with the findings of To further discuss multidecadal variability in German Bight storminess, we need to compare storm activity with the occurrence of dominant weather regimes in Europe. Werner and Gerstengarbe (2010) showed that the relative frequency of synoptic circulation patterns over Europe on seasonal and annual scales is subject to interannual variability. Breaking down the circulation into three categories-zonal, mixed and meridional-Werner and Gerstengarbe (2010) found a peak in the rate of zonal weather regimes around 1990 and a steep subsequent decline, similar to the behaviour of German Bight storm activity. The late 1980s also feature absolute minima of meridional weather patterns in the annual, autumn and winter time series, as well as an absolute maximum of zonal patterns in winter. Predominantly meridional periods can be found in the late 1960s and 1970s, which correspond to the phase of low activity in the German Bight. The evolution of the annual prevalence of synoptic weather setups shares many attributes with our time series of German Bight storminess. However, the method used by Werner and Gerstengarbe (2010), which quantifies frequencies by counting the days per year with a specific prevalent pattern, causes an overrepresentation of the summer months in the annual data compared to our method. Thus, periods with meridionally dominated summers and autumns and zonally dominated winters like the 1910s appear as overly meridional in the statistics, while annual German Bight storm activity, which is closely coupled to the winter months, reaches aboveaverage values in the same time frame. It is therefore important to focus on the evolution of winter weather regimes. Using winter pattern statistics only, the high storm activity periods around 1910 and 1950 can be explained by an increasing number of zonal circulation patterns, while the decrease in storminess with its negative peak during the early 1930s coincides with a shift towards a meridional regime lasting from the 1920s well into the 1940s. For recent years, disparities between German Bight storminess and indices for the large-scale circulation can be attributed to the repeated occurrence of anticyclonic patterns over Central Europe, which dampen storminess over the German Bight. A dominant example for such a case can be found in fall of 2016, when the Azores high extended across Central Europe into Siberia, steering storms away from the German Bight, while still contributing to a positive NAO index and increased storminess over the entire NE Atlantic (DWD, 2017a).

| Uncertainties
We use an ensemble of 18 time series of geostrophic wind speeds to derive a robust time series of German Bight storminess. The ensemble would bring its own estimate of uncertainty-the ensemble spread or any other measure of ensemble variability. However, as the triangles used to derive the proxy time series cover almost identical areas, the differences among the time series are relatively small, and hence would make such an estimate of uncertainty rather small. a undetected errors in the pressure readings, the data selection process, sampling and assumptions made to calculate geostrophic winds affect our estimate of geostrophic wind speed statistics and thus cause uncertainties, we utilize a bootstrapping approach to estimate and quantify these uncertainties (see Section 2.2). Figure 9 displays the time series and 95% confidence intervals of both annual and low-pass filtered German Bight storminess. Figure 10 shows the time series of the range of the aforementioned confidence intervals for annual storminess values. Figure 11 then gives an overview over the range of the 95% confidence intervals for every season. We see that uncertainty is generally smaller for the 95th than for the 99th percentiles. The uncertainty for the 95th percentile-based annual time series ranges from 0.20 to 1.14, whereas the 99th percentile counterpart lies between 0.22 and 2.21. The same effect is apparent in the seasonal time series. Higher quantiles naturally possess a greater variability, which increases the uncertainty. Both the annual and seasonal time series of uncertainties follow a similar pattern. Uncertainties are at a maximum during the early years of our dataset, peaking around and shortly after 1900. They steadily decline until the early 1950s, but show a very erratic and highly variable behaviour. Autumn uncertainties are characterized by a secondary peak during the late 1920s, which is not visible in other uncertainty time series. From the 1950s onward, uncertainties remain steady at relatively low levels, indicating that the highest confidence can be found after 1950. The general trend of annual and seasonal uncertainties can be explained by data availability (Figure 12). As all uncertainties are simulated through the application of a bootstrapping scheme, lower availability rates (derived through the quality control procedures) correspond to higher levels of uncertainty and vice versa. Pressure observations are incomplete or too sparse for various stations before the early 1950s. The unavailability of pressure data results in a reduced number of possible triangles and thus a smaller usable ensemble for storminess calculations. The lowest percentage of data availability is found during the early decades of the 20th century, explaining the highest uncertainties during that time frame. Noteworthy increases in the number of usable triangles in 1935/36 and the late 1940s lead to a stepwise decrease in uncertainty. From the 1950s onwards, with the exception of 1971, 1972, 1991 and 1992, sufficient data are available to ensure that all 18 triangles are represented in the annual storminess calculations. The remaining uncertainty that cannot be attributed to data availability results from the variability caused by the triangles not fully overlapping (see Figures 1 and 4). We can use these uncertainty estimates to explain some of the disagreements between our time series of German Bight storminess and other reconstructions mentioned in Section 3.1, especially for periods before 1950. Still, our uncertainty estimates are generally smaller than comparable uncertainties for the entire NE Atlantic by Krueger et al. (2019), which were estimated through a similar bootstrapping approach. While low-F I G U R E 9 Standardized annual 95th (top) and 99th (bottom) percentiles of geostrophic wind speeds over the German Bight, 1897-2018. Dots indicate annual values, thick dashed lines show lowpass filtered data. Error bars indicate annual 95% confidence intervals, whereas thin dashed lines show the low-pass 95% confidence interval envelope F I G U R E 1 0 Size of the 95% confidence interval for annual 95th (blue) and 99th (red) percentiles of geostrophic wind speeds over the German Bight (i.e., the size of the confidence envelopes in Figure 9), 1897-2018. Thin solid lines indicate annual values, thick dashed lines show low-pass filtered data pass filtered uncertainties after 1950 are around 0.3 (0.4) for the 95th (99th) percentile-based German Bight storminess time series, Krueger et al. (2019) found values of 0.35 (0.5) for the larger NE Atlantic. The discrepancy is also present in the seasonal time series. Here, our approach results in uncertainties in the range of 0.3-0.5 (0.6-0.8) after 1950, while the uncertainties for the NE Atlantic are higher at 0.5-0.7 (0.7-1.0) for 95th (99th) percentiles. As the NE Atlantic covers a larger area, and the respective storminess index is based on non-overlapping triangles, high spatial variability is more likely to be incorporated into the index than with our method, thus increasing the uncertainty. Similar to our uncertainty time series, limited data availability during the beginning of the period also results in increased uncertainties in the estimates by Krueger et al. (2019), followed by a subsequent decrease as soon as more data become available.

| COMPARISON WITH OBSERVATIONS AND REANALYSIS DATASETS
Despite that the time series exhibit uncertainty, the method of calculating geostrophic wind speed and their statistics is a valuable tool to make inferences about storm activity. To demonstrate the validity of the geostrophic storminess time series, we first compare 3-hourly geostrophic wind speeds with measurements of wind speeds taken at the FINO1 site in the North Sea. Afterwards, we compare our annual statistics with statistics from reanalysis datasets. Figure 13 displays wind speed measurements taken from the 61 m cup anemometer at the FINO1 research platform sampled at 3 hr intervals, as well as the average geostrophic wind over the German Bight for the period F I G U R E 1 1 Size of the 95% confidence interval for seasonal 95th (blue) and 99th (red) percentiles of geostrophic wind speeds over the German Bight, 1897-2018. Thin solid lines indicate annual values, thick dashed lines show low-pass filtered data F I G U R E 1 2 Annual data availability rates (red line) and resulting number of useable triangles (blue bars),  2004-2018. Additionally, for illustrative purposes, a 30-day moving average has been added for both time series. Both time series show a very dominant annual cycle, with the highest wind speeds typically occurring in winter. The geostrophic wind speeds are notably higher, reaching peaks above 40 m Á s −1 in multiple years, whereas the highest wind speed measured at FINO1 is 32.75 m Á s −1 . Note that sampling at 3-hourly intervals can hide the extent of wind speed variability at smaller time scales. The measured wind speed at FINO1 is generally lower than the calculated geostrophic wind over the German Bight during the same period, which is understandable as the geostrophic wind is more or less an approximation of upper atmospheric movements devoid of ageostrophic effects (e.g., Hasse, 1974). However, both time series show good correlation skills as the correlation between the two time series is 0.81 for 3-hourly data. For longer time scales, when high frequency noise is smoothed out, such as on the monthly time scale, the correlation increases to 0.93 for 30-day low-pass filtered data indicating the agreement in the variabilities of the seasonal and annual cycles. Unfortunately, we could not examine longer time scales as the measurements were too short to reliably compare both time series on interannual scales.
To demonstrate such a good agreement on longer time scales, we make use of current atmospheric reanalysis datasets, namely ERA5, ERA-Interim, MERRA2, and a combination of CFSR version 1 and version 2 for the period 1980-2018. These datasets contain records of analyses of atmospheric fields, obtained through the assimilation of observational data for meteorological variables (e.g., Kalnay et al., 1996). For such a comparison, we select all gridpoints encompassed by the respective triangles, and sample at the same time steps, that is, 3-hourly for MERRA and ERA5, 6-hourly for ERA-Interim and CFSR. We calculate upper quantiles of area-weighted means of 10m near-surface wind in the reanalyses and compare them with the results obtained. The results are shown in Figure 14. The large bias in F I G U R E 1 3 Three-hourly geostrophic wind speeds averaged over 18 triangles in the German Bight (blue) and wind speed observations from the 61 m cup anemometer at the FINO1 research platform (red), 2004-2018. Solid lines denote 30-day moving averages F I G U R E 1 4 Annual upper percentiles of reconstructed geostrophic winds (red, solid) and 10 m wind speeds from reanalysis datasets (dashed and dotted) over the German Bight, 1980Bight, -2018 absolute wind speeds shows the natural discrepancy between geostrophic winds and near-surface winds (e.g., Hasse, 1974). As ageostrophic effects are not considered by the geostrophic assumptions in our reconstruction, the reconstructed winds are higher than the nearsurface winds in the reanalysis datasets. However, we aim at comparing the underlying variabilities of both time series via a correlation analysis, which the bias does not affect. The best correlation (Table 3) between observation-based storminess and wind speed reanalysis data can be found in ERA-Interim (0.74 and 0.84 for 95th and 99th percentiles), followed by MERRA2 (0.74 and 0.80), ERA5 (0.70 and 0.79) and CFSR (0.66 and 0.79). Even though the comparison stretches among different datasets with different temporal and spatial resolutions, the comparison between observation-based statistics and modelled statistics indicates good agreement between the variability of near-surface wind and geostrophic wind on longer time scales showing that on longer time scales statistics of geostrophic wind speeds are valuable to represent storminess conditions over the German Bight, furthermore confirming Krueger and von Storch (2011).

| CONCLUSION
We reconstructed the evolution of storminess over the German Bight from 1897 to 2018 from surface pressure observations. We presented a novel way of building robust storminess time series by enhancing the established proxy method of utilizing triangles of pressure observations. Here, our approach takes advantage of an ensemble of 18 partially overlapping triangles to derive upper annual and seasonal quantiles of geostrophic wind speeds. Our results provide an update of existing indices of storm activity for more recent times, so that assessments of the variability of storm activity are more comprehensive and improved. We make evident that German Bight storminess is characterized by a dominant multidecadal variability with periods of high activity around 1910, 1950 and 1990, and low activity in the 1930s, 1970s, and the 2000s. A comparison with NE Atlantic storminess and the NAO index shows that storm activity over the German Bight is connected to the large-scale circulation over the Northeast Atlantic. Nonetheless, intermittent periods of disagreement between the indices and our time series also indicate that there are other driving factors behind German Bight storminess. By giving an estimation of uncertainty, we find that annual uncertainty is inversely related to data availability and generally reduced compared to previous studies for the NE Atlantic, owing to the smaller area and the more robust method. Our approach made use of the dense observation network around the German Bight and demonstrates that reliable and homogeneous reconstructions of the wind climate require a good coverage and availability of historical records. Other nearby areas, for instance the Baltic Sea, are less covered and hence not yet suitable for comparable investigations.
In the future however, more historical records will become available through digitizing handwritten historical records of pressure observations. These ongoing efforts of the national meteorological services are coordinated by the WMO or by international collaborations, such as the Atmospheric Circulation Reconstructions over the Earth , the International Comprehensive Ocean-Atmosphere Data Set (Freeman et al., 2017), and will provide complimentary information for reliable reconstructions of the past storm climate. For German coasts, first analyses of such pressure reconstructions from handwritten records have been shown to provide useful information for the assessment of past storm activity (Wagner, 2016) and with additional records becoming available in the future, more detailed reconstructions of storminess in other areas become possible.