Long-Term Foehn Reconstruction Combining Unsupervised and Supervised Learning
ABSTRACT
Foehn winds, characterised by abrupt temperature increases and wind speed changes, significantly impact regions on the leeward side of mountain ranges, e.g., by spreading wildfires. Understanding how foehn occurrences change under climate change is crucial. As foehn is a meteorological phenomenon, its prevalence has to be inferred from meteorological measurements employing suitable classification schemes. Hence, this approach is typically limited to specific periods for which the necessary data are available. We present a novel approach for reconstructing historical foehn occurrences using a combination of unsupervised and supervised probabilistic statistical learning methods. We utilise in situ measurements (available for recent decades) to train an unsupervised learner (finite mixture model) for automatic foehn classification. These labelled data are then linked to reanalysis data (covering longer periods) using a supervised learner (lasso or boosting). This allows us to reconstruct past foehn probabilities based solely on reanalysis data. Applying this method to ERA5 reanalysis data for six stations across Switzerland and Austria achieves accurate hourly reconstructions of north and south foehn occurrence, respectively, dating back to 1940. This paves the way for investigating how seasonal foehn patterns have evolved over the past 83 years, providing valuable insights into climate change impacts on these critical wind events.
1 Introduction
Foehn winds are downslope winds on the leeward side of mountains and can be found all around the world in areas with pronounced topographical features that impede the airflow, such as the European Alps, the Southern Alps in New Zealand, mountain ranges along the Mediterranean Sea or the Rocky Mountains. Depending on the region, these winds are given specific names such as Santa Anna winds (Southern California; Sergius, Ellis, and Ogden 1962; Rolinski, Capps, and Zhuang 2019), Chinook (Rocky Mountains; Armi and Mayr 2015), Bora (Croatia; Grisogono and BelusšIć 2009), Zonda (Andes, Argentina; Norte 2015), Raco (Chile; Muñoz and Armi 2024), Jintsu-Oroshi and Inami-Kaze (Japan; Kusaka et al. 2021; Koyanagi and Kusaka 2020), Halny (Tatra Mountains, Poland; Śliwińska and Ciaranek 2015; Grajek and Bednorz 2024) or Foehn (Central Europe and New Zealand; McGowan and Sturman 1996; Richner and Hächler 2013; McClung and Mass 2020). For historical reasons, ‘foehn’ has become a synonym for this type of terrain-induced wind phenomena.
Often, foehn is characterised by a sharp increase in wind speed and sudden changes in temperature and relative humidity, which can have a strong influence on the local climate and the people living in the affected areas. While foehn is often associated with a mild (and typically dry) climate, strong foehn events can also cause extensive damage to vegetation and man-made structures. In some areas, it is not uncommon for strong foehn gusts to overturn trucks or vans, or for airports and harbours to be closed due to unsafe conditions. In addition, the strong and dry winds can kindle and spread domestic fires and wildfires (Schoennagel, Veblen, and Romme 2004; Reinhard, Rebetez, and Schlaepfer 2005; Zumbrunnen et al. 2009) or affect the development of Antarctic ice shelves (e.g., Cape et al. 2015; Elvidge et al. 2020).
While the conceptual model of foehn is well established (Armi and Mayr 2007, 2011; Mayr and Armi 2008; Richner and Hächler 2013), its prevalence must be derived from its physical quantities such as wind, temperature and relative humidity. During the last decades, several (semi)-automatic methods have been developed that allow the differentiation of ‘foehn’ and ‘no-foehn’ events based on in situ measurements from automated weather stations (AWSs). Among the frequently used algorithms are Widmer's föhn index (Widmer 1966; Courvoisier and Gutermann 1971) based on Fisher's linear discriminant analysis to distinguish between two or more distinct classes and enhanced versions of it (e.g., Jansing et al. 2022). Other studies use decision-based or tree-based methods to classify foehn events (e.g., Dürr 2008; Speirs et al. 2013; Cape et al. 2015; Turton et al. 2018; Datta et al. 2019; Elvidge et al. 2020; Laffin et al. 2021; Francis et al. 2023), all of which are deterministic methods, where the thresholds have often been selected manually. To overcome these limitations, Plavcan, Mayr, and Zeileis (2014) proposed a method based on finite mixture models for automatic and fully parametric probabilistic foehn classification. To perform the classification, all methods require AWS measurements with high temporal resolution (ideally sub-hourly), which are typically only available for recent decades. While this allows for the classification and analysis of foehn when the AWS provides sufficient data, it does not offer information on foehn prior to the installation of the AWS, nor during outages after the AWS has been decommissioned.
Additional information on the atmospheric conditions from numerical (re-)analysis models is an excellent source to complement the in situ measurements. Reanalysis data sets are typically produced by physically based numerical weather prediction (NWP) models and sophisticated data assimilation schemes that use all available observations to estimate the ‘best known’ atmospheric state. An example is the global reanalysis data set ERA5 (Hersbach et al. 2023a, 2023b) from the European Centre for Medium-Range Weather Forecasts (ECMWF). ERA5 provides global hourly four-dimensional atmospheric conditions with a horizontal resolution of 0.25°×0.25° ( at the equator) back to 1940. However, this comes with its own challenges. Due to technical and computational limitations, ERA5 and other NWPs with a similar (or coarser) resolution can only approximate the real world and cannot resolve small-scale atmospheric processes and topographic features, which are important for small-scale phenomena such as foehn.
One way to overcome these limitations is to combine AWS measurements and reanalysis data using statistical or machine learning techniques. A classification of foehn based on AWS data serves as the response (target/outcome/labels) for a supervised model that uses reanalysis data as explanatory variables (inputs/covariates). Once the relationship between the two sets of data has been learned, the models can be used to predict the expected state (‘foehn’ or ‘no foehn’) for periods for which reanalysis data is available, but AWS measurements are not. This is also known as (statistical) downscaling or post-processing and has been used for foehn modelling in some variations. For nowcasting foehn at a station in Switzerland (Altdorf), Sprenger et al. (2017) use data from a local 7 km analysis data set (COSMO-7) for their adaptive boosting algorithm (AdaBoost). Laffin et al. (2021) use both ERA5 and the Regional Atmospheric Climate Model 2 (RACMO2; Wessen and Laffin 2022) combined with tree-based gradient boosting (XGBoost; Chen and Guestrin 2016) to predict foehn on the Antarctic Peninsula. XGBoost is also used by Mony, Jansing, and Sprenger (2021) to investigate future changes to foehn frequency in Switzerland.
This study proposes a novel probabilistic approach that combines unsupervised and supervised machine learning methods to bridge the gap between in situ automatic weather station (AWS) measurements and ERA5 reanalysis data to diagnose foehn. This combined approach allows us to reconstruct long-term, high-resolution foehn conditions over several decades. ERA5 data enables us to reconstruct the probability of foehn occurrence with hourly temporal resolution dating back to 1940, long before AWS was installed.
Our approach also allows for the identification of potential long-term changes in foehn occurrence in the European Alps from this high-resolution reconstruction. The validity of this combined approach is demonstrated by applying it to six stations located both north and south of the main Alpine ridge to show the method's effectiveness for both north and south foehn wind situations.
Figure 1 shows a schematic representation of the proposed approach. For all six locations, data are available from an AWS at the target location as well as from a nearby mountain station for the last 14–22 years (Section 2.1) on a 10 min temporal interval. A Gaussian mixture model (unsupervised learning; Section 3.1) is used for foehn classification. The result is then aggregated into binary time series (‘foehn’/‘no foehn’) with an hourly temporal resolution to match the resolution of the ERA5 data used in the next step. After combining the different data sources, supervised learning (Section 3.2) is used to find the relationship between a variety of interpolated and derived variables from ERA5 (Section 2.2) and the classified events. Once these statistical models have been estimated, foehn can be reconstructed (Section 3.3) for the whole period from 1940 to 2022, allowing the investigation of possible trends and/or seasonal changes over the past decades (Section 3.4).
2 Data
Section 2.1 describes the measurement data utilised for foehn classification along with the study area and the target stations. Section 2.2 explains the reanalysis data set and its pre-processing for the supervised learning method, along with the reconstruction process.
2.1 In Situ Measurements
This study utilises data from six AWSs situated across Switzerland and the western part of Austria, all positioned at the bottom of valleys known to be affected by foehn winds. Four of these stations are located north of the main Alpine ridge, while two are located in the canton of Ticino (Switzerland) south of the main Alpine ridge. Whilst the stations north of the Alps are prone to south foehn, the stations south of the Alps are known for the presence of north foehn.
An additional AWS upstream near the crest of the main Alpine range improves the accuracy of the foehn classification (Plavcan, Mayr, and Zeileis 2014). The stations Innsbruck and Ellbögen utilise data from Sattelberg, and the remaining four stations in Switzerland use observations from station Gütsch.
All stations provide data on mean wind speed, wind direction, air temperature and relative humidity at a 10-min temporal resolution. Table 1 displays the locations and data availability of these stations, while Figure 2 depicts a map illustrating their geographical position and the surrounding topography.
Type | Location | Data availability | |
---|---|---|---|
Gütsch (Andermatt)a | Crest | 46.653 N/8.616E 2286 m | 2005-01-01–2023-12-31 (95.3%) |
Altdorfa | South | 46.890 N/8.620E 438 m | 2005-01-01–2022-12-31 (78.1%) |
Montanaa | South | 46.290 N/7.460E 1423 m | 2005-01-01–2022-12-31 (77.1%) |
Comprovascoa | North | 46.460 N/8.935E 576 m | 2005-01-01–2022-12-31 (85.8%) |
Luganoa | North | 46.004 N/8.960E 205 m | 2005-01-01–2022-12-31 (90.2%) |
Sattelbergb | Crest | 47.011 N/11.479E 2107 m | 2006-01-01–2022-12-31 (75.0%) |
Ellbögenb | South | 47.200 N/11.430E 1080 m | 2006-01-01–2022-12-31 (92.0%) |
(Universität) Innsbruckc | South | 47.260 N/11.385E 578 m | 2009-06-21–2022-12-30 (99.1%) |
- Note: Observations provided by the Swiss national weather service. Four stations are used to model south foehn, two to model north foehn and two serving information at the mountain crest (△; cf. Type).
- a MeteoSwiss, the University of Innsbruck.
- b The Austrian national weather service.
- c GeoSphere Austria.
2.2 ERA5 Reanalysis
This study makes use of ERA5 reanalysis data, which is publicly accessible via the Copernicus climate data store (Hersbach et al. 2023a, 2023b). ERA5 offers four-dimensional gridded data with an hourly temporal resolution (starting from 1940) on a spatial grid of ( for Central Europe).
The data from 90 different fields (30 single-level fields and 60 pressure-level fields; see Tables S1 and S2 in Appendix S1) are bilinearly interpolated to the geographical location of the six stations of interest. Based on the 90 interpolated values, a series of derived variables is calculated, including vertical temperature gradients and level thickness, resulting in a total of 155 variables. These 155 variables are referred to as the ‘direct’ variable set as they solely rely on information retrieved directly from the geographical location of the corresponding target station.
Since foehn results from atmospheric conditions on a scale larger than the station scale (e.g., McGowan and Sturman 1996; Mayr and Armi 2010; Richner and Hächler 2013; Armi and Mayr 2015; Kusaka et al. 2021; Stoev, Post, and Guerova 2022), relying solely on data at the target location might be insufficient. Therefore, additional information from the surrounding area is incorporated by extracting data from ERA5 at a series of neighbouring points arranged in a ‘star’ formation around the target location. Figure 2 shows the target locations (C; center) and their neighbouring points used. These neighbouring points are positioned geographically relative to the target station upstream (U) and downstream (D) of the main foehn wind direction as well as to the right (R) and left (L) of it.
While the interpolated information from the target location itself (C) is always used as possible covariate for the statistical models, the values interpolated at the neighbouring points are not directly employed but are instead used for the calculation of the derived/augmented variables. In combination with the ‘direct’ variables, a list of additional derived variables is calculated such as spatio-temporal temperature and pressure differences. For example, spatial differences in surface pressure are calculated between neighbouring points (e.g., C-D, U-C, UU-DD, UL-DR, UR-DL; see Figure 2) as well as temporal changes over the 3–6 h at these neighbouring points. Taking spatial differences of these temporal changes yields the spatio-temporal information. Finally, the first and second-order harmonics of the day of the year are included to capture seasonal variation. In total, this yields 497 variables: Four harmonics, 155 direct variables, 136 spatial variables, 120 temporal variables and 82 spatio-temporal variables. This expanded set is referred to as the ‘full’ variable set. Further details regarding the construction of the neighbouring ‘star’ can be found in Appendix S1, for a comprehensive list of all variables see the materials at https://doi.org/10.48323/gdkr5-7tt45.
3 Methodology
Section 3.1 introduces the unsupervised learning model used for foehn classification, followed by the supervised learning models in Section 3.2. The results from Section 3.2 are then used to reconstruct hourly foehn occurrence over the past decades (Section 3.3), which is analysed employing season-trend decomposition in Section 3.4.
3.1 Unsupervised Learning: Mixture Model for Foehn Classification
denotes the posterior probability for a ‘foehn’ observation at a specific time and station, which is modelled as a function () of the 10-min in situ measurements (Section 2.1). We employ a two-component Gaussian mixture model (Grün and Leisch 2008) with concomitants, closely following the method proposed by Plavcan, Mayr, and Zeileis (2014) implemented in the R package foehnix (Stauffer 2023).
The prerequisite condition for an observation to be used for classification is that the wind direction at the location of interest falls within the prevailing foehn direction at the target location, and that wind from a specific direction is also prevalent at the corresponding crest station at the same time (details in Table S3). Only the periods matching this precondition are used for estimating the Gaussian mixture model, while is set for all remaining observations.
The underlying concept involves that two unobservable Gaussian components (or clusters) exist, one describing ‘foehn’ conditions and the other ‘no foehn’ conditions. To distinguish between these two components, a main covariate is required. In this study, the potential temperature difference () between the valley station () and the crest station () is computed with the dry-adiabatic lapse rate (), yielding , where is the height difference between the two stations (cf. Table 1 and S3). This simplification eliminates the need for air pressure measurements at both sites, which is otherwise required when using the definition of potential temperature. During foehn events, the air descends on the leeward side of the mountains, with a potential temperature difference close to zero in the absence of significant diabatic processes, such as turbulent mixing or air entrainment from tributary valleys.
However, the potential temperature difference alone might not be sufficient to adequately separate the two states. Therefore, an additional concomitant model is employed to weigh the two components conditional on additional covariates. In this study, binary logistic regression is used for the concomitant model, employing relative humidity and mean wind speed as additional covariates. Models with this specification have been shown to work well for stations in the Alpine region (e.g., Plavcan, Mayr, and Zeileis 2014; Plavcan and Mayr 2015). Figure 3 provides an illustration of this model, depicting the use of to separate the two components (‘foehn’/‘no foehn’) and the effect of the concomitant model on the joint density. More details can be found in Appendix S2.
3.2 Supervised Learning: Modelling Foehn Probability
lasso Logistic regression with lasso (L1) regularisation (Friedman, Hastie, and Tibshirani 2010; Tay, Narasimhan, and Hastie 2023).
stabsel Logistic regression with lasso-based stability selection (Meinshausen and Bühlmann 2010).
xgboost Extreme gradient boosting (Chen and Guestrin 2016).
In order to investigate the possible benefits of incorporating large-scale information from the stations' neighbourhood, two variations of each learner are considered: One utilising the ‘full’ set of 497 variables and one only using the ‘direct’ set of 155 variables (Section 2.2).
To account for location and time of day, separate models are estimated for each of the six stations for each hour of the day ( UTC, UTC, …, UTC), resulting in a total of 864 models. Depending on the station, the training data for these models include 10–18 years of data (see Table 1). In addition, a six-fold cross-validation (CV) is performed using a fixed period of 12 years (2011–2022) where, in each fold, two consecutive years are left out as test data. Ellbögen and Innsbruck are missing one fold (with test data 2013–2014) where no measurements from the crest station are available, and thus the classification is not possible.
3.3 Long-Term Foehn Reconstruction
Once the models from the previous section are estimated, they can be applied to the entire ERA5 period available. Although this is a prediction from a statistical perspective, it is termed ‘reconstruction’ in this article, as these predictions are applied backwards in time. The result is an hourly time series of foehn probabilities from January 1, 1940 to December 31, 2022 (83 years).
This high-resolution reconstruction can serve as input for a variety of applications and analyses. To demonstrate the potential, we analyse the foehn occurrence from a climatological perspective: Did the occurrence of foehn increase/decrease along with the changing climate over the decades? Are there changes in the seasonal or diurnal patterns? These questions are investigated in more detail in the next section.
3.4 Season-Trend Decomposition
The comprehensive reconstructed data set allows for the study of foehn occurrence in a climatological context. For this analysis, the hourly probability (Equation 3) is aggregated by (i) taking the highest probability per day (0000 UTC–0000 UTC), before (ii) calculating monthly averages. The resulting time series contains “monthly means of the daily maxima”, which are then modelled using a season-trend decomposition.
Due to the nature of the data, there is a large year-by-year but also within-year variability depending on the prevailing weather situation. To decompose the signal, a season-trend decomposition is employed separating the signal into long-term changes and a remainder component containing the residual variability.
4 Results
First, this section investigates which insights can be gained from the reconstruction about the foehn occurrence at the six different target stations. Different temporal scales are considered for this, namely: Inter-annual changes in the foehn probabilities in Section 4.1, long-term trends and seasonal patterns in Section 4.2, and changes in the diurnal patterns across decades in Section 4.3. All of these results are based on the reconstruction using the ‘lasso’ learner with the ‘full’ covariate set for the full-time period (without CV).
Second, the performance of the supervised learning model is assessed under different model specifications, namely: Using the ‘full’ set of all 497 variables versus the 155 ‘direct’ variables only in Section 4.4 and comparing the performance of the three supervised learners (lasso, stability selection, extreme gradient boosting) in Section 4.5. All of these results are based on out-of-sample Brier scores (BSs) obtained in a six-fold CV.
4.1 Average Annual Foehn Probabilities
The primary outcome of this study is the complete reconstruction of hourly foehn probabilities over years (see Sections 3.1, 3.3), yielding time series with individual probabilities (Equation 3). To carve out inter-annual variations in the foehn probabilities, we aggregate the reconstructed data by taking the daily maximum of before calculating annual means. This can be interpreted as the average probability of observing a foehn event on any given day within that year.
Figure 4 contains the result for all six stations, with Ellbögen exhibiting the highest mean annual probabilities (on average ), while Altdorf and Innsbruck show the lowest (on average and , respectively). Additionally, the annual mean of daily maxima from the classification is shown for years, with at least 80% of measurements available at the AWSs. The results show an overall good agreement between the two signals from the reconstruction and the classification, with some larger gaps due to data availability as well as some noticeable differences for specific stations in particular years.
The reconstruction reveals a pronounced inter-annual variability, with certain years exhibiting a much higher annual mean than the long-term average, whilst others distinctly fall below. This variability is not random, as one can see similar patterns among the four south-foehn stations. For instance, all four stations show unusually high mean probabilities for 1951 and 1972. Similarly, the two north-foehn stations exhibit a similar temporal behaviour over time.
Moreover, the figure suggests a possible increase in the annual mean foehn probability for south-foehn stations between 1940 and 1980. Hence, this question is investigated in more detail in the next section.
4.2 Climatological Trends and Seasonal Patterns
In this section, the analysis from the previous section is taken a step further. Rather than focusing on the inter-annual variation the goal is to bring out the long-term climatological trends and changes in the seasonal patterns. Hence, the reconstructed hourly time series are again aggregated but to monthly (rather than annual) means of the daily maxima of . Based on the season-trend decomposition outlined in Section 3, Figure 5a illustrates the resulting smooth trends () along with the corresponding confidence intervals and the long-term average. Figure 5b depicts the corresponding smoothly varying seasonal signals () averaged over decades for visual purposes (where the decade 1940 corresponds to the years 1940–1949 etc.).
The estimated trends (Figure 5a) show a slight increase across all six stations. At the two stations in Western Austria (Innsbruck, Ellbögen), an increase can be seen between 1940 and 1980, followed by a plateau. The other four stations exhibit a linear change over time. Although these changes are small in absolute terms, they are statistically significant for four of the six stations (all except Montana and Comprovasco). Here, significance indicates that the trend differs from a constant because the long-term average falls outside of the corresponding 95% confidence interval.
Figure 5b shows the analysis of the seasonal changes, which reveal the different characteristics between north-foehn stations and south-foehn stations. The two stations located south of the main Alpine ridge, Comprovasco and Lugano, show one distinct maximum in spring and a minimum during autumn. This pattern is stable over the entire study period, and no changes in the seasonal pattern () are found. The picture looks different when focusing on the four south-foehn stations which all show two maxima in spring and autumn with lower probability of foehn occurrence during summer and winter. Although not significant, the season-trend decomposition indicates an increase in the probability of foehn in spring (April, May) as well as in autumn (October, November) with a slight decrease in late summer (August, September).
4.3 Diurnal Variability
For certain applications, information about diurnal patterns and their changes over time can be of great interest. With the hourly temporal resolution of the reconstruction, such insights are now possible across several decades.
Figure 6 shows Hovmøller diagrams for Ellbögen, depicting the decadal mean probability per time of day and month. Despite pronounced variability between the decades, the plot supports the previous findings showing an overall increase over time with the strongest increase in spring (April, May) and in autumn (October, November). In addition, this visualisation gives insights into the diurnal pattern. Generally, foehn occurrence in Ellbögen is more likely during the day (around 1000 UTC–2200 UTC) in spring and autumn, the period where the indication for a certain increase was found (Section 4.2). The minimum average foehn probability is in the early morning and tied to the length of the night. The time of the minimum varies somewhat interdecadally.
The Hovmøller diagrams for the remaining stations are provided in Appendix S5 (Figures S2–S6). Similar to Ellbögen, these diagrams support the findings from the previous sections while offering additional insights into the changes in diurnal patterns over the decades. For Comprovasco and Lugano, the diagrams show the same general patterns as the corresponding figures 20 and 37 from Cetti, Buzzi, and Sprenger (2015) for the time periods 1993–2003 and 2004–2014.
4.4 Benefit of Full Covariate Set
As described in Section 3.2, two variants of the supervised learning methods are estimated: One using only the 155 ‘direct’ variables and one using the 497 ‘full’ variables, including large-scale atmospheric conditions (Section 2.2). For both variants, a six-fold CV is performed (Section 3.2).
To investigate the benefit of the ‘full’ covariate set, Figure 7 shows the BSs for the ‘lasso’ model for all six stations. For each station and each variant, BSs are shown for the test data set (out-of-sample) as well as for the training data set (in-sample). This shows that the models based on the ‘full’ variable set clearly outperform the models based on the ‘direct’ variables only. Although the overall performance of the less complex models based on the ‘direct’ variable set is still decent, including the additional large-scale spatio-temporal information substantially improves the overall model performance.
In addition to the predictive skill, Figure 7 also shows the stability of the model with the largest variance in the BSs visible on the test data sets in Ellbögen. On the training data sets, the scores barely vary due to the large sample size (10 years, hourly data). A comprehensive comparison of all models and variants can be found in Appendix S6 (Figure S7).
4.5 Comparison of Supervised Learners
While the previous section focuses on the benefits of using more input data, this section compares the three different supervised learners described in Section 3.2. For simplicity, only the results for models based on the ‘full’ variable set are shown as they have been shown to outperform those only using the ‘direct’ variable set (Section 4.4).
Figure 8 illustrates that the BSs from the six-fold CV on the test set (out-of-sample) are comparable for all three supervised learners with only minor differences. The average BS is slightly lower for ‘lasso’ (), followed by ‘stabsel’ () and ‘xgboost’ ().
On the training data (in-sample), the picture is similar for ‘lasso’ and ‘stabsel’, but ‘xgboost’ has much lower BSs. This indicates that ‘xgboost’ might be subject to some overfitting, despite careful tuning of the hyperparameters (Table S4 in Appendix S3.3).
5 Discussion and Outlook
Using a novel combination of unsupervised and supervised learning, we are able to accurately reconstruct long-term foehn time series (starting from 1940) at hourly resolution. More specifically, foehn classification is first accomplished by (unsupervised) Gaussian finite mixture models based on AWS measurements and then linked to ERA5 data using binary supervised learners such as lasso, stability selection or extreme gradient boosting. The resulting foehn reconstruction enables novel analyses, exemplified here by investigating long-term changes in trends, seasonal patterns and diurnal cycles of foehn occurrence.
The season-trend decomposition based on the period 1940–2022 reveals that all six stations considered have either experienced a linear increase in foehn occurrence (probability) over the entire study period or an increase between 1940 and the early 1980s that levelled off afterwards. Altough these changes over time have proven to be statistically significant, they are small in absolute terms. The seasonality did not show any significant changes over time. However, the results for all south-foehn stations (Altdorf, Montana, Ellbögen and Innsbruck) indicate a slight increase in the occurrence of foehn in spring and autumn, with a slight decrease in late summer.
The high quality of the foehn reconstruction is partially due to using a large set of predictor covariates that not only contains information at the target location but also includes additional large-scale atmospheric information from the stations' surroundings. The benefits of this full set of covariates are similar for all three supervised learners considered: Logistic regression with lasso regularisation (‘lasso’), logistic-regression-based stability selection (‘stabsel’) and extreme gradient boosting (‘xgboost’). Lasso performs best in our application, closely followed by the other two learners. Some further improvements might be gained for xgboost if overfitting on the training data can be further reduced, e.g., by a different hyperparameter tuning strategy.
For a comparison to existing publications, we complement the BSs shown in the main paper by other popular scores for binary outcomes, such as the false negative rate (FNR, also known as miss rate), false positive rate (FPR, also known as false alarm rate) and percent correct (PC, also known as accuracy). Based on the best model (lasso with full covariate set), we obtain the following performances for Altdorf: 15.7%/0.4%/98.8% (FNR/FPR/PC). These align well with the existing literature and stand out for an exceptionally low FPR. Sprenger et al. (2017) report 11.8%/33.8%/96.5% and Mony, Jansing, and Sprenger (2021) report 21.4%/21.4%/97.7%. Similarly, for Lugano, we obtain 15.3%/0.7%/98.2%, while Mony, Jansing, and Sprenger (2021) report 22.1%/22.1%/97.1%. More details are included in the Appendix S6 (Table S5).
The same holds for the foehn classification when compared to existing literature and the Swiss foehn index (SFI) operationally used at MeteoSwiss in terms of ‘average foehn hours per year’. The results from the Gaussian mixture model (Section 3.1) for Altdorf show an average of 482.4 h/year which aligns well with the SFI (458.4 h/year), as well as the results reported by Dürr (2008) (478 h/year), Jansing et al. (2022) (465.8 h/year) and MeteoSwiss (2024) (477 h/year). Montana exhibits 1007.4 h/year (SFI 904.7 h/year1), while Lugano and Comprovasco show 644.7 h/year (SFI 563.0 h/year, MeteoSwiss (2024) 551 h/year) and 1077.4 h/year (SFI 953.4 h/year), respectively.
Lastly, the reconstruction for Altdorf is compared to the results from Gutermann et al. (2012) and Richner et al. (2014), who provide an hourly binary ‘foehn’/‘no foehn’ time series for the period 1955–2008 at https://www.agfoehn.org. These data are aggregated to foehn hours per year and depicted in Figure 9 (grey) along with the annual number of foehn hours from our reconstruction (; purple). This shows that the results from both methods closely agree for the latter half of the time period. Only in the first half, the Guterman et al. series have systematically higher values than our reconstruction and the timing of the change essentially coincides with the availability of AWS measurements at Altdorf, starting in June 1981. Previously, the foehn indicators reported by Gutermann and co-workers are based on manual foehn classifications using traditional weather station recordings. Note that our reconstruction does not utilise any measurements from the station at Altdorf prior to the start of the training period in 2005.
It is worth mentioning that the quality of the reconstruction strongly relies on the quality of the automatic foehn classification. While the two-component Gaussian mixture model works well in this study, there are stations where this approach is insufficient. For example, we have found this to be the case for Aigle, Switzerland, where a three-component mixture model (or another classifier) appears to be necessary to separate light down-valley winds, strong humid (katabatic) outflows and actual foehn situations (details not shown).
However, if the separation of the AWS measurements into ‘foehn’ or ‘no foehn’ components works well (as for the six stations presented), the different binary supervised learners are able to link this reliably to the ERA5 data, yielding excellent results. In this article, we demonstrate the value of the long-term reconstruction by identifying long-term trends over the past decades as one possible application (Section 4.2). Additional work on this aspect might be needed in the future using different methods to get more detailed insights and assess the stability of our results.
Additionally, this high-resolution reconstruction offers promising opportunities for other applications. For instance, it can be used to fill gaps in historical records or to extend ‘foehn observations’ for studies, for example, studies which identify synoptic circulation patterns associated with foehn in specific regions (Kusaka et al. 2021; Stoev, Post, and Guerova 2022). Furthermore, these extended datasets can provide valuable insights into the effects of foehn on different areas, such as ecology, where the warming and drying effects of frequent foehn events could significantly impact flora and fauna or increase fire hazards.
It would also be interesting to see how the approach performs in other regions around the globe or when applied to the future (forecasts) rather than the past (reconstructions), similar to the work of Zweifel (2016), Sprenger et al. (2017) or Mony, Jansing, and Sprenger (2021). Although some adjustments will be needed, particularly for longer forecast horizons where the temporal resolution of NWP outputs typically decreases, this combination of supervised and unsupervised approaches has great potential for further research.
Author Contributions
Reto Stauffer: investigation, methodology, validation, visualization, writing – original draft, writing – review and editing, formal analysis, software, data curation, conceptualization. Achim Zeileis: writing – review and editing, data curation, conceptualization. Georg J. Mayr: writing – review and editing, data curation, conceptualization.
Acknowledgements
The study is partly based on the preliminary work of Morgenstern (2020). The computational results presented here have been achieved (in part) using the LEO HPC infrastructure of the University of Innsbruck.
Conflicts of Interest
The authors declare no conflicts of interest.
Computational Details
The results in this paper were obtained using R 4.2+. The majority of data preparation and handling is done using the R packages stars 0.6.4, sf 1.0.15 and zoo 1.8.12. foehnix 0.1.6 is used for foehn classification, and the supervised learning is based on glmnet 4.1.7 and xgboost 1.7.5.1. The season-trend decomposition is based on the R package stR 0.6.
Endnotes
Open Research
Data Availability Statement
In situ observations were retrieved from the Swiss national weather service (MeteoSwiss; not publicly available) as well as Universität Innsbruck and GeoSphere Austria. The latter two are publicly available via https://acinn-data.uibk.ac.at/ and https://data.hub.geosphere.at/. ERA5 data is generated using Copernicus Climate Change Service publicly available via https://cds.climate.copernicus.eu/. Results for Ellbögen and Innsbruck are available at https://doi.org/10.48323/gdkr5-7tt45.