Extreme precipitation events over northern Italy. Part I: A systematic classification with machine-learning techniques

Extreme precipitation events (EPEs) are meteorological phenomena of major concern for society. They can have different characteristics depending on the physical mechanisms responsible for their generation, which in turn depend on the large and mesoscale conditions. This work provides a systematic classification of EPEs over northern–central Italy, one of the regions in Europe with the highest frequency of these events. The EPE statistics have been deduced using the new high-resolution precipitation dataset ArCIS (Climatological Archive for Central–Northern Italy), that gathers together a very high number of daily, quality-controlled and homogenized observations from different networks of 11 Italian regions. Gridded precipitation is aggregated over Italian operational warning-area units (WA). EPEs are defined as events in which daily average precipitation in at least one of the 94 WAs exceeds the 99th percentile with respect to the climate reference 1979–2015. A list of 887 events is compiled, significantly enlarging the database compared to any previous study of EPEs. EPEs are separated into three different dynamical classes: Cat1, events mainly attributable to frontal/orographic uplift; Cat2, events due to frontal uplift with (equilibrium) deep convection embedded; Cat3, events mainly generated by non-equilibrium deep convection. A preliminary version


INTRODUCTION
Prediction of extreme precipitation events (EPEs) is a fundamental scientific challenge and of key importance to society, not only for civil protection purposes but also for water management optimization.EPEs result from interactions of different physical processes on a wide range of spatial and temporal scales and this complexity poses challenges for their skilful forecast.Large-scale slowly evolving flow can be predictable over many days, but convective situations, dominated by fast processes, can be characterized by upscale error growth that can severely reduce predictability (Hohenegger et al., 2006).
A deeper understanding of how the large-scale atmospheric flow interacts (especially in terms of error propagation) with local dynamical and precipitation processes is fundamental to make significant progress in extreme precipitation and flood forecasting.This interaction has been shown to change on a case-to-case basis (Craig and Selz, 2018).
Several atmospheric and geographical factors can contribute to the development of EPEs.A key element is moisture availability and its transport, a necessary condition to achieve extreme daily accumulations (Lavers and Villarini, 2015).Others factors include presence and organization of convection, thermal and moisture stratification, precipitation efficiency, air-stream ascent mechanism and interaction with orography, proximity to the sea and vertical wind shear.The Mediterranean area is located at the end of the Atlantic storm track and, with the combination of a warm sea (especially in autumn) surrounded by high orography, presents a perfect laboratory to study the relative contribution of the different factors (Khodayar et al., 2018).A number of studies have already identified large-scale precursors of Mediterranean EPEs.Several authors highlighted the presence of an upper-level trough (Rossby wave) that enables, on its eastward movement, a warm-moist southerly airflow over the western Mediterranean basin (Massacand et al., 1998;Grazzini, 2007;Martius et al., 2008;Nuissier et al., 2011;Pinto et al., 2013).In addition, Pfahl et al. (2014) and Raveh-Rubin and Wernli (2015) have shown that more than 50% of these moist airflows are classifiable as a Warm Conveyor Belt (WCB), pointing to the importance of baroclinic instability and large-scale lifting for extreme precipitation in this region.The analysis of moisture supply for EPEs confirms a prominent role of large-scale transport with important contributions, especially in convective cases, from local sources.For example, Winschall (2013) and Winschall et al. (2014) have shown a high event-to-event variability in moisture supply.They identify water vapour coming from remote origins such as the North and subtropical Atlantic as a major contributor for stratiform precipitation, while a greater contribution comes from local moisture sources, like evaporation from the Mediterranean Sea, when Mesoscale Convective Systems (MCS) produce heavy precipitation.Within the WCB of extratropical cyclones, strong moisture advection usually occurs in narrow filaments of high integrated water vapour, called atmospheric rivers.Studies have indicated that atmospheric rivers can be a precursor of heavy precipitation in mountainous areas, also in Europe as shown by Lavers and Villarini (2013).
Given this large body of previous studies highlighting both large-scale components and significant contributions of local convective processes leading to EPEs (Ducrocq et al., 2014), it is desirable to condense this knowledge by developing a systematic classification of EPE.Inevitably, such a classification will introduce simplifications with respect to physical processes acting in nature, but it may prove useful to gain a deeper understanding.In an operational context, this may help forecasters to build conceptual models for different kinds of EPEs, while in research it will allow us to study predictability for each specific category separately.Some authors have already dealt with precipitation classification methods, first looking only at precipitation data (Llasat, 2001;Pinto et al., 2013), or combining two-dimensional (2D) radar data and neural network classification algorithms to discriminate between frontal and convective precipitation (Walther and Bennartz, 2006).Molini et al. (2011) classified severe rainfall events based on hydro-meteorological and dynamical criteria over a period of 3 years.
Expanding the Molini et al. (2011) approach we propose a categorization method which considers dynamic upper-air variables and the thermodynamic state, in addition to precipitation data.Our goal is to discriminate between three categories of EPE: those of frontal origin, those generated by deep convection, and an intermediate category.In this respect, a machine-learning approach provides an innovative framework to achieve this classification.Among its advantages are easy-to-generalize methods, efficient handling of a large number of predictors, integration of physical understanding into statistical models and exploration of additional information from the data, as shown in a series of applications related to high-impact weather recognition by McGovern et al. (2017).K-means clustering has been widely used for clustering weather patterns (see e.g. a similar approach applied to precipitation over Greece by Houssos et al. (2008)).However, the combination of K-means, plus random forest refinement (see sections 2.1 and 4.2 for a brief description of the two algorithms) used here, is novel.The result of this combination is to produce a better separation of EPEs into three different categories, outperforming the subjective classification.
We restrict our analysis to northern-central Italy, an area very prone to these phenomena with numerous cases documented and described in the literature.Isotta et al. (2014) shows that this region is one of the areas in Europe with the highest fraction of high-intensity precipitation days compared to the total number of wet days.Our EPE database contains 887 events spanning a period of 37 years , thus significantly increasing the number of cases compared to previous studies.For instance, this is a 10-fold increase compared to Molini et al. (2011).
After having described in detail the datasets used and the choice of the predictors in section 2, we present the EPEs classification focusing on the seasonal distribution of the events and commenting on its connection with seasonality of the large-scale forcing in section 3.In section 4 we show the clustering criteria.In section 5 we discuss the results, illustrate the characteristics of the different EPE categories, and focus in particular on events classified in category 2 for which we show two example cases.We conclude in section 6.

DATA AND METHODS
This study is based upon three complementary data sources: 1. ECMWF ERA-Interim reanalyses for atmospheric fields (Dee et al., 2011) 2. Northern-central Italy daily precipitation dataset ArCIS 3. Italian warning-area shape data (provided by Italian Department of Civil Protection) used to compute precipitation area averages ArCIS (Archivo Climatologico per l'Italia centro-Settentrionale, Climatological Archive for Central-Northern Italy) is a gridded precipitation dataset (5 km × 5 km) derived from 1,762 rain-gauges that belong to different networks of 11 Italian regions plus a number of stations of adjacent Alpine nations.The area covered is north-central Italy, at daily temporal resolution for the period 1961-2015.Input data are checked for quality, time consistency, synchronicity and statistical homogeneity.Data are interpolated using a modified Shepard scheme.A full description of the dataset can be found in Pavan et al. (2019).The 24 h accumulation period follows the best practice of the Italian Hydrological Service reporting between 0800 and 0800 UTC.That means that the nominal time of precipitation records is shifted by plus one day with respect to most of the hours in which rain has potentially been accumulated.This is taken into account, subtracting one day when comparing with daily mean reanalysis data.Precipitation is aggregated over warning-area units (WA) provided by the Italian Department of Civil Protection, where they are used operationally for the national warning system.WAs are defined 1 by a suitable aggregation of subregional hydrological basins.The goal is to obtain homogeneous areas with respect to the type and intensity of meteo-hydrological phenomena within a given territory.North-central Italy is subdivided into 94 1 WAs definition can be found here (in Italian): http://www.protezionecivile.gov.it/attivita-rischi/schede-tecniche/dettaglio/-/asset_publisher/default/ content/zone-di-aller-3.
WAs (displayed in Figure 1) with the naming convention being an abbreviation of the administrative region followed by an alphanumeric code.Their area extension ranges from the smallest domain in Tuscany of 192 km 2 (Tosc-S3) to the largest domain in Trentino Alto-Adige Alpine region of 7,398 km 2 (Tren-A).The mean area extension is 1,750 km 2 .First, we compute the daily spatially average precipitation and spatial standard deviation for each WA for the period 1979-2015.Secondly, we compute precipitation percentiles considering wet days only (daily accumulation greater than or equal to 1 mm).EPEs are subsequently defined as days with daily precipitation greater than or equal to the 99th percentile across one or more WAs.A description of each area, including their precipitation percentiles value, is provided in Table S1 in File S1.Note that with this upscaling approach we are implicitly disregarding localized events smaller than roughly 300 km 2 .
Fields from European Centre for Medium-range Weather Forecasts (ECMWF) ERA-Interim reanalyses are retrieved at 6 h intervals, temporally accumulated to daily resolution and spatially averaged over a box covering north-central Italy (indicated by the blue rectangle in Figure 1 and hereafter called target domain).Upper-air fields are averaged over the target domain, rather than on single warning areas, since our final goal is to study how a given upper-level flow forcing produces different precipitation characteristics at the surface conditional on the dynamics and the thermodynamic stratification.ArCIS and ECMWF ERA-Interim datasets are used for the common period 1979-2015.

Choice of atmospheric predictors
The choice of predictors was obtained through a combination of established variables described in the literature or previous case-studies with predictors typically used by forecasters in their operational experience.We select eight possible predictors which describe the EPE environment, including variables sensitive to flow conditions and variables representative of thermodynamic conditions.Their names and abbreviations are listed and fully described in Table 1.In particular, the use of CAPE, the convective adjustment time-scale Tau (see section 2.2) and IVT accounting for water vapour fluxes (Lavers and Villarini, 2015) are well documented.In addition,  e850 and TCWV are used for describing air-mass types.Δ e and BS 500-925 (Bulk Shear) are also included, providing further information on the convective environment.For each day in the 37-year period, spatial averages across the target domain are computed for these variables.Initial tests showed that maximum/minimum values for fields describing the convective environment have better discriminatory power than their mean daily values.Thus, maximum values of spatial averages of Tau, CAPE, BS 500-925 and minimum values of Δ e500-850 , all available at 6-hourly temporal resolution, are used instead of daily means.

Convective adjustment time-scale computation (Tau)
The convective adjustment time-scale is used to discriminate between atmospheric states that differ by the rate of removal of conditional instability: equilibrium and non-equilibrium regimes (Done et al., 2006).In the equilibrium regime the generation of CAPE is balanced by widespread convective heating associated with synoptic forcing, while in the non-equilibrium regime CAPE can rise to larger values since convection is limited by high convection inhibition (CIN) and its initiation is associated with local circulations in the boundary layer (weak large-scale forcing).Values between 3 and 12 h can be used as a threshold to discriminate between these regimes with a value of 6 h mostly used (Molini et al., 2011;Keil et al., 2014;Kober et al., 2014).Following Zimmer et al. (2011), Tau is computed as at 3 h intervals and averaged over the target domain.CAPE and precipitation P are extracted from short-term forecasts of ERA-Interim at 3 h intervals since these are not analysed fields.P is divided accordingly to obtain hourly precipitation rates needed for the computation.We omit grid-points with hourly rain rates lower than 0.2 mm/h.This empirically determined threshold allows a good balance between avoiding very low intensities that would produce spurious high values of Tau , and providing enough data points for a robust estimate.The domain-averaged Tau is set to zero if there are less than 10% precipitating grid points.

Machine-learning algorithm description and Silhouette score
The machine-learning classification is performed using modules of the Scikit-Learn library written in Python (Pedregosa et al., 2011).In particular, for clustering we use the Kmeans method of the sklearn.clustermodule, and for removing the unnecessary predictors (or reducing impurity in the machine-learning language) we used the RandomForestClassifier method, and its attribute feature_importances, which are part of the sklearn.ensemblemodule.As an objective metric to judge the cluster separation into three categories we used the Silhouette score (Rousseeuw, 1987), implemented in the silhouette score method part of the sklearn.metricsmodule.This score measures, along each dimension (i.e. each predictor in a normalized space), how tightly the events are grouped inside each cluster (cohesion) compared to the remaining clusters (separation).It ranges from −1 (wrong clustering) to 1 (fully separated clusters) with values equal to 0 indicating that a given element has the same distance from the other cluster centroids (overlapping).The Silhouette score is computed for all classification methods and averaged over all elements falling in each category.

EPE SEASONAL DISTRIBUTION
The seasonal distribution of all 887 EPE days is displayed in weekly bins in Figure 2. One bin contains 7 days, each counting from the first day of the year.Grouping in weeks instead of months, as done in previous studies, provides a more detailed temporal evolution and facilitates deeper insights into the large-scale triggering of the events.All EPE days are attributable to 633 independent events (separated at least by one day) with a mean duration of 1.4 (± 0.7) days.A marked seasonal cycle is visible in Figure 2 with a main peak in the autumn season.From the beginning of September to the beginning of December the relative frequency of EPEs is very high, reaching a maximum in weeks 45 and 46, where values are larger than one.Relative frequencies greater than one implies more than one EPE day per week.This is caused by the higher frequency of events persisting over consecutive days in this period of the year (the mean duration in weeks 45 and 46 increases to 1.8 days).This autumn peak of heavy precipitation events over the Mediterranean is well documented (Khodayar et al., 2018;Pavan et al., 2019) and is explained by the large thermal gradient between the warm sea and the atmosphere, favouring strong moisture and heat exchange.Winter and mid-summer are periods with a low EPE frequency, while from April to mid-June a secondary peak emerges that is less discussed in the literature.The observed frequency in spring is almost half of that observed in autumn and the interannual variability is much higher, as indicated by the wider confidence interval.The entire seasonal cycle of EPEs shows remarkable correlation with mean IVTn fluxes (Figure 2).This has important implications since it indicates that EPEs are statistically associated with large-scale precursors which are ultimately responsible for triggering strong meridional water flux transport towards the target area.On EPE days, the mean IVTn anomaly over the target domain is in fact +1.3 standard deviations over its climatological (weekly) value.

EPE CLUSTERING AND CLASSIFICATION
In the previous section we have shown that periods with high EPE frequency are associated with anomalously high IVTn.However, the resulting precipitation pattern can vary substantially depending on details of the mesoscale and thermodynamic state.Given a similar large-scale setting, an EPE can be generated by different processes, including or excluding convection for example.In winter for example, when colder air masses hold less water vapour, EPEs can be achieved only by a strong moisture transport from remote areas (e.g. in the form of atmospheric rivers) in association with additional uplift forced by steep topography.Lavers and Villarini (2013) have shown in fact that this association is stronger in winter months.On the other end, in summer, characterized by high moisture availability and high thermodynamic instability, a weaker thermal circulation can be sufficient to trigger convection, even on modest relief (Khodayar et al., 2018).The details of the different precipitation mechanisms of moist flow impinging on orography  (Ducrocq et al., 2014).Davolio et al. (2016) have shown, for example, two case-studies with similar large-scale flows that result in two very different precipitation patterns.The difference was attributable to the type of interaction of the impinging flow with orography; in one case producing convection upstream due to persistent blocked-flow conditions, while in the other case heavy rain was limited to the main Alpine crest as the flow went over the orography.This characterization is based on a detailed analysis of how the flow interacts in space and time with the orographic barrier, and would be difficult to repeat for our large EPE dataset.For this reason, we propose a more practical approach based on a categorization of EPEs according to mean values of typical predictors averaged over the target domain.Based on these arguments, we subdivide EPEs into three categories differentiated by the main processes involved: • Category 1 (main process: frontal/orographic uplift) EPEs in this category originate from intense and persistent frontal structures, including slantwise ascent in warm sectors, often classifiable as Warm Conveyor Belt (WCB), initiated by an upper-level Rossby wave in the western Mediterranean.Mechanical orographic uplift of low-level marine, moist air is the key factor to attain extreme precipitation over steep topography.Remotely transported moisture via atmospheric rivers may also play a role.Rare presence of convection, mostly associated with cold-front passages, accounts only for a small fraction of total precipitation of the event.
• Category 2 (main process: frontal uplift with equilibrium deep convection embedded) This category shares with the first a prominent large-scale signature, with an amplified upper-level precursor (Rossby wave) in the western Mediterranean but a stronger southerly flow component.However, reduced moist static stability might lead to the occurrence of deep convection, often in the form of back-building MCS (Lee et al., 2016) embedded in WCB ascent or more generally in the warm sector of the frontal system associated with Rossby waves.Persistent convergence lines over sea or close to orography, as in the case of presence of barrier-flow close to the orography, are the main factors triggering convection.
• Category 3 (main process: non-equilibrium convection) Even in this category a synoptic-scale wave can often be recognized, but of smaller amplitude.EPEs are generated mostly by convective events in a high conditionally unstable thermodynamic environment (very high CAPE).Triggering is controlled by local factors in a complicated interplay with orography: thermal boundaries induced by direct circulations (including sea and mountain breezes), soil wetness gradients, or outflow of previous mature systems.Triggering is typically limited by persistent capping inversions.Precipitating structures tend to assume the form of single cells or MCS of different kinds depending on the steering wind, local thermodynamic characteristics and environmental wind shear.

Subjective threshold-based classification
We investigate several ways to populate these three predefined categories.As a first approach we make a selection, based on experience and previous literature, using the list of predictors to obtain a reduced set for which we establish characteristic thresholds.We call this method the subjective threshold approach (STA).The convective time-scale Tau represents our first choice due to its ability to discriminate between equilibrium and non-equilibrium convective cases, as described in section 2.2.For Mediterranean cases spanning a 3-year period, Molini et al. (2011) apply a threshold of 6 h for Tau to classify heavy precipitation events over Italy and propose two categories: Tau <6 h type I events (equilibrium convection events, larger than 2,500 km 2 ), and Tau >6 h type II events (non-equilibrium, smaller than 2,500 km 2 ).However, this predictor alone is not able to discriminate between frontal precipitation with no convection embedded and cases of frontal precipitation with embedded convection.Both cases are characterized by very small values of Tau.Kober et al. (2014) introduced CAPE as an additional predictor to account for stratiform cases over Germany.Similarly, we introduce CAPE to discriminate between events falling in category 1 (from now on indicated as Cat1), while for events above a certain CAPE threshold Tau is used to distinguish between category 2 (Cat2) and category 3 (Cat3).
Figure 3 shows a scatter plot of Tau dmax against day of the year with colour coding according to the value of CAPE dmax .In addition, a smaller panel displays the mean orographic fraction for 6 bins of Tau dmax .The orographic fraction is the ratio between the number of mountainous WAs (underlined WAs in Figure 1) and flat WAs affected by the EPE.Winter events (up to beginning of March) are characterized by low values of Tau dmax and high orographic fraction, meaning that winter events mostly occur in regions with high orography.Values of CAPE dmax are small, although not exactly zero since there is always some residual CAPE over sea, even in winter.
A selection of 15 benchmark cases (5 for each category) is used to determine a characteristic threshold value of CAPE dmax .Since the event type is a priori known for these events (see Table S2 in File S1) we can assign CAPE dmax to specific weather regimes.The EPE benchmark cases that represent orographic precipitation events (Cat1) suggest a threshold value of 150 J/kg for CAPE dmax .Together with the discrimination between Cat2 and Cat3 based on Tau dmax we obtain the following STA classification:

Objective K-means classification
Although the classification proposed above provides a sufficient separation between the three categories, it is inherently subjective and requires a priori knowledge for a proper definition of the thresholds.In addition, only a small part of the information available in the complete list of predictors is used.We therefore apply an objective clustering method to exploit the full potential of the entire set of eight available predictors (see Table 1).We use a K-means method, one of the simplest and most-used unsupervised learning tools for unlabelled data.The algorithm assigns every data point to one of the K predefined groups (3 in our case) following a minimization of the inertia function or, in other words, the sum of squared distances within any cluster, between cluster centroid and points.Through a series of iterations, the algorithm creates groups of data points that have similar variance and that minimize the distances within the groups, in a multidimensional space defined by the number of predictors.Before applying any machine-learning algorithm (see section 2.3 for a description of the software modules used), all features (predictors) are normalized to the same scale (subtracting the mean and dividing by the standard deviation) to avoid distortion of the norm.Initially, we start clustering with all eight variables, being aware that some information is redundant due to cross-correlations between variables.The K-means method is applied in the default configuration.To check whether it is possible to reduce the number of predictors, we use a random forest method (RandomForestClassifier) to simulate the classification obtained by K-means.This ensemble learning method fits a number of decision trees (in our case 100 estimators or trees) to various subsamples of the dataset and uses averaging to improve the accuracy of the classification and control over-fitting (Breiman, 2001).In this way, we estimate the sensitivity of K-mean classification with respect to each predictor through the feature_importances attribute of the RandomForestClassifier method.In Figure 4, the ranking of the eight predictors is displayed according to their importance in assigning a given EPE to one of the three categories. e850 and TCWV show the greatest importance, probably acting as air mass tracers, followed by Tau dmax , CAPE dmax and Δ e , all important for describing the potential and type of the convective environment.The surprisingly low ranking of IVTn can be explained by the fact that the IVTn component plays an important role in all three categories so its ability to discriminate is low, however not negligible.Finally, IVTe and BS 500_925_dmax are well below 0.05.Therefore, we considered the latter two descriptors not important and consequently dropped.The final configuration of K-means clustering is based on the six remaining predictors.

Comparison between K-means and subjective method
Different approaches are employed to comparatively evaluate both methods.First, we focus on key properties such as a visual separation of the clusters in pairs of two selected dimensions.In Figure 5 a scatter plot comparing Tau dmax and TCWV is presented.While the STA approach guarantees a sharp separation in terms of the selected variables (Tau dmax and CAPE dmax ), it does not guarantee a sufficient separation in the remaining variables, as can be seen along the TCWV axis with Cat1 and Cat2 almost completely overlapping and with less separated centroids compared with K-means classification.An interesting property emerging from K-means clustering is that the value of Tau dmax that separates Cat2 from Cat3 decreases as the value of TCWV increases, indicating that a transition towards non-equilibrium convection is becoming more likely even with low Tau values as total water vapour increases in the column.This can be seen in the right panel of Figure 5 where the separation between orange dots (Cat2) and green dots (Cat3) follows a diagonal line.To the authors' knowledge, this dependence has not been highlighted in previous literature.Another important metric is the seasonal distribution of the three different categories.According to its definition, we expect that Cat1 events are more frequent during the cold season, while Cat3 should peak in summer months.Cat2, being an intermediate category, is expected to be most frequent during transition seasons.

F I G U R E 4
Feature importance ranking computed with the homonymous attribute of the RandomForestClassifier algorithm.The algorithm was run using the output of the K-means prediction with all eight predictors as target data (see text for further explanation).The last two predictors, IVTe and BS_500_925_dmax (abbreviated in the label figure to BS) are dropped since they rank well below 0.05 Indeed, comparison of the two methods shows a clear advantage of the K-means clustering method in producing more separated categories over the seasons (Figure 6).K-means produces, as expected, a prominence of Cat1 events in winter.On the contrary, the STA approach gives a more mixed situation in winter, with a frequent overlapping between Cat1 and Cat2, indicated by the brown colour.Moreover, Cat2 is more prominent in transition seasons using the K-means clustering.
A third classification method simply based on the week of the year (seasonal classification) is used as an additional independent dataset to be compared against the other two.As can be seen in Table 2, the Silhouette score (an objective measure of cluster separation) is highest for K-means clustering indicating a better separation than the other methods (STA and seasonal).Thus, the classification based on the K-means method is used in the remainder of the study.

CLASSIFICATION RESULTS
A discussion of the characteristics of the three categories resulting from the K-means classification is now presented.The characteristics of each category are highlighted, starting with Cat1 events, followed by Cat3 and finally Cat2 events.The order reflects the fact that Cat1 and Cat3 events represent opposing extrema of the categorization, while Cat2 shows intermediate characteristics.Cat2 includes many of the most important EPE cases.The discussion is mainly based on three figures: Figure 7 displays the size distributions of the EPEs and the mean area of EPEs in each category (in the inset).Figure 8 shows a summary panel of nine key variables that can be thematically grouped: characteristics, (d,e,f) thermodynamic instability indices, and (g,h,i) total column water vapour TCWV and vertical integrated transport IVT.Figures 9 and 10 depict respectively composite maps of geopotential height at 500 hPa (Z 500 ), mean sea level pressure (MSLP),  e850 , and daily precipitation averaged over 100 events with the highest Silhouette score for each category.

Category 1
On average, EPEs in Cat1 have the smallest area extension with a mean value close to 5,000 km 2 corresponding to 3.5 WAs involved (Figure 7 and Figure 8a).They are more frequent close to orography (orographic fraction 0.6, Figure 8c) and they have the smallest spatial variability inside the WA (Figure 8b).They are predominant in winter up to mid-May, when their frequency decays, and they start to appear again from December (Figure 6).They are characterized by strong moist static stability (Figure 8d,e,f) and show a comparable transport of water vapour from the zonal and the meridional component (Figure 8h,i).The mean flow pattern (Figure 9, upper panel) shows a broad upper-level wave in Z 500 centred over central Europe.A surface cyclone is present over the Tyrrhenian Sea, embedded in a weak  e850 gradient, aligned with the main trough axis.Peak values of precipitation are lower and more confined than Cat2 and Cat3.The highest values are found along the Apennine crest, and to a lesser extent also over the Adriatic area in response to low-level easterly (bora) winds (Figure 10, left panel).This is highlighting the fact that Cat1 EPEs are associated with stable low-level flow, blocked by the upstream orography, and circulating around the surface cyclone.This flow configuration is a distinctive feature of cyclogenesis in the lee of the Alps (or Genoa Low), mainly occurring in winter/spring (Trigo et al., 2002).This is confirming the expectation that Cat1 are mainly attributable to winter-type events, where in addition to the direct uplift on the windward side of the orographic barriers, baroclinic instability is locally increased by differential flow deformation at different levels (Buzzi and Tibaldi, 1978).

Category 3
Events in Cat3 are comparable in size with Cat1 events, especially in terms of area extension.The size distributions in both peak strongly at 1,700 km 2 (Figure 7).Cat3 events occur from mid-May to the end of October with the highest frequency from mid-August to mid-September (Figure 6).This seasonal distribution is similar to the climatology of MCS over Europe (Morel and Senesi, 2002).Cat3 events show the lowest orographic fraction, where a value of 0.5 in orographic fraction indicates that EPEs in Cat3 occur with same frequency whether orography is present or absent in the WA, They also show the largest spatial standard deviation variability inside the WA (Figure 8b) indicating greater variability in the precipitation field typical for spotty convective events.Thermodynamic indices are significantly higher than for other categories as indicated by the highest values of CAPE dmax , Tau dmax , and conditional instability (negative value of Δ e ), respectively in Figure 8d,e,f.Finally, Cat3 shows the highest TCWV, reflecting warmer and moister air masses present in summer.Interestingly the highest moisture transport towards the target domain is attributable to the IVT zonal component (Figure 8g,h,i).The flow composite still shows an upper-level wave, but of smaller amplitude with a shallow and broad surface cyclone over the central Mediterranean, implying a weaker surface circulation. e850 values are also the highest (Figure 9).The precipitation composite shows a reduced locking of the precipitation along the orography of central Italy while a maximum emerges over the western-central Alps linked with summer convection which tend to be localized more on the Alpine range (Figure 10, right panel).
Based on the characteristics discussed above, we attribute Cat3 events to a predominance of non-equilibrium convection, clearly highlighted by mean values of Tau dmax larger than 12 h.Non-equilibrium convective environment is characterized by weak large-scale forcing with the most relevant phenomena being thermally forced convection, that is notoriously difficult to predict, as it responds to details in the spatial distribution of CAPE and convective inhibition (CIN) (Done et al., 2006).Strong CIN constitutes a limiting factor that prevents the development of diffuse widespread convective activity but allows outbreaks of violent convection leading to extreme precipitation over limited areas.We hypothesize that the main features responsible for EPEs in this category are MCS affecting one or more WAs during their lifetime.

Category 2
Cat2 events exhibit by far the largest spatial scale, both in terms of number of WA and affected area.The mean area for Cat2 is about 10 4 km 2 , peaking (Substitute median with mode) at 3,000 km 2 (Figure 7).The different peaks in EPE area size of Cat1, Cat3 and Cat2 are consistent with Molini et al. (2011) who found a separation in scale between equilibrium (here Cat2) and non-equilibrium convection (here Cat3) at 2,500 km 2 .Events in Cat2 are even more likely to affect WAs with orography.Interestingly, the seasonal distribution of the events in this category shows two peaks: one in spring around week 20 (mid-May), and a larger one in autumn, between weeks 40 and 45 (October-mid-November) (Figure 6).EPEs in this category are less thermodynamically stable than in Cat1, exhibiting a nearly neutral stratification with Δ e500-850 close to zero (Figure 8f).The Cat2 upper-level flow is characterized by the presence of a sharper trough compared with Cat1 (Figure 9).In addition, the trough axis is centred 5 • in longitude more to the west, close to the Greenwich meridian, and has the main axis meridionally aligned, while in Cat1 it is more cyclonically tilted.The surface circulation and thermal gradients are stronger, with a deeper surface cyclone positioned over the western Mediterranean, in a forward position with respect to the upper-level main trough axis.All these characteristics indicate a more active baroclinic structure compared to both other categories, producing stronger meridional flow.Such a favourable positioning produces the highest moisture fluxes in the meridional direction (IVTn, Figure 8i).Many favourable ingredients for generating strong EPEs are present for Cat2.In particular, there is a clear synergy between strong large-scale forcing, denoted by high values of IVTn, which in turn imply large-scale upward vertical motion induced by horizontal advection of moist/warm air masses, and boundary-layer conditions still supporting deep convection.Synoptically driven low-level jets over the warm waters of the Mediterranean Sea further destabilize (in potential terms) the onshore flow, increasing low-level  e .This creates the ideal ingredient for the development of deep convection bursts embedded in the synoptic flow, typically localized at the interface between sea and coast or on the windward side of the orography close to the sea (Buzzi et al., 1998;Kirshbaum et al., 2018).The particular combination of stratiform precipitation and embedded deep convection explains why this category of EPE exhibits the highest precipitation intensity and the largest spatial extent as clearly evident in Figure 10, central panel.
To this category belongs the largest EPE in the period 1979-2015 which occurred on 1 November 2010, with an area extension of 70,000 km 2 .If we extend the statistics of EPEs back to 1961 (the first available year of the ArCIS dataset), the November 2010 EPE is surpassed only by what is known as the "century" flood in Italy.This event, which occurred between 3 and 5 November 1966, badly impacted Florence, where 101 people died and millions of rare books and art masterpieces were inundated.Beyond Florence, 54 WAs (out of 94) were affected with a total area extension that reached 98,760 km 2 on 4 November, by far the largest size F I G U R E 9 (a-c) Composite maps of the 100 events attaining the highest Silhouette score for each category.The average value of the Silhouette score for the three subsamples is reported at the top of each map.The fields shown are geopotential height at 500 hPa (contours every 6 dam in thick dark blue), MSLP (contours every 3 hPa in white) and  e850 shaded according to the legend in our dataset.Although not included in our list, since the ERA-Interim data are not available for this date, the K-means algorithm correctly classifies this EPE as Cat2 date (based on ECMWF ERA40 reanalysis data).A detailed meteorological description of that episode including a modelling study indicate that indeed the record precipitation was achieved by slowly moving stratiform rain preceding the cold front combined with an extensive line of deep convection, particularly active over the Apennines (Malguzzi et al., 2006).Finally, it is also worth mentioning a recent event occurring on 27-30 October 2018, called storm "Vaia," which affected north and central Italy with an amplitude similar to both cases above.Using ECMWF operational analysis as input, the objective classification classifies this EPE also in Cat2.A preliminary analysis shows that this EPE is likely to become one of the strongest on record in terms of rain accumulation and integrated water vapour transport over the target domain.Further analyses are planned to study this event in detail.
The seasonal distribution of Cat2 shows a consistent correlation with the climatological monthly precipitation distribution, in particular concerning the monthly distribution of extreme daily rainfall on the southern side of the Alps (Isotta et al., 2014).Consequently, the Cat2 distribution also fits well with the seasonality of the discharge of major rivers, like the Po river, showing two peaks, one in mid-May (due to melting snow plus peaks of rain) and the second in mid-November (due to wide and extreme rainfall only: Montanari, 2012).
We hypothesize that Cat2 events are closely linked with pulses of particularly long-lived Rossby Wave Packets (RWP), coherently maintained by a strong wave guiding effect.This long chain of downstream cyclone development is likely to open ideal pathways for long-range moisture transport towards the target domain (as documented by Piaget et al. (2014)).In the next section we show an example of this.This hypothesis has some important implication for predictability.Grazzini and Vitart (2015) have shown that if long and coherent RWPs (lasting more than 8 days) are present in the initial conditions, the resulting forecast shows higher skill than average conditions over Europe.An analysis of such an event is documented in the next section.

Genesis of Cat2 events: An example
In this section, we show an example of a typical large-scale evolution leading to Cat2 EPEs.We focus on a period embracing two Cat2 events, both included in the list of benchmark cases reported in Table S2 in File S1.Both occurred within a 10-day period in autumn 2011: on 25 October 2011 (Cinque Terre flood, Figure 11b) and 4 November 2011 (Genova (Genoa) flood, Figure 11c).In both cases, localized convection stayed quasi-stationary within slow moving large-scale patterns, and precipitation accumulated in an area already affected by widespread heavy rain causing devastating floods at different spatial scales.
The main panel of Figure 11 illustrates the RWP propagation (and IVT transport) that ultimately led to the positioning of the upper-level waves associated with those EPEs.In the Hovmüller diagram we can see that the flow was characterized by an almost stationary wave pattern until 15 October, with main waves located over eastern USA and the Atlantic.A small-scale EPE event (less than 1,000 km 2 ) occurred on the 19th associated with weak wave activity.On the 15th a large-amplitude RWP started off the west Pacific coast, reaching Europe on the 23rd.A second RWP pulse, apparently less coherent and split into two branches, started in the west Pacific on 26th and reached Europe on 3 November.In both cases RWP propagation ended when reaching Europe, leading These upper-level waves channelled very warm moist air from the Atlantic towards the central Mediterranean and the target domain.IVT values higher than 250 kg m −1 s −1 , the threshold defining an atmospheric river (AMS meteorology glossary) are evident in both cases.The second RWP produced an even greater and more persistent water vapour transport from the central Atlantic, setting up favourable conditions not only for the EPE on 4 November, but also for three subsequent days (sequence of triangles in Figure 11).This extremely high IVT appears to be related to the convergence of anomalously high water-vapour amounts associated with the remnants of Atlantic tropical storm Rina (23-28 October), as discussed by Rebora et al. (2013).

CONCLUSIONS
In this article, we describe a methodology for identification and systematic classification of extreme precipitation events (EPEs) over northern-central Italy.EPEs are defined as days when at least in one of the Italian Civil Protection warning-area units the spatially average daily precipitation is greater than the 99th percentile of the daily climatological distribution .The computation is based on the ArCIS gridded database, which is built from more than 1,700 quality-controlled stations.This database, in combination with ERA-Interim reanalysis data for upper-level atmospheric fields, allows a 10-fold increase in the number of EPEs compared to previous studies.A set of 887 EPEs is found and a subdivision in three predefined categories is proposed.First a subjective classification based on CAPE dmax and Tau dmax is developed, then a combination of machine-learning methods (K-means and Random Forest) is applied to group EPEs into the three categories.Random Forest Classifier and feature importances methods turn out to be decisive in finding an optimal classification and for neglecting non-useful predictors.The resulting upper-level composites agree with the subjectively chosen categories in which we wanted to map our events.
From the analysis of the upper-level composites, different processes generating EPEs are recognized: frontal or mechanical orographic uplift of moist statically stable flow for Cat1, stronger frontal and mechanical uplift of a neutrally moister/warmer stable flow for Cat2, and finally thermally forced deep convective ascent for Cat3.
A common characteristic for all three categories is that IVT is anomalously high.EPEs are largely controlled by the intensity of the meridional component of integrated vapour transport IVTn that in turn depends not only on moisture availability but also on a favourable phasing of the upper-level wave with respect to the target area.This confirms IVT as an important large-scale predictor, especially for Cat2 events, shown to be the most relevant category in terms of effects and EPE area extension.The importance of IVT as a predictor has been shown by Lavers et al. (2014;2016), who demonstrated that it is possible to extend the range of predictability of extreme hydro-geological events if the integrated water vapour transport is directly employed instead of considering the precipitation from direct model output.
The proposed classification, based on widely used machine-learning methods, has the advantage that it can be easily applied elsewhere, since no subjective choice of fixed thresholds is necessary.The categorization of precipitation may introduce some simplifications compared to nature, but it is very useful for gaining a clearer picture of the basic processes.This approach can raise forecaster awareness of the origins of high impact weather phenomena and of different kind of EPEs, fostering a more critical interpretation of numerical model output.In addition, moving to research aspects, the study sets the stage to investigate the relation between EPEs and Rossby wave packets.This analysis will be conducted in Part II of this work with the intention of gaining insight into flow-dependent predictability for these three different categories.The value of the forecast is measured by its ability to predict critical situations and the skill of modern numerical weather prediction is highly flow-dependent, especially when convection is involved (Keil et al., 2014;Nuissier et al., 2016;Rodwell et al., 2018).It is therefore important to provide the meteorological operational community with a more process-based assessment of predictability as a foundation for a new forecasting methodology specifically designed for extreme precipitation events.

F
The figure shows the 94 warning areas of north-central Italy (as defined by Italian Civil Protection) used for precipitation averaging.Labels indicate the name of each warning area which is composed of an abbreviation of the administrative region followed by an alphanumeric code.Underlined names indicate areas characterized by significant orography (see the elevation legend).The blue rectangular box represents the target domain used for averaging atmospheric variables.Latitudes and longitudes for reference are included along the inner border of the figure T A B L E 1 ERA-Interim variables chosen as predictors to represent the large-scale flow associated with EPEs.For each EPE day, variables are spatially averaged over the Target Domain and aggregated daily as reported in the

F
I G U R E 2 Seasonal distribution of EPEs in the period 1979-2015.Bars show the mean frequency of EPEs in bins of 7 days.The thin solid blue curve and corresponding shaded area depict moving averages over 21 days and 95% confidence intervals, respectively, estimated with the adjusted Wald method assuming a binomial distribution inside each bin.The thick red curve shows the climatological frequency of days with IVTn averaged over the target domain greater than 150 (kg m −1 s −1 ), a threshold corresponding to the 95th percentile of the area-averaged IVTn distribution for all days in the period 1979-2015.The IVTn curve is also based on a 21-day moving average have been extensively investigated in the Hydrological cycle in Mediterranean eXperiment (HyMeX) project, and in particular during the special observing period SOP1 dedicated to studying heavy precipitation across the Mediterranean

F
Distribution of EPEs in terms of Tau dmax (on the y-axes), day of the year (x-axis on the larger plot), CAPE dmax (coloured) and orographic fraction (small plot on the left).The median orographic fraction has been computed using six equally populated bins.Dots on the left graph mark the centroid of the bins (a,b,c) present EPEs area F I G U R E 5 TCWV and Tau dmax scatter plots for the two different types of classification.Subjective classification (left) and K-means based clustering (right) with six predictors.Black squares represent the centroids of the three different clusters.The respective population of each category is reported in the legend F I G U R E 6 Distribution of EPE counts (y-axis) in the three different categories according to the week of the year (x-axis), obtained with K-means clustering with six predictors (Kmeans6, panel b), and subjective method (STA, panel a).In addition to the colours in the legend, overlapping colours are: brown (Cat1 + Cat2), olive green (Cat2 + Cat3) T A B L E 2 Silhouette score computed on the 6-dimensional predictor space used for K-means clustering.For comparison, the subjective classification (STA) and an alternative classification based on the week of the year are also scored.The score provides a measure of the efficiency of the algorithm in producing well-separated clusters.It ranges from −1 (wrong clusters) to 1 (fully separated clusters) with 0 meaning overlapping clusters.K-means with six predictors proved to be superior to other tested configurations

F
Density distribution of EPE area extension for the three different categories: Cat1 (blue dash-dotted line), Cat2 (orange dashed line) and Cat3 (green solid line).The inset shows the corresponding mean area and the 95% confidence interval especially over central Italy (Figures 8c and 10 right panel).

F
Nine key mean characteristics of the EPEs for the three categories.The first column shows statistics derived from observations aggregated over warning areas, respectively: (a) the mean number (n) of WA per event, (b) the relative spatial standard deviation (RSD, areal standard deviation of precipitation divided by the precipitation mean for each WA), (c) the orographic fraction (1 if all areas with EPE have orography; 0 EPE only on flat warning areas).Second column: (d) CAPE dmax , (e) Tau dmax , (f) Δ e500-850_dmin .Third column: (g) TCWV, (h) IVTe/zonal component of IVT, (i) IVTn/meridional component.Confidence intervals are computed with a bootstrapping method as part of the Seaborn Python library

F
I G U R E 10 As in Figure 9 but displaying the EPEs precipitation composites for each category, overlaid on WA.(a) Cat1, (b) Cat2, (c) Cat3.Units are mm/day.EPEs in Cat2 are clearly the strongest to a deep trough positioned slightly west of 0 • longitude.

F
I G U R E 11 (a) Hovmüller diagram of RWP propagation during the period 9 October-10 November 2011, characterized by a significant EPE sequence.Red (continuous)/blue (dashed) lines show the meridional wind speed at 250 hPa (every 6 m/s, starting from 16 m/s).The green shaded areas represent the magnitude of IVT, starting from a threshold value of 250 kg m −1 s −1 which marks the atmospheric river lower limit.Fields are averaged over a band of latitude between 30 • N and 60 • N. EPE events in the target domain are marked by green triangles.The larger ones filled with orange colour are indicating Cat2 events.The two smaller triangles, respectively on the 20 October and 8 November are representing two smaller events of Cat3 and Cat1.Black dashed arrows mark RWPs associated with EPE.The brown shading, just above the longitude axis, provides a graphical impression of the distribution of the orography (white/sea, cream to dark brown/altitude) along the longitude.(b,c) Instantaneous Z500 and IVT for the two benchmark EPEs, 25 October and 4 November 2011, at 1200 UTC, respectively