Digital long‐term topoclimate surfaces of the Pyrenees mountain range for the period 1950–2012

Long‐term climate data accounting for high‐spatial variability and detailed topographic effects are crucial for research and management in complex terrains such as mountain ranges. Here, we introduce the Pyrenean Digital Climate Atlas based on data from around 400 weather stations located across the range during the 1950–2012 period. Average monthly, seasonal, and annual temporal resolutions are provided for 30‐m spatial resolution surfaces. Local heterogeneity was considered, integrating meteorological station data, high‐quality terrain information (altitude, latitude, distance to the sea, and solar potential radiation) and multivariate regression modelling in a Geographical Information System. Climate surfaces of air temperature (minimum, maximum, and mean) and precipitation were obtained and used to derive maps of bioclimatic interest such as potential evapotranspiration, water availability, and growing degree‐days. Metadata are provided in XML standard format (ISO 19139) with all the usual fields and quality indicators for each map, with an RMSE ranging from 0.7 to 1.2°C and 11 to 15 mm for air temperature and precipitation maps respectively. The Atlas is available in GeoTIFF format at the ZENODO repository.


| INTRODUCTION
Mountain areas are particularly suitable observatories for climate change research since they provide a variety of topoclimate environments and a remarkable altitudinal gradient. Moreover, high-mountain ecosystems are one of the most vulnerable environments to the effects of the ongoing climate warming, such as melting of glaciers (Kohler & Maselli, 2009) and biodiversity loss (Engler et al., 2011). Increased natural hazards, changes in water and snow resources, and restricted migration possibilities for plant species are also some of the main consequences of climate change in mountain regions (Kovats et al., 2014).
Understanding the spatiotemporal patterns of the mountain climate is essential for managing the many environmental issues of these areas. Indeed, mountain ranges and their surrounding lowlands provide essential ecosystem services, such as water supply and biodiversity hotspots (Körner and Ohsawa, 2005), carbon and climate regulation, and the supply of timber, fuelwood, and hydropower energy (European Environment Agency, 2010). Tourism and cultural values also attract many visitors, and the local agencies must have tools to deal with environmental hazards such as storms or snow landslides (Gret-Regamey et al., 2012).
High-resolution climate surfaces, which account for toposcale effects, have increasingly become essential tools for understanding landscapes and managing the territory. Many studies have shown the relevance of accounting for local climate variation in mountainous areas by combining mesoclimatic and topoclimatic information, from pioneer studies (Daly et al., 1994;Goovaerts, 2000) to recent approaches (Gutiérrez Illan et al., 2010;Ashcroft and Gollan, 2012;Case et al., 2016;Aalto et al., 2017;Meineri and Hylander, 2017). Several studies have developed climate surfaces for mountain regions; for instance, for The Alps, Efthymiadis et al. (2006), Isotta et al. (2014), Frei and Schar (1998), and Tobin et al. (2011) developed precipitation climate surfaces and Chimani et al. (2013) developed them for temperature. Antolovic et al. (2013) have developed gridded maps for the Carpathians. Following these previous initiatives, we aimed to produce appropriate continuous topoclimate surfaces for a complex terrain area such as the Pyrenees.
The Pyrenees is a European mountain range with many ecological services and it also has a strong influence on nearby lowlands. The current landscape is the result of natural and cultural forces interacting since ancient times (Catalan et al. 2017). Global change, as well as social and economic prosperity in this area as in other mountains of the world, is challenging the conservation of the natural environment. The administration of the Pyrenees involves three countries (Andorra, France, and Spain) and several regions with different degrees of political autonomy. Consequently, several climate maps that already exist use different spatial resolutions and interpolation schemes and consider different parts of the mountain range. Andorra has a climate model adapted to the mountainous area based on the country's own meteorological stations as well as ones from neighbouring regions with high resolution and accurate procedures (Batalla et al. 2016) (for details and information check, the CENMA-IEA Andorran Climate Digital Atlas at the website, www.acda.ad). However, there are no digital long-term climate maps from the entire Pyrenean area with spatiotemporal completeness, and it is therefore necessary to develop them. Although previous climate surfaces were developed in Walker et al. (2010), here we markedly improve the Geographical Information System (GIS)-based dataset of Essential Climate Variables (ECV) and the spatial resolution.
From the mapping perspective, the Pyrenees mountain range presents the usual difficulties when the regionalization process has to deal with complex topography that results in large climate variability at short distances (Opedal et al. 2015). The interpolation process becomes a complicated surface generation procedure since the gradients are abrupt. The relationships between predictors and predicted variables are more difficult to establish than in flat and homogeneous lowland areas (Vicente-Serrano et al. 2003;Ninyerola et al. 2005). Indeed, it does not help that meteorological stations are usually not placed on the slopes but rather in the valleys, and there are only a few stations at high altitudes.
The Pyrenees mountain range has two additional challenges that may also be common to other relatively large ranges. One is the multiple sources of the meteorological data, particularly the administrative divisions between countries and their meteorology agencies. This variety increases the difficulty in accessing and harmonizing the data from the original ground stations and integrating them into a single database. The second challenge concerns the geographic position of the Pyrenees at the limits of the Mediterranean and Eurosiberian regions. The east-west alignment (between the 42°and 43°north hemisphere parallels) results in a high contrast between the sides of the range. The northern side has a larger Atlantic influence than the southern side, which is more Mediterranean. Although the Mediterranean climate is characterized by rainy winters and dry, warm to hot summers, the Atlantic climate is characterized by mild and rainy winters but cooler and relatively wet summers.
Following we describe the Pyrenean Digital Climate Atlas (PDCA) developed at a spatial resolution of 30 meters and based on meteorological data for the 1950-2012 period in temporal completeness (monthly, seasonal, and annual).

| STUDY AREA
The Pyrenees is a mountain range located in the northeast of the Iberian Peninsula and south of France, in southwest Europe, centred at 0°27' of east longitude and 42°34' of north latitude (Figure 1). It sits astride the border of three different states: Spain, France, and Andorra. This mountain range covers an approximate area of 60,000 km 2 and is characterized by a heterogeneous orography rising from sea level to a maximum altitude of nearly 3,400 m (mean altitude = 891 m; SD = 595 m; minimum = 6 m; maximum = 3,358 m).
Three bioclimatic regions can be distinguished along the Pyrenees. The western Pyrenees has an Atlantic influence (with a humid climate), the central Pyrenees is more continental (with contrasted temperatures and it is drier than the western region), and the eastern Pyrenees has a Mediterranean influence (which means a dry, warm summer season). While in the Atlantic region, the difference between the northern and southern slopes is very low in terms of amount of rainfall, in the rest of the mountain range, there are remarkable differences between the two slopes. Beyond these general features, there is also high-climatic variability related to topographic features such as abrupt changes in hillslope and aspect ratio and the presence of waterbodies (Chen et al., 1999;Frey et al., 2016).
The geographic domain of the PDCA (Figure 1) was determined according to criteria that select those bioclimatic belts representative of the alpine, subalpine, and montane zones of the Pyrenees (Rivas-Martinez, 1987). Using the parameters defined by thermal indexes and plant communities for each bioclimatic belt, we selected the appropriate area from the WorldClim v.1 maps (Hijmans et al., 2005) and added a 15-km buffer.

| Meteorological data sources
The climatic information was derived from the long-term observations of the weather stations in the defined area ( Figure 2). Meteorological data were obtained from the Agencia Estatal de Meteorología (AEMET), Météo-France (MF) and the Servei Meteorològic de Catalunya (SMC). It was necessary to harmonize the different database structure provided for the meteorological agencies. From the harmonized data, we calculated mean maximum air temperature, mean air temperature, mean minimum air temperature, and total precipitation for two timescales, monthly and annual.
Once we had processed the original data, we selected the data series that had the highest quality based on objective criteria (series length, temporal stability, and cross validation test) and expert knowledge (series quality and correct location coordinates). Finally, we used the meteorological stations with temperature series with a minimum of 15 years and precipitation series with a minimum of 20 years, obtaining a compromise between temporal stability and spatial coverage for the 1950-2012 period. The final dataset used for building the digital climate surfaces included approximately 370 and 450 temperature and precipitation stations respectively ( Figure 2).

| Geographic data
The spatial interpolation involved a regression-based step in which geographic predictors were included to explain as much climate variability as possible. Following Ninyerola et al. (2000), and for both air temperature and precipitation, we selected altitude, latitude, distance to the sea, and potential solar radiation. Other authors also used similar predictors (Hijmans et al., 2005;Attorre et al., 2007;Vicente-Serrano et al., 2016).
Due to terrain complexity, we have tested different Digital Elevation Model (DEM)-based variables (Topographic Position Index, Topographic Wetness Index, Terrain Ruggedness Index, Terrain Surface Convexity, Terrain Surface Texture) without success, probably due to the homogeneity in the meteorological stations placement, the weakness of some indexes to capture cold air drainage, and to the fact that we dealt with long-term mean series.
The altitude was extracted from a DEM of ASTER GDEM V2 (NASA LP DAAC, Japan Ministry of Economy, Trade and Industry (METI), 2011) with 30-m spatial resolution. We have kept this resolution in the final maps to capture the high-topographic variability of the Pyrenees. Latitude and distance to the sea were computed using GIS-based Euclidean distance operators. In the case of the latitude, the target distance was the equator and, in the case of the distance to the sea, the target was the coastline, considering both the Mediterranean Sea and the Atlantic Ocean. Distance to the two coasts jointly performed better for temperature maps and separately for precipitation maps. For the sea influence, we tested different approaches: linear, logarithmic, and quadratic distances; finally, we used the logarithmic distance to the sea.
The potential solar radiation was derived from the DEM using astronomical equations that describe the position of the Sun-Earth system. The atmospheric conditions (i.e., cloudiness) were considered to be constant across the study area; therefore, the variation in potential solar radiation mostly reflects the local influence of topographic structures (Aalto et al., 2017). Maps were generated with the Mira-Mon module Insoldia, following the procedure described in Pons and Ninyerola (2008). For the precipitation maps, a low-pass filter of 9x9 cells was applied to potential solar radiation grids to obtain a more realistic effect of the relief in the final maps. Note that the Potential Solar Radiation plays a double role in our study, as it is one of the geographical variables used as a predictor in the regression models and it is one of the climatic variables that conforms the Atlas.

| DIGITAL CLIMATE SURFACES
Finally, we constructed an extensive climatic database of 30-m spatial resolution ( Table 1) that contains a complete set of digital toposcale climate surfaces covering the whole area, the PDCA. The methodological details and related information are described below.

| Methodology
To build the digital climate surfaces (Figure 3), we implemented a two-step spatial interpolation procedure using the GIS software MiraMon 8.1 (Pons, 2006). The first step is a deterministic multiple regression analysis (global interpolation) and the second step is a spatial interpolation of the regression residuals (local interpolation). After several tests, we selected the inverse distance-weighted interpolation method (with exponent = 2) to produce the temperature maps and the regularized spline method (with tension=300 and smooth=0) for the precipitation maps. This selection is in accordance with previous studies (Ninyerola et al., 2007a;2007b). Geostatistical-based interpolators (see Bhowmik and Cabral, 2011;Bhowmik, 2012;Arowolo et al., 2017 for comparison studies) could also provide satisfactory results in complex terrain areas, but splines performed nicely  Potential solar radiation monthly, seasonal and annual Primary 10 kJ·m −2 ·day −1 enough and they have the advantage that variogram fitting step is not required, with an associated lower computational effort.
The multiple regression provides a predictive model of the climatic variable (information from the meteorological stations in monthly or annual resolution) based on the geographic variables that influence the climate of the area (continuous surface maps); thus, the complexity of the terrain is considered.
Once the regression had been applied, we interpolated the difference between observed and predicted values (residuals) to determine the nonexplained part of the regression. In the Inverse Distance Weighted method used for temperature maps, the value assigned to each cell is calculated using a weighted average of the known values considered. In the splines method used for the precipitation maps, the value assigned to each cell is determined by a regular, continuous, and derivable function that best suits the sampling points. The limited altitudinal distribution of the meteorological stations meant that there was an area where the models had to be extrapolated without sides for validation (Figure 4).
Seasonal maps were based on the algebraic mean of the monthly maps, according to the meteorological season approach. Summer maps correspond to the months of June, July, and August; autumn maps to September, October, and November; winter maps to December, January, and February; and spring maps to March, April, and May.
Altitude is the most explanatory variable of the thermometric variables (Figure 5), as it is highly significant during every month of the year. In the case of precipitation,

maximum temperatures (c). Growing degree-days (d). Precipitation (e) Potential Evapotranspiration (f). Water availability (g). Potential solar radiation (h)
however, altitude is only the most significant variable during the summer months, and during the rest of the year, latitude is more relevant (Figure 5). The potential solar radiation mainly influences the mean and maximum temperature and always has positive regression coefficients. The distance to the sea varies according to the period and variable, but it is worth noting that it has a strong negative weight on precipitation, with a higher Atlantic influence, indicating a clear decrease in precipitation in areas far from the seacoast. The adjusted R 2 in Table 2 states the explained variation for the dependent variables of the regression.

| Validation
Although in the past it was common to use a single index to explain the accuracy of the model, it is increasingly recommended to use a combination of approaches to assess the quality of the map (Daly, 2006;Bhowmik and Costa, 2015;Vicente-Serrano et al., 2016). Following this advice, we used various types of indexes (RMSE, MAE, MBE) to evaluate different aspects of the error distribution and provide a more accurate estimation of the global map uncertainty. The root mean square error (RMSE) quantifies the differences between values predicted by the model and the values observed (error). It provides information about the global average error and is measured in the map's units. We estimated a global RMSE for each map, using a leave-one-out cross validation, in which the accuracy measure is obtained by repeating the training of the model n-times with n−1 samples of data and predicting the sample not included. The fitting step uses all the selected stations.
where pred is the predicted value, obs is the observed value, and i refers to each of the n data samples (meteorological stations).
As our study area has a very complex relief, we classified the meteorological stations into three types of terrain complexity. We used the SD of the altitude computed from all the cells inside an area delimitated by a buffer of 10-km radius around each meteorological station. Finally, we distinguished between flat lands (SD < 150 m), middle lands (SD between 150 and 250 m), and rugged lands (SD > 250 m). In this way, we obtained a partial RMSE based on the stations' surrounding structures that best reflect the spatial context of the errors and how they increase with the terrain complexity ( Table 2).
The Mean Absolute Error (MAE) measures the magnitude of the error without considering its direction (sub-or overprediction), as the RMSE does; however, it is less sensitive to extreme values. The MAE makes it possible to determine how much the RMSE reflects the central tendency of the average error. Finally, the Mean Bias Error (MBE) provides information about existing bias. When MBE is positive, predictions generally tend to be lower than observations (subprediction), otherwise (overprediction), the MBE is negative. Table 2 shows the accuracy statistics for the monthly and annual temperature and precipitation models. Regarding the thermometric variables, global RMSE values range between 0.7 and 1.2°C. The mean minimum air temperature was the most difficult variable to model, probably due to the thermal inversions in the valleys that modify the usual lapse rate tendency. With respect to partial RMSE, the rugged land class shows the lowest accuracies because the high-mountain climate is more difficult to interpolate due to terrain complexity.
If we look at the other indexes, the MBE in the case of mean minimum temperature was always positive, and for the mean maximum temperature, it was always negative, F I G U R E 4 Altitudinal areas where no meteorological stations are available. In these areas, the values in the climatic maps are extrapolated without any possibility of validation although the values were very low. Therefore, the model slightly subpredicts the minimum temperatures and overpredicts the maximum ones. In addition, the RMSE is a few tenths of a degree higher than the MAE, indicating that there is low variability within the error distribution.
For the precipitation models, the monthly RMSE ranges between 11 and 15 mm. There is no clear pattern for the terrain complexity types; however, the middle and rugged classes have higher RMSE. The MBE index is negative, showing a moderate overprediction of the precipitation values in the model.

| Derived variables
From the primary variables maps (temperature and precipitation), we derived a set of bioclimatic variables (Sykes et al., 1996) that can be useful in applications related to vegetation, ecosystem, and socioeconomic applications. Evapotranspiration is the combined evaporation from the soil surface and transpiration from plants, which represents the transport of water from the earth back to the atmosphere (Thornthwaite, 1948). The Thornthwaite equation is a simple temperature-based method for estimating Potential Evapotranspiration (PET), used a lot when solar radiation and wind speed data, among other variables, are not available. The equation consists in: where PET is the estimated potential evaporation (mm), T a is the average monthly temperature (°C), N is the number of days of each month, L is the average sunshine hours of the month, and α is calculated according to Equation (5).

| Water availability
Water availability expresses whether the supply of atmospheric water (precipitation) exceeds the demand (evapotranspiration), WA = PR -PET (7) where WA is water availability (mm), PR refers to precipitation (mm), and PET to potential evapotranspiration (mm).

| Growing degree-days, a monthly approximation
Growing degree-days (GDD) is a heuristic measure of plant phenology and animal behaviour and development in relation to temperature that considers how accumulated heat influences organisms' activity. GDD is usually defined as the temperature degrees accumulated above a certain threshold base evaluated on a daily basis. Here, we applied a monthly approach (GDD m ), in accordance with the previous temperature interpolation.
where T base is the temperature threshold (0°C in this case), T max is the mean maximum temperature, T min is the mean Growing degree-days (°C above 0°C·day) RMSE Taking as reference the average value of GDD in the Pyrenees map, this value represents an overall error of 6.5%. minimum temperature, and N represents the number of days of the month. If the mean temperature is lower than the base temperature, then GDD m = 0.

| Validation
An estimation of the uncertainty in derived variables (i.e., PET, WA, and GDD) was also performed, comparing the modelled values and the measured ones at the weather stations. A cross validation splitting the data on training (80%) and test (20%) sets was applied in the temperature and precipitation process. The corresponding RMSE are provided in Table 2.

| Metadata
Special emphasis was placed on documenting metadata to provide end users with fit for purpose usage. Information such as quality measures, lineage layers, and an explanation of the methodology applied is made available. Metadata are presented in XML format using standard ISO 19139.

| DATASET LOCATION AND FORMAT
The dataset is located in the ZENODO repository (http:// www.zenodo.org/) and is stored in GeoTIFF format. This dataset can be downloaded freely for research and educational purposes; however, if the maps or information derived from them are used in a publication, this article should be cited. Cartographic specifications are detailed in Table 3.

FOR THE DATASET
The maps provide basic long-term climatic information, at the most detailed resolution made so far, for the whole territory of the Pyrenees that can be used in other cartographic tasks and spatial analyses. Analysis of the spatial climatic patterns across the geographic extent can provide information of interest to managers and researchers. For example, there is a marked contrast between the southern and northern slopes of the Pyrenees in the summer mean maximum temperatures (Figure 6a). In addition, there is a large topographic effect in the summer precipitation map (Figure 6b) due to the many summer orographic storms, in contrast to the winter precipitation map (Figure 6c), which has a greater incidence of Atlantic disturbances. These are known general features that are now available at high resolution and averaged over a long period of time.
There is a need for high-resolution climate data for modelling alpine species. Climate data that reflect toposcale conditions yield more accurate results in ecological niche calibration (Gutiérrez Illan et al., 2010;Bobrowski and Schickhoff, 2017;Meineri and Hylander, 2017). Therefore, these topoclimatic maps provide an opportunity for establishing the relationships between climate and vegetation and generating suitability maps or species distribution models (Elith and Leathwick, 2009) for the Pyrenean mountain region, with direct applications in forest management and the conservation of species.
The dataset presented here can also be used as the baseline of the current climate for future spatial projections or palaeoclimate reconstructions (Anderson et al., 2006;Anandhi et al., 2011;Kriticos et al., 2012;Harris et al., 2014) as well as estimations of possible changes in vegetation, carbon exports, and the nitrogen cycle, among others.

OPEN PRACTICES
This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at DOI 10.5281/zenodo.1186639. Learn more about the Open Practices badges from the Center for Open Science: https:// osf.io/tvyxz/wiki.