Volume 6, Issue 2 p. 214-221
DATA PAPER
Open Access
Open Data

Historical global gridded degree-days: A high-spatial resolution database of CDD and HDD

Malcolm N. Mistry

Corresponding Author

Malcolm N. Mistry

Department of Economics, Ca’ Foscari University of Venice, Venice, Italy

Centro Euro-Mediterraneo sui Cambiamenti Climatici (CMCC), Venice, Italy

Correspondence

Malcolm N. Mistry, Department of Economics, Ca’ Foscari University of Venice, Italy.

Email: [email protected]

Search for more papers by this author
First published: 21 October 2019
Citations: 27
Dataset
Creator: Malcolm N. Mistry, Department of Economics, Ca’ Foscari University of Venice, Italy
Title: A High-Resolution (0.25 degree) Historical Global Gridded Dataset of monthly and annual Cooling and Heating degree-days (1970-2018) based on GLDAS data.
Publisher: PANGAEA
Publication year: 2019
Resource type: Digitised data, metadata
Version: 1.0
Readers are recommended to consult the author for an updated version of the dataset discussed in this article, which includes more recent years.

Funding information

This article was funded by a grant from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme, under grant agreement No. 756194 (ENERGYA).

Abstract

Cooling and heating degree-days (CDD/HDD) are important metrics used in energy studies as a proxy for determining demand and consumption patterns of residential/commercial buildings and work spaces. Driven by the requirements of energy impact modellers, policymakers and building design experts; a new historical high-spatial resolution, global gridded dataset of degree-days constructed using various base (threshold) temperatures (Tb) is presented in this study. Derived using sub-daily temperature from a quality-controlled reanalysis data product (Global Land Data Assimilation System—GLDAS), the dataset called ‘DegDays_0p25_1970_2018’ includes monthly and annual (i) CDD; (ii) HDD; and (iii) CDD computed using wet-bulb temperature (CDDwb) at 0.25° × 0.25° gridded resolution, covering 49 years over the period 1970–2018. The Tb used for assembling DegDays_0p25_1970_2018 include 18, 18.3, 22, 23, 24, 25°C for CDD and CDDwb; and 10, 15, 15.5, 16, 17 and 18°C for HDD, respectively. The data of individual indices are made publicly available in the commonly used scientific Network Common Data Form 4 (NetCDF4) and Georeferenced Tagged Image File (GeoTIFF) formats. DegDays_0p25_1970_2018 fills gaps in existing energy indicators’ datasets by being the only high-resolution historical global gridded time series based on multiple threshold temperatures, thus offering applications in wide-ranging climate zones and thermal comfort environments. The richness of DegDays_0p25_1970_2018 lies in its flexibility by allowing users to aggregate the degree-days not only at varying spatial scales (such as administrative levels, national boundaries, economic organizations e.g. OECD; with or without population weights), but also at varying temporal scales (such as seasons), thereby offering climatologists with a potential to examine global teleconnection patterns more discretely.

1 INTRODUCTION

Cooling (CDD) and heating (HDD) degree-days are important climatic indicators, commonly used to estimate the climate-dependent cooling and heating demands in buildings respectively (CIBSE, 2006). Degree-days are defined as monthly or annual sum of the difference between a base temperature (Tb) and daily mean outdoor air temperature (Td), whenever the Td is greater (CDD) or lower (HDD) than Tb (ASHRAE, 2009).1 The Tb is also referred to as ‘threshold’ temperature or ‘set-point’ temperature, and it signifies the Td at which the indoor cooling or heating systems do not need to run in order to maintain human comfort levels (CIBSE, 2006; ASHRAE, 2009).

Degree-days have been routinely used by building designers and engineers to estimate indoor cooling/heating-related energy consumption; and by policymakers and researchers for forecasting energy demand, consumption patterns and associated carbon emissions (Lee et al., 2005; Mourshed, 2012). This is partly rooted in its’ simplicity but yet a powerful capability to represent a relationship with cooling or heating energy consumptions (Atalla et al., 2018). In addition, degree-days are also widely used as climatic indicators for the assessment of the impact of climate change and variability, such as the CDD and HDD in the energy sector (Moustris et al., 2015), and growing degree-days (GDD) in the agriculture sector (Schlenker and Roberts, 2009; Schauberger et al., 2017). Readers are referred to Spinoni et al. (2018) for a more detailed application of degree-days in various sectoral impact studies.

This study presents a unique (first-ever) high-spatial resolution, global gridded database of three types of degree-days; namely CDD, HDD and a variant of CDD accounting for humidity (CDDwb). Computed using multiple wide-ranging Tb and meteorological variables from a quality-controlled reanalysis data product, the degree-days dataset referred to as ‘DegDays_0p25_1970_2018’ includes monthly and annual degree-days, spanning the most recent 49 years (1970–2018). The exhaustive dataset is aimed towards multiple end users, such as the research community assessing impacts of climate change on the energy sector (as well as the usage of energy for adapting to climate change), and policymakers examining the historical climate-energy nexus as a proxy for understanding future trends and patterns in energy demands for human comfort.

Rest of the paper is organized as follows. Indices, materials and methods, and the underlying reanalysis data product used in assembling the dataset are discussed in detail in Section 4. Details on data file formats and ways to access the dataset are outlined in Section 4. Finally, Section 5 discusses potential applications and limitations of the dataset, with recommendations for additional work.

2 DATA PRODUCTION METHODS

CDD and HDD are calculated using the commonly used American Society of Heating, Refrigerating, and Air-Conditioning (ASHRAE) method (ASHRAE, 2009), which are defined as follows:
urn:x-wiley:20496060:media:gdj383:gdj383-math-0001(1)
urn:x-wiley:20496060:media:gdj383:gdj383-math-0002(2)
where ‘+’ signifies only positive values accumulate over n days in the chosen time period (e.g. months, seasons, year). Td and Tb in Equations 1-2 represent the daily mean outdoor air and base (threshold) temperatures, respectively. Degree-days are commonly represented as °C or °F days, depending on the underlying units of Td and Tb used in the formulation. Nevertheless, conversion from °C days to °F days (and vice-versa) follows similar rule for unit conversions as in temperature scale. For example, CDD computed using °C units can be converted to °F days by using the following relationship:
urn:x-wiley:20496060:media:gdj383:gdj383-math-0003(3)

CDD computed using Td only considers the effect of dry-bulb temperature.2 In regions with high relative humidity (rh) such as the coastal regions in New South Wales (Australia), coastal regions in India (e.g. Kerala) and South-Eastern regions of China and Brazil, CDD can have limited applications in determining energy requirements for space cooling (Guan, 2009). For such regions, CDDwb is recommended as a more suitable indicator than the conventional dry-bulb-derived CDD (Guan, 2009; Krese, 2012).

The methodology to compute CDDwb on monthly and annual timescales varies only in the use of wet-bulb temperature (Twb) instead of dry-bulb temperature (or simply Td as discussed in Equation 1). Moreover, the base temperatures and the units of CDDwb also remain unchanged, thus making CDDwb easily comparable to CDD. Twb is the minimum temperature to which air can be cooled by evaporative cooling, and as such, contains information about air temperature as well as moisture content. For further details, readers are referred to Stull (2000, 2011).

Following Stull (2011), average daily Twb is computed utilizing Td and average daily rh as follows:
urn:x-wiley:20496060:media:gdj383:gdj383-math-0004(4)
where the arctangent (atan) function returns values in radians. Twb are expressed in the same units (°C) as Td.

2.1 Dataset description

The degree-days included in this study are derived using meteorological variables from Global Land Data Assimilation System (GLDAS) (Rodell et al., 2004). GLDAS is a new generation global high-resolution reanalysis data product developed jointly by the National Aeronautics and Space Administration (NASA), Goddard Space Flight Center (GSFC) and National Centers for Environmental Prediction (NCEP) (Ji et al., 2015).

GLDAS incorporates satellite and ground-based observations, producing optimal fields of land surface states and fluxes in near real time, thus facilitating regular updates of the DegDays_0p25_1970_2018 dataset presented in this study (Section 5.1). Furthermore, GLDAS makes available meteorological and land surface variables that are not commonly available in other reanalysis data products either as consistent long time series, or at a high-spatial resolution. Other reanalysis data products available have either (i) a coarser spatial resolution (e.g. ECMWF-ERA40 and JRA-55, both available from the mid-1950s but at 1.125°) or (ii) a shorter time series (e.g. newly released ECMWF-ERA5 at 0.281° from 1979–present day and NCEP-CFSv2 at 0.205° from 2011–present day).

GLDAS provides a consistent quality-controlled long global gridded time series of a number of key meteorological variables at fine-scale spatio-temporal (0.25° gridded,3 3-hourly) resolution. It has been comprehensively evaluated using different regional/global reference datasets in earlier studies, such as Ji et al. (2015) who compare the GLDAS daily surface air temperature at 0.25° gridded resolution with two reference datasets: (a) Daymet data (2002 and 2010) for the conterminous United States at 1-km gridded resolution and (b) global meteorological observations (2000) from the Global Historical Climatology Network (GHCN).

Examples of previous studies that have incorporated GLDAS data include (a) De Cian and Sue Wing (2019), De Cian et al. (2019) for impact assessment studies in energy sector; and (b) Gao et al. (2014), Zhong et al. (2011) for the analysis of regional environmental conditions and changes. A recent dataset (Mistry, 2019b, 2019c) has also incorporated temperature and precipitation data from GLDAS to assemble a comprehensive set of 71 climate extreme indices. Further details on studies implementing GLDAS are available on https://ldas.gsfc.nasa.gov/gldas/GLDASpublications.php. Some known caveats of GLDAS are discussed in Section 5.2.

3 MATERIALS AND METHODS

The GLDAS variables used in the present study for computing CDD and HDD include daily (a) near-surface maximum (TX) and minimum (TN) temperatures in °C, and in addition (b) surface relative humidity (rh) in % for computing CDDwb. rh is not directly available from GLDAS, but assembled utilizing surface pressure (P) in hecto-Pascal (hPa) or millibars (mb), and specific humidity (Q) in kg kg−1, both made available by GLDAS (Equations 6-8).

The variables (TX, TN, P and Q) covering the years 1970–2018 were obtained at their native 3-hourly time steps in the Network Common Data Form 4 (NetCDF4) format4 from GLDAS version 25 (Rodell et al., 2004; Kumar et al., 2006; Peters-Lidard et al., 2007). The daily fields of these variables were assembled using a suite of command line operators from NetCDF Command Operators (NCO ver 4.3.4)6 and Climate Data Operators (CDO ver 1.9.0).7 A summary of the data variables used, along with the methodology, is provided in Table 1.

Table 1. Summary of monthly and annual degree-days, with the corresponding base temperatures and methodology used in this study
Indicator (°C days) Tb (°C) Variable Used Eqns

CDD

CDDwb

18, 18.3, 22, 23, 24, 25

Td

Twb

1, 5

1a, 4b

HDD 10, 15, 15.5, 16, 17, 18 Td 2, 5
  • a Equation 1 utilizes Twb in lieu of Td.
  • b Equation 4 derived using Equations 5-8.
Td used in the computation of degree-days (CDD, CDDwb and HDD) was calculated as an arithmetic average of TX and TN at 3-hourly intervals (Equation 5), before computing the daily average.
urn:x-wiley:20496060:media:gdj383:gdj383-math-0005(5)
rh (%) used in the computation of Twb (Equation 4) was computed utilizing P and Q as follows:
urn:x-wiley:20496060:media:gdj383:gdj383-math-0006(6)
urn:x-wiley:20496060:media:gdj383:gdj383-math-0007(7)
urn:x-wiley:20496060:media:gdj383:gdj383-math-0008(8)
where VP is the vapour pressure (in hPa) and SVP is the saturation vapour pressure (in hPa).

Equation 7 is referred to as the Magnus equation or the Magnus–Tetens equation, or the August–Roche–Magnus equation (Tetens, 1930; Webb, 1994), and is defined for temperatures above 0°C. Equations 6-8 are discussed in detail in Stull (2000).

3.1 Spatial and Temporal coverage of DegDays_0p25_1970_2018

The spatial extent of GLDAS covers all land north of 60°S latitude. Consequently, the degree-days in DegDays_0p25_1970_2018 are also computed over the corresponding 1,440 (longitude) × 600 (latitude) grid cells spanning 90°N–60°S, at the same 0.25° gridded resolution. Because GLDAS does not record data at or near water bodies, the grid cells in the proximity of water bodies do not report degree-days. Figure 1 (a–c) shows the mean 1970–2018 annual degree-days using Tb = 18°C at the native 0.25° gridded resolution.

Details are in the caption following the image
Global maps of mean 1970–2018 annual (a) CDD (b) CDDwb and (c) HDD, as °C days, computed using Tb = 18°C, at 0.25° grid-cell level. Country boundaries overlaid to show spatial distribution of degree-days. At a given Td and rh < 100%, Twb will be lower than Td. The CDDwb computed at the same Tb (as in CDD) therefore show a lower range of °C days compared to CDD

4 DATASET LOCATION AND FORMAT

The degree-days in DegDays_0p25_1970_2018 on monthly and annual timescales spanning years 1970–2018, computed using different base (threshold) temperatures (Table 1), are free available in two widely used data formats; NetCDF-4 (.nc4) and Georeferenced Tagged Image File (GeoTIFF) (.tif). While the former is a scientific data format commonly used by the climate research and modelling community, the latter is popular among users applying geospatial analysis. Both data formats are compatible with a number of software or desktop GIS tools, such as R, Python, MATLAB and QGIS. Additionally, command line tools such as CDO and NCO are recommended for reading, manipulating and analysing NetCDF-4 data format.

Data can be accessed as compressed.tar.bz2 folder containing the individual.nc4 and.tif files from https://doi.pangaea.de/10.1594/PANGAEA.903123. The files follow the naming convention ‘gldas_0p25_deg_DD_base_T_degC_1970_2018_timescale.nc4’; wherein ‘DD’ is the abbreviation of the index (CDD, CDDwb or HDD), degC is the threshold temperature used in the computation of Tb, and ‘timescale’ either ‘ann’ or ‘mon’ relating to annual or monthly timescales over which the corresponding degree-days are computed.

Grid cells with missing values are identified by ‘1.e + 20f’. Further details of the variables/dimensions in the individual netCDF4 files can be examined using either NCO or CDO commands, such as ‘ncdump -h netcdf_file_name’ or ‘cdo sinfo netcdf_file_name’, respectively. For creating quick plots and exploratory data analysis of individual netCDF files, open-access data tools such as Panoply (https://www.giss.nasa.gov/tools/panoply/) or NCview (http://meteora.ucsd.edu/~pierce/ncview_home_page.html) are recommended.

5 DATASET USE, LIMITATIONS AND SCOPE FOR FURTHER WORK

5.1 Scope of application

Potential scope and applications of DegDays_0p25_1970_2018 include empirical assessment of energy demands at regional and global scales, implications on efficiency of building heating/cooling systems (such as Heating Ventilation and Air Conditioning systems—HVAC), cluster analysis of grid cells for identification of regions with similar historical spatial-temporal patterns of degree-days.

DegDays_0p25_1970_2018 enables users to apply degree-days using various (a) spatial scales, by aggregating grid cells to regional, national or user-defined boundaries; (b) temporal scales, by aggregating monthly degree-days to seasonal (e.g. winter months) or user-defined periods; and (c) weighting options,8 for example population or other socio-economic indicator weighted degree-days, again at varying spatio-temporal scales.

For instance, linear trends in annual CDD (Tb = 24°C) for Mexico (Figure 2) are examined using Mann–Kendall9 test using R (R Core Team, 2018) spatialEco package (Evans, 2018).

Details are in the caption following the image
CDD using Tb = 24°C at 0.25° grid-cell level for Mexico illustrating (a) Trends (°C days/year) and (b) mean 1970–2018 (°C days). White regions in trends indicate Mann–Kendall test not significant at p < 0.05. Regional boundaries overlaid to show spatial patterns of climatological mean and trends

Trend analysis, as well as other statistical and machine learning approaches (e.g. cluster analysis), can facilitate identification of potential cooling/heating demand patterns in recent decades.10 As evident from Figure 2a, the north-west states of Sonora and Sinaloa along the Gulf of California show a significant positive trend (8–12°C days year−1, at p < 0.05) in CDD. Together with information on population distribution and air conditioning in households, the fine-scale degree-days available in DegDays_0p25_1970_2018 can assist policy planners to identify potential hot-spots in regional-scale energy demands.

By employing different Tb in compiling DegDays_0p25_1970_2018, users can also have flexibility in application of degree-days across broader climatic regions (Indraganti and Boussaa, 2017). Recently studies such as Krese et al., (2012) and Lee et al., (2014) have highlighted the sensitivity to the choice of Tb both in assessment of energy demands, as well as in shaping policy measures for consumption of residential/commercial cooling and heating devices.

5.2 Limitations

While the ASHRAE (2009) methodology employed for computing degree-days in this study is one of the commonly adopted approaches in literature, the Td used in the formulation may make the degree-days less applicable for certain applications. For instance, fluctuations of Td around the Tb, as well as the asymmetry between Td and diurnal temperature variations are important (Spinoni et al., 2018); both of which are not accounted for fully by the degree-days in DegDays_0p25_1970_2018.

The different methodologies to compute Td using daily and sub-daily TX and TN, and the subsequent potential bias in the derived metric (such as the degree-days in this study) have been well investigated in literature (e.g. (Weiss and Hays, 2005; Ma and Guttorp, 2013; Villarini et al., 2017)). Td computed as the arithmetic mean of TX and TN (Equation 5) was driven by the choice of methodology (ASHRAE, 2009) for computing degree-days (Equations 1-2) in this study. Any potential bias in the monthly and annual degree-days emanating by using arithmetic mean for Td is likely to be negligible as highlighted by Villarini et al. (2017). Moreover, as emphasized by (Weiss and Hays, 2005), the choice of methodology in computing Td becomes more relevant when the outcome metric is based on a nonlinear algorithm, which is not the case in this study.

While the underlying reasons for utilizing GLDAS in this study have been discussed in Section 2.1 in detail; whenever possible, applications of indices (especially in impacts assessment) should incorporate input variables from different underlying data products to account for parameter and model uncertainty. For instance, certain known limitations of GLDAS data, such as larger uncertainty in the surface air temperature estimates over high mountainous areas are well documented in literature (Ji et al., 2015). Users of the GLDAS-derived data products, such as (De Cian et al., 2019; Mistry, 2019b) and DegDays_0p25_1970_2018 in this study, are recommended to pay attention to the data caveats.

Moreover, as highlighted in Section 3.1, the grid cells in the proximity of water bodies do not report degree-days because of missing data in GLDAS. This can introduce some limitations to users focusing on point locations or regions smaller than the ~27 × 27 km2 within water bodies (including lakes and rivers), especially in densely populated areas near coastal region. Such instances in DegDays_0p25_1970_2018 are likely to be minimal because the criteria to assign the grid cell as land or water in GLDAS ver-2 data are based on a very high-resolution land-water mask.11 Nevertheless, one work around to fill these gaps in the degree-days data would be to use an appropriate interpolation technique using software routines commonly available in R, CDO, etc. (e.g. bilinear, near neighbour, inverse-distance mapping).

Lastly, it is important to emphasize that while CDD and HDD have been widely adopted in literature as indicators of heating and cooling demands, respectively, they should not be construed either as ‘perfect’ indicators of energy demands for heating and cooling; or as being representative of outdoor thermal comfort (Petri and Caldeira, 2015). Nevertheless, degree-days can be applied as proxy indicators to understand both independent, as well as combined cooling and heating energy requirements (see Petri and Caldeira, 2015 as an example of aggregated CDD + HDD indicator of the total amount of cooling and heating needs).

5.3 Ongoing work and recommendations for work in future

A key motivation of this study is to provide an open-source, high spatio-temporal dataset of degree-days, using Tb, updated for the most recent years. Consequently, subject to the availability of the required GLDAS input meteorological variables in the coming years, DegDays_0p25_1970_2018 will be kept updated and made available to the research and end-user communities.

Additionally, another dataset of indices largely relevant for health but also energy sector (called ‘HEI_0p25_1970_2018’) is currently under preparation (Mistry, 2019a). Some features of HEI_0p25_1970_2018 will for instance be the inclusion of indices accounting for wind as a feel factor, in addition to the Td, Twb and rh used in this study. For instance, two of the indices ‘Wind Chill’ and ‘Apparent Temperature’ in HEI_0p25_1970_2018 are aimed to address human discomfort factors in cold and warm thermal environments. Together, both DegDays_0p25_1970_2018 and HEI_0p25_1970_2018, as well as the recently published dataset on climate extreme indices ‘CEI_0p25_1970_2016’ (Mistry, 2019b, 2019c), are aimed to address the growing needs of the climate impact community, by overcoming the current data scarcity of high-resolution global gridded CEIs in climate science.

DegDays_0p25_1970_2018 is currently the only comprehensive set of degree-days computed at a global high-spatial resolution using multiple Tb (see Table S1 for a summary of other existing publicly available degree-days’ datasets covering selective regions). Nevertheless, it is based on a single global reanalysis dataset (GLDAS), employs one of the known methods in formulating degree-days (ASHRAE, 2009), and may be restrictive in applications due to the selective (although broad range) choice of Tb. Datasets of similar energy indicators based on additional observed/reanalysis datasets should be considered for a robust assessment of energy impacts. The compilation of such datasets is recommended for work in future.

ACKNOWLEDGEMENTS

The research presented in this paper was funded by a grant from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme, under grant agreement No. 756194 (ENERGYA). The author is grateful to Enrica De Cian (ENERGYA) and two anonymous reviewers for suggestions to improve the manuscript; Alexander Ruane, Sujay Kumar and Hiroko Kato Beaudoing from NASA GSFC for clarifying doubts relating to GLDAS data; and NASA Goddard Earth Sciences Data and Information Services Center (GES DISC) for making GLDAS data publicly available. Developers of R raster/rgdal/sp/spatialEco packages, CDO and NCO are also acknowledged for providing open-access tools that were used for data preparation in this study.

    CONFLICT OF INTEREST

    The author declares no conflict of interest.

    OPEN PRACTICES

    Open Data

    This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://doi.pangaea.de/10.1594/PANGAEA.903123. Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki.

    • 1 Definitions of degree-days applying Tb differently in calculations also exist (see CIBSE, 2006). This study uses the definition adopted by ASHRAE (2009).
    • 2 For clarity, the daily mean outdoor air temperature (Td) referred to in Equations 1-2 is measured using a dry-bulb thermometer. Hence, Td is also referred to as dry-bulb temperature in the remainder of the text.
    • 3 ~27 km × 27 km at the equator.
    • 4 NetCDF is a set of scientific software libraries, with self-describing and machine-independent data format. https://www.unidata.ucar.edu/software/netcdf/docs/.
    • 5 Data accessed from https://disc.gsfc.nasa.gov/ on 12 April 2019.
    • 6 NCO (Zender, 2008) accessed on 14 July 2018 from http://nco.sourceforge.net/.
    • 7 CDO (Schulzweida, 2018) accessed on 14 July 2018 from http://www.mpimet.mpg.de/cdo.
    • 8 Readers are referred to (Hanigan et al., 2006) for a detailed discussion on methods for calculating population exposure estimates of derived meteorological parameters.
    • 9 The Mann-Kendall test developed by Mann (1945) and Kendall (1975), and expanded by Dietz and Killeen (1981), is a commonly-used nonparametric test for time trend analysis.
    • 10 Additional animations of global-gridded annual CDD, CDDwb and HDD (using Tb = 18°C) are provided in the online Supporting Information.
    • 11 Further details on the land-water mask used in GLDAS ver-2 data are provided in the online Supporting Information.