Reconstruction of a long‐term historical daily maximum and minimum air temperature network dataset for Ireland (1831‐1968)

The extension of spatial and temporal coverage of digital daily maximum and minimum air temperature observations is indispensable for a greater understanding of past climate variability. Long‐term series are fundamental for the assessment of frequency, duration, intensity and geographical distribution of past extreme air temperature events at local and regional scales in Ireland. Raw daily observations from 12 long‐term and 21 short‐term maximum and minimum air temperature series in Ireland, extending from 1831 to 1968, were rescued from multiple archives. Detailed station metadata on instrumentation, site location, observation practices and observer's notes are included in the dataset. Over 970,000 daily maximum and minimum air temperature observations were transcribed from handwritten meteorological registers, publications, newspapers and the Daily Weather Report. The data rescue strategies, sources for data and metadata rescue, and methodologies for double keying are discussed. The Ireland Long‐term Maximum and Minimum Air Temperature dataset (ILMMT) format for daily air temperature and metadata and organization is reviewed. The ILMMT dataset comprises raw observations and detailed station metadata, so data users can apply their selected quality control and homogenization approaches.


| INTRODUCTION
Long-term instrumental daily air temperature datasets are crucial to a better assessment of past climate variability and trends, to evaluate extreme air temperature events and to support validation of palaeoclimate reconstructions from proxies or documentary sources (World Meteorological Organization, 2016). Instrumental series are important for the generation of climate products, such as long-term gridded datasets. In addition, long-term series are fundamental for climate monitoring, climate change detection and attribution, climate modelling and to assist climate action and adaption policies. Thus, long-term instrumental series are required to fill key gaps in climate research at a global and national scale. A dataset of geographically well-distributed long-term daily air temperature series dating back to the early 19th century is paramount for understanding Irish climate variability and extreme air temperature events at local and regional scales, as previous research has focused on the period dating back to 1940s (McElwain and Sweeney, 2007).
Climate data and metadata rescue are essential to preserve historical instrumental observations that are in danger of being lost due to the vulnerability of original paper datasources (World Meteorological Organization, 2016). Rescue of instrumental records allows more complete climate datasets and improves data availability for researchers, meteorological institutes, stakeholders and policy-makers. Climate data rescue is necessary in Ireland as instrumental meteorological observations date back to the 17th century (Shields, 1983) and continuous readings of daily maximum and minimum air temperature started in the early to mid-19th century.
Multiple climate data rescue initiatives have been undertaken, such as the International Surface Temperature Initiative (Thorne et al., 2011), I-DARE (International Data Rescue Portal, https://www.idare -portal.org/conte nt/dare) or The International Atmospheric Circulation Reconstructions over the Earth (ACRE) . On the Island of Ireland, monthly (Noone et al., 2016;Murphy et al., 2018) and daily (Ryan et al., 2018) rainfall series were rescued. Data rescue of climate elements observed at Armagh Observatory were carried out, including daily maximum and minimum air temperature observations (Butler et al., 2005). Historical maximum and minimum air temperature series registered at Markree were previously rescued although not distributed as open access (McKeown et al., 2012). Prior to this project, most of the daily air temperature records in Ireland preceding the 1960s had not been digitized and largely existed as fragile manuscripts and scattered publications stored in various archives across Ireland and abroad. The lack of open access of meteorological observations in a digital format constitutes an obstacle to climate data analysis and research. This research fulfils data and comprehensive metadata rescue from 12 long-term instrumental daily maximum and minimum air temperature series since the early and mid-19th century to 1968 in Ireland. To facilitate future quality control and homogenization procedures, 21 short-term series recorded in the mid-19th century were also rescued. Early instrumental short-term series are crucial to assess rare weather events (Brönnimann et al., 2019). Despite the existence of other historical observed climate elements in the examined datasources, only the daily maximum and minimum air temperature observations were rescued under a funding-awarded research project for the assessment of past extreme air temperature events in Ireland. A key aim of this research is to make the digital raw series of daily maximum and minimum air temperature observations and related detailed metadata available as open access through the Ireland Long-term Maximum and Minimum Air Temperature dataset (ILMMT) to the wider scientific community, stakeholders and the public. The application of quality control and homogenization procedures on the rescued data is out of the scope of this article.

| Long and short-term daily maximum and minimum air temperature series
Early meteorological observations were geographical well-dispersed through Ireland ( Figure 1, Tables 1 and 2). Observations were initially undertaken by a variety of observers, such as physicians interested in the relationship between weather and mortality (e.g. John William Moore at Fitzwilliam Square Dublin), scientific societies like the Royal Dublin Society or the Royal Irish Academy, Royal Engineers at the Ordnance Survey Office in Phoenix Park Dublin, Professors at Trinity College Dublin and National University of Ireland Galway (NUI Galway), astronomical observatories, for instance Markree and Birr, and other amateurs. In Ireland and internationally, well-educated amateur observers were responsible for the generation of early instrumental meteorological observations prior the creation of the National Meteorological Services (e.g. Ashcroft et al., 2014;Brönnimann et al., 2019). The 12 long-term series included in the ILMMT dataset comprises a network of telegraphic The majority of short-term series in the mid-19th century ( Figure 1, Table 2) were under the authority of the Royal Irish Academy (Lloyd, 1853), while the remaining observations were conducted by volunteers (Kilkenny, Blackrock and Glendooen), instrument makers (Grafton Street Dublin), physicians (Portarlington), professors (Royal College of Surgeons Dublin) or clerks (Dublin Commercial Buildings).
Detailed station histories and metadata on location, instrumental and observing practices are available in each station file which accompanies the dataset.

| Sources of data rescue
Often long-term observations are available in diverse datasources which are preserved by multiple data-holders. Data included in the ILMMT dataset were rescued from the Met Éireann archives which holds catalogued handwritten meteorological registers containing long-term daily maximum and minimum air temperature series recorded across Ireland dating back to 1855 (Keane et al., 2017). Thermometer observations were also rescued from archives at NUI Galway, National Library of Ireland, Royal Irish Academy, National Botanic Gardens, Met Office, Royal Dublin Society, Trinity College Dublin and online resources (British Newspaper Archive, JSTOR, Hathi Trust Digital Library, Google Books and Digital Library and Archive of the Met Office) to assure completeness of the long-term series (Tables 1 and 2). The majority of instrumental records were rescued from the archive at Met Éireann (75.9%). The remaining observations were rescued from archives at NUI Galway (5.9%), National Library of Ireland (5.2%), online sources (which comprise Daily Weather Reports, newspapers, proceedings and transactions, 5.1%), National Botanic Gardens Dublin (3.4%), National Meteorological Archive at the Met Office (2.0%), Royal Irish Academy (1.6%), Trinity College Dublin (0.75%) and Armagh Observatory (0.2%).
Despite the desirable data recue from an exclusive datasource such as the original handwritten meteorological logs, sometimes these sources are not traceable. Multiple datasources were utilized to preserve the entire long-term series where possible. The handwritten meteorological registers were the primary datasource selected (Figure 2). When the original manuscripts were missing, the data were rescued from the Daily Weather Report (available online at the Digital Library and Archive of the Met Office, https://digit al.nmla. metof fice.gov.uk/SO_86058 de1-8d55-4bc5-8305-5698d 0bd7e 13/); newspapers in the British Newspaper Archive (https://www.briti shnew spape rarch ive.co.uk/), and scientific literature which comprises monographs, transactions and proceedings. Two datasources may be available due to duplicated instruments but with different times of thermometer readings T A B L E 1 Station data rescue sources for long-term series. Station numbers correspond to the map in Figure 1. Acronyms for data-holders: National Library of Ireland (NLI), Met Éireann (ME), Meteorological Office (MO), National Botanic Gardens of Ireland (NBG), NUI Galway special collections (NUI Galway) and Royal Irish Academy (RIA)

| Sources of metadata rescue
Detailed and complete station metadata are crucial to achieve high-quality daily instrumental time-series through quality control and homogenization procedures (Aguilar et al., 2003;Venema et al., 2018) and are thus necessary to data users. Important metadata for air temperature observations include the following: location and relocation of the meteorological station, station surroundings and type of land use and cover, type of thermometer screen, height of thermometer screen above ground, thermometer exposure and position, types of thermometers, time of thermometer setting and observation, number of thermometer observations per day, meteorological observer and instrument maintenance and replacement. Additional metadata comprise observer's comments, explanations on missing observations, thermometer calibration errors, standard time of observation (e.g. Greenwich Mean Time or Local Mean Time) and type of station (e.g. private register, telegraphic or second order). Diverse sources were exploited to rescue detailed station metadata included in the ILMMT dataset. Notes in the original handwritten meteorological registers, publications, Daily Weather Report and newspapers were rescued. In addition, station inspection reports available as original manuscripts or as appendices in meteorological publications such as the annual Report of the Meteorological Council to the Royal Society (e.g. Meteorological Council, 1882) were utilized to rescue metadata. Station photographs taken through the time which can show possible changes in the location, type of thermometer screen, land use at station enclosure and surroundings ( Figure 3) are furnished in the ILMMT dataset. Drawings when no station photographs are available for the early 19th century, for example Cameron (1856) of the thermometer screen at Phoenix Park Dublin, are furnished. Additionally, the ILMMT dataset comprises rescued metadata from publications by Institutions or Societies responsible for the supervision of the meteorological observations. For example Cameron (1856) furnishes information on location, instrumentation and observing practices by the Royal Engineers at Phoenix Park Dublin. Metadata in observers' publications on their meteorological observations were consulted to appraise early observation practices and instrumentation used such as by Wynne (1886) at Killarney. Observers' correspondences, including query forms from the Meteorological Office to the observers about the meteorological records which are important for assessing any instrumentation or observation errors, were rescued. Publications, for example Morley (1964), which contains a map on the location of the thermometer screen at Trinity College Dublin are also cited in the metadata files. Metadata printed in newspapers as for instance the Dublin Evening Post dating of 29th December 1849 on the Rutherford's self-registering minimum thermometer in use at the Commercial Buildings Dublin were also rescued.

| Data rescue strategy
The data rescue strategy followed the best climate data rescue practices outlined by the World Meteorological Organization (2016). The first step consisted of checking digital climate datasets such as the European Climate Assessment & Dataset (ECA&D) (https://www.ecad.eu/) and Met Éireann digital database (https://www.met.ie/clima te/avail able-data/histo rical -data) to determine the availability and completion of the long-term daily maximum and minimum air temperature series. Scanned publications and newspapers containing historical daily maximum and minimum air temperature observations and available in online resources (British Newspaper Archive, JSTOR, Hathi Trust Digital Library, Google Books and Digital Library and Archive of the Met Office) were examined. Contacts were established with diverse Archives holders of catalogued and non-catalogued climate data and metadata. The archives at Met Éireann, National Library of Ireland, National Botanic Gardens Dublin, Royal Irish Academy, Special Collections at NUI Galway, Royal Dublin Society, Trinity College Dublin and the National Archive of the Met Office were consulted for imaging the existing meteorological registers. Images of each page of the handwritten meteorological registers were carefully taken with a digital camera so as to ensure the legibility of the daily observations during manual keying. Station metadata sources were identified for rescue. Station folders comprising meteorological register images for each calendar year and for each station (with instructions and MS Excel templates) were organized for manual data keying. Finally, contacts were established to engage secondary school students as part of a service-learning programme, university students and citizen scientists in climate data keying. Brönnimann et al. (2006) tested diverse methodologies of digitization of manuscript climate data through optical character recognition, speech recognition and manual entry, and concluded that manual key entry was the quickest digitization methodology which led to fewer transcribing errors. Ashcroft et al. (2018) also identify manual keying as the best climate digitization methodology after testing speech recognition and optical character recognition technologies. Issues with the legibility of the Irish handwritten meteorological returns included penmanship, faint ink or blurred registers, omission of points as decimal separations, shifts in the registers when the maximum air temperature was recorded in the morning but not entered to previous day, maximum air temperature readings registered by mistake into the minimum air temperature column or vice-versa, reversed values and existence of many corrections over previously written records. Manual key entry into MS Excel templates was the methodology chosen for a faster and low-cost digitization of the historical maximum and minimum thermometer observations.

| Keying procedures
Observations were keyed as they are written in the datasource following the standard data rescue practice advised by the World Meteorological Organization (2016). This methodology comprises the rescue of obvious errors such as daily maximum air temperature lower than minimum air temperature or outliers which must be examined during quality control procedures.
The daily minimum air temperature readings were keyed to the days on which they were read and registered in the datasource. The daily maximum air temperature observations were rescued to the same calendar day as they were written in the datasource when there was an indication in the manuscript that the observation taken in the morning has been entered to the preceding day on which the readings were made; there is no mention on the observing time or the observations were taken in the morning but there is no indication on the manuscript of the observations having been thrown back to the previous day. However, the daily maximum air temperature values were transcribed and attributed to the previous day if the observations were recorded in the morning and there is an indication in the meteorological return that values were not thrown back. This procedure refers to early instrumental data recorded at the telegraphic reporting stations and it is specified in each station metadata file. Handwritten corrections in the original manuscripts were accepted and digitized. These corrections include marks as red ink or pencil corrections (0.6% of rescued data), which were made to address thermometer index adjustments, observer's errors, comparison of observed air temperature values to neighbouring stations, comparison of the air temperature and dry bulb values, interpolation of non-recorded values or probable values, or reversed minimum and maximum air temperature values.
To reduce keying errors, the monthly air temperature average and sum generated in the MS Excel template were compared after keying the daily air temperature values for a single month with the monthly average and sum supplied in the majority of the datasources. In addition, a visual cross-checking was made between the keyed data and the original data-source to assure that there were no reversed values, any repetition or other observed climate element (e.g. minimum air temperature on grass or dry bulb thermometer). In cases of poor legibility of the handwritten meteorological registers, publications on the readings taken at stations of second order (e.g. Meteorological Office, 1880) and at the telegraphic reporting stations in the Daily Weather Report were consulted.
Metadata were keyed once into MS Excel station files by the first author. In situations of poor legibility, the second author was responsible for metadata transcription verification. Each rescued station metadata file was converted to a MS Excel file and included in the ILMMT dataset.

| Data transcription verification
Double keying is a necessary procedure to minimize transcribing errors and to fulfil data accuracy (World Meteorological | 109

MATEUS ET Al.
Organization, 2016). Thus each daily maximum and minimum air temperature record was rescued by two different persons. The first keying of all daily maximum and minimum air temperature series was accomplished by the first author of this research. The second keying was completed through a variety of methods which description and results are available in Mateus et al., (2020). For the first time, over 140 secondary school students (15-16 years old) from 8 schools achieved climate data rescue under service-learning: 127 students were hosted as research collaborators at NUI Galway and 18 students cooperated at school through the Green School module as part of a student-scientist partnership . More than 190 NUI Galway BA Joint Honours (Geography) and BSc Applied Social Science undergraduate students completed data rescue as part of an assignment on climate data rescue and statistical data analysis in the module Geography in Practice, analogous to that investigated by Ryan et al. (2018). In addition, NUI Galway students through the volunteering programme ALIVE (A Learning Initiative and the Volunteering Experience, https://  14  47  47  38  39  15  47  49  40  40  16  49  47  39  38  17  47  47  41  41  18  48  46  40  40  19  45  46  41  44  20  45  42  39  37  21  41  43  35  37  22  42  43  38  40  23  42  46  30  40  24  44  46  39  39  25  42  43  35  37  26  44  46  39  42  27  44  41  38  35  28  43  44  37  41  29  44  44  32  38  30  42  46  33  34  31  45  40  36  www.stude ntvol unteer.ie/nuiga lway) and volunteers at Met Éireann and Irish Meteorological Society contributed to the second data keying. Students and volunteers enrolled in the second data keying received theoretical and practical training on climate data rescue. Instructions were given on identification of the maximum and minimum air temperature columns in the image of the original datasource and how to perform data input into the MS Excel templates. In order to avoid typing errors, the participants were required to perform a visual cross-checking between the keyed data and the datasource. MS Excel templates were basic ( Figure 4) and similar to the original datasource ( Figure 5) to minimize typing errors and to allow a faster keying. Annual MS Excel templates contained a tab for each month, the number of days per each month, columns with the title 'maximum temperature' and 'minimum temperature', and formulas to automatically generate the monthly average and sum after keying of daily values to allow the comparison with the values in the datasource.
MS Excel macros were created to compare the consistency of the first and the second keying. In cases of input differences, the second author of this project was furnished with the information on the date and image of the original station datasource in order to confirm the correct air temperature record. Following cross-checking of keyed data, the first data keying comprised 0.036% errors.

| RESULTS AND DATA ACCESS
The ILMMT dataset which comprises 12 long-term and 21 short-term raw daily maximum and minimum air temperature series and related station metadata extending from 1831 to 1968 was rescued, and it is available through edepositIreland (http://hdl.handle.net/2262/92442) and the Met Éireann website (https://www.met.ie/clima te/avail able-data/longterm-data-sets). The ILMMT dataset comprises each station file available as CSV format, which contains the original raw Fahrenheit and the converted Celsius daily maximum and minimum air temperature series (°C = (°F−32)/1.8). Supplemental observations such as values not corrected for thermometer index error, observations in different thermometer exposures which are crucial to determine instrumental bias, or readings at different setting and observing times which are necessary to check observing time bias, are provided. Missing data is represented as NA (not available).
Detailed and traceable station metadata rescued from multiple sources include references on the data and metadata sources and early inspection of station summaries which are presented chronologically in each station folder as MS Excel files. The metadata tables are organized in columns (date, metadata and description). Keywords such as 'thermometers', 'inspection of station' or 'location' characterize the type of metadata to aid the users' navigation. Station photographs and drawings are provided in the metadata files. Diverse thermometer exposures were in use prior to the introduction of the Stevenson thermometer screen in the late 1870s and early 1880s such as: indoors in a window recess, a large closed shed, thermometer stands, attached to an external wall or window typically facing northwards or non-standard screens which comprised double screens or a pyramidal roof screen painted green. Different types of thermometer exposures led to distinct F I G U R E 6 Long-term rescued raw annual (a), spring (b), summer (c), autumn (d) and winter (e) minimum air temperature series merged with the modern digital Met Éireann and Armagh Observatory series (Butler et al., 2005; http://www.clima te.armagh.ac.uk/) height of thermometers above the ground. Thermometers were initially protected against radiation by wood or metal shade. Particular observing practices were undertaken in case of indoor thermometer observations: it was routine to open a window for a few minutes prior the thermometer readings when the thermometers were placed at a window recess. The rooms adjacent to these indoor thermometers could have had fires in the autumn and winter (Cameron, 1856).
Early historical self-registering thermometers include for instance Rutherford, Six, Negretti and Casella. According to the metadata observational gaps in the early instrumental series were due to the entanglement of mercury on the maximum thermometer index, the distillation of alcohol into the top of the minimum thermometer tube, thermometer breakages, absence of observers, inexistence of observations on Sundays or historical political events such as the Easter Rising in April 1916.
The rescued raw series merged with modern Met Éireann digital observations and with the calibrated series recorded at Armagh Observatory (Butler et al., 2005; http://www.clima te.armagh.ac.uk/) allows the spatial and temporal expansion of the long-term maximum and minimum air temperature records in Ireland back to 1831 (Figures 6 and 7). Twice daily readings of the self-registering thermometers were made F I G U R E 7 Long-term rescued raw annual (a), spring (b), summer (c), autumn (d) and winter (e) maximum air temperature series merged with the modern digital Met Éireann and Armagh Observatory series (Butler et al., 2005;  at 7 and 18 hr at Malin Head, Blacksod Point, Birr Castle, Valentia Observatory and Roches Point since 1921. Later, the observing time changed as 9 and 21 hr. Thus, for these stations the extreme air temperatures were calculated in the 24 hr period 07-07 hr and at 09-09 hr for Figures 6 and 7. Met Éireann observations from 1960s onwards refer to 09-09 hr period. Since raw data is presented, deviations are displayed such as in the minimum air temperature at NUI Galway in the early 1930s ( Figure 6). The application of quality control and homogenization techniques is essential to generate high-quality data prior to any climate data analysis. However, the main goal of this article is the provision of raw observations and detailed station metadata so users can apply their selected procedure according to their aims.

WORK
The examination of diverse data-holders and datasources of historical meteorological observations allowed the rescue of detailed station metadata and the generation of a long-term air temperature dataset. Open access to unexplored and geographically well-distributed daily maximum and minimum air temperature series and related comprehensive metadata dating back to the early and mid-19th century through the ILMNT dataset will fill key gaps in climate research for Ireland, Europe and worldwide. The authors have fulfilled quality control and homogenization procedures on the rescued data, the results of which will be available to users in a forthcoming publication. These series will contribute to the generation of climate products, to assist climate change and attribution studies and to support climate modelling research. The long-term daily maximum and minimum air temperature series will offer a better understanding of past climate variability, trends and assessment of frequency, duration, intensity and distribution of extreme air temperature events and calculation of return period of rare events. The authors are assessing past extreme air temperature events in Ireland using quality controlled and homogenized daily air temperature data, and findings will be available in a forthcoming publication. Due to the rich heritage and importance of early instrumental observations in Ireland, the authors are undertaking further data rescue and continuing the search for missing manuscripts. double keying practices. Thus, these series were double-keyed. The comparison of the double-keyed series with the available series at ECA&D revealed a few keying errors on those series namely on the NUI Galway series (not shown).