Early meteorological records from Corrientes and Bahía Blanca, Argentina: Initial ACRE-Argentina data rescue and related activities
Dataset provided in PANGAEA Data Archiving & Publication, PDI-32376.
The international Atmospheric Circulation Reconstructions over the Earth (ACRE) initiative works to recover global climate history and build an accessible databank, with data from the past 250 years of terrestrial and marine surface. Argentina is part of that effort with ACRE Argentina recovering data from various sources that have been recorded throughout the country since the 19th century. In this paper, weather observations transcribed from the original records into digital form and taken every day during the years 1860–1879 at Bahía Blanca and from 1873–1886 at Corrientes are presented. The digitization was carried out through the project Meteorologum ad Extremum Terrae (MET) launched in the Zooniverse interface, which is currently working with near 900 citizen scientist volunteers per day. The present analysis corresponds to data retrieved from the collection ‘Anales de la Oficina Meteorológica Argentina’ and contains information on atmospheric pressure, air temperature, relative humidity, cloud cover, cloud types, wind direction, wind speed, rainfall and weather remarks. For the present analysis, only temperature and pressure values, measured in °C and mm of Hg, respectively, were considered, after a quality control of the digitization process was applied. Data values were tested and used to rebuild the time series of both places and correlation with SOI index and monthly pressure values for both places was tested using Spearman correlation. Results show that the influence of El Niño episode during 1877–1878 can be found in the pressure values at Corrientes.
Meteorological and climate data are essential tools to understand climate variability and to assess climate change on regional and global scales (IPCC, 2021). The availability of data, especially those extended over time, can contribute to understand where we are coming from and estimate where we may be going to. The study of the climate system requires the identification of the climate variability, trend and changes. It is thus necessary to have as much data as possible, spanning decades or even centuries. Various data sources have been used for this purpose, spanning many centuries (Bojinski et al., 2014). In recent times, direct and indirect observations are used to increase the climate dataset, but prior to modernity, records of different sources were used. For example, in paleoclimatology the so-called proxy data are often used (Sarachik, 1995). These proxy data are preserved physical characteristics of the environment that can stand in for direct measurements such as ice cores, pollen, tree rings or ocean and lake sediments. In the best of cases, these data represent annual average behaviours. As we go back in time, paleoclimatic information from ice and especially sediments and fossil species becomes more diffuse, with temporal resolutions of decades and centuries. Since proxy data only provide a partial perspective of the climate, it is necessary to retrieve as many meteorological observations back from the mid-20th century to the late Renaissance. Historic instrumental observations can furthermore be compared with proxy data, validating proxy records and extending our understanding of the climate system evolution (e.g., Zitto et al., 2015). This effort can contribute not only to knowing the past of the climate, global and regional, but it will also allow us to know the present and possible future trends in a more accurate way as well as provide data to improve existing reanalysis and models (Brunet & Jones, 2011). According to World Meteorological Organization (2016), long-term instrumental data and metadata from historical instrument observation are essential to improve our understanding of past climate variability and trends and to support validation of paleoclimate reconstructions from different sources (Mateus, 2021; Mateus et al., 2020). Hence, such records must be restored and preserved.
The first instrumental weather observations in the world can be found at the beginning of the 17th century in Italy (Camuffo & Bertolin, 2012; Camuffo et al., 2017), with Galileo's thermometer and Torriccelli's barometer. Daily observations made by Torriccelli at Florence and 10 other stations over several years have recently been recovered in the ‘first world meteorological network’, known as the Medici meteorological network between 1654 and 1670 and in 1694 at Modena, Paris and Essex (Camuffo & Bertolin, 2012). However, it was not until the 19th century that observations with long records increased and became systematic, mostly in the North Hemisphere, although some valuable observations during this period were made in the Southern Hemisphere. Domínguez-Castro et al. (2017) pointed out some of the earliest pressure observations associated with cyclone events were made between 17th and 19th centuries in Caribbean and central South American countries.
There is a wide range of diverse sources that have preserved historical information over the years; from public institutions, religious organizations, private and national libraries to academic, scientific and military institutions and national weather services. Argentina has a lengthy, valuable history in the field of meteorology. Weather instrumental observations in Argentina date back at least to the beginnings of the 19th century. These were carried out by Argentine citizens as well as resident foreigners. Among those responsible for such early observations, we can mention Manuel Moreno, one of the country's forefathers. Historic institutions such as the Colegio San Carlos (today Colegio Nacional Buenos Aires) carried out regular meteorological observations since in the early 1850s Buenos Aires. Natural science enthusiasts also started regular weather observations such as Caronti in Bahia Blanca in 1860.
The Oficina Meteorológica Argentina (OMA), established in Córdoba in 1872 was the third such public institution in the world, shortly after those established in Hungary (1870) and the United States (1871). Created by law by the National Congress after a request made by President Domingo F. Sarmiento, Dr. Benjamin A. Gould, from the United States, was its first director (‘Cronología Institucional’, Servicio Meteorológico Nacional). Inspired by the recommendations made by Captain Fitz Roy, commander of the H.M.S. Beagle during Charles Darwin's famous trip around the world, the authorities of the Armada de la República Argentina (A.R.A., Argentine National Navy) ordered A.R.A. vessels to record in regular, standard format weather and oceanic data, starting the mid-1890s. Argentina maintains the oldest and lengthiest Antarctic observational weather record at Base Antártica Orcadas, Laurie Is., since in 1903, through the cooperation with the Scottish National Antarctic Expedition (SNAE), led by William Speirs Bruce.
Robert Mossman, a famed Scottish explorer, member of SNAE and meteorologist trained the first ever Antarctic weather observers, Argentine citizens, thus beginning the permanent year-round human presence in the south polar region (Davies, 1905; Swinney, 2007). He then settled in Argentina, a prominent member of the OMA, later Servicio Meteorológico Nacional (SMN), leading a distinguished career both in the advancement of Argentina's Meteorological Service as well as in meteorological research, providing a generalized view of the three major atmospheric circulation cells and the positions of the subtropical high and low pressure systems around Antarctica. He was interested in teleconnections, correlations of meteorological and oceanic conditions in different areas, but was especially focused on links between weather and ice in the Antarctic and weather in South America. This coupling, particularly the one which establishes the link between rain in the middle latitudes of South America and the ice in the Weddell Sea, turned out to be of particular relevance for the very recently found Endurance ship, whose wreckage was found at the bottom (3,008 m b.s.l.) of the Weddell Sea, Antarctica (March 9, 2022), after being crushed by the pack ice in 1915 (Burton & King, 2016). Indeed, according to Burton and King (2016), in 1914, the Weddell Sea was known to be a dangerous area due to the pack-ice. It was Mossman who, a week after Shackleton's announcement of his plan to cross the continent of Antarctica from the Weddell Sea to the Ross Sea (The Times, 30 December, 1913; Mill, 1923), drew attention to the particular weather and ice conditions there in an article published in The Manchester Guardian (5 January, 1914; Burton & King, 2016). Train stations across central and northern Argentina also kept weather records (mostly pluviometric records) up till 1990. Many citizens, estancias and private companies also recorded weather observations, throughout the national territories. These observations, the way in which they were made, recorded and who recorded them, are of great value for Argentina and for the world given the particular and extensive geography of the country, extending between 21.5°S and 55°S, approximately, as well as the Antarctic territories, with many Antarctic stations in the Antarctic Peninsula and Weddell Sea. Furthermore, the Argentine Navy operated the logistic support to the Base Orcadas since 1904, with at least annual cruises, and since the late 1940s, to the subsequent Argentine Antarctic Stations, providing valuable meteorological and oceanic observations of Argentina´s continental shelf (Mar Argentino), the South Atlantic Ocean and the Southern or Antarctic Ocean.
As in the rest of the world, there are different sources of historic weather records in Argentina. Some are formal publications such as the Anales de la Oficina Meteorológica Argentina (OMA) published between 1878 and 1912 (Davies, 1905). Others sources are unexpected, such as the oldest known weather records found in a social, political and literary magazine, ‘La Abeja Argentina’, published between 1821 and 1823 in Buenos Aires (see for example ‘Observaciones meteorológicas.1° de Otoño en Buenos Aires, La Abeja Argentina’, vol. 1, p.34; La Abeja Argentina, 1822–1823) and the Census of the Province of Buenos Aires, 1890 (see for example ‘Clima en la República Argentina’, Gualterio G. Davis, 259–379, Censo General de la Provincia de Buenos Aires, 1890). These can be found in a variety of places, such as the Biblioteca Nacional (Argentina's National Library), the Biblioteca Nacional del Maestro (National Ministry of Education), Biblioteca Nacional de Meteorología (Servicio Meteorológico Nacional). Many relevant documents have been located in university libraries in Europe and USA; some of them were donated for example to the Oficina Meteorológica Argentina. Paraná and Uruguay River harbours hydrological records were managed at different times by provincial or municipal authorities. Many provincial and municipal governments throughout Argentina also kept weather stations and records. Many such hand-written documents vanished from municipal archives in Argentina, reappearing years later in university collections abroad, but without access to Argentine researchers. It is hoped that such institutions will contribute to current ACRE Argentina Data Rescue (DR) activities by facilitating access to such one-of-a-kind public documents which were originally part of municipal and /or provincial patrimony. Historic ship logbooks belonging to the Argentine Navy are stored and curated in the Archivo General de la Armada (https://www.argentina.gob.ar/defensa/archivos-abiertos/instituciones-de-archivo/archivo-general-de-la-armada). Many estancias (rural establishments) and mining companies kept weather records since the late 19th century, which need to be recovered and curated. The Claris EC 6th Framework Programme international project has recovered precipitation, and/or maximum air temperature and minimum air temperature records held by estancias within the Rio de la Plata Basin (Paraná and Uruguay rivers and its affluents), spanning eastern central and northeastern Argentina (NEA), parts of Uruguay and Bolivia, Paraguay and southern Brazil. However, similar records from the rest of Argentina need to be located and curated.
Transforming historical records into usable data is a challenging task. In general, the observations made in different places were not homogeneous and the instruments used varied over the years, many times without calibration or at least made available. This can be an issue particularly prior to the creation of the International Meteorological Organization (IMO), WMO precursor, in Vienna back in 1873, which standardized procedures and metadata records. Since that time, many instruments deployed in Argentina during the 19th century, particularly all those deployed by the OMA, were calibrated at Kew Gardens, UK, as the metadata contained in the Anales OMA attest.
The actual DR process is lengthy and complex. The sources of records must first be located and access to them guaranteed, both for public and private hands. The records must be photographed/imaged in order to have clear images that can be ordered, prepared for digitization and the data and metadata contents transcribed. Many of these records are handwritten, so the transcription can be really complex. Handwriting has changed over time, even for the 19th century and early 20th century documents compared with documents from more recent periods. To carry out this task, and due to the large amount of information starting to become available, a large number of people are needed not only to travel to the places where these records are held, but also to image and transcribe them to digital format. In recent years, several local, national and international initiatives and data rescue programmes with different scopes have been launched to recover and preserve this enormous amount of information at risk of being lost all over the world. With an international dimension, the Latin America Climate Assessment and Dataset (LACA&D), International Data Rescue (I-DARE; https://www.idare-portal.org/) and the Atmospheric Circulation Reconstructions over the Earth (ACRE; http://www.met-acre.org/) can be mentioned. The ACRE initiative is one of the most ambitious initiatives to date dedicated to rescuing early weather records to extend back in time reanalyses to include the early 19th century observations or at least since 1850. ACRE works closely with the international surface weather and climate observations community (International Surface Pressure Databank, the international RECLAIM, the International Environmental Data Rescue Organization, NOAA's NCDC Climate Database Modernization Programme), together with academics and archives around the world to expand the recovery, imaging, and digitization of historical instrumental weather observations (Allan et al., 2011; Slivinski et al., 2021). Under ACRE umbrella, a huge amount of data has been digitized in addition to several personal initiatives. A few of the DR efforts under way are: Alcoforado et al. (2012), digitized Portugal's records, Brugnara et al. (2020) digitized meteorological observations from 40 locations in Switzerland (1708–1873), Camuffo & Bertolin, 2012, and Camuffo et al., 2017 for Italy and the Western Mediterranean; Domínguez-Castro et al. (2015) for Spain and Domínguez-Castro et al. (2017) retrieved more than 300,000 meteorological data summarized in 137 series from Latin America and the Caribbean during the 18th and 19th centuries, Ashcroft et al., 2018 for the Mediterranean North Africa and the Middle East, and Hawkins et al. (2019) for Scotland. There is still a huge amount of weather records worldwide left to rescue and process. However, the efforts for ACRE in the Southern Hemisphere are not many, and we can mention ACRE Australia, ACRE Chile, ACRE Argentina (ACRE AR), ACRE South Africa and ACRE Antarctica restricted west of the Drake Passage (Allan et al., 2011). ACRE AR is managed through the Universidad Tecnológica Nacional, Facultad Regional Buenos Aires, Unidad de Investigacion y Desarrollo de las Ingenierías (UIDI) and is dedicated to rescue, process and digitize climate data from all over the country as well as Argentina´s Antarctic stations and naval records.
Prior to the Covid-19 pandemic, the ACRE AR team had all the 17 volumes of the Anales OMA ready in pdf format, spanning monthly, daily and sub-daily records, for observations made in Argentina between 1801 and 1910, together with all the volumes, also in pdf format, of the ‘Abeja Argentina’ published between 1821 and 1823. Furthermore, the logbook collections corresponding the Argentina Navy ships, A.R.A 1ro de Mayo (an oceanographic and logistic support steamer which operated between 1894 and 1941) and the corvette, A.R.A. Uruguay (which operated since 1877, and now is a floating museum in Puerto Madero, Ciudad Autónoma de Buenos Aires) were fully imaged.
In order to start digitization of ACRE AR DR effort, the printed records found in Anales OMA vol. II and III corresponding to Corrientes and Bahía Blanca were pre-processed by means of an open source software specifically developed for this purpose. This software was developed by the ACRE AR team for pre-processing images and post-processing digitized outputs. It works together with the Zooniverse platform and the ACRE AR´s Meteororum ad Extremun Terrae (MET) Zooniverse project (https://www.zooniverse.org/projects/acre-ar/meteororum-ad-extremum-terrae). The aim of this paper is to introduce the work under way through ACRE Argentina and to present the preliminary temperature and pressure analyses of the rescued timeseries from Bahia Blanca and Corrientes. Nineteenth century observations are compared with more recent ones at these two locations. A first approach to correlation with El Niño Southern Hemisphere Oscilation (ENSO) major events during the same decades of the 19th century is also presented here.
2 ACRE AR DR, DATABASE AND CITIZEN SCIENCE CONTRIBUTION WITH RETINA IN ZOONIVERSE
Once the basic equipment such as camera, portable scanners, tripods and lighting were acquired, following ACRE recommendations, ACRE AR started the imaging and subsequent digitization processes. Depending on the archival status of the data sources, this process may involve different steps. The process can be summarized in five main steps: (a) search of documentary sources, both in public and private hands, potentially containing historic weather data, (b) first visual error inspection of the original records to detect potential sources of error due to poorly typed or illegible letters or numbers, or ink smudges on the original records, commonly found on paper support. As Capozzi et al. (2020) pointed out, ‘visual inspection is a key part of the quality control, highlighting some impairments in data quality that would otherwise be very difficult to flag through automatic statistical methods’. The next step is the imaging of the documentation by digital scanning or digital photography (c) followed by the digitization of meteorological data content, as well as relevant metadata (d). Finally, the essential quality control of the digitized records and production of data inventory (e) ends the cycle for a given document. Climatological analysis of the digitized records provides an additional quality control (WMO, 2016). Temperature expressed in Celsius degrees and pressure in mm of Hg, were chosen to be analysed as ECVs (Essential Climate Variables, Thorne et al., 2017), while the remaining wind, precipitation and other relevant observations will be analysed during the next stage. The following paragraphs introduce data and the corresponding analyses resulting from such processes for two sets of 19th century weather records contained in the Anales OMA, specifically Corrientes and Bahia Blanca. Details of procedures and software developed by ACRE AR are also introduced.
The OMA in 1872 coordinated the first national meteorological network with more than 15 stations distributed all around the country and by 1895 the number of meteorological stations was more than 30 (Table 1). ACRE initiative in Argentina started working with data from Corrientes (station identification Meteosat: 87166, WMO: SARC, International Airport of Corrientes), Corrientes province fairly close the tropical boundary of the southern subtropics (Tropic of Capricorn), and Bahía Blanca (station identification Meteosat: 87750, WMO: SAB, Bahía Blanca Aero) located in the southern region of the Buenos Aires Province, on the extratropical boundary of the southern subtropics. These two stations, named as the cities where they are located, belong to the original regular network of the national meteorological monitoring system set up by the OMA. Recovered daily data from the OMA, Volume II, whose front page is shown in Figure 1, span April, 1873 through January 1885 for Corrientes and January 1860 through September 1879 for Bahía Blanca, both of them in daily bases. According to the station metadata available in the Anales OMA, observations in Bahía Blanca were started by Mr. Felipe Caronti on 27 January 1859, with two aneroid barometers. The same recalibrated instruments allowed the observations to continue until 19 March 1873, when they were replaced by a mercury barometer built by Troughton and Simms. In August 1st, 1875, the Negretti y Zembra barometer n° 1013, property of OMA, arrived at the Meteorological Office, after calibration at Kew Gardens, London, UK, being then shipped to Bahia Blanca. The consequences of such instrument changes are discussed together with the data.
|Main weather stations||Year||Lat.||Lon.|
|Santiago del Estero*||1873||30.02°S||64.18°O|
|San Antonio de Areco*||1879||34.25°S||59.47°O|
|Paramillo de Uspallata||1886||32.47°S||69.13°O|
|San Juan de Salvamento Is. de los Estados||1886||54.71°S||63.85°O|
|San Antonio Oeste*||1888||40.44°S||64.57°O|
|Paso de los LIbres||1900||29.43°S||57.06° O|
|Base Orcadas*||1903||60.73°S||44.77° O|
Mr Eduardo Fitz-Simon, in cooperation with Mr. Santiago Fitz-Simon, Rector of the Colegio Nacional de Corrientes (Figure 4) carried out the earliest observations in Corrientes. The instruments used for the observations also correspond to the firm Negretti and Zembra (thermometers no 15857 and 15838, barometer no 994) built in London, and also calibrated at Kew Gardens. Figure 2 shows an example of a page from OMA corresponding to Bahía Blanca in 1860 with mean daily value for each day of temperature and pressure, although it is also possible to find the average values every 10 days or ‘decades’, given for different times, i.e., 7 a.m, 2 and 9 p.m. The whereabouts of the documents containing subdaily observations at specific times is currently being investigated by ACRE Argentina. A standard printed page belonging to these books usually contains, in the earlier volumes, is divided into two sections, each half, right and left, corresponding to 1 month. The first column, common to both months, shows the day of the records, while the rest of the available information is referred to the state of the sky, pressure (mm) and temperature (°C) values, relative humidity, wind direction, and one last column with information about rain, storm or other significant weather events. In most cases (see Figure 2) the averages were also calculated every 10 days and information on the monthly average is also provided.
There is also station metadata with the observation methodology used at the time, as well as origin and type of instruments used. Preliminary analysis of the data and suggested corrections to be applied to the recorded values due to the change of instrumentation used, can also be found. Figure 3 shows the corrections for the pressure values of Bahía Blanca, as a result of the instrument changes.
All the observations found for Corrientes and Bahia Blanca are still in paper source. The quality of the records can be considered as acceptable with the majority of the printed numbers and letters visible an easy to transcribe, as can be observed in the examples shown in the figures. There are a few cases with blurred and/or distorted numbers and letters that can be potential sources of errors during digitization. These values were flagged for analysis after the digitization process. After imaging the observation contents, the pictures were stored as jpg format and transformed into pdf to be digitized. In order to digitize the information included in this work, like all the rest that is in progress for other locations in Argentina, the Meteororum ad Extremum Terrae (MET) project (https://www.zooniverse.org/projects/acre-ar/meteororum-ad-extremum-terrae) was launched in the Zooniverse platform in December 2021, with a very good media coverage, including national press, radio and TV news outlets (https://www.agenciacyta.org.ar/2021/11/convocan-a-ciudadanos-para-la-reconstruccion-historica-del-clima-de-argentina-a-partir-de-1850/; https://www.infobae.com/tag/proyecto-acre/). This national and international coverage allowed us to obtain more than 37,000 digitizations on the first day. This platform (zooniverse.org) also used by other meteorological data rescue projects (Weather Rescue, Old Weather, Southern Weather Discovery, and Climate History Australia to name a few) provided us the opportunity to create a customized bilingual website (English and Spanish) to encourage volunteers to participate as citizen scientists and digitized the data from the scanned images. A “beta” version of the website was first tested and reviewed by a small group of experienced Zooniverse volunteers who provided valuable feedback and suggestions to improve the process. (See for example Craig et al. (2020) and Slonosky et al., (2019)) However, and due to the huge amount of information (each of the Anales de la OMA, for example has, on average, 550 pages, of which approximately two thirds correspond to weather data tables) and in order to simplify the process for the volunteers, a software open source package (RETINA), based on artificial intelligence, was specially developed (Figure 4).
The function of rescuing data from paper media with hand-written or typed text with RETINA, can be viewed from the perspective of pre, main, and post processing. Pre-processing involves identifying text that has value. Images can be viewed as pages that inherently contain data. These images can be sorted so that the order of pages corresponds with how they appear in a physical document. In this way, all pages have a relationship with respect to time in the manner in which they were written. There are areas of an image that have no value, such as margins and lines. The pre-processing stage allows the operator to choose which areas of the image have text with value, e.g., temperature or pressure observations, and focuses on overlaying with RETINA rectangles and grids, which will provide and maintain a reference to digitized text in the future. Note that similar page layouts can be handled with the grip made by the operator on the first such page, reducing the time required for image preprocessing. Main processing involves asking a volunteer to transcribe text from a sub-image corresponding to a segment or ‘pixel’ from the grid overlaid over the image during pre-processing. Most frequently this ‘pixel’ contains only the smallest piece of complete information to be digitized, such as an observation, column/line identifiers, or short phrases describing a particular observation or conditions at the time of observation. The processing of the sub-image be completed in seconds and the same sub-image is randomly and systematically shown to different volunteers accessing MET a predetermined number of times and the resulting input text is collected.
Making collections of same, similar, or different text values for the same sub-image is the purpose of main processing and it is essential to quality control. Sub-images are subsets of a much larger image and an IDS is assigned to each sub-image to track these inside and outside of the system. Sub-images are uploaded to Zooniverse with their IDS, the digitized text is collected, and the export format can be imported by a solution system. Post processing uses the text or digitization collected for each sub-image together with the IDS stored in the Zooniverse image files. The collected digitization must be validated as correct. This then can be done by collecting multiple transcriptions of an image, as mentioned above, and organizing them into sets. The set with the most frequent value can be regarded as correct. At the moment we set up MET to get at least eight transcriptions for each sub-image and the threshold to considerer the image as correct is set at five identical transcriptions. If there is complete disagreement between the volunteers, then the uncertain value is automatically flagged for manual checking. Once validated, the remaining task is to organizing the data in such a way that it allows for the production of tables and graphs. There are a number of manual ways to do this, while it is always the goal to automate as much as possible. RETINA uses a custom algorithm to rebuild the structure of a table from only the rectangles and grids drawn, providing a spreadsheet with all the data transcribed. The output is thus a digitized table or chart of the original image, commonly in excel format which is finally visually compared with the original page contents. Thus, RETINA provides a semi-automated approach to image digitization which significantly simplifies and increases the speed of digitization and quality control.
The documentation of the software is available at https://github.com/meritoki/retina-desktop-application/releases (Users are required to acknowledge the source of the software). Figure 5 shows a couple of examples of those rectangles displaying sub-images with clouds description and temperature values in the Zooniverse page. In this way, each volunteer, only has to type the single data/information contained in the MET project screen. This simplifies the volunteer´s task and reduces the error probability when multiple pieces of information are to be digitized from within a single image. Results until now show that there is a general consensus between volunteers that this approach works best. At the end of the process, in this case for Bahia Blanca and Corrientes, temperature and pressure values in excel forms are obtained. Two types of errors are usually found in digitized spreadsheets. The first is associated with blurred letters and numbers or outlier values in the original document and can result in having outlier digitized values or more than one transcribed value for a single sub-image (take into account that each sub-image is digitized in MET by eight volunteers). The second type is linked to the actual transcriptions of the sub-images, where the volunteer may occasionally change the position of the decimal point during typing, or even invert number/letter order, and instead of 23.4°C for example, transcribes 2.34°C, i.e. a human error. This type of error tends to occur more in pressure than in temperature values, probably because there are more number in each sub-image to be transcribed. In all the cases, the first step in the quality assessment is to look at the rescued data and check on very large values using appropriate tools like Excel filters. Since each sub-image is transcribed by eight volunteers, changes in the position of the decimal point, number inversion or very large values at the end of quality control mare more frequently linked to anomalous original data rather than transcription mistakes. If the additional information in the observed data does not describe a particular weather context that justifies these values, then and following the same criteria adopted by Domínguez-Castro et al. (2017), the outlier values are detected by calculating the mean plus/minus three standard deviations of each variable. For the present analysis more than 16.000 daily data were checked and only 30 pressure values (error rate close to 0.19%) were found to be incorrect, mainly due to the position of the decimal point. These values were corrected. After correcting for the mistyped or missing transcribed data, the digitized time series, daily and monthly mean values (calculated from the daily data observed) were subject to homogeneity testing by applying the Standard Normal Homogeneity Test (SNHT; Alexandersson & Moberg, 1997). As Brázdil et al. (2010) pointed out all observational rescued data require quality control in order to reflect the real climate variations rather than the influence of non-climatic factors and this is of particular relevance in the case for historical observations that have not been taken using modern standards and techniques (Brázdil et al., 2010). Figure 6 show the daily time series of temperature values rescued for Corrientes and Bahía Blanca.
Results show that between 7 and 16 November 1874, a break point with a shift of +1.7°C can be detected in the daily temperature values for Corrientes, while the only discontinuity for Bahía Blanca can be found between 3 and 4 April 1871 with a shift of around −0.7°C. Although the discontinuity observed in Corrientes may be related to the lack of data for almost a year and a half, between December 1873 and May 1874 (Figure 6a), the break points both Corrientes and Bahía Blanca are discus in the next paragraphs. Regarding monthly mean temperature values, the SNHT showed that the time series for both Corrientes and Bahía Blanca are homogeneous and, therefore historical observations are capable of providing information on relative climate variability.
In order to have an additional insight of the behaviour of the dataset, Figure 7 shows the monthly mean temperature values for Corrientes and Bahía Blanca, including monthly mean temperature values of KNMI Explorer for both stations (Corrientes between 1930–2017, Bahía Blanca between 1860–2019) and monthly mean values from the Instituto Correntino del Agua (IAC; 2012–2017) for Corrientes. Maximum monthly values for Bahía Blanca of both series correspond to 26.4°C for observed data and 24.7°C for KNMI, both in February, 1874, while minimum values are 4.6°C and 6.3°C for KNMI and observed data respectively, both in July 1873. It can also be noted that in general the values of the rescued data are slightly higher than those of KNMI, with 2°C of difference between then, which decreases to 1.2°C if the mean value of the two dataset is considered. At least three points in the rescued data were detected to be outliers, with 0.1, 0.9 and −0.2°C.
Unfortunately, no dataset for Corrientes is currently available that can at least partially overlap the rescued data so as to compare past and present behaviours. Thus, the gap in data during the years with no information may lead to inaccurate conclusions. Differences between the maximum and the minimum value of both series could be due not only to the difference between both databases, but also to a change in the variable behaviour. However as can be seen in the plot, maximum value for the rescued data is close to 33.9°C in February, 1883, quite higher than the 29.3°C that corresponds to KNMI dataset, while minimum value of the observed data is close to 7°C in June, 1884, lower than a minimum at 11.1°C found in KNMI records. Something similar happens with the recent ICA data where maximum and minimum values are closer to the KNMI dataset than to those observed during the 19th century. Figure 8 shows the comparison between monthly mean temperature cycle calculated from daily observed data with mean monthly values from KNMI Explorer, both for Corrientes and Bahía Blanca. In both cases, a very clear correspondence between both series of data is noted, meaning that the digitized data is in the expected temperature range; although in both stations it can be seen that rescue data series seems to overestimate temperature values. Figure 9 shows daily pressure timeseries at both locations. It should be noted that the first two plots of the figure (Figure 9a,b) correspond to the uncorrected raw pressure values while the last ones (Figure 9c,d) show the adjusted values after the corrections proposed in the Anales (Figure 3) were applied. As can be seen in the plots, there are some erroneous or outlier values in the raw data. See for example, records at Corrientes in June 6, 1873, December 24, 1880 and January 30, 1883. After applying corrections, in Bahia Blanca, the gap in January 21, 1873, although smaller, still remains and as far as Corrientes is concerned, outliers can still be observed. As in the case of the temperature time series, both daily pressure values (corrected and uncorrected) were also subject to SNHT. In both stations, results show that discontinuities can be detected. At the beginning of the observations, between January 30, 1875 and February 2, 1875, a shift of around −2.6 can be observed in Corrientes, while the discontinuity in Bahía Blanca was found between April, 20–22, 1873 with a shift of −2. Both for the overestimation of the rescued monthly temperature series compared to KNMI data, as well as the discontinuities in the daily temperature and pressure values and the outliers, the original data and the metadata were rechecked. According to the Anales, the aneroids were replaced in 1873 by new barometers built by Troughton and Simms; in the same year where there is a gap in Corrientes. In 1875, the pressure measuring instruments were replaced again. It was precisely because of these changes that corrections to the recorded pressure values were proposed (Figure 3). Reference is also made to the calibration of the instruments which may have been affected once they arrived at the stations after having travelled long distances from Buenos Aires, where they were usually received. There are no insights about their exposure, either indoors or in the sunlight or relocation, as well as the turnover of the personnel responsible for the meteorological observations, although it is mentioned that observations were not always made regularly at the same times. There are some analyses related to early instrumental warming in the rescued series that could be find in Frank et al. (2007) and Böhm et al. (2010), and could be potentially linked with the overestimated temperature values found in both stations; however there are focused in the Northern Hemisphere. Instrument calibration as well as regularity and methodology in measurement practises may be the causes for the overestimation of temperature, however, the reason for the outliers in Corrientes pressure does not seem to correspond to any of these reasons as they are isolated data. A plausible hypothesis for these outliers could be simple a human error in the records. Finally, the monthly pressure series were also tested and they turned out to be homogeneous.
The above discussion highlights the importance of having or locating good quality metadata regarding instruments and operations at a given location. A careful assessment about quality and reliability of the data must be performed on the basis of historical metadata. It is relevant to understand potential problems or causes for behavioural changes in the obsevations and to correct them if possible for future use. It is remarkable too that such a complicated correction with multiple breaks in the observations requiring complex calculations was succesfully carried out long before the advent of modern computing, given credit to the profesionality of the operators and scientists working at or for OMA.
3 POTENTIAL RESCUED DATA APPLICATIONS: AN EXAMPLE
As was mentioned before, in situ data are an essential need in the analyses of past and current climate change. Not only are rescued data valuable for either 20th century reanalyses improvement and/or validation. Relevant climate studies can be carried out, exploring historic weather events or analysing whether repetitive events such as El Niño events are yielding similar climate responses and impacts in different regions.
There is considerable evidence that during 1876–1878 a major El Niño episode resulted in several harmful impacts around the world, however, the strength and statistical significance of this El Niño event have not been fully addressed, largely due to the lack of data (Huang et al., 2020). El Niño Southern Oscillation (ENSO) is one of the dominant modes of Earth's climate system and plays an important role in the analysis of seasonal-to-interannual temperature and precipitation behaviour. ENSO's effect is linked with sea-surface temperature values in the equatorial Pacific that are either warmer (El Niño) or colder (La Niña) than average (McPhaden et al., 2006). The influence of ENSO extends far beyond the coastal regions of Ecuador and Perú, and has a substantial impact in South America (SA; Cai et al., 2020). In fact, during the 1876–1878 episode the precipitation pattern was strongly affected with rainfall deficit and drought in northeastern Brazil and in the central Andean highlands, and abundant rainfall and floods along the coastal areas of southern Ecuador and northern Perú, central Chile and the Paraná basin in the southeastern part of the continent (Kiladis & Diaz, 1986; Ortlieb, 2000). According to Aceituno et al. (2009), floods also affected areas in southeastern South America (SESA: southern Brazil, Uruguay, Paraguay and northeastern Argentina) during the first months of 1878 and meteorological stations such as Corrientes, Goya, Córdoba and Buenos Aires in Argentina registered rainfall around twice the climatological mean. In this context, it is important to note that Corrientes is located within the southern region of the La Plata Basin that has registered the largest El Niño impacts during the last 40 to 50 years (Boulanger et al., 2005), after the strengthening of the ENSO signal in the 1960's, as shown in Barrucand et al. (2018). On the other hand, current ENSO events do not yield have such a strong signature in the southern portion of the Pampas region where Bahia Blanca is located. It is interesting to note that Barrucand et al. (2018) show similarly strong ENSO signals between at least the 1880s and up to 1920. After that ENSO was very weak for more than 40 years. Hence it would be possible to expect a similar response to ENSO variability during the study period as in the end of the 20th century and early 21st century.
To address on the effect of ENSO described above, the possible links between the Southern Oscillation Index (SOI) and reconstructed pressure time series using the rescued data from Corrientes and Bahía Blanca were tested. Monthly pressure averages at both places (constructed with the daily rescued data) were used and the Spearman's correlation coefficient (ρ) was calculated and tested by a forward lagging of the pressure series for up to 12 months in the case of the annual series and for up to 3 months in the case of the seasonal series. The SOI monthly values were retrieved from https://crudata.uea.ac.uk/cru/data//soi (Ropelewski & Jones, 1987).
Figure 10 shows the pressure time series for Corrientes and Bahía Blanca, together with the one for the SOI index. A quick inspection of the figure shows that between 1877 and 1878 the pressure values are lower than in the rest of the period, suggesting a potential linkage with the strong El Niño event during those years. In order to evaluate this possible linkage, the Spearman correlations between the pressure monthly averages for Bahia Blanca and Corrientes were performed (Table 2). Correlations with values for Bahía Blanca show that the 7-month lag has a slight negative yet significant correlation with the SOI. On the other hand, the pressure time series for Corrientes is significantly correlated with the SOI for lag 0 in autumn and for lag 2 in winter (Table 2). A positive value denotes a direct relationship between a negative phase of the SOI (El Niño conditions) that seems to have an impact in the pressure at Corrientes by decreasing it. During autumn, this is in agreement with a positive rainfall anomaly in the region (Cai et al., 2020).
Such an analysis acts as a further validation of the rescued data. In principle, as previously argued, one would expect that given similar forcing such as ENSO the records would exhibit a similar response to it as observed during more recent events. Admittedly, as discussed in Barrucand et al. (2018) ENSO has shown considerable variability in strength and frequency along the 20th Century and varied response in the climate system. However strong ENSO events in all cases elicited similar, albeit not identical, strong systemic responses, in such variables as pressure, temperature and rainfall. The ENSO event considered here, as given by SOI, has been documented as a particularly strong event, and the current analysis does show a pressure response in good agreement with far more recent strong ENSO events.
Thanks to the digitization work of thousands of volunteers around the world, a huge trove of weather observations relevant to climate studies is being recovered. These data are important to deepen the knowledge of the past, present and future climate, but they also represent a great contribution to reconstruct the history of each place and region, providing data, methodology and analysis, as well as information on the evolution in the field of meteorological observations and the history of atmospheric sciences, which otherwise could be lost. ACRE AR has been set up to fill in a significant gap in available historic observations for the Southern Hemisphere, in particular Southern South America, adjacent oceans, specially the South Atlantic and Southern Ocean, and Antarctica, i.e. from the edge of the Tropic of Capricorn all the way to at least subpolar and polar latitudes since the beginning of the 20th century. This represents and extensive latitude range on a comparatively narrow longitudinal swath, which is highly relevant for comparison with similar ACRE projects in Chile, South Africa, Australia and New Zealand. An example of ongoing activities, are the pressure and temperature time series shown here, corresponding to 19th century observations made at Bahía Blanca and Corrientes, recovered for science by means of digitized data through the MET project on the Zooniverse platform. Thanks to the contribution of the volunteers more than 16.000 daily data of both stations were digitized and checked and only an error rate around 0.19% were found to be incorrect, mainly due to the position of the decimal point. The digitized time series, daily and monthly mean values were subject to homogeneity testing and results have shown that in both stations, discontinuities can be detected in the daily mean values while monthly time series are homogeneous. Detailed metadata in conjunction with data were used to understand several inhomogeneities.
ENSO impacts in SESA have been the focus of much research given their socio-economic consequences. Preliminary results begin to highlight the occurrence of such strong events during the years of significant development and consolidation in NEA. This also reflects the relevance of the rescued data for the analysis of past extreme events.
The development of the RETINA software package for pre and post processing during image digitization represents a very important contribution to DR activities. These are essential tedious, time consuming tasks. RETINA significantly reduces the processing time and tediousness and is a helpful tool for digitization quality control, essential to ensure good records for the international meteorological databases and studies and reanalyses products to be produced with them.
There are still many more sources of historical climate data for Argentina, adjacent oceans and Antarctica to be discovered and we are working on it through ACRE Argentina. Our next steps in this task will focus on the digitization and analysis of the remaining volumes of Anales OMA, with monthly, daily and sub-daily records of temperature and pressure values, and later we will focus on the logbook collections corresponding the Argentina Navy ships A.R.A 1ro de Mayo.
Susan Gabriela LAKKIS: Conceptualization (equal); data curation (equal); formal analysis (equal); investigation (equal); methodology (equal); visualization (equal); writing – original draft (equal). Pablo Osvaldo Canziani: Conceptualization (equal); formal analysis (equal); funding acquisition (lead); investigation (equal); resources (equal); supervision (equal); writing – review and editing (equal). Joaquín Rodriguez: Data curation (equal); project administration (equal); software (equal). Adrián Enrique Yuchechen: Formal analysis (equal); methodology (equal); writing – original draft (equal).
This project would not have been possible without the great contribution of all volunteers who are working in MET and giving us your time and dedication to bring the history of our climate to the present. We also would thanks to Zooniverse.org which allowed us to create our project. We are also indebted to the Archivo de la Armada Argentina that provided the original logbooks and the good disposition of ‘its guardians’ at all times to help us in this task. We also wish to thank Rob Allan. Clive Wilkinson, Kevin Wood, sadly recently deceased, and M. McBenoy for their support and advice during the implementation of ACRE Argentina, as well as Capitan de Corbeta Lic. Alvaro Scardilli and Dra. Sandra Barreira from Servicio Meteorológico de la Armada, Armada de la República Argentina. Our gratitude must also to Clarisa Rodriguez, Carola Toro and Alejandro Intriago, who have worked to make the digitization process possible. The Universidad Tecnológica Nacional FRBA also must be mentioned for their institutional support and funding through PID ACRE 2MNSINNBA0006543 funding.
CONFLICTS OF INTEREST
The authors declare no conflict of interest.
OPEN RESEARCH BADGES
This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://doi.pangaea.de/10.1594/PANGAEA.946733. Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki.
- 2009) The 1877–1878 El Niño episode: the 1877–1878 El Niño episode: associated impacts in South America. Climatic Change, 92, 389–416. https://doi.org/10.1007/s10584-008-9470-5
- 2012) Early Portuguese meteorological measurements (18th century). Climate of the Past, 8, 353–371. https://doi.org/10.5194/cp-8-353-2012
- 1997) Homogenization of Swedish temperature data. Part I: homogeneity test for linear trends. International Journal of Climatology, 17, 25–34.
- 2011) The international atmospheric circulation reconstructions over the earth (ACRE) initiative. Bulletin of the American Meteorological Society, 92, 1421–1425. https://doi.org/10.1175/2011BAMS3218.1
- Anales de la Oficina Meteorológica Nacional. (1881). Tomo II, Climas de bahía Blanca y Corrientes. Benjamin A Gould, Imprenta de pablo E. Coni, Especial para Obras.
- 2018) A rescued dataset of sub-daily meteorological observations for Europe and the southern Mediterranean region 1877–2012. Earth System Science Data, 10, 1613–1635. https://doi.org/10.5194/essd-10-1613-2018
- 2018) Historical SAM index time series: linear and nonlinear analysis. International Journal of Climatology, 38, e1091–e1106. https://doi.org/10.1002/joc.5435
- 2010) The early instrumental warm-bias: a solution for long central European temperature series 1760–2007. Climatic Change, 101(1), 41–67.
- 2014) The concept of essential climate variables in support of climate research, applications, and policy. Bulletin of the American Meteorological Society, 95(9), 1431–1443.
- 2005) Observed precipitation in the Parana' -Plata hydrological basin: long-term trends, extreme conditions and ENSO teleconnections. Climate Dynamics, 24, 393–413. https://doi.org/10.1007/s00382-004-0514-x
- 2010) European climate of the past 500 years: new challenges for historical climatology. Climatic Change, 101, 7–40. https://doi.org/10.1007/s10584-009-9783-z
- 2020) Early instrumental meteorological observations in Switzerland: 1708–1873. Earth System Science Data, 12, 1179–1190. https://doi.org/10.5194/essd-12-1179-2020
- 2011) Data rescue initiatives: bringing historical climate data into the 21st century. Climate Research, 47, 29–40.
- 2016) Robert Mossman, endurance and the Weddell Sea ice. Polar Record, 52(1), 92–97. https://doi.org/10.1017/S0032247415000285
- 2020) Climate impacts of the El Niño–Southern oscillation on South America. Nature Reviews Earth & Environment, 1, 215–231. https://doi.org/10.1038/s43017-020-0040-3
- 2012) The earliest temperature observations in the world: the Medici network (1654–1670). Climatic Change, 111, 335–363. https://doi.org/10.1007/s10584-011-0142-5
- 2017) Temperature observations in Bologna, Italy, from 1715 to 1815: a comparison with other contemporary series and an overview of three centuries of changing climate. Climatic Change, 142, 7–22. https://doi.org/10.1007/s10584-017-1931-2
- 2020) Rescue and quality control of sub-daily meteorological data collected at Montevergine observatory (southern Apennines), 1884-1963. Earth System Science Data, 12(2), 1467–1487.
- 2020) Digitizingobservations from the Met Office Daily Weather Reports for 1900–1910 usingcitizen scientist volunteers. Geoscience Data Journal, 7(2), 116–134.
- . (1890) Censo General de la Provincia de Buenos Aires 31 de enero de 1890, Colección: Censo General de la Provincia de Buenos Aires 1890, Datos de edición. La Plata: La Dirección.
- 1905), Anales de la Oficina Meteorológica Argentina, Vol XVI, Talleres de Publicaciones de la Oficina Meteorológica Argentina, p. 214.
- 2015) Iberian extreme precipitation 1855/1856: an analysis from early instrumental observations and documentary sources. International Journal of Climatology, 35(1), 142–153. https://doi.org/10.1002/joc.3973
- 2017) Early meteorological records from Latin-America and the Caribbean during the 18th and 19th centuries. Scientific Data, 14(4), 170169. https://doi.org/10.1038/sdata.2017.169
- 2007) Warmer early instrumental measurements versus colder reconstructed temperatures: shooting at a moving target. Quaternary Science Reviews, 26(25–28), 3298–3310.
- 2019) Hourly weather observations from the Scottish Highlands (1883–1904) rescued by volunteer citizen scientists. Geoscience Data Journal, 6(2), 160–173. https://doi.org/10.1002/gdj3.79
- 2020) How Significant Was the 1877/78 El Niño? Journal of Climate., 33, 4853–4869. https://doi.org/10.1175/JCLI-D-19-0650.1
- IPCC. (2021) Summary for Policymakers. In: V. Masson-Delmotte, P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger et al. (Eds.) Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. United Kingdom and New York: Cambridge University Press, pp. 3–32.
- 1986) An analysis of the 1877–78 ENSO episode and comparison with 1982–83. Monthly Weather Review, 114, 1035–1047.
- La Abeja Argentina. (1822–1823). 15 nros. En Biblioteca de Mayo, Colección de Obras y Documentos para la Historia Argentina, tomo VI: Literatura, Buenos Aires: Senado de la Nación, 1960, 5245–5700.
- Manchester Guardian (1914). Prospects of Sir E. Shackleton's Expedition 5, January.
- 2021) Searching for historical meteorological observations on the Island of Ireland. Weather, 76, 160–165. https://doi.org/10.1002/wea.3887
- 2020) Reconstruction of a long-term historical daily maximum and minimum air temperature network dataset for Ireland (1831-1968). Geoscience Data Journal, 7(2), 102–115.
- 2006) ENSO as an integrating concept in earth science. Science, 314, 1740–1745.
- 1923) The life of Sir Ernest Shackleton C.V.O., O.B.E. (Mil.), LL. D.. London: W. Heinemann.
- 2000) The documented historical record of El Niño events in Perú: an update of the Quinn record (sixteen through nineteen centuries). In: H.F. Diaz & V. Markgraf (Eds.) El Niño and the southern oscillation: multiscale variability and global and regional impacts. Cambridge: Cambridge University Press.
- 1987) An extension of the Tahiti-Darwin southern oscillation index. Monthly Weather Review, 115, 2161–2165.
- 1995) In: National Research Council (Ed.) Natural climate variability on decade-to-century time Scales. D.G. Martinson, K. Bryan, M. Ghil, M.M. Hall, T.R. Karl, E.S. Sarachik, S. Sorooshian, and L.D. Talley (Eds.). Washington, DC: National Academy Press, p. 630.
- 2021) An evaluation of the performance of the 20th century reanalysis version 3. Journal of Climate, 34, 1417–1438. https://doi.org/10.1175/JCLI-D-20-0505.1
- 2019) From books to bytes: Anew data rescue tool. Geoscience Data Journal, 6(1), 58–73.
- 2007) The Scottish national Antarctic expedition (1902–04) and the founding of base Orcadas. Scottish Geographical Journal, 123(1), 48–67. https://doi.org/10.1080/00369220718737283
- 2017) Toward an integrated set of surface meteorological observations for climate science and applications. Bulletin of the American Meteorological Society, 98, 2689–2702. doi:10.1175/BAMS-D-16-0165.1
- World Meteorological Organization. (2016) Guidelines on best practices for climate data rescue. WMO-No. 1182. Geneva: World Meteorological Organization.
- 2015). Variability at low frequencies with wavelet transform and empirical mode decomposition: application to a climatological timeseries. https://doi.org/10.1109/RPIC.2015.7497098