Improving the estimation of the true mean monthly and true mean annual air temperatures in Greece

The true mean monthly/annual air‐temperature estimations are usually based on the monthly averages of meteorological observations performed at standard hours. The performance of this combination varies as it depends on local climate of the particular meteorological station; it consequently allows for small/large deviations from its exact value. This study examines the possible deviations in Greece and suggests ways of improving the derived estimates. Two cases are considered. (1) An area has one thermograph and a number of thermometric stations. (2) An area has several thermographs and many thermometric stations. Solutions for minimizing the estimation error in both cases are provided.


Introduction
At meteorological stations equipped with thermographs, continuous air-temperature recordings are available. From such observations, 24-h air-temperature values can be deduced. Their arithmetic mean is defined as the true mean temperature or the true daily mean (Conrad and Pollak, 1950;Weiss and Hays, 2005;Dall' Amico and Hornsteiner, 2006), while the monthly average of the true daily means yields the true monthly mean.
When only a few (usually three) daily air-temperature observations are available, a linear combination of the monthly average temperatures at the standard hours of the observations is used in order to obtain the closest possible approximation to the true monthly mean. These combinations depend on the climate of the station; Conrad and Pollak (1950) have listed the formulas that obtain best results for the greater part of Europe as well as for tropical and subtropical climates.
All the formulas can give an accurate answer, provided that there is no daily temperature variation. This implies that the formulas tend to express the average daily air-temperature variation of the stations that corresponds to the climate; they consequently fail when the daily air-temperature variation at a particular station strongly deviates from its assumed pattern. This failure may yield miscalculations concerning the heating demands of buildings (e.g. Kaufmann et al., 2013) as well as predictions of crop production (e.g. Rosenzweig and Parry, 1994;Olesen and Bindi, 2002).
The formula that has been suggested by the aforementioned authors as very advantageous for the greater part of Europe is: where T is the approximate true average monthly air temperature and T 7 , T 14 , and T 21 are the monthly mean air temperatures at the 0700, 1400, and 2100 of observation, respectively. The hours refer to mean local time, UTC+2.
A modified version of this formula, with T 8 replacing T 7 , has been used by Aeginitis (1907) and Mariolopoulos (1938) for the estimation of the true mean monthly temperature in Greece. The validity of the modified formula has been tested with the help of the thermographic recordings taken by a Richard-type thermograph at the meteorological station of the National Observatory of Athens (hereafter NOA) from 1894 to 1903 (Aeginitis, 1907). The error of the approximation was found to be small at NOA (Aeginitis, 1907) and of the order of 0.1 ∘ C in most cases (Mariolopoulos, 1938). The formula: was used all over Greece for the estimation of the true monthly air temperature under the implied assumption that it performs equally well all over the country. The latter constitutes a rather optimistic point of view, considering the climatic diversification of Greece. The aim of this study is, therefore, to investigate the possible errors that may arise as a result of the discriminatory application of this formula and to suggest ways of improving the true mean monthly temperature estimates.
It should be noted that Equation (2) is still in use by the Hellenic National Meteorological Service (HNMS) in a slightly different form: the observations taken at 2100 local time have been replaced by those at 2000 from 1930 onwards.
Two different cases are considered here for this problem. First, when only one thermograph operates in a region but thermometric observations are available at a number of locations. This case resembles the situation in Greece from 1894 to 1930. Second, thermographic observations are available at a limited number of sites and thermometric ones exist over a denser network of meteorological stations. This resembles the situation in Greece from 1950 onwards. Solutions to the problem are suggested in this study for both cases and the results are inter-compared. The objective of this study is to provide corrections to the pre-1930 temperature measurements when thermometric stations mainly existed.

Case 1. True mean air-temperature estimations with one thermograph
It should be mentioned here that a typical temperature reading difference between a thermometer and a thermograph may be around 0.5-1 ∘ C (Srivastava, 2008).
At NOA's meteorological station a 2-year analysis of the temperature difference between the dry-bulb thermometer and the bimetallic thermograph has shown this to be at 0.5 ∘ C on average, with the thermograph giving lower temperature values. In order to improve the air-temperature estimates deduced by Equation (2), it is required that the following are secured: 1 The average hourly and daily values for every month and for a number of years should be available from the operating thermograph; its location is called principal location hereafter. 2 The average hourly values for each month should be available at a number of meteorological stations, where air-temperature observations are taken at standard hours every day; these sites are called secondary locations hereafter. 3 A procedure that takes into account the aforementioned thermographic and thermometric data and produces improved true mean monthly estimates at the secondary locations should also be developed. 4 Supplementary thermographic data at the secondary locations should be available, in order to validate the new estimates.
NOA has been chosen as the primary location; its registered thermographic observations taken in the period [1916][1917][1918][1919][1920][1921][1922][1923][1924][1925][1926][1927][1928][1929][1930] have been used in this study. The secondary locations considered belong to the HNMS network and there has been given attention to be distributed evenly across Greece; their data are used in this study. Trikala 1992,1993 a The location of the stations is shown in Figure 1.  Table 1.
Another requirement of the study is the availability of thermographic recordings for validation purposes, in parallel to the availability of thermometric observations at the same stations, even for a small number of years. Table 1 and Figure 1 give a full list of the meteorological stations and the corresponding years of measurements used in this study. These stations have thermographic recordings. The low number of years for the HNMS data used in this study depends upon their verified reliability. The proposed algorithm takes into account the true mean monthly air-temperature value (T) and its monthly averages at the standard times of observations (T 8 , T 14 , T 21 ) at the principal location. These standard times were in use in Greece during the period of 1894-1930.
At a secondary location, X, the corresponding true mean monthly air temperature, T X , is assumed equal to T, when the corresponding monthly averages at the standard hours of observations are equal to those at the principal location. If one of those observations differs and the remaining two are equal, it is assumed that the difference decreases linearly with time and becomes zero at the other two observational hours. If, for example, T X14 is greater than T 14 , their difference is: which decreases linearly to zero at 0800 and 2100; then T X becomes greater than T: Similarly: and: If all monthly averages at the standard hours of observations differ, then, by combining Equations (3.1)-(3.3), we deduce: Equation (4) is the mathematical expression of the proposed algorithm; it implies that maximum differences of the averaged hourly air temperatures between station X and NOA occur at the standard hours of observations. This is rather optimistic, but as the actual daily course of the air temperature at station X in unknown, Equation (4) is expected to yield better approximation estimates of the true monthly means at station X than that of Equation (1). In practice, Equation (2) can be regarded as the first-order approximation to the true monthly mean at station X, and Equation (4) can serve as the second-order approximation.

Validation of the algorithm
Equations (2) and (4) were applied to the data from all stations and years listed in Table 1 and Figure 1. For every month, year, and station, the differences between the actual true mean air temperatures deduced from thermographic recordings were compared with the estimations derived from Equations (2) and (4). The obtained differences, averaged over all stations and years of observations, are listed in Table 2; it is noticed that from late spring until autumn, Equation (2) overestimates the true mean monthly air temperatures far more than previously anticipated. The averaged differences exceed 0.8 ∘ C in June; occasionally, the difference can become higher than 1.0 ∘ C at inland stations.
Application of Equation (4) alleviates this problem; it corrects more than 70% of the occurring error from March until September. During these months (except for March and September), the results from Equation (4) are found to produce better estimates of the true mean monthly temperatures than those received from Equation (2) at the significance level of 99%.  (2) and (4).

Difference from Equation (2) ( o C)
Difference from Equation (4)  During the remaining months, Equation (4) performs worse than Equation (2), with an exception for February and December. Equation (2) gives moderate overestimated true mean monthly temperatures, while Equation (4) has a tendency toward considerable underestimation. This behavior of Equation (4) can partly be attributed to the breakdown of the hypothesis for a linear decrease of the observed differences D 8 , D 14 , and D 21 with time. The shorter length of the day may modify and complicate the daily course of the temperature during the months of October through February.
True mean annual temperatures can be deduced from true mean monthly values. As Equation (4) provides both overestimated values for some months and underestimated for others, it approaches, on the average, the true annual mean with a very good accuracy (0.011), while Equation (2) gives an average overestimation by almost 0.385 ∘ C. This overestimation varies from year to year and, thus, considerable noise can be introduced into the time-series analysis that tries to estimate temperature-related climate change, when Equation (2) is used in order to deduce the mean annual temperature values.

Case 2. True mean air-temperature estimations with a number of thermographs
In this scenario, a limited number of thermographequipped stations exist (more than one principal location). The problem to be solved is the estimation of the true mean monthly air temperatures at a number of secondary locations, where only daily temperature observations at standard hours are available. As more thermographs are now available, it is possible to express their true mean monthly T as a function of the monthly temperature averages T 8 , T 14 , and T 21 , taken at the corresponding standard observational hours with the help of linear multiple regression analysis. Consequently, 12 equations, one for each month, are    Table 1.
derived. Such equations, deduced from the observations made at the stations and years listed in Table 1, are presented in Table 3. Comparing the coefficients of these equations to the coefficients of Equation (2) (not shown here), it is realized that only the coefficient at noon is similar in all equations. This is another indication of the inadequacy of Equation (2). Observed versus predicted values based on the aforementioned equations and for the months of January and July are presented in Figures 2 and 3, respectively, in order to depict the accuracy of the fit. From these equations and using the temperature observations at the same observational times at the secondary locations, the true monthly means at these locations can be estimated. It is implied that the daily temperature course at the secondary locations is more or less similar to the average daily temperature variation of the stations at the primary locations. Otherwise, the estimated values are of poor quality.
the others. Consequently, the question about the most promising procedure arises.
To resolve this, another data set made available by HNMS stations with simultaneous thermographic recordings and temperature observations was employed. The stations and years of observations used in the subsequent analysis are listed in Table 4 and Figure 4.
True monthly means were estimated for every particular month, for all years and stations employed with the help of Equations (2) and (4), as well as with the help of equations listed in Table 3. Differences between observed and estimated true monthly means for every particular month and estimation procedure, averaged over all stations and years of observations are listed in Table 5, along with their seasonal and annual values.
Comparing Tables 2 and 5, it is observed that Equation (4) yields better estimates than Equation (2) from April to September. During the same period, the regression analysis equations yield more or less results of equal quality to those obtained through Equation (4). The results obtained from the regression analysis equations tend to be rather symmetrically distributed around the true monthly averages, while those deduced from Equations (2) and (4) tend toward overestimated values. It is, therefore, expected that the seasonal and annual averages obtained through the regression  Table 4. analysis equations should yield better estimates than those obtained through Equations (2) and (4). The annual averages listed in Table 5 demonstrate this. Equation (2) can be retained for winter and late autumn months, as it performs more or less equally well with the other two methods.

Possible effect on climatic temperatures of Greece
The first tabulation of the mean monthly temperatures, based on observations taken over a long-term period from many meteorological stations operating all over Greece, appears for the first time in the 'Climate of Greece' by Mariolopoulos (1938). Mean temperatures were derived with the help of Equation  Figure 5.
(2) and it was assumed that they represent the true means with great accuracy. The climatic period was 1900-1929, but as many of the stations started their operation later than 1900 (about 1915) the means of the period 1915-1929 had to be adjusted to yield the 1900-1929 values. The adjustment was based on the already derived mean values from the stations with full records.
To assess the possible error introduced in the true mean climatic estimations, observations taken at a number of stations listed in Table 6 and for periods of observations varying from 1910-1929 and 1915-1929 were used in order to obtain averages at the standard hours of observations for every month. Their data were taken from the NOA archives. It should be mentioned that before 1930, NOA was responsible for all meteorological observations in Greece.
From these averages, the monthly means were estimated with the help of Equation (2) as well as with the help of the regression analysis equations. The latter are considered representing the true means; the differences between their estimates and the corresponding estimates from Equation (2) are considered representing the error in the climatic true mean air-temperature estimates.
Comparing Tables 5 and 6, it is noticed that the general characteristics of the overestimated values obtained from Equation (2) are exhibited in both tables, implying that the overestimations of Equation (2) concern not only individual years but also the large observational periods as well. Maximum errors are found at stations located in northern Greece and away from the sea, while minimum ones occur at stations located in the southern part of the country and close to the sea, as can be seen in Figure 5, where the locations of the stations are shown. The annual error equals approximately 50% of the error of the summer season in both tables. As mentioned in Section 1, the meteorological observations at 2100 local time were replaced by those at 2000 local  Table 6. time from 1930 onwards. It is, therefore, expected that the overestimation error has been augmented from the estimated values so far.

Conclusions
From the present study, it is evident that Equation (2) largely overestimates the true mean air temperatures, with the exception of winter and late autumn months. Overestimations occur for individual months, years, or climatic periods. They reach maximum values in the summer months and for stations located in northern Greece and away from the sea. The opposite holds for stations located in the southern part of the country and close to the sea. Two solutions for the problem have been suggested.
The first is applied when only one thermograph is operating in an area and its recordings should be used along with their air-temperature observations taken at other locations, in order to obtain the true mean air temperature at these locations. Equation (4) is employed for the solution to the problem and corrects more than 50% of the overestimation error in spring, summer, and the annual true mean air temperatures.
The second solution is applied when more than one thermographs is operating in an area. Multiple regression analysis is employed and the true mean air temperatures for these stations are expressed in the form of equations having the true mean value as dependent variable and the mean monthly averages at the standard observation times as independent ones (see equations in Table 3). These equations yield the true mean values at locations, where only air-temperature observations at the standard observation hours are available.
The two solutions yield comparable results for individual spring and summer months, while the second solution yields better seasonal and annual estimates. The annual averages obtained through the second solution are more accurate than those obtained from Equation (2). Therefore, the time series of the mean annual air-temperature values derived from Equation (2) may contain noise, thus obscuring the detection of any climate change signal.