Leaf phenology amplitude derived from MODIS NDVI and EVI: Maps of leaf phenology synchrony for Meso‐ and South America

The leaf phenology (i.e. the seasonality of leaf amount and leaf demography) of ecosystems can be characterized through the use of Earth observation data using a variety of different approaches. The most common approach is to derive time series of vegetation indices (VIs) which are related to the temporal evolution of FPAR, LAI and GPP or alternatively used to derive phenology metrics that quantify the growing season. The product presented here shows a map of average ‘amplitude’ (i.e. maximum minus minimum) of annual cycles observed in MODIS‐derived NDVI and EVI from 2000 to 2013 for Meso‐ and South America. It is a robust determination of the amplitude of annual cycles of vegetation greenness derived from a Lomb–Scargle spectral analysis of unevenly spaced data. VI time series pre‐processing was used to eliminate measurement outliers, and the outputs of the spectral analysis were screened for statistically significant annual signals. Amplitude maps provide an indication of net ecosystem phenology since the satellite observations integrate the greenness variations across the plant individuals within each pixel. The average amplitude values can be interpreted as indicating the degree to which the leaf life cycles of individual plants and species are synchronized. Areas without statistically significant annual variations in greenness may still consist of individuals that show a well‐defined annual leaf phenology. In such cases, the timing of the phenology events will vary strongly within the year between individuals. Alternatively, such areas may consist mainly of plants with leaf turnover strategies that maintain a constant canopy of leaves of different ages. Comparison with in situ observations confirms our interpretation of the average amplitude measure. VI amplitude interpreted as leaf life cycle synchrony can support model evaluation by informing on the likely leaf turn over rates and seasonal variation in ecosystem leaf age distribution.


| INTRODUCTION
The characterization of the leaf phenology of ecosystems and the plant functional types (PFTs) within them is important for the identification of spatial phenological patterns and their drivers. This information is essential to improve dynamic global vegetation models through better representation and parameterization of PFTs. This is particularly relevant in the tropical and subtropical regions where observed leaf phenological behaviours range from (a) rapid shedding of short-lived, leaves early in the dry season in deciduous species; (b) irregular exchange in brevi-deciduous species to (c) constant exchange of long-lived, leaves in evergreen species, and where both intraspecific synchrony and asynchrony have been reported (van Schaik et al., 1993;Wright and Schaik, 1994;Reich, 1995;Reich et al., 2004).
A variety of different Earth observation (EO) approaches have been used to characterize the leaf phenology of ecosystems and their PFTs. A common approach is to derive time series of vegetation indices (normalized difference vegetation index, NDVI and enhanced vegetation index, EVI) which can be related to the temporal evolution of the fraction of absorbed photosynthetically active radiation (fAPAR), leaf area index (LAI) and gross primary productivity (GPP) (Justice et al., 1998;Fensholt et al., 2004;Vina and Gitelson, 2005;Yan et al., 2008;Zhu et al., 2013). EO-derived land cover maps often include 'evergreen' and 'deciduous' classes (GlobCover, Arino et al., 2012, e.g. MODIS land cover, Friedl et al., 2010. However, these classes are limited to a qualitative description of generic phenological behaviours. Another approach is to characterize leaf phenology through combining EO-derived phenology metrics, such as start (leaf out), end (leaf fall) and length of growing season, and the magnitude of greening (i.e. the range or maximum minus minimum, commonly referred to in the phenology literature as 'amplitude') from time series of vegetation indices (Ivits et al., 2013;Eastman et al., 2013;Buitenwerf et al., 2015). Such metrics will often show high levels of uncertainty (White et al., 2009), particularly in the tropics, if derived from EO data where the incidence of cloud is high and/or the growing season is less well-defined.
Here we have implemented an approach that deals with the noisy time series and missing observations in vegetation indices, prevalent in these regions, and that robustly tests for the presence of statistically significant annual leaf phenology signals. We present maps of average ranges (described as average 'amplitudes') of vegetation indices for Meso-and South America, representing the synchrony of vegetation greening within 500 m 2 , 1 km 2 or 0.5 degree 2 pixels or grid boxes. Where the uncertainty is too high, no data are provided. These maps can be used as reference phenology maps by the DGVM modelling community, which is currently concerned with the development of improved PFTs (e.g. Arora and Boer, 2005;De Weirdt et al., 2012;Zhao et al., 2013;Sakschewski et al., 2016) in the tropical and subtropical regions, and to inform on the likely seasonal variation in leaf age distribution to help improve model GPP estimates (Wu et al., 2016;Albert et al., 2018).

| METHODS
The approach is based on a method we developed as part of (Bradley et al., 2011) with improvements introduced at different stages of the data processing. Important differences in the method, which is outlined below, are (a) the use of 14 years of vegetation index (VI) data compared with 6 years in (Bradley et al., 2011) which resulted in an much better estimate of the amplitude; (b) the application of a stringent quality control procedure to select the most reliable VI observations; and (c) the use of the Lomb-Scargle transform (Press et al., 1992), thus avoiding the need for time series gap filling that can bias the detection of annual cycles.
Quality-assured EVI and NDVI time series were derived from the 14-year-long (2000-2013) 16-day time series of MODIS vegetation indices (500 and 1,000 m EVI; and 500 and 1,000 m NDVI). These input time series were acquired from LPDAAC and are referred to as MOD13A1 (500 m 2 ) and MOD13A2 (1,000 m 2 ).
Both NDVI and EVI are an indirect measure of vegetation growth (i.e. increase in LAI) and activity (i.e. photosynthesis). NDVI is a ratio of the reflectance from the red and near infrared (NIR) wavebands and so reduces noise caused by sources present in both bands, such as illumination differences and atmospheric attenuation. However, NDVI is nonlinear and saturates at high vegetation biomass levels while at low vegetation levels it is sensitive to variations in canopy of leaves of different ages. Comparison with in situ observations confirms our interpretation of the average amplitude measure. VI amplitude interpreted as leaf life cycle synchrony can support model evaluation by informing on the likely leaf turn over rates and seasonal variation in ecosystem leaf age distribution.

K E Y W O R D S
leaf phenology, Lomb-Scargle spectral analysis, MODIS time series, synchrony, vegetation indices | 15 GERARD Et Al. background reflectance (e.g. soil, litter) (Huete, 1988). The MODIS EVI product was designed to reduce these problems and also correct for atmospheric effects (Huete et al., 2002). NDVI is more sensitive to leaf chlorophyll concentrations through the red band (620-670 nm) while EVI is more sensitive to canopy structure, including LAI, through the NIR band (840-875 nm) (Gao et al., 2000).
The NDVI equation is: with values ranging between −1 and 1. The EVI equation is: where ρ are atmospherically corrected reflectance values; G is the gain factor; L is the canopy background adjustment; and C 1 , C 2 are the coefficients of the aerosol resistance term, which uses the blue band reflectance to correct for aerosol influences on the reflectance in the red band. The coefficients adopted in EVI MODIS are L = 1, C 1 = 6, C 2 = 7.5 and G = 2.5 (Huete et al., 1994) resulting in values that range between −2.5 and 1.25. Both vegetation indices are normally rescaled to a practical range of 0-1. Data quality control of NDVI and EVI time series was performed in a series of steps. Firstly, the vegetation index data were screened to remove outliers as follows. Values below 0.0 and data not flagged as 'good quality' according to the MODIS VI Quality dataset were eliminated. Any values of more than three standard deviations above or below the mean of the remaining values were considered as outliers and also removed. Next, to ensure relatively homogeneous observations, pixels with time series that did not consist of 100 observations or more (out of a possible maximum of 320 over the years 2000-2013 inclusive) or that did not have at least one observation in every calendar month were also removed. Finally, pixels with mean NDVI or EVI values of <0.01 were removed from the spectral analysis as these cannot present significant variations in vegetation greenness (e.g. pixels in the Atacama Desert). Despite these stringent restrictions, sufficient pixels were retained for analysis even in cloud-prone areas at the 500 and 1,000 m 2 resolutions ( Figure 1).
Pre-processing of the quality-assured time series involved standard linear detrending by least squares regression (Weedon, 2003). The first and last five per cent of the detrended time series were then cosine tapered to supress 'periodogram leakage' (Priestley, 1981;Weedon, 2003).
The previous time series analysis of MODIS NDVI and EVI covering March 2000 to December 2006, that was reported in Bradley et al. (2011), used limited 'gap filling' as this would not have significantly influenced the relationship (coherency and phase) between paired variables (the focus of that work). However, gap filling could potentially bias the detection of annual cycles in the vegetation indices. Hence, the data used in this study were transformed to the frequency domain using the Lomb-Scargle algorithm of Press et al. (1992) instead of via a standard discrete Fourier Transform (including FFT). The transformed data were used to generate periodogram estimates of power -that is values of the average variance or, equivalently, the squared average deviationor squared average amplitude sensu stricto -across a range of frequencies. Three applications of a discrete Hanning spectral window were used to reduce the variability of the periodograms and provide power spectral estimates with eight degrees of freedom (Priestley, 1981;Weedon et al., 2015).
For regularly spaced data, the Lomb-Scargle algorithm yields identical results to a standard discrete Fourier Transform. However, for data that are irregularly spaced in time, such as the time series data used here, the specific distribution of values by time can lead to biased (over-estimated) Lomb-Scargle spectral estimates at high frequencies -necessitating a bias correction procedure (Schulz and Mudelsee, 2002). Typically, the Lomb-Scargle algorithm is used with the spacing of spectral estimates (the Rayleigh Frequency, RF) determined by the length of the dataset (i.e. RF = 1/T, where T is the total duration of the data). However, since outlier removal can lead to differences in the total length of the data, here the RF was adjusted in the spectrum for each pixel so that there was estimation of the power at the frequency of precisely 1/(1 year). This adjustment of RF prevented underestimation of the power at the annual scale that would otherwise occur due to 'scalloping loss' or 'picket fencing' (Ifeachor and Jervis, 1993).
The spectral background was modelled as resulting from a first-order autoregressive process and located using the robust median smoothing method of Mann and Lees (1996). One-sided confidence levels were obtained using standard chi-squared distribution according to the degrees of freedom of the spectral estimates (Priestley, 1981;Weedon, 2003). In order for a pixel to be judged to have statistically significant annual variations in greenness, the spectral peak at the annual scale must exceed the 95% confidence level above the spectral background. No corrections were made to the confidence levels for the effects of multiple testing (i.e. for signal detection at multiple frequencies) because this study is concerned specifically with testing peak significance at the annual scale alone. Given the degrees of freedom of our spectral estimates and the 95% confidence level threshold for significance, the uncertainty in resulting amplitudes ranges between (larger than) −41.6% and (less than) +39.2%. Figure 2 shows example time series and their power spectra which were assigned a low, medium or high average range 'amplitude' value.
In time series analysis, the term 'amplitude' relates to the deviation from the mean (or linear regression) level. Spectral estimates derived from the periodogram directly indicate the power (i.e. variance) at each frequency. The square root of the periodogram estimates gives the average amplitude at each frequency. Note that other methods of spectral estimation (e.g. the Blackman-Tukey method) yield the spectral density so that the variance is proportional to the area under the spectrum.
In the phenological literature, the maximum minus the minimum of a sinusoidal time series is often described as the 'amplitude' rather than as wave height or range. In order to produce maps of annual-scale variability that can be readily interpreted, the spectra were rescaled so the map values of 'average amplitude' show the average annual range (maximum minus minimum) of vegetation index values where the spectra have statistically significant annual cycles. The re-scaling of the power spectra accounts for the amplitude squaring, effects of data tapering, spectral smoothing and conversion of average deviation (average amplitude sensu stricto) to average range (peak to trough or maximum minus minimum) yielding the final mapped 'average amplitude'.
The reliability of the spectral results is affected by the number of data points in a time series. So in addition to the significance test, the number of good quality observations used in the analysis for each pixel is provided to allow for a further quality screening by the user. The steps described in Figure 3 delivered maps showing the average range ('amplitude') of the annual cycle observed for each pixel. Pixels with too few high quality data ('no data' flag) are distinguished from pixels with no significant annual cycles in greenness ('no significant annual cycle' flag). The values are presented as (1) absolute amplitude values of NDVI or EVI and (2) normalized amplitude values of NDVI or EVI. The normalized amplitude is the absolute amplitude value divided by the average of the 14 year VI time series. The other values found in the data are: 0 = mean VI < 0.01 −1 = 'no significant annual cycle'; −2 = 'no data' or pixels with insufficient observations; F I G U R E 1 Top: Example of outliers (inside red circles) that were removed from a 16 day MODIS NDVI time series; Bottom: a MODIS tile (500 m pixels -left, 1,000 m pixels -right), including parts of Peru and Bolivia, illustrating the impact of screening for time series with insufficient good quality observations; the black areas represent pixels with insufficient data, which mainly include lakes (e.g. Titicaca) and the Andes mountain range −3 = Sea or Open water; A quality layer with the % number of good quality observations for each pixel used in the spectral analysis is provided for each leaf phenology synchrony map. For each vegetation index (NDVI, EVI)m maps with the following spatial resolutions are available: 500m, 1km and a version re-gridded to 0.5 degrees. Figure 4 shows the amplitude maps derived from NDVI and EVI for the whole of Meso-and South America and highlights the level of spatial detail achieved for two areas shown close-up ( Figure S1 shows the normalized amplitude maps).

| INTERPRETING THE DATASET
The surface area contributing to a MODIS VI value is always larger than the nominal pixel size of the VI value (Tan et al., 2006). Also, the NDVI and EVI time series present a temporal signal that results from the spatial and temporal integration of the leaf phenology of vegetation contained within the area contributing to a pixel. Hence, any phenology measures derived from such a signal should be considered at the plant community level.
The EO community has previously introduced vegetation phenology metrics (e.g. start, end and length of growing season) and productivity metrics (e.g. Maximum VI, seasonal amplitude) (e.g. Jonsson and Eklundh, 2004) that are derived from time series of VIs for each growing season. However, the dataset presented here is more aligned with the plant leaf phenological patterns characterized by ecologists through measures that capture the timing of life cycle events, such as frequency, regularity, amplitude, synchrony and duration (Newstrom et al., 1994). The significance testing of the spectral peaks at the annual scale and confirms the existence or absence of a regular annual cycle in vegetation greenness. This aligns with the combined concepts of frequency and regularity described by Newstrom et al. (1994). The average amplitude values denote the degree of synchrony of changing net leaf greenness of the plant population (Newstrom et al., 1994) across the whole pixel. Changes in land cover during the study period, which led to a sudden change in leaf phenology patterns, will influence the average amplitude values. However, testing for the potential influences of such land cover changes was beyond the scope of this investigation.  Figure 5 shows the relationship between the EVI-and NDVIderived amplitude values for the main IGBP cover types found in Meso-and South America. As expected, the EVI-based amplitude values were larger than their corresponding NDVIbased amplitudes in areas of high biomass, because the EVI is designed to be more sensitive in these areas. Recent work from Morton et al (2014) has highlighted that in the tropics, the MODIS VIs are affected by bi-directional reflectance effects. In other words, the seasonal changes in solar zenith angle introduce an underlying artificial increase/decrease in EVI/NDVI values between June and October. The magnitude of this effect is greater for EVI than for NDVI. However, the average amplitude (i.e. average range) values found for the IGBP 'evergreen broadleaf forest' are generally larger than the magnitude of the BRDF effect shown in Morton et al. (2014) (i.e. a median of 0.04 and 0.07 versus the uncorrected June-October differences of ~0.02 and ~0.065 reported by Morton for NVDI and EVI, respectively) which is in line with Jian et al. (2015). Moreover, large proportions of the tropical region, where the impact of the BRDF effect is expected to be greatest, are mapped as having no significant annual cycle. This gives us confidence that the VI products are correctly identifying areas which exhibit real phenological cycles. It is important to note that the NDVI product is more conservative than the EVI product, showing a larger proportion of 500 m or 1 km pixels with no significant cycle (see Figure 4a and e).

| The absolute and normalized amplitude values
The absolute amplitude value ranges between 0 and 1, and can be interpreted as the combined effect of synchrony and vegetation cover. In other words, the two extremes could be interpreted as: • Areas with a high vegetation cover and annual variations in greenness that is well synchronized across individuals and vegetation species, yielding a high value. • Areas with small annual greenness variations that could indicate a high vegetation cover that is poorly synchronized or with a very low vegetation cover but that is well synchronized. Both cases would yield a low value. It is critical that the user understands that the absolute amplitude maps relate to annual-scale VI variability. The variability might or might not be linked to the average greenness. For example, areas with low (e.g. arid grassland with low leaf area index, LAI) or high vegetation cover (subtropical grassland with high LAI), both showing a regularly timed growing season, could display a low and high amplitude value, respectively. In order to judge the average greenness, the maps include the average NDVI and EVI.
The normalized amplitude value could in theory range between 0 and 100; however, in practice the range is between 0 and ~2. It can be interpreted as the level of net synchrony between the leaf life cycles of the vegetation observed within pixels. In other words, given a within-pixel variation in environmental characteristics and climate, and the diversity of plant species in terms of their different leaf life cycle strategies (leaf phenological behaviours), a particular pixel will be between two extremes: • Well synchronized, where the leaf life cycle of whole plant communities are driven by the same environmental and climatological factors, yielding a high amplitude value. • Poorly synchronized where the life cycles of individuals within a community are likely to be driven by different environmental and climatological factors, yielding a low amplitude value.
Note that areas with low (e.g. arid grassland) and high vegetation cover (tropical forest) showing a regularly timed and synchronized growing season could potentially show similar normalized amplitude values.
In both cases (absolute and normalized), combining the amplitude with land cover information will help make the distinction between possible options.

| EVALUATING THE AMPLITUDE VALUE
We used two approaches to evaluate the amplitude values. The first involved comparing the full extent of the average amplitude data grids with the independent IGBP land cover map (Loveland and Belward, 1997;Friedl et al., 2010). This was done with histograms showing the distribution of the amplitude values found within each of the IGBP cover classes. The second approach compared published in situ site observations that describe canopy leaf life cycles through leaf flushing and falling, LAI and litter fall or other observations, with the corresponding amplitude values of these sites.

| Comparison with independent IGBP
land cover map Figure 6 shows for NDVI the distribution (i.e. histograms) of the 500 m 2 average amplitude values found within the land cover classes of the 500 m IGBP global land cover product -MCD12Q1 ( Figure S2 in Supplement shows similar histogram plots for EVI).
The histogram plots for the land cover classes categorized explicitly as 'evergreen' and 'deciduous' confirm our interpretation of the amplitude values (i.e. a measure of synchronized behaviour of the vegetation found within a pixel with respect to their leaf phenology). We find a large proportion of pixels with F I G U R E 5 Scatterplots between the amplitudes of NDVI and EVI for the main IGBP land cover classes found in Meso-and South America, where shrubland includes IGBP classes 'closed Shrubland', 'Open Shrubland' and 'Woody Savanna'; and grassland includes IGBP classes ' Savanna' and ' Grassland'. The amplitudes shown are for a random subsample of points taken to represent each IGBP class. The total number of points sampled varies per cover class and is given in the respective plot. The black to white point density gradient is relative no annual cycle (Table 1) or with annual cycles showing very low average amplitudes within the IGBP cover class defined as 'evergreen broadleaf forest'. By contrast, we find a large proportion of pixels with significant annual NDVI cycles (Table  1)  a spread-out histogram, but with a higher proportion of pixels with low average amplitudes, confirming that this class indeed represents a mixture of deciduous and evergreen phenological behaviours. The histograms of the shrubland, savanna, grassland and cropland classes indicate a mixture of phenological behaviour for all these classes.

F I G U R E 7 Examples of 16 day MODIS NDVI time series for 4 of the sites identified in the literature (left) and their corresponding Lamb-
Scargle transform power spectra (right). The spectrum for site 9 (deciduous subtropical dry forest) shows a half-annual spectral peak associated with half-annual cycles visible in the time series, while the spectrum for site 10 (tropical dry forest) shows a half-annual spectral peak due to the second harmonic of the annual cycle in the time series, showing non-sinusoidal annual cycles with no half-annual oscillations

Data description Pixel resolution
Absolute amplitude based on EVI (2nd layer: % number of good quality observations used) 500 m  Table 2 lists and describes 15 in situ sites that were found in the literature and used to evaluate the amplitude values. Figure S3 in supplement shows the location of these sites, Table 3 lists their NDVI and EVI amplitudes (absolute and normalized) and Figure 7 the NDVI time series and resulting Lomb-Scargle power spectra for four of these sites. The sites vary in location (from Mexico to Argentina) and vegetation type (i.e. evergreen forests, deciduous forests, savanna, wetlands and sugar cane). The oldest cited work is from 1986, the most recent from 2008. In most cases, the data were collected from nature reserves, so we can assume that the vegetation will not have changed much over time in terms of its overall composition and, thus, phenological behaviour. Google Earth was also used to ensure that the sites were not located in heterogeneous areas or areas heavily impacted by humans.

| Comparisons with in situ observations
Nine of the 15 sites are sites with suitable field data most of which are presented as a time series plot. The published plots are shown in the supplement ( Figures S4-S12). The Sugar cane site -an active Fluxnet site but with no site based phenology observations -was included as an example cropland site with a large amplitude. Although the type and quality of the reported field data varies substantially, the amplitudes derived for the sites, from both NDVI and EVI, generally match the level of synchronicity described. Sites (5, 6 and 8) with species or trees exhibiting unsynchronized leaf flushing and leaf fall have the lowest amplitudes (NDVI amplitude: 0.05, 0.04 and 0.04). Sites (1, 3, 7, 9 and 10) which show a mixture of synchronized behaviour with unsynchronized behaviour or those with an 'evergreen' layer show amplitudes in the middle range (NDVI amplitude: 0.21, 0.16, 0.12 and 0.26). Finally, sites which are fully synchronized (2 and 4), show the highest amplitudes (NDVI: 0.43 and 0.34). The one exception is site 1 which has a low amplitude (NDVI: 0.06) but which has field observations showing a synchrony in leaf flushing (deciduous and evergreen species) and partial synchrony in leaf fall (deciduous species only). Site 1 contains a mixture of deciduous and evergreen species, the latter of which are retaining leaves throughout the year. This could explain why the field data suggest there is a highly significant annual cycle, but the map shows a small amplitude. Figure 7 illustrates four of these published sites plus their NDVI power spectra -showing the highly significant annual-scale variations even for sites with low variability (e.g. site 5).
The remaining five sites are the major wetland areas of South America. Their location and inundation patterns are described by Hamilton et al. (2002). All five wetland sites show a medium amplitude. This is in contrast with the larger amplitudes of their non-inundated surroundings which are much more affected by the local wet and dry seasons.

| DATASET USE AND REUSE
We have shown that the magnitude of the amplitude value could be interpreted as the degree at which the leaf flushing and leaf senescence of the vegetation in a particular area is synchronized. There is an increasing impetus to introduce plant diversity and plasticity into global dynamic vegetation models and allowing for variations in leaf life cycle strategies is an important aspect of this. The introduction of ecosystem leaf phenology synchronicity represented in the form of a normalized amplitude map with tested significance can be used to evaluate the spatial distribution of net ecosystem phenology patterns resulting from the range of leaf life cycle strategies or leaf turn over rates introduced into the DGVMs. To facilitate this, the map could be combined with a land cover (e.g. IGBP) and woody cover map (e.g. Hansen et al., 2013) to locate aseasonal (~ evergreen) and seasonal (~ deciduous) forests in the tropical and subtropical regions, or help better understand the leaf life cycle dynamics present in the woody-grassland matrix of the savannah. The product can also be used to inform on the likely seasonal variation in leaf age distribution and so help explain current observed mismatches between flux tower and model GPP estimates (Wu et al., 2016;Albert et al., 2018). The high spatial resolution grid layers could also provide valuable continental scale information for biodiversity and ecology studies that require a baseline map of community level leaf phenology behaviour.