Uncertainties in simulated El Niño–Southern Oscillation arising from internal climate variability

Significant uncertainties exist in El Niño–Southern Oscillation (ENSO) simulations. To investigate the source of these uncertainties, previous studies have primarily focused on the model itself; however, internal climate variability (ICV) as a source of uncertainty has not been sufficiently explored to date. Using the Community Earth System Model–Last Millennium Ensemble (CESM–LME) modeling project and the Coupled Model Intercomparison Project (CMIP), an investigation into uncertainties in simulated ENSO arising from ICV is performed. Results show that external forcing can significantly increase the uncertainties arising from ICV when the simulation length is greater than ~40 years. In addition, the spread in ENSO amplitude arising from ICV accounts for 50% of the total spread within the CMIP5 historical simulations. Finally, the impact of ICV on ENSO varies considerably with simulation length and stabilizes at the threshold of 300–400 years.

Significant uncertainties exist in El Niño-Southern Oscillation (ENSO) simulations. To investigate the source of these uncertainties, previous studies have primarily focused on the model itself; however, internal climate variability (ICV) as a source of uncertainty has not been sufficiently explored to date. Using the Community Earth System Model-Last Millennium Ensemble (CESM-LME) modeling project and the Coupled Model Intercomparison Project (CMIP), an investigation into uncertainties in simulated ENSO arising from ICV is performed. Results show that external forcing can significantly increase the uncertainties arising from ICV when the simulation length is greater than~40 years. In addition, the spread in ENSO amplitude arising from ICV accounts for 50% of the total spread within the CMIP5 historical simulations. Finally, the impact of ICV on ENSO varies considerably with simulation length and stabilizes at the threshold of 300-400 years.

K E Y W O R D S
CMIP, ENSO simulation, external forcing, internal climate variability

| INTRODUCTION
The El Niño-Southern Oscillation (ENSO) phenomenon is one of the dominant modes of climate variability in the tropical Pacific, with significant impacts on weather, ecosystems, and societies in most parts of the world. Understanding and predicting ENSO behaviors are crucial to both the scientific community and the public (Bellenger, Guilyardi, Leloup, Lengaigne, & Vialard, 2013;Guilyardi, Braconnot et al., 2009). Because of the complex physical interactions among various oceanic and atmospheric processes (Wang & Picaut, 2004), ENSO studies often heavily rely on numerical simulations by Coupled General Circulation Models (CGCMs) that can produce many detailed processes and interactions related to ENSO behaviors. However, a fundamental question here is whether climate simulations can perform well in reproducing ENSO behavior in the real world.
ICV is defined as the natural fluctuation of the climate system that arises in the absence of external forcing and includes nonlinear dynamical processes intrinsic to the atmosphere, ocean, and coupled ocean-atmosphere system (Deser, Knutti, et al., 2012). A long-term control simulation without changes to external forcing is essential to assess the influence of ICV (Kay et al., 2015;Wittenberg, 2009). Wittenberg (2009) suggested that a 500+ year record should be needed to distinguish the influence of ICV on ENSO metrics in a 2000-year preindustrial control simulation. However, most of the studies suggested that a large number of simulations with the same model under the same external forcing, especially time-evolutional forcing, can also provide an estimate of the influence of ICV (Deser, Knutti, et al., 2012;Kang, Deser, & Polvani, 2013;Otto-Bliesner et al., 2015;Zheng, Hui, & Yeh, 2017). Using this method, Zheng et al. (2017) studied the effect of ICV on ENSO amplitude change in a 40member ensemble of future climate projections. Whether the above two methods have the same assessments regarding ICV (i.e., whether there is a significant influence of external forcing on ICV) and how the effect of ICV on other ENSO properties such as period, asymmetry, precipitation have not yet been explored. Therefore, this study examines the influence of external forcing on ICV and the effect of ICV on multiple ENSO characteristics.
The remainder of this article is organized as follows. Section 2 describes the data and methods. In Section 3, the influence of external forcing on ICV is assessed, and the uncertainties of simulated ENSO arising from ICV and its dependence on the simulation length are investigated. Finally, a discussion and conclusions are given in Section 4.

| DATA AND METHODS
As discussed above, both a long-term control simulation and an ensemble experiment with a large number of simulations using an individual model under the same external forcing are required. The Community Earth System Model-Last Millennium Ensemble (CESM-LME) modeling project provides the research community with such a resource in version 1.1 of CESM with the Community Atmosphere Model version 5 (CESM1 [CAM5]; Hurrell et al., 2013). Based on an 1850 control simulation, the CESM-LME spins up the model for 650 years and then starts an 850 control simulation for an additional 1356 years (650-2005).
The 1156-year control simulation from 850 to 2005 is publicly available. Using CMIP5 climate forcing reconstructions (Schmidt et al., 2011) including orbital, solar, volcanic, changes in land use/land cover, and greenhouse gas levels, the CESM-LME runs 10-member "full forcing" simulations for the period from 850 to 2005. The only difference among ensemble members is the perturbation in the initial air temperature field of each ensemble member through adding small random round-off (order 10 −14 C) differences. More details can be found in Otto-Bliesner et al. (2015).
In this study, the full-forcing ensemble with the maximum available number of members (10) for the period from 850 to 2005 and the 1000-year 850 control simulation from 850 to 1850 are used. For comparison, 100-year results (from 1900 to 1999) of historical experiments of the 22 CMIP3 models and 40 CMIP5 models are used to evaluate the multimodel spread of simulated ENSO properties. These models are selected on the basis of data availability ( Table 1).
The uncertainty in simulated ENSO properties arising from ICV is estimated using the spread of modeled ENSO properties across the ensemble (or segments by dividing the 850 control simulation) relative to the ensemble (segments) mean response. The spread among the ensemble members (segments) is assessed using the standard deviation, and the statistical significance of the standard deviations among different ensembles is assessed using an F-test.

| Influence of external forcing on ICV
The standard deviation of SST anomalies (SSTA Stddev) over the Niño3 region (150 -90 W, 5 S-5 N) is used as the ENSO amplitude. The spread of the modeled ENSO amplitude across the ensemble members and nonoverlapping segments, assessed by the standard deviation, is compared among the 850 control simulation and two periods of the full-forcing simulation with CESM-LME as a function of simulation length ( Figure 1). In the long-term control simulation (850 cntl), the spread is calculated by dividing the 1000-year simulation into 10 nonoverlapping segments of 100 years, and the evolution of normalized standard deviation as a function of simulation length in 10-year increments is then calculated based on the 10 nonoverlapping segments. To investigate the impact of external forcing, the spreads across the 10 ensemble members from the preindustrial period of 850 to 950 (Forcing_850) and the postindustrial period of 1850 to 1950 (Forcing_1850) in the full-forcing simulation with CESM-LME are calculated. The statistical significance of the normalized standard deviation between 850 cntl and Forcing_1850 as a function of simulation length is assessed using an F-test.
When the simulation length is less than 40 years, the two methods, a long-term control simulation and a large number of simulations with time-varying external forcing, have similar spread in ENSO amplitude, especially in 850 cntl and Forcing_850, and the relative spread in Forc-ing_1850 is slightly more than that in 850 cntl but does not exceed 95% confidence level. These comparisons suggest that the uncertainty arising from ICV is independent of external forcing; that is, the influence of external forcing on the uncertainty is not significant compared with the influence of ICV itself. However, when the simulation length is longer than 40 years, the uncertainty arising from ICV assessed by the second method with time-varying external forcing can be significantly larger than that by the first method with a long-term control simulation, and it increases with increasing external forcing, as seen by comparison among Forcing_850, Forcing_1850, and 850 cntl. The differences in spread between 850 cntl and Forcing_1850  exceed the 95% significance level using an F-test, except at 50 years, which approaches 95% significance. This indicates that external forcing can have a significant influence on ICV, and the impact tends to increase with strengthening external forcing. The above results also indicate that the assessment of the uncertainty arising from ICV may differ with the choice of assessment method. For other ENSO characteristics, such as spectral shape (see Section 3.2), similar conclusions can be drawn, the specific details of which are not presented here.

| Uncertainties in ENSO characteristics arising from ICV
Since the second method using a large number of simulations by the same model under the same external forcing can more comprehensively reflect the influence of ICV and has been widely used in previous studies (Deser, Knutti, et al., 2012;Kang et al., 2013;Otto-Bliesner et al., 2015;Zheng et al., 2017), the 10 ensemble members of the full-forcing simulation with CESM-LME for the period 1900-1999 in the historical experiment are used to evaluate the uncertainty in simulated ENSO properties arising from ICV (including interactions between ICV and external forcing) in this subsection. For comparison, historical experiments of CMIP3 and CMIP5 for the same period are also adopted. Multiple ENSO-related metrics, including the spectral shape, amplitude, seasonality, and precipitation (Bellenger et al., 2013;Guilyardi, Braconnot et al., 2009;Guilyardi, Wittenberg et al., 2009), are chosen for a comprehensive evaluation. The spectral shape, defined as the ratio of SST anomaly magnitude over the Niño3 region between the year 3-8 band and the year 1-3 band, measures the amplitude of the ENSO biennial component, which is a well-known and stable ENSO spectral characteristic (Bellenger et al., 2013). The modeled ENSO amplitude is evaluated here using the SSTA Stddev over the Niño3 and Niño4 regions (160 E-150 W, 5 S-5 N). The seasonality, defined as the ratio of the average standard deviations of the Niño3 SSTA between the November-January and March-May periods, is a measure of the ENSO seasonal phase-locking character. The standard deviation of precipitation anomalies over the Niño4 region is used to evaluate the impact of ENSO on precipitation and large-scale circulation. Table 2 summarizes the standard deviations in the above ENSO-related metrics for the period 1900-1999 in the historical experiment among 22 CMIP3 models, 40 CMIP5 models, and the 10 ensemble members of the full-forcing simulation with CESM-LME. In agreement with previous studies (Bellenger et al., 2013;Flato et al., 2013;Guilyardi, Braconnot et al., 2009;Guilyardi, Wittenberg et al., 2009;Lloyd et al., 2012), there is an evident decrease in multimodel spread for most ENSO-related metrics from CMIP3 to CIP5, indicating the improvement of ENSO representation in current climate models. The multimodel spread results both from model uncertainty and ICV Kay et al., 2015). As a measure of the influence on uncertainty of ICV, the standard deviation among the 10 ensemble members of the full-forcing simulation with CESM-LME indicates a large contribution from ICV, compared with the total spread in the CMIP ensembles. Specifically, the spread arising from ICV within the CESM-LME ensemble generally accounts for 30% of the total spread within the CMIP3 ensemble for ENSO amplitude (Niño3 and Niño4 SSTA Stddev) and~50% of the total spread within the CMIP5 ensemble. The spreads of the SSTA Stddev among multiple CMIP5 models in the Niño3 region are not significantly different from those among the 10 ensemble members of the full-forcing simulation with CESM-LME at the 95% confidence level, according to an F-test ( Figure S1, Supporting Information), which also indicates that the influence of ICV is comparable to model error in the Niño3 region. For other metrics, ICV has varying effects; for example, ICV accounts for 21% of the total spread within the CMIP5 ensemble for Niño3 seasonality and 41% for Niño3 spectral shape, but it can explain at least The first member of historical runs is used for the CMIP models. 15% of the total spread within the CMIP ensemble for every metric listed in Table 2. More importantly, with the decrease in multimodel spread of ENSO properties from CMIP3 to CMIP5, the contribution from ICV becomes much more significant, indicating that ICV should receive more attention when investigating the sources of uncertainty in ENSO simulations.

| Dependence of ICV on simulation length
Previous studies have shown that ICV has important effects on climate change projections, especially at regional scales and subdecadal time scales (e.g., Deser, Knutti, et al., 2012;Hawkins & Sutton, 2009). The results in Section 3.1 also suggest that ICV may be influenced by simulation length. Therefore, the dependence of ICV on simulation length in ENSO simulations is assessed in this subsection. Figure 2 shows the ensemble-mean Niño3 power spectra and the associated standard deviation range among the 10 ensemble members of the full-forcing simulation with CESM-LME for different simulation lengths. The standard deviation range shows a clear reduction with increasing simulation length from 50 to 300 years, while it stabilizes at lengths longer than 300 years, indicating a threshold in simulation length at~300 years for the influence of ICV on ENSO spectral characteristics. Figure 3 shows the standard deviations of multiple ENSO-related metrics discussed in Section 3.2 among the 10 ensemble members of the full-forcing simulation with CESM-LME, with respect to simulation length. Although the various ENSO-related metrics have different standard deviations, the shape of their variation profile with simulation time is similar. Each ENSO metric profile suggests a threshold within the range 300-400 years, above which the standard deviation stabilizes with increasing simulation length, and below which it varies considerably with simulation length. These results indicate ICV has a dependence on simulation length; that is, the shorter the simulation length, the larger the uncertainty arising from ICV. Considering the large uncertainties in short time-length climate simulations, like the century-scale historical and Representative Concentration Pathways (RCP) experiment in CMIP5, future modeling efforts should consider the influence of ICV on the simulated ENSO properties in these experiments, especially for shorter simulation time-lengths.
Freq. (cycles month -1 ) Freq. (cycles month -1 ) Freq. (cycles month -1 ) Freq. (cycles month -1 ) FIGURE 2 Ensemble-mean Niño3 power spectra ( C 2 [cycles month −1 ] −1 ) and the associated standard deviation range among the 10 ensemble members of the full-forcing runs with CESM-LME as a function of simulation length (from 50 to 1000 years). The black line represents the ensemble-mean power spectrum, and the blue range represents the standard deviation. The simulation length corresponding to all panels starts from the same time (Year 850) Using data from the CESM-LME project, uncertainties in ENSO simulations arising from ICV were investigated. Results show that the strengthening external forcing could increase the influence of ICV on simulated ENSO amplitude through interactions between ICV and external forcing when the simulation length is longer than 40 years. This might explain the results of Figure 2 in Zheng et al. (2017), where the ENSO amplitude range during 2046-2095 is larger than that during 1950-1999. ICV is an important source of uncertainty in centennial ENSO simulations, and becomes increasingly significant with the decrease in multimodel spread of simulated ENSO properties from CMIP3 to CMIP5. Moreover, the influence of ICV on ENSO characteristics depends on the simulation length. Uncertainty from ICV increases with decreasing simulation length. Although an exploration of all ICV characteristics is beyond the scope of this work, results presented here can further discussed and provide a basis for future work in this area. For example, the ensemble mean can be relatively stable in a few decades and very close to the 1000-year mean. Based on the same computational cost (e.g., 500-year integration), the comparison of the simulated ENSO properties under different integrations and ensemble sizes (e.g., Figure S2) suggests that the traditional serial-in-time long-term climate integrations might be replaced by representative ensembles of shorter simulations. The same idea but on a shorter time scale (daily to yearly) in discerning climate-relevant sensitivities in atmospheric general circulation models has proven to be very effective (Wan et al., 2014). It is well worth studying in the more costly long-term climate simulations, especially with the development of high-resolution and complex earth system models.
As the long-term initial condition (IC) perturbed ensemble members using the same coupled model CESM1 (CAM5) are used in this study, the quantified results might be model-dependent. For example, the overestimated ENSO magnitude in CESM1 (CAM5) (Zheng et al., 2017) might have an impact on the influence of ICV. However, similar results can also be observed based on the multi-IC ensemble members of the historical experiment of the CMIP5 models ( Figure S3), indicating the common sense of the conclusion in this study. In addition, as the maximum members in the full-forcing ensemble have already been used in this study, it is very difficult for us to directly address whether the 10 ensemble members are enough for the reliability of the key points? Instead, we can investigate the reliability of the key points through using smaller numbers of ensemble members. As shown in Figure S4, similar conclusions, that is, existence of obvious initial spread, the reduction of initial spread following the increase of simulation length and existence of a threshold of simulation length, can be observed when the number of ensemble members used are smaller than 10 (e.g., 8). Meanwhile, using the 1000-year 850 cntl simulation, we get ensembles of 10, 15, and 20 members overlapping segments separately by sliding different time windows. The results ( Figure S5) from different ensemble members can reach similar conclusions, indicating some extent reliability of the results from 10 ensemble members. Beyond that, the only difference between ensemble members is the perturbation in the atmospheric temperature fields, indicating that the ocean drift, which have been shown to influence ICV within the ocean temperature (Hogan & Sriver, 2017) and ENSO trends (Zheng et al., 2017), might have potential effects in this study and need to be further studied.
Although experiments such as the historical experiment and the RCP experiment in CMIP5 are~150 and 100 years long, respectively, the simulation length is still short compared with the 300-400-year threshold of the impact of ICV on ENSO, which suggests these experiments are affected by ICV. Moreover, with the increase of external forcing in the historical and RCP experiments, the uncertainty arising from ICV may increase further. In this work, a noticeable spread in ENSO metrics, as well as a reduction in spread with increasing simulation length, is evident in results from ensemble members of the historical experiment of the CMIP5 models ( Figure S3). The influence of ICV is typically difficult to evaluate due to the small number of simulations available from modeling centers. For example, as summarized in Table 1, only~20% (7) of models provide more than five integrations based on 40 historical simulations in CMIP5, and almost 40% (15) of models provide just one integration. Similar results can be found in CMIP3. Moreover, most recent studies using multiple models from CMIP3 or CMIP5 only consider one ensemble member from each model (usually the first member) (Kitoh & Uchiyama, 2006;Sheffield, Barrett et al., 2013;Sheffield, Camargo et al., 2013). As a result, ICV is often ignored and its impact lumped in with model uncertainty. This work indicates that many of the results obtained from a single ensemble member may be affected by ICV. Therefore, future work using ENSO simulations should address the influence of ICV, especially in historical and RCP experiments. To better evaluate the contribution of ICV, more ensemble members should be provided in future modeling experiments. Finally, investigation into how to further reduce the impact of ICV is worthy of further study. resources provided by NSF/CISL/Yellowstone for data availability.