Surface ocean CO2 in 1990–2011 modelled using a feed‐forward neural network

This dataset includes the monthly distributions of CO2 fugacity in the world surface oceans reconstructed using a feed‐forward neural network model and the CO2 measurements of the Surface Ocean CO2 Atlas version 2.0. It has a spatial resolution of 1 × 1° and spans a period of 22 years, from January 1990 to December 2011. The dataset also includes necessary parameters for the reconstruction and an estimate of the CO2 fluxes between the ocean and the atmosphere. The aim of this work is to provide a dataset for estimating the oceans' contribution to the global carbon budget.


Introduction
Understanding the global distribution of the surface ocean CO 2 (SOC) fugacity plays an important role in accurately estimating the oceans' contribution to the global carbon budget, as indicated in Le Qu er e et al. (2013) and Wanninkhof et al. (2013). However, available measurements are insufficient in most parts of the oceans for direct estimates of the oceans' contribution. Although the composite map of SOC measurements from 1990 to 2011, shown in Zeng et al. (2014), indicates that about 60% of the oceanic areas were sampled, the area ratio is only about 7-25% when divided by the same months of all years (Figure 1(a)) and is even smaller, between 0% and 4% (Figure 1(b)), when calculated for individual months. Insufficient measurements demand using models to estimate the global SOC in multiple years. However, the relatively large temporal and spatial variations in SOC raise the difficulties in SOC modelling. Recent studies showed that the amplitude of the seasonal SOC changes can be 100 latm or more (Wanninkhof et al., 2013), which is about 10 times of what has been observed for atmospheric CO 2 (e.g. Bacastow et al., 1985); and that the spatial decorrelation length scales are on the order of 100 km (Li et al., 2005) to 400 km (Jones et al., 2012), which is smaller than that of the marine atmosphere. As a result, many works on modelling SOC focused on a mesoscale (e.g. Zeng et al., 2002;L efevre et al., 2005;Sarma et al., 2006;Jamet et al., 2007;Friedrich and Oschlies, 2009;Telszewski et al., 2009;Takamura et al., 2010;Landsch € utzer et al., 2013;Nakaoka et al., 2013;Schuster et al., 2013); and on the global scale, mapping SOC was confined to the climatology in a given year (e.g. Takahashi et al., 2009;Zeng et al., 2014).
This work extends the method of Zeng et al. (2014) to reconstruct the global monthly SOC in 1990-2011. The resulting dataset not only has a finer spatial resolution (1°9 1°) compared to the climatology of Takahashi et al. (2009) (4°9 5°), which is currently the most frequently used product, but also provides global SOC maps in multiple years. While the SOC climatology of Zeng et al. (2014) does not cover areas where chlorophyll data were not available, this dataset filled those gaps.
We recognize that in areas where large spatial gaps exist the model has a potential to over-interpolating due to its nonlinear characteristics. To minimize such a problem, we implemented a three-stage modelling approach. In the first stage, we excluded the LAT and LON variables from the model; therefore the model equation becomes We used model Equation (2) for spatial gap filling, i.e. filling grid cells with modelled fCO 2 if there is no measurement surrounding those grid cells within 10°. In the second stage, we used the measurements and gap-filled data with equation (1) for areas where CHL data are available. To estimate fCO 2 in CHL-missing areas, we excluded CHL from equation (1) so the model equation becomes: Therefore, we have two sets of model results in this stage: One includes the dependency of fCO 2 on CHL and another excludes that dependency. Finally in the third stage, we merged the two results to obtain the final product. The merging process first fills grid cells Area ratio (%) 1 9 9 0 1 9 9 1 1 9 9 2 1 9 9 3 1 9 9 4 1 9 9 5 1 9 9 6 1 9 9 7 1 9 9 8 1 9 9 9 2 0 0 0 with the model results of equation (1), and then with the results of equation (3). It should be noted that the criteria for identifying the open ocean grid cells in this work are slightly different from those of Zeng et al. (2014). The criteria are elevation smaller than À500 m, SST larger than À10°C, SSS larger than 25, and ice cover smaller than 50%.
We used the climatology of all variables in training. The climatology of fCO 2 was obtained by normalizing fCO 2 measurements to the reference year of 2000 assuming a mean global increase rate of 1.5 latm year À1 . For the fCO 2 reconstruction, the SST climatology was replaced by the monthly SST. It would be ideal if we could use also the monthly SSS and CHL, but their data were not available for the whole period. Finally, the model outputs of fCO 2 were adjusted by the same rate to yield the monthly fCO 2 in each year. The adjustment is necessary as the model output is the detrended fCO 2 . Figure

Dataset location and format
The dataset in NetCDF format is archived at http://doi.pangaea.de/10.1594/PANGAEA.834398. The data file JTECH-D-13-00137.extended.nc contains the following variables.
Area -The grid cell area in 1 9 1°. Elevation -The grid mean of land topography and ocean bathymetry derived from the 1-arc-minute data of NOAA (http://www.ngdc.noaa.gov/mgg/global/global.html).
SST_MM -The monthly SST climatology derived from SST_YYMM.
Ps_MM -The surface pressure in the reference year of 2000 derived from the ECMWF's Interim data.
WND_YYMM -The monthly wind speed extracted from the ECMWF's Interim data.
Flux_YYMM -The monthly CO 2 fluxes calculated by the method of Zeng et al. (2014).
Flux_YY -The sum of Flux_YYMM over the global oceans.