An evaluation of cloud vertical structure in three reanalyses against CloudSat/cloud‐aerosol lidar and infrared pathfinder satellite observations

Cloud fraction is a great source of uncertainty in current models. By utilizing cloudiness fields from CloudSat/cloud‐aerosol lidar and infrared pathfinder satellite observations (CALIPSO), the three widely used reanalyses including the Interim ECWMF Re‐Analysis (ERA‐Interim), Japanese 55‐yar Reanalysis (JRA‐55), and the Modern‐Era Retrospective Analysis for Research and Applications (MERRA‐2) are assessed for their representation of cloudiness. Results show all three reanalyses can basically capture the cloud horizontal pattern and vertical structure as in CloudSat/CALIPSO, yet the magnitude is markedly underestimated, in particular for JRA‐55 and MERRA‐2. Besides, all reanalyses struggle to simulate the mid‐level clouds at low latitudes. In addition to these common deficiencies, the three reanalyses have their own distinctive behaviors and differ from one another. While ERA‐Interim and JRA‐55 show better performance for low‐level clouds in the tropics, they exhibit remarkable underestimation for high‐level clouds. On the contrary, MERRA‐2 succeeds in representing high‐level clouds but dramatically underestimates the low and mid‐level clouds at low latitudes. As a measure of subgrid‐scale variability of moisture, the derived “critical relative humidity (RH c)” from CloudSat/CALIPSO exhibits distinctive vertical structures at different latitudes, it is thus speculated that poor specification or parameterization of RH c is responsible for these bias behaviors.


| INTRODUCTION
Clouds are of fundamental importance in modulating Earth's energy budget and hydrological cycle (Stephens, 2005). Over the past decades, great efforts have been devoted to improving cloud simulations in weather and climate models, yet its parameterization still remains a vast challenge and contributes to one of the largest uncertainties in climate projections (Andrews et al., 2012). Clouds typically stratify in the vertical, and different structures can lead to striking distinctions in cloud-radiative forcing (Weare, 2000). Lowlevel clouds have a net cooling effect via enhanced solar reflection, while high-level clouds commonly have a net warming effect by preventing thermal emission from outgoing. Cloud microphysical processes such as collision and sedimentation are affected by cloud vertical structure as well, which can then cause an influence on precipitation (Jakob and Klein, 1999). Therefore, the knowledge of cloud vertical structure is important for both cloud radiative transfer and microphysical processes.
However, such data of global average is unavailable prior to the launch of NASA CloudSat/CALIPSO, which joined the A-Train in April 2006. Passive sensors such as those utilized by the International Satellite Cloud Climatology Project (ISCCP) detect clouds based on the integrated effect of properties of the whole atmospheric column, thus making it impossible to retrieve clouds layer by layer (Rossow and Schiffer, 1999). Field campaigns with active sensors such as surface cloud profiling radar (CPR) can provide cloud vertical structure (Dong et al., 2006), yet they cannot fully sample the global variability in clouds, especially over the oceans. By carrying a 94 GHz cloud-profiling radar on board, CloudSat can provide vast amounts of unprecedented information of cloud vertical properties on a global scale. While the radar on CloudSat can penetrate into thick clouds, the lidar on CALIPSO is capable of detecting weak vapor condensation and thus optically thin cirrus clouds (Stephens et al., 2002). Thence, a combination of radar and lidar instruments can make full use of their complementary capabilities for better cloud detection. Over the past decade, Cloud-Sat/CALIPSO have been widely used for cloud investigation and model evaluation (Luo et al., 2011;Yan et al., 2016;Yan et al., 2017;Yamauchi et al., 2018). In the cloud modeling community, a great challenge remains on how to well represent subgrid-scale cloud condensation and fractional cloudiness (Quaas, 2012;Wang et al., 2015). Almost all climate models suffer from the poor cloudiness simulation (Zhang et al., 2005). The situation is not getting much better for numerical weather models in spite of finer resolution and advanced assimilation system. For example, Stengel et al., (2018) points out that the total cloud cover in ERA-Interim is generally too low nearly everywhere on the globe except in polar-regions. Naud et al., (2014) found cloud fraction is slightly underestimated over the southern oceans for ERA-Interim, but severely underestimated for MERRA. In this study, we utilize CloudSat/CALIPSO to comprehensively evaluate cloudiness fields in three widely used reanalysis products, with the aim to point out their common deficiency as well as distinct behaviors. By diagnosing the so-called "critical relative humidity (RH c )" from CloudSat/CALIPSO and auxiliary ECWMF products, it explains why cloudiness biases occur and sheds lights on potential improvement. The paper is organized as follows. Section 2 describes the data and method. These are followed by a critical evaluation of cloud fraction in reanalyses against CloudSat/CALIPSO. Section 4 discusses the possible cause of cloudiness biases and potential way for future improvement. The last section gives a summary.

| DATA AND METHOD
CloudSat/CALIPSO generates 37,081 profiles along each orbit and 125 bins for each profile. To determine whether a pixel is cloudy or not, following Barker (2008), we use a combination of fields of CPR_Cloud_mask and Radar_ Reflectivity fields from 2B-GEOPROF and CloudFraction from 2B-GEOPROF-LIDAR. The criteria are as follows. Each volume is classified as a cloud if one of the two conditions are satisfied: (a) CPR_Cloud_mask ≥ 20 and Radar_ Reflectivity ≥ −30 dBz or (b) CloudFraction ≥ 99%. We take the horizontal resolution of 2.5 × 2.5 and vertical resolution of 25 hPa as the standard grid size. The cloud fraction is then defined as the ratio of the number of cloudy pixels to that of total pixels within a grid at each layer. Similarly, the total cloud cover is determined in terms of the ratio of the number of cloudy columns that contain as least one cloudy pixel in the vertical dimension to that of the total columns within each grid. In addition to CloudSat/CALIPSO, the monthly ISCCP-D2 dataset is also used as a guiding reference (Rossow and Schiffer, 1999).
The three reanalyses to be assessed are ERA-Interim from the European Centre for Medium-Range Weather Forecasts (ECMWF) (Dee et al., 2011), Japanese 55-year Reanalysis (JRA-55) (Ebita et al., 2011), and the Modern-Era Retrospective Analysis for Research and Applications (MERRA-2) (Gelaro et al., 2017), which span the same period from 2007 to 2010 to accommodate Cloud-Sat/CALIPSO missions. The cloudiness fields to be assessed are 3D cloud vertical structure (CF 3D ), and total cloud cover (CF T ). These variables are processed to the same horizontal resolution as gridded CloudSat/CALIPSO. Auxiliary variables such as relative humidity (RH) from ECMWF products are used for RH c diagnosis purposes. For brevity, the main characteristics of all above data are listed in Table 1.

| RESULTS AND ANALYSIS
Firstly, we compare the simulation of total cloud cover in reanalysis products against CloudSat/CALIPSO and ISCCP-D2. Figure 1 shows the global distribution of CF T for Cloud-Sat/CALIPSO and the differences between CloudSat/ CALIPSO and other datasets. Positive (negative) values mean that CloudSat/CALIPSO provides larger (smaller) CF T than a given dataset. From the top to the bottom are for CloudSat/CALIPSO, ISCCP-D2, ERA-Interim, JRA-55, and MERRA-2. Consistent with Luo et al. (2017), the CF T from CloudSat/CALIPSO is slightly larger than ISCCP-D2 at low latitudes, presumably because CloudSat/CALIPSO detects more broken cumulus clouds and high-level cirrus clouds . Overall, all three reanalyses can well reproduce the same spatial pattern as CloudSat/CALIPSO, for example, cloudiness centers along storm tracks, and in middle-high latitude of the southern hemisphere. The clouds do not show obvious seasonal change except in the tropics, which move in pace with the seasonal shift of the global monsoon systems and the Intertropical Convergence Zone (ITCZ). The biases in magnitude are however considerable. All reanalyses underestimate CF T in comparison with CloudSat/CALIPSO, especially for JRA-55 and MERRA-2, where the underestimation reaches as high as 30% over northern hemisphere lands. Overall, the ERA-Interim shows the most resemblance to Cloud-Sat/CALIPSO among all three reanalyses, despite slight underestimations over land surfaces. As will be shown next, the bias in CF T to some extent can be attributed to the bias in 3D fractional cloudiness (CF 3D ). Figure 2 gives the zonally averaged cloud vertical structure from CloudSat/CALIPSO and the three reanalyses. For both summer and winter, CloudSat/CALIPSO exhibits pronounced cloudiness in lower troposphere in middle and high latitudes, which corresponds to large CF T in Figure 1. In the tropics, high-level clouds above 400 hPa shift across the equator from June-August (JJA) to December-February (DJF), in accordance with the seasonal moving of ITCZ. Meanwhile, due to the prevailing large-scale subsidence controlled by the descending branch of the Hadley cells, few clouds are found to maintain over subtropical zones, which is mainly composed of trade cumuli. Compared with Cloud-Sat/CALIPSO, cloud fraction in all reanalyses is clearly underestimated in spite of similar vertical structures, which is in line with the underestimation in CF T . Moreover, all  reanalyses fail to reproduce vertical cloud structures in the tropics that extend from the surface up to 200 hPa. Again, ERA-Interim is closest to CloudSat/CALIPSO, although considerable biases are still observed. While JRA-55 exhibits comparable performance as ERA-Interim for low-level clouds in the tropics, it reveals apparent underestimation for high-level clouds. On the contrary, MERRA-2 succeeds in representing high-level clouds, but dramatically underestimates the low and mid-level clouds at low latitudes. The stratocumulus and shallow cumulus clouds have important implications on cloud feedbacks in climate models (Stephens, 2005;. The cloud transition, referring to the transition from stratocumulus to shallow cumulus and to deep convective clouds, remains the subject of numerous studies (Teixeira et al., 2011;. Cloud transitions along the Global Energy and Water Cycle Experiment Cloud System Study/Working Group on Numerical Experimentation (GCSS/WGNE) Pacific Cross-Section Intercomparison (GPCI) transect and seasonal evolutions over the southeastern equatorial Pacific (SEP) are two cases of such kind (Yu et al., 2017). Figure 3 presents cloud transition along the GPCI transect and seasonal evolution over the SEP region in three reanalyses and Cloud-Sat/CALIPSO. The GPCI cross-section connects the two locations of 1 S, 173 W and 35 N, 125 W. Along the GPCI transect, low-level clouds decrease while high-level cloud increase with increasing sea surface temperature (SST) from the west coast of California to the equator. Similarly, over F I G U R E 2 Zonally averaged cloud vertical structure from (a-c) CloudSat/CALIPSO, (d-f) ERA-Interim, (g-i) JRA-55, and (j-l) MERRA-2 datasets at different seasons (unit: %) the SEP region, low-level clouds decrease while high-level cloud increases from JJA to DJF. In general, all three reanalyses can basically capture the cloud transition along the GPCI transect and over the SEP region, yet they show their own distinctive characteristics. Low-level clouds to the west of California are well reproduced in ERA-Interim, whereas only marginally found in JRA-55 and MERRA-2. High-level clouds in the tropics are well represented in MERRA-2 but remain significantly underestimated in ERA-Interim and JRA-55. For clouds over the SEP region, all reanalyses exhibit significant underestimation throughout the troposphere. The low and mid-level clouds in the tropics are dramatically underestimated in MERRA-2, and to a lesser degree in JRA-55 and ERA-Interim. High-level clouds in MERRA-2 are somewhat overestimated, whereas moderately underestimated in JRA-55 and ERA-Interim. The bias characteristics along GPCI transect and over the SEP region are in agreement with those found in Figure 2, suggesting these clouds are potentially representative of clouds globally.

| DISCUSSIONS
The reason why cloudiness biases occur in reanalyses touches the basis of cloudiness parameterization in general circulation models (GCMs). For the majority of models at present, cloud fraction at each level is diagnosed either by an empirical RH-based formula, or a PDF-based statistical scheme. The key to cloudiness parameterization relies on how to properly account for the subgrid-scale variability of moisture (Tompkins, 2003), which is usually measured by "critical relative humidity (RH c )" (Quaas, 2012) and expressed as for 0 < CF 3D < 1, where RH denotes grid-mean relative humidity. By utilizing CF 3D from CloudSat/CALIPSO and RH from auxiliary ECMWF products, we derive the averaged RH c at different latitudes for the observation, which is shown in Figure 5, with its SD overlaid in each subplot. At low and middle latitudes, the observational RH c first decreases with altitude and then increases upward. At high latitude, the observed RH c decreases with altitude in the upper troposphere, producing a second peak around 300 hPa. It is thus inappropriate to use one fixed RH c profile everywhere as in many previous studies (Lohmann and Roeckner, 1996;Quaas, 2012). The specification or parameterization of RH c should accurately account for both horizontal and vertical variations. In Figure 5 we also overlay the RH c profiles for the three reanalyses, which to some extent measure the subgrid-scale variability of moisture in models. Overall, the RH c profiles in reanalyses are larger than the observation, especially at low and middle latitudes. The RH c profiles for ERA-Interim and JRA-55 show more resemblance to the observation from surface up to 300 hPa, while the RH c profiles for MERRA-2 exhibit comparable and even smaller value than the observed RH c in the upper atmosphere. Note the "observed" RH c is not perfect, as it relies on RH from ECMWF profiles. This also explains why RH c in ERA-Interim shares more resemblance to the observation. The biases in RH c profiles are consistent with the biases in cloud fraction. As implied in Equation (1), a large (small) value of RH c typically leads to an underestimation (overestimation) in cloud fraction, given that RH in reanalyses bear close resemblance to reality due to data assimilation. Alternatively, the overestimation (underestimation) of cloud fraction is presumably due to a small (large) specification of RH c in models. For climate models, the bias of RH adds one more uncertainty, in addition to the uncertainty of RH c .

| CONCLUSIONS
By using the CloudSat/CALIPSO data, we evaluated cloud fraction in three reanalyses (ERA-Interim, JRA-55, and MERRA-2) to identify their quality and reliability for cloud simulation. It is found that while all reanalyses can basically capture the horizontal pattern and vertical structure as in CloudSat/CALIPSO, they show considerable biases against satellite retrievals and differ from one another as well.
The most pronounced feature is that all reanalyses more or less underestimated the total cloud cover against Cloud-Sat/CALIPSO, especially for JRA-55 and MERRA-2, with the bias reaching as high as 30%. Further analysis demonstrates that these underestimations are tightly related with the underestimation in 3D cloud fraction. All reanalyses struggle to simulate the mid-level clouds at low latitudes. We further assessed the transition of clouds along GPCI transect and seasonal evolution over the SEP region. While all reanalyses can generally reproduce the transition from stratocumulus to shallow cumulus and eventually to deep convective clouds, they exhibit considerable biases in comparison with CloudSat/CALIPSO. ERA-Interim and JRA-55 perform better for low and mid-level clouds, but exhibit apparent underestimation for high-level clouds, whereas MERRA-2 succeeds in representing high-level clouds but dramatically underestimates low and mid-level clouds. We also extended the analysis to East Asia and found similar bias characteristics as along GPCI and over SEP. The derived "critical relative humidity (RH c )" from CloudSat/CALIPSO exhibits distinctive vertical structures at different latitudes. The diagnosed RH c from reanalyses show variations in terms of latitudes, yet they do not well match the observation. Moreover, the value of RH c in all reanalyses are larger than the observation at lower troposphere, especially for MERRA-2. This explains why cloudiness biases occur in reanalyses. Thence, the subgrid-scale variability of moisture should be accurately specified or parameterized in models. Failing of this would yield bias in cloud fraction.