Moist convection and its upscale effects in simulations of the Indian monsoon with explicit and parametrized convection

In common with many global models, the Met Office Unified Model (MetUM) climate simulations show large errors in Indian summer monsoon rainfall, with a wet bias over the equatorial Indian Ocean, a dry bias over India, and with too weak low‐level flow into India. The representation of moist convection is a dominant source of error in global models, where convection must be parametrized, with the errors growing quickly enough to affect both weather and climate simulations. Here we use the first multi‐week continental‐scale MetUM simulations over India, with grid spacings that allow explicit convection, to examine how convective parametrization contributes to model biases in the region.


Introduction
The Indian monsoon (Sperber et al., 2013) is the largest annual reversal in synoptic patterns of wind and rainfall in the world. Its summer rains are critical, socially and economically, to the more than one billion people of the Indian subcontinent. Most of India receives more than 80% of its annual rainfall during the summer monsoon months of June to September (Venkateswarlu and Rao, 2013). It is estimated that a severe drought year reduces the gross domestic product of India by 2-5%, and that this has not changed in the last 50 years (Gadgil and Gadgil, 2006). In May 2002 there was no indication from any empirical or atmospheric general circulation model that all-India rainfall in June and July would be 30% below normal (19% deficit for June to September) with a similar failure in 2004, when there was a seasonal (June to September) rainfall deficit of 13% (Gadgil et al., 2002(Gadgil et al., , 2005. Improving forecasts for the Indian summer monsoon, on all time-scales, has been linked to a need for a better understanding of the role of deep convection in the Tropics . The two major regions of rainfall are the Western Ghats, a mountain range running parallel to the western coast of the Indian peninsula, and the Ganges-Mahanadi Basin (GB) in northeast India (Figure 1). There is also a region that runs northwest from the head of the Bay of Bengal, often referred to as the monsoon zone (Sikka and Gadgil, 1980) or monsoon trough region (MT in Figure 1), where transient low pressure systems (LPSs) which form in the Bay of Bengal or northeast India generate a significant fraction of the total Indian summer monsoon rainfall (Yoon and Chen, 2005). The rainfall variability in the monsoon trough is highly correlated with all-India summer monsoon rainfall (Gadgil, 2003), and so an improved prediction of variability in this region should also project onto the larger-scale predictability.
While global climate models (GCMs) perform reasonably well on the global scale, they fail to resolve important localto regional-scale processes (Karmacharya et al., 2015). Most typically exhibit a systematic wet bias over the equatorial Indian Ocean, and a dry bias over central India (Sperber et al., 2013). Higher-resolution regional climate models (RCMs), which are able to represent regional forcings, feedbacks, and processes, improve the representation of rainfall in the Indian summer monsoon, particularly over regions of steep orography such as the Himalayas and Western Ghats (Rupa Kumar et al., 2006). However, Lucas-Picher et al. (2011) show significant differences in the representation of the Indian monsoon by a number of RCMs forced with lateral boundary conditions from the 45 year European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40) for the period 1981-2000, highlighting the fact that they fail to properly represent important feedbacks and processes, even when biases introduced by the driving model are reduced.
The representation of convection is a dominant source of error in global models, (Jung et al., 2010;Sherwood et al., 2014), and there is evidence that the errors are primarily due to physical processes that occur on a short enough time-scale (within the first few days, often the first 24 h) to affect both weather and climate models (Murphy et al., 2004;Rodwell and Palmer, 2007). Improvements to convective parametrization schemes, based on weather models, should also lead to improvements in climate models. It is expected that in the next 10 years, accounting for increases in computing power, global models of weather and climate will run at grid spacings ranging from several kilometres, to about 100 km (Holloway et al., 2012b). Therefore it will be necessary to parametrize convection for the foreseeable future.
Convective parametrization schemes typically produce too many light rain events, too few heavy rain events, and have a diurnal cycle of continental precipitation that peaks too early in the day (Betts and Jakob, 2002;Randall et al., 2003;Guichard et al., 2004;Stephens et al., 2010;Dirmeyer et al., 2012). The intensity and frequency of precipitation influences cloud formation and associated radiative effects, aerosol effects on the radiation balance, latent heating in the atmosphere, and surface hydrological processes (Stephens et al., 2010). Large amounts of moisture in the lower troposphere over India during the summer mean that small perturbations can lead to cloud formation and precipitation. Ground heating of the lower atmosphere due to insolation, which increases the lower-tropospheric instability, is an important control on the diurnal cycle of summertime convection and precipitation over the subcontinent. The diurnal cycle associated with this large and well-defined solar forcing is a fundamental mode of variability in the atmosphere, and as such has been suggested to be an important test for the correctness of any model (Yang and Slingo, 2001). In addition, mesoscale circulations such as land-sea breezes, katabatic-anabatic winds, or mountain-valley winds can modulate the precipitation regime and produce a diurnal cycle with distinct regional variations.
Model configurations with small enough grid spacings to allow convection to be explicitly resolved are known to give a more realistic diurnal cycle of precipitation in the Tropics, with rainfall typically peaking over land in the late afternoon (Guichard et al., 2004;Dirmeyer et al., 2012), and give a better rainfall intensity distribution, but overestimate the amount (Weisman et al., 1997;Holloway et al., 2012b). For the West African monsoon, when run over large domains for many days, convection-permitting simulations have been shown to be much better on the continental scale, due largely to their improved representations of triggering, organisation and the diurnal cycle of precipitation (Marsham et al., 2013;Birch et al., 2014).
As part of the Earth system Model Bias Reduction and assessing Abrupt Climate project (EMBRACE; a collaboration between 19 European partners, with the goal of improving Earth System Models), we analyse a suite of Met Office Unified Model (MetUM) simulations of a 3 week period of the 2011 Indian summer monsoon, over a domain size large enough to capture the monsoon system.
Model configurations with sufficiently high horizontal resolution to permit the explicit resolution of cloud systems, and temporal and spatial domain size large enough to allow the representation of convection to affect the continentalscale circulation, are compared with observational data and parametrized convection model configurations of the same period. Biases are expected in the convection-permitting simulations, particularly as grid spacing increases, but the similarities among them, and their differences to the parametrized convection simulations, provide a unique insight into convection and its upscale effects in the Indian monsoon.
Section 2 describes the EMBRACE simulations and observational datasets. Section 3 presents differences in rainfall and other diagnostics between the simulations, along with their biases compared to the satellite rainfall retrievals and surface and upper-air observations, and discusses the link between the rainfall differences and the larger-scale aspects of the monsoon. Section 4 gives a summary of the results and discussion.

Methods
All simulations use the UK Met Office Unified Model (MetUM) version 8.2. The fully compressible non-hydrostatic deepatmosphere equations of motion are solved using a semi-implicit, semi-Lagrangian scheme (Davies et al., 2005). It uses a staggered Arakawa C-grid in the horizontal and a terrain-following hybridheight Charney-Phillips vertical grid. There are a comprehensive set of parametrizations for processes too complex or small-scale to be physically represented, such as surface exchange (Essery et al., 2001), boundary-layer mixing (Lock et al., 2000), mixedphase cloud microphysics (Wilson and Ballard, 1999), and an optional mass flux convective parametrization scheme (Gregory and Rowntree, 1990).
The simulations (Table 1) are a suite of regional MetUM simulations of a 21 day period starting 18 August 2011 0000 UTC, which was the most anomalously wet period (giving the best signal-to-noise ratio) of the 2011 Indian summer monsoon (domains in Figure 1). There are 2.2, 4, 8, and 12 km grid spacing simulations that treat convection explicitly, with no convective parametrization and a 3D Smagorinsky scheme for  sub-grid mixing. The simulations were originally run at the Met Office to examine the stratospheric gravity wave field above deep tropical convection (Bushell et al., 2015). While grid-spacings of 8 and 12 km would normally be considered too coarse to model without a convective parametrization, the overlap in grid spacings allows the effects of the representation of convection to be isolated from those due to grid spacing (as in Marsham et al., 2013, for the West African monsoon). Simulations with parametrized convection at grid spacings of 8, 12, 24 km (comparable with many global numerical weather prediction models) and 120 km (comparable with many climate models) use the MetUM Global Atmosphere 4.0 (Walters et al., 2014) configuration, with a 1D boundary-layer scheme for the sub-grid mixing. All of these simulations have a rotated-pole horizontal grid. The convection-permitting simulations are configured as in the operational MetUM variable grid spacing NWP model configuration (UKV; Cullen, 1993), but with the differences listed in Table S1. The simulations are nested directly within the MetUM N512L70 (∼ 24 km horizontal grid spacing) global model, which is reinitialized every 6 h with Met Office operational analyses, and provides hourly local boundary conditions for the free-running simulations. Sea surface temperatures (SSTs) are prescribed and are updated daily from Operational Sea Surface Temperature and Sea Ice Analyses (OSTIA; Donlon et al., 2012). For reasons of data volume, a limited-area simulation with the same grid and configuration as the global simulation, which is also re-initialised every 6 h, has been used to provide the global model output, which is considered to be the model analysis for the purpose of comparison with the free-running simulations ('Driving', domain in Figure 1).
Three satellite rainfall retrieval products are used for comparison with the model simulations. The Tropical Rainfall Measuring Mission (TRMM) 3B42 (version 7) rainfall product (Huffman et al., 2007) combines precipitation estimates from multiple satellites, and is bias-corrected with rain-gauge data. It has a 0.25 • by 0.25 • spatial grid-spacing, and is 3-hourly. The CMORPH (CPC Morphing technique) product (Joyce et al., 2004;Xie et al., 2013), is on an 8 km horizontal grid and is half-hourly. It combines precipitation estimates from existing low-orbiter microwave rainfall retrieval algorithms with spatial propagation information from infrared satellite data, which are then adjusted with daily rain-gauge analysis. The Global Satellite Mapping of Precipitation (GSMAP) product (Mega et al., 2014), has a grid spacing of 0.1 • and 1 h, and uses an algorithm to combine microwave radiometer and infrared data from multiple satellites, which are then adjusted with daily rain-gauge analysis.
One notable difference between these products is the use of global analysis (Japan Meteorological Agency) data, which include precipitation profiles, in the GSMAP algorithm, while TRMM and CMORPH do not use general circulation model data in their algorithms.
In an analysis of the performance of TRMM 3B42 and GSMAP satellite rainfall products over India, Prakash et al. (2015b) find that while they are capable of representing large-scale spatial features and capture interannual variability, there are regionspecific biases, and significant biases in rainfall amount over India (±20%), while Xin-Xin et al. (2015) find good agreement in the diurnal cycle of rainfall in TRMM and CMORPH products over most of the study domain except, notably, the Tibetan Plateau. In a comparison study of biases in TRMM 3B42 versions 6 and 7, Prakash et al. (2015a) find an overall improvement of 5-10% in V7 over high rainfall regions on the west coast of India and in the northeast and central regions of the country, but there are still large biases in central India regions where monsoon LPSs are common.
Unlike these studies, all the satellite rainfall products used in this study are adjusted with rain-gauge data, but at the time of writing, there was no quantitative assessment of their differences over the study domain. Consequently, multiple satellite rainfall products have been used to allow some understanding of the possible error in these products.
Sea-level pressures (SLPs) measured at three surface stations (Patna, Port Blair and Minicoy in Figure 1), and radiosonde sounding data from Minicoy are compared with the simulations (Durre et al., 2006;Met Office, 2015).

Mean pattern of rainfall
The mean modelled distributions of rainfall are strongly affected by the representation of convection. Figure 2 shows distributions for selected simulations, with plots for other model configurations, and CMORPH and GSMAP in the supporting information ( Figure S1). Simulations with parametrized convection give smooth distributions, while explicit convection gives much more patchy rainfall. More coarsely resolved explicit convection produces excessive rain over the ocean, which is consistent with past studies (Holloway et al., 2012a,b). TRMM (Figure 2(a)) shows regions of higher rainfall over the Himalayas, the Myanmar coast, the Bay of Bengal, and the Western Ghats; all the simulations produce excessive rain over the orography of the Himalayas and the west coast of Myanmar, and are too dry over the Bay of Bengal and the north of the Western Ghats. Model performance in the monsoon trough region is discussed below.
The band of monsoon trough rainfall is further north in all the convection-permitting simulations, compared to TRMM (Figures 2(a)-(c)), such that there is a positive/negative dipole in the differences (Figure 2(e) and (f)). In the parametrized simulations, the band of maximum rainfall over central India is further south (Figure 2(d)), in better agreement with TRMM, but there is deficient rainfall there and excess rainfall extending northwards to the Himalayas (Figure 2(d)), so that the dipole of rainfall difference is due to a relatively consistent spread of rainfall over central India north of 20 • N, rather than a difference in the location of the rainfall maximum. Mean total rainfall amounts in the monsoon trough from 22 August to 6 September are between 242 and 250 mm for the three satellite rainfall retrieval products, which is relatively well captured by 2.2E, 4E, and 8E (242, 239, 237 mm respectively), although 12E produces significantly less (212 mm). The parametrized simulations produce much less in the monsoon trough, with 8P and 12P total rainfall at 175 and 174 mm respectively. A large proportion of the rainfall in the monsoon trough comes from the propagation of a LPS northwest across India from the Bay of Bengal (discussed further in section 3.2), and differences in the position of the band of monsoon trough rainfall in the free-running simulations are mostly due to the path it takes. It is not clear from these mean spatial fields of rainfall alone that, for example, 2.2E gives a better representation than 8P of this 21 day period. The driving simulation has the lowest rainfall biases (Figure 2(h)) which, as it is re-initialised every 6 h, is to be expected. The rainfall biases over the subcontinent in 2.2E may appear to be larger than those in 8P, but this is due largely to the position of the band of maximum rainfall which, in turn, is due to the path a LPS takes. Biases in both the convection-permitting and parametrized simulations, such as the deficient rainfall over the Bay of Bengal, can also still be useful in highlighting biases which are, to some degree, insensitive to changing grid spacing or the representation of convection. As will be shown, the convectionpermitting simulations do give a better representation of a number of aspects of the rainfall. The convection-permitting simulations also give a significantly different representation of other aspects of the monsoon system, and it is the link between convection and these differences that we aim to better understand here.

Temporal variability in rainfall
The total rainfall, the diurnal cycle of rainfall and rainfall intensities are all much more strongly dependent on the representation of convection than on model grid spacing . Figure 3(a) shows that, over the subcontinent as a whole, the convection-permitting simulations consistently rain more than the satellite retrievals and the parametrized simulations, with the exception of the rainfall minimum centred around 25 August. There is a clear initial 4 day spin-up for the convection-permitting simulations over land; this presumably results from the time required for convective-scale circulations Table 2. Pearson Correlation Coefficients (PCCs) between the daily mean rainfall retrievals from TRMM or CMORPH, and the other satellite rainfall retrievals and a number of simulations, for the period 22 August to 7 September, after the convection-permitting simulations have spun up. All correlations are performed after coarse-graining to the TRMM 0.25 • (∼27 km) horizontal grid. Regions are described or shown in Figure 1. to develop and the adjustment of the large-scale state of the convection-permitting simulations to their preferred atmospheric state, from that of the MetUM operational global model, which parametrizes convection. Even after this spin-up, the convectionpermitting simulations tend to rain more than observed over the subcontinent (Figure 3(a)). Over the ocean (Figure 3(c)), it is not clear if there is a spin-up, which may be shorter (1-2 days).
There is a large spread in the satellite estimates of total mean rainfall over the subcontinent after spin-up, with CMORPH closer to the parametrized free-running simulations and driving (both ∼0.3 mm h −1 ), and TRMM closer to the convection-permitting simulations ∼0.37 and ∼0.39 mm h −1 ) respectively).
Among the free-running simulations, 2.2E, 4E and 8E capture the day-to-day variability over the subcontinent in TRMM the best, between 22 August and 7 September (after the spin-up period) with the highest Pearson Correlation Coefficients (PCC) of 0.5, 0.57, 0.46 respectively (Table 2), although the PCC between TRMM and 12E is very low (0.1). Among the parametrized simulations, there is an increase of PCC with grid spacing in 8P, 12P, and 24P, which is similar to 120P (0.35, 0.36, 0.45, 0.46 respectively). This increase in correlation as grid spacing increases is an interesting result, but further investigation is beyond the scope of this article. The driving simulation, compared to TRMM, captures the day-to-day variability over the subcontinent better than the free-running simulations, with a PCC of 0.68. This is within the spread of the PCCs among the satellite rainfall retrievals (0.6-0.88), which is higher than the PCCs between all of the free-running simulations and TRMM.
In the monsoon trough, while the daily mean rainfall variability is much greater than for the whole domain (Figure 3(b)), the dayto-day variability in rainfall in each of the convection-permitting simulations is similar, and is distinct from the variability in the parametrized models, which are also all similar to each other. This is particularly true after ∼31 August, when the convectionpermitting simulations capture the day-to-day variability in the satellite retrievals to some degree, but the rainfall drops off in the parametrized simulations and there is very little variability. Much of the variability after 31 August is associated with the propagation of a LPS northwest along the monsoon trough from the Bay of Bengal, and is discussed further in section 3.2.
There is good correlation in the modelled daily variability of rainfall in the monsoon trough in high-resolution convectionpermitting modes (PCC for 2.2E and 4E are 0.5 and 0.52 respectively; Table 2), but lower correlation for the lowerresolution convection-permitting simulations (for 8E and 12E, PCCs are 0.05 and 0.11 respectively). The correlation with observations is negative for parametrized simulations, with PCC between −0.2 and −0.27, while the driving simulation, as expected, has a much higher PCC at 0.83. These negative PCCs in the parametrized simulations are, to some degree, also attributed to the propagation of a LPS northwest along the monsoon trough. Figure 4 shows the cumulative sum of the fractional contribution of rainfall rates to the total rain for the simulations and satellite retrievals. A greater fraction of the total rainfall in the convection-permitting simulations and satellite observations comes from more intense rainfall, compared to the parametrized simulations, and as grid spacing decreases, the convectionpermitting distribution moves closer to that of TRMM and CMORPH. The distribution is similar among the parametrized simulations, which includes the driving simulation, with the vast majority of rain coming from light rain. There is a pronounced grid-spacing effect on the distribution among the convectionpermitting simulations, with an increase in more intense rain as the grid spacing increases, although their total rainfall amounts are similar (Figure 3). About 80% of the rainfall in the parametrized (free-running and driving) simulations comes from rain rates of <3 mm h −1 and 95% comes from rain rates of <5 mm h −1 , Rain rate (mm h -1 ) while 70-90% (2.2E-12E) of the rainfall in the convectionpermitting simulations comes from rain rates of >3 mm h −1 and 75 to 35% (12E-2.2E) comes from rain rates of >10 mm h −1 . The 2.2E distribution of rainfall intensities is a close match to CMORPH while the TRMM product has a lower proportion of the rainfall coming from rain rates between 5 mm h −1 and 35 mm h −1 . The GSMAP distribution is a close match to the parametrized simulations, and this is expected to be due to the use of model reanalysis products in its algorithm.
Consistent with past studies in other regions (Sato et al., 2009;Marsham et al., 2013), the phase of the diurnal cycle of rainfall over the subcontinent (Figure 5(a)) is much improved in the convection-permitting simulations, compared to the parametrized, although the convection-permitting simulations rain excessively during the afternoon and evening, compared to the satellite rainfall retrievals. In the convection-permitting simulations, rainfall peaks at 1500-1700 local time (India Standard Time (IST) = UTC + 5.5 h) and is at a minimum in the early morning, from 0800 to 1000 IST, in agreement with the satellite products, whereas rainfall in the parametrized convection simulations peaks too early in the morning between 0900 and 1200 IST, and is at a minimum at ∼1800 IST. There is a shift among the convection-permitting simulations to a later peak in rainfall as the grid spacing increases, consistent with many but not all past studies (Petch et al., 2002;Bryan et al., 2003;Marsham et al., 2013).
The means in Figure 5 are not able to show the variety in the diurnal cycle across the land and ocean regions; this is shown in Figure 6, which shows the timing of the diurnal peak in rainfall across the domain. The convection-permitting simulations capture the high degree of variability seen in TRMM, whilst the parametrized show far too little variability. TRMM and 2.2E peak rainfall timings are very similar over the oceans, with a high degree of variability which is generally not captured by the parametrized simulations. Despite this, the diurnal cycle over the Bay of Bengal, with a change in peak timing here from morning to night-time from northwest to southeast as in TRMM, is still captured to some extent in the parametrized simulations.
The time of peak rainfall in TRMM, over much of the subcontinent (particularly over the Indian peninsula, in the monsoon trough and the northwest of the domain), is 1800-0000 IST, but in 2.2E the time of peak rainfall is much more often ∼1500 IST (Figure 6, which is reflected in the earlier 2.2E mean time of peak diurnal rainfall in the monsoon trough, compared to the satellite observations in Figure 5(b). This difference is most marked over the Indian peninsula, where 2.2E rainfall in the lee of the Western Ghats and inshore from the east coast is between 1200 and 1500 IST, and 2100 to 0000 IST in TRMM. The 8E difference over the peninsula is less pronounced, with the night-time maxima on the east coast extending further inland, and in general more of the subcontinent has later rainfall compared to 2.2E.

Interactions between convection and the monsoon
Having examined the characteristics of the modelled rainfall in section 3.1, we now use these simulations to study the interactions between the moist convection and the monsoon flow. Figure 7 shows how a change in the representation of convection produces a characteristically different monsoon trough, with a deeper trough in the convection-permitting simulations. During the first few days of spin-up, the monsoon trough is too deep in the convection-permitting simulations, but after this period they are in better agreement with driving (i.e. analyses) than the parametrized simulations. After 31 August the parametrized and convection-permitting simulations diverge significantly. After this date, the convection-permitting simulations variability continues to correlate well with driving, but there is a sharp increase in pressure in the parametrized simulations. This divergence is due to the propagation of a documented (Khole and Devi, 2012) LPS, northwest from the Bay of Bengal towards Pakistan, which takes less time to move through the monsoon trough in the parametrized simulations, and accounts for the lower 925 hPa geopotential heights in the parametrized simulations during 29-31 August, as well as rainfall differences in the monsoon trough (Figure 3(b)). Therefore the remainder of our analysis focuses on 22-30 August before the simulations diverge, due to differences in synoptic-scale weather, but after the spin-up of the convection-permitting simulations. Contours in Figure 8 show the location of the monsoon trough as a closed low in 925 hPa height over northern India in 8E, with a gradient of increasing height to the southwest over India, and marked gradients over the the Arabian Sea and Bay of Bengal, which drive the onshore circulation of moist air into India. Colours in Figure 8 show that 8E 925 hPa potential temperatures are, for the most part, 1-2 K higher over land and 1-2 K lower in the Bay of Bengal and Arabian Sea, compared to 8P, which will encourage ventilation of the continent by enhancing the monsoon flow. The exception to higher 8E temperatures over land is in the northwest of the domain (25 • N, 70 • E), which is consistent with advection of cooler oceanic air driven by changes in synoptic-scale flow between the simulations (discussed below), accelerated by the boundary effect of the adjacent highlands of Pakistan, into a region with no orography to impede the flow or cause the condensation of water vapour.
The higher 925 hPa temperatures over land are largely explained by the effect of the change in surface fluxes resulting from explicit convection shown in Figure 9. During the daytime, the land surface in the convection-permitting simulation receives more short-wave radiation (+20 W m −2 mean daily total), as a result of a later peak in clouds and convection (Figures 5(a) and (b)). Changes in net long-wave radiation are smaller (−10 W m −2 ), and so there is greater net surface heating in the convection-permitting simulation (+10 W m −2 ). This actually gives increased sensible and reduced latent fluxes in 8E than in 8P (+15 and −7 W m −2 respectively), with a Bowen ratio greater than 1 from ∼1200 to 1500 IST in 8E, and ∼ 0.5 throughout the day in 8P, indicating a moister surface in 8P. This can be explained by the rainfall in the convection-permitting simulations being both more intense (Figure 4), and later in the day (Figures 5(a) and (b)), resulting in decreased interception of rainfall by the vegetation canopy, greater run-off, and greater penetration into the soil (Best et al., 2011), and since the rain falls after peak insolation, reduced rapid re-evaporation (Birch et al., 2015). 15 W m −2 extra sensible heating in 8E, would correspond to ∼ 0.5 K extra heating for a 2 km boundary layer over 1 day, which is broadly consistent with the magnitude of the differences in 925 hPa potential temperatures in Figure 8. Over the ocean, differences in 925 hPa air temperatures are smaller than over the land, since the SSTs are identical between the simulations, whereas land surface temperatures are free to evolve. Heavier rainfall in 8E over much of the western equatorial Indian Ocean (WEIO) (Figures 2(f) and (g)), with its greater latent heat release, is spatially correlated with the 925 hPa differences in height and potential temperature.
Rainfall differences between the free-running simulations, over both the ocean and the subcontinent, significantly alter the mean low-level pressure distribution and flow into the subcontinent ( Figure 10). As will be discussed, this can be seen most clearly in the region of the black box in Figure 10, which covers part of the Arabian Sea and the west coast of India. There is a deeper monsoon trough in 2.2E and 8E than in 8P, consistent with the greater precipitation in the convection-permitting runs (Figures 10(b) and (c), Figure 7). Over northern India, differences between 2.2E, 8E and 8P (Figures 10(a)-(c)), all have positive/negative dipoles in northern India, which are related to differences in the position of the monsoon trough and rainfall within it, but these are quite localized, with the positive and negative anomalies cancelling each other in the far-field. For this reason, these anomalies due to the shift in location of precipitation features do not influence the continental-scale water vapour convergence budget, compared with the changes in continental-scale gradients. Where 8E rains more than 8P at ∼ 24 • N, 80 • E, there is a relative 8E low of 16 m, whereas the relative 8P rainfall maximum at ∼ 20 • N, 89 • E corresponds to an 8P low of 2 m. In short, areas of higher rainfall in the convection-permitting simulations correspond to much larger height differences. As a result, there is a deeper monsoon trough in 2.2E and 8E than in 8P (Figure 7).
2.2E rainfall over the WEIO is the most realistic, compared to the observations, while 4E, 8E and 8P rain excessively ( Figure 5(d)). Less latent heating through rainfall over the ocean in 2.2E, compared to 8E and 8P (Figures 10(a) and (b)) corresponds to a relative high, which acts to increase the pressure gradient towards the north and onshore, leading to greater southerly flow in the Arabian Sea and onto the west coast of India. 8P rains less than 8E over the WEIO (Figure 10(c)) which will act to increase the land-sea pressure gradient in 8P, and favour an increase in the onshore flow, but 8E has a larger land-sea pressure gradient, as it is the pressure differences over the continent which are dominant in this case.
The differences in the modelled 925 hPa winds are largely consistent with a geostrophic response to these differences in geopotential over land and ocean, with an enhanced southerly cross-equatorial flow (the Somali jet), in the WEIO and Arabian Sea in 2.2E, compared to 8E and 8P, and greater onshore flow in 8E than in 8P. Figures 11(a) and (b) show simulated and observed (radiosonde) vertical profiles of wind at Minicoy (Figure 1), which is in the Indian Ocean, in the region of the largest wind differences. 2.2E is the only simulation with southerly winds below 925 hPa, and has the weakest northerlies at the jet maximum at 850 hPa. All the free-running simulations have too weak westerlies up to ∼400 hPa. It is not clear, from these simulations, what effect the domain has on the wind in the Arabian Sea. Although the enhanced southerly flow in 2.2E is actually further from the observations and analysis than 8P, the direction of the flow suggests it may be restricted by the lateral boundary conditions, and in a larger-domain simulation might give an enhanced southwesterly flow, in better agreement with analyses. The increased ageostrophic wind seen on the west coast of the Indian peninsula in Figure 10(c) (over land ∼ 18 • N, 75 • E) is consistent with a response to the increased land-sea contrast discussed above. However, differences in latent heating from continental rainfall are larger than the effect on surface fluxes, and are likely the dominant mechanism behind the changes in the circulation. To quantify this, in the period 22-30 August, when the convection-permitting simulations have 'spun up', and the models do not diverge due to synoptic events (Figure 7), 8E rains 16 mm more than 8P over the subcontinent, which corresponds to ∼ 47 W m −2 atmospheric heating from rainfall, compared to ∼ 16 W m −2 sensible heating from the surface. Figures 10(a)-(c) show differences between free-running simulations, while Figures 10(d) and (e) show differences between free-running simulations and the analysis. The differences in 925 hPa winds between 2.2E/8P and driving are relatively large, compared to the differences between, for example, 8E and 8P: Compared with driving, 8E and 8P show too strong southerlies coming onshore in from the northwest of the domain, and too weak westerlies and southwesterlies into the southern Indian peninsula and the Bay of Bengal respectively. The free-running simulations also have a northeast to southwest dipole of excess to deficient rainfall in the monsoon trough, which match with the wind differences. Although the differences between the freerunning simulations and the analysis are large, they are similarly large in 2.2E and 8P, compared to the differences between them.
The enhanced low-level monsoon circulation in 2.2E and 8E brings more moisture into the sub-continent, which supports the increased rainfall. Figure 11(c) shows simulated and observed vertical profiles of specific humidity at Minicoy. While there are large differences in the low-level flow over Minicoy (Figure 10), the profiles of specific humidity are very similar. As such, differences in the representation of convection and grid spacing do not, in these simulations, have a large impact on the moisture content of air advected over the Arabian Sea, and the change in the transport of moisture into the subcontinent is determined by changes in the flow, not by moisture content.
In previous work using numerical models, excess rainfall over the WEIO has been found to contribute to a dry bias over India, but the mechanisms by which the rainfall biases are reduced are different to those presented here. Bush et al. (2015) found that increasing the entrainment factor by 1.5 in the WEIO suppresses precipitation there which, unlike in these simulations, increases moisture in the Somali jet, and increases precipitation over the Arabian Sea and Bay of Bengal, just outside the area of increased entrainment, and over central India by a small fraction of the MetUM bias. One theory is that the meridional SST gradient in the WEIO has a large effect on the distribution of precipitation in simulations of the ISM (Bollasina and Ming, 2013). The SST gradient induces low-level wind convergence, and it is the interaction of the model parametrization schemes with this largescale forcing that leads to excess rainfall over the WEIO. In addition to weakening the low-level monsoon flow, Bollasina and Ming (2013) found that excess rainfall over the WEIO induces a Hadley-type circulation which has a descending branch over northeast India/Indochina which, for example, leads to a more gradual onset over India.

Diurnal cycle of surface pressure
The change in convection not only affects the mean synoptic pattern (Figure 10), but its diurnal cycle ( Figure 12). The simulations are compared here to surface station data, as opposed to model analyses, which are significantly affected by their representation of convection. The diurnal cycle of MSLP at any point depends on atmospheric tides, which are global-scale periodic oscillations of the atmosphere (Woolnough et al., 2004), and have a large amplitude in the Tropics (Basu, 2007). However, the effect of tides is fairly consistent across the domain and so differences in the diurnal cycle in SLP between two points, especially those on a similar longitude, are dominated by other processes. Differences in the diurnal cycle of land-sea pressure gradient between the simulations will affect the low-level onshore advection of moisture by the monsoon circulation, which will have important implications over India and the surrounding oceans. Figure 12 shows the simulated and observed diurnal cycle of SLP difference between the monsoon trough and Port Blair (Bay of Bengal) and Minicoy (Arabian Sea). As the SLP for stations above sea level is derived from the measured surface pressure, differences in the magnitude of the pressure gradient are here considered less important than the relative magnitude and timings of the diurnal variation. In both Figures 12(a) and (b), the most negative land-sea pressure gradient is between 1500 and 1800 IST, at the time of peak rainfall over the continent, which matches well with the convection-permitting simulations, as does the least negative pressure gradient in the morning (1100 IST Port Blair gradient, 0700 IST Minicoy). The timings of maxima and minima in the diurnal cycle of land-sea pressure gradient differ much more from the observations in the parametrized simulations: Between the monsoon trough and Port Blair, the most negative land-sea pressure gradient is around 2100 IST, and between Patna and Minicoy it is around 1200 IST; these are too late and too early respectively. In the parametrized simulations, Minicoy (8.3 • N, 73 • E) is in a broad region where the diurnal peak in rainfall ( Figure 6(d)) is at night, whereas at Port Blair (11.6 • N, 92.7 • E), the peak is between 0900 and 1200, with the spatial pattern of the diurnal peak timing around Port Blair appearing to be related to the Andaman and Nicobar island chain. The daytime peak in rainfall in the parametrized simulations at Port Blair means that the diurnal cycle of rainfall there is more similar to the diurnal cycle of rainfall at Patna, giving the flatter diurnal cycle in Figure 12(a) compared to Figure 12(b).
The results are consistent with the late afternoon heating from moist convection in the monsoon trough region driving a decrease in the pressure over land in the convection-permitting simulations, and increasing the pressure gradient. The land-sea pressure gradient is then greatest at night, in agreement with the observations, when the drag effect of continental boundary-layer convection is at a minimum. It shows that the ability of the simulations to capture the diurnal cycle of convection is not only important for radiation and surface fluxes (Figure 9), but also for the dynamical couplings between convection and the larger-scale flows.

Conclusions
Most global climate models have a systematic dry bias over India during the Indian summer monsoon, and a wet bias over the equatorial Indian Ocean. To investigate the role convective parametrization plays in the development of these systematic model biases, convection-permitting simulations with grid spacings of 2.2, 4, 8 and 12 km, and convection-parametrized simulations with grid spacings of 8, 12, 24, and 120 km, are compared with model analyses and satellite and ground station observations. The simulations are of a 3 week period during August and September 2011, with a domain that covers the subcontinent and its surrounding oceans, and captures the monsoon circulation over the subcontinent.
There is more rainfall over the subcontinent in the convectionpermitting simulations, which is more intense and peaks later in the day. The 2.2E convection-permitting simulation gives the best representation of the diurnal cycle, and intensity of continental rainfall, compared to the observations. In general, there is better day-to-day variability in the amount of rainfall over the continent in the convection-permitting simulations. The convectionpermitting simulations rain more, over the subcontinent, than the satellite rainfall retrievals and the parametrized simulations. In the monsoon trough, the convection-permitting simulations show similar amounts to the satellite rainfall retrievals (which have a much lower spread among them than over the whole subcontinent), while the parametrized simulations rain much less. While the convection-permitting simulations rainfall is excessive over land, the difference between them and the parametrized simulations has been used here to examine the effects of parametrized convection on the dry bias over India.
The relationship between rainfall and some other aspects of the Indian monsoon are shown schematically in Figure 13. Higher rainfall over the subcontinent, from more intense convection, increases the pressure gradient in the convection-permitting simulations and, subsequently, the onshore advection of moisture. The later convection in the convection-permitting simulations also leads to greater surface solar short-wave heating, due to reduced cloud cover during the middle of the day, while the higher intensity of rainfall results in a drier land surface because more rainfall reaches the surface, where it can be lost through runoff and penetration into the soil, rather than being intercepted by the vegetation, where it can be evaporated. The greater insolation and sensible heating at the surface contributes to a larger land-sea temperature gradient, which leads to enhanced onshore flow. As a result of the improved diurnal cycle of rainfall in the convectionpermitting simulations over land, the diurnal cycle of the land-sea pressure gradient is improved, and the land-sea pressure gradient is enhanced in the late afternoon and at night, when the drag effect of boundary-layer convection on the synoptic flow is reduced or non-existent.
Rainfall over the equatorial Indian Ocean, through its effect on the onshore pressure gradient, is found to be an important factor in reducing low-level flow and moisture transport into the subcontinent. The 2.2 km convection-permitting simulation rains less than 8P in the WEIO and, among the convectionpermitting simulations, decreasing the grid spacing from 8 to 2.2 km substantially reduces the rainfall over the WEIO, in better agreement with the observations. Reduced rainfall there leads to an increase in the onshore pressure gradient, and as a result there is more southerly geostrophic flow onto the Indian peninsula from the WEIO (Figure 13). However, it is difficult to say how the western boundary of the model domain affects these flow differences. The observed and simulated vertical profiles of specific humidity within these large flow differences do not differ greatly, compared to the wind differences; it is the strength of the monsoon circulation, and not the moisture content of the flow, that is important in reducing biases in the transport of moisture into the Indian subcontinent. It is possible that in a larger-domain simulation, which would include the cross-equatorial Somali jet circulation, reduced rainfall over the WEIO would enhance that flow (which may also become moister) rather than the southerly flow shown here.
After the first 4 days of the simulation, when the convectionpermitting simulations have spun-up and are adjusted to their preferred atmospheric state, they capture the time evolution of the monsoon trough depth for the remainder of the simulated period (22 August to 7 September), whereas the monsoon trough in the parametrized simulations is generally not deep enough. The propagation of a LPS from the Bay of Bengal northwest along the monsoon trough, in the second half of the simulated period, causes significant divergence between the convection-permitting and parametrized simulations, as seen in the monsoon trough 925 hPa height; the convection-permitting simulations capture the daily variability in the analysis, but the height increases significantly in the parametrized simulations. The divergence appears to be related to differences in the speed of propagation of the LPS in the free-running simulations, with it taking less time to propagate northwest in the parametrized simulations. If models that parametrize convection consistently exhibit a similar bias in the propagation of LPSs, this could contribute to a systematic dry bias in parametrized convection simulations over the subcontinent, and would also have an effect on the onshore moisture transport through a weaker land-sea pressure gradient. Further work is needed to determine if there is a systematic bias in the propagation speed of LPSs in the Indian monsoon trough as a result of a convective parametrization.
(a) (b) (c) Figure 13. Schematic over India and the western equatorial Indian Ocean illustrating rainfall, 925 hPa height (contours), wind (arrows) and temperature. (a) shows the 8P 925 hPa mean height structure, while (b) and (c) show the respective height anomalies of (b) 8E and (c) 2.2E from 8P. Wind and rainfall are similarly relative to 8P. Darker grey represents more rainfall, with more rainfall coming from more intense events. Darker rain represents more rainfall, with more rainfall coming from more intense events. Darker/lighter land or ocean in the relative panels represents warmer/cooler 925 hPa potential temperatures, compared to 8P.
The convection-permitting simulations have their own biases, and in some respects perform worse than the parametrized simulations, particularly at coarser grid spacings. All the simulations overestimate rainfall over the Himalayas and the orography of the Myanmar coastline, and underestimate rainfall over the Western Ghats. They also fail to capture the broad spread of rainfall over the Bay of Bengal. 2.2E rainfall over the Indian Ocean is comparable to TRMM, but as grid spacing increases, rainfall in the convection-permitting simulations becomes increasingly excessive, while there is little effect due to grid spacing in the parametrized simulations, which have rainfall amounts comparable to TRMM.
The MetUM, in common with many models, has had a longstanding dry bias over India during the monsoon. The results show that an explicit representation of convection affects the entire monsoon circulation, increasing rainfall in the monsoon trough region, and improving key aspects of the circulation such as the magnitude and diurnal cycle of pressure gradient from the oceans to the continent.
We conclude that it is important for any parametrization of convection to capture its diurnal cycle, and give an improved representation of rainfall intensities over the Indian subcontinent and the western equatorial Indian Ocean, if they are to give a realistic coupling between convection and the monsoon. the European Commissions 7th Framework Programme, under grant agreement 282672. Thanks go to Simon Peatman for providing code to compute the diurnal harmonics of rainfall. We acknowledge the TRMM mission scientists and associated NASA personnel for the production of the TRMM data used in this article and are grateful to the Goddard Earth Sciences Data and Information Services Center (GES DISC) for making the data available. The CMORPH data were obtained from NOAA CPC, from their website at ftp://ftp.cpc.ncep.noaa.gov/precip/global CMORPH/ (accessed 14 January 2017). The GSMaP Project was sponsored by JST-CREST and is promoted by the JAXA Precipitation Measuring Mission (PMM) Science Team, and the GSMaP products were distributed by the Earth Observation Research Center, Japan Aerospace Exploration Agency. The radiosonde data were obtained from NOAA NCEI, from their website at https://www.ncdc.noaa.gov/data-access/weather-balloon/ integrated-global-radiosonde-archive (accessed 14 January 2017). Thanks to the British Atmospheric Data Centre, which is part of the NERC National Centre for Atmospheric Science (NCAS), for access to MIDAS surface station data. Finally, we thank two anonymous reviewers for their comments, which improved the article.