On‐ and off‐line evaluation of the single‐layer urban canopy model in London summertime conditions

Urban canopy models are essential tools for forecasting weather and air quality in cities. However, they require many surface parameters, which are uncertain and can reduce model performance if inappropriately prescribed. Here, we evaluate the model sensitivity of the single‐layer urban canopy model (SLUCM) in the Weather Research and Forecasting (WRF) model to surface parameters in two different configurations, one coupled to the overlying atmosphere (on‐line) in a 1D configuration and one without coupling (off‐line). A two‐day summertime period in London is used as a case study, with clear skies and low wind speeds. Our sensitivity tests indicate that the SLUCM reacts differently when coupled to the atmosphere. For certain surface parameters, atmospheric feedback effects can outweigh the variations caused by surface parameter settings. Hence, in order to fully understand the model sensitivity, atmospheric feedback should be considered.


INTRODUCTION
More accurate urban weather forecasts for human thermal comfort, air quality, energy demand and wind in urban areas are necessary for the vast urban population, which is expected to be 66% of the world's population by 2050 (UN, 2014). These forecasts require a good understanding of the physical and chemical processes in the urban boundary layer (UBL) (Barlow, 2014).
Understanding the turbulent transport is essential for correct quantification of the surface energy balance in urban areas (Pigeon et al., 2007). Building height, shape and materials are linked to different surface properties (e.g., albedo, thermal conductivity and heat capacity). Being able to correctly model the physical processes in the urban environment is dependent on the adequate representation of surface properties (Grimmond et al., 2011;Best and Grimmond, 2015). For example, short-wave radiation at building facets is governed by albedo, and radiation trapping in the urban canopy varies with canyon morphology and surface emissivity (Best and Grimmond, 2015). Moreover, surface albedo, emissivity, heat capacity and thermal conductivity of building materials have a direct influence on heat storage in the urban fabric during the daytime and its subsequent release into the atmosphere at night.
To provide forecasts for urban environments, numerical weather prediction (NWP) models usually utilize a so-called urban canopy model (UCM) (Masson, 2000;Martilli et al., 2002;Best, 2005;Chen et al., 2011a). UCMs incorporate the subgrid-scale physical processes of the urban environment and the complexity and heterogeneity of the urban surface. The required surface parameters remain a challenge to define appropriately. While mean building height and other morphological parameters are generally well represented (e.g., Kent et al., 2018), other surface properties such as albedo, emissivity, thermal conductivity and heat capacity are very uncertain. If the parameter settings do not accurately represent the urban environment to be modelled this can hinder model performance Grimmond et al., 2011).
Recent studies have investigated the sensitivity of UCMs to these surface parameters, but mainly in off-line mode (no coupling with the overlying UBL). For example, Loridan et al. (2010), Wang et al. (2011) and Zhao et al. (2014) conducted comprehensive analyses on the effects of surface parameters in Noah-SLUCM off-line. They found that the accuracy of modelled surface energy balance fluxes depended heavily on having correct values for the urban surface parameters. Similar conclusions were reached by the PILPS-Urban intercomparison experiment , where various urban surface parametrization schemes were evaluated. They concluded that the complexity with which UCMs parametrize the physical processes in the urban environment is less important than the correct prescription of the surface properties.
Although off-line modelling studies are valuable for improving UCMs, they use a simplified test setting in which feedback mechanisms with the overlying atmosphere are not accounted for. Coupling of UCMs to the overlying atmosphere is thus essential for understanding the effects of surface parameters on model performance and can assist in the selection of surface parameters that lead to more robust results for specific case studies and for future forecasting purposes (Song and Wang, 2015).
Interactions with the overlying atmosphere in an NWP model can be coupled to a surface model either in a full 3D set-up or in a 1D set-up (single-column model; SCM). Many studies have investigated the effects of surface parametrization on boundary-layer representation (e.g., Pigeon et al., 2007;Flagg and Taylor, 2011;Ferrero et al., 2018), surface energy balance (e.g., Pigeon et al., 2007;Loridan et al., 2013;Demuzere et al., 2017), urban heat islands (e.g., Miao et al., 2009;Bohnenstengel et al., 2011;Nemunaitis-Berry et al., 2017;Ronda et al., 2017) and other mesoscale phenomena (e.g., Chen et al., 2011b). The majority of the studies use a 3D set-up, which limits the ability to link the model response to changes in surface parameters and investigate feedback mechanisms in detail. To circumvent this drawback, the SCM set-up can be used (e.g., Song and Wang, 2015;Nemunaitis-Berry et al., 2017), which allows for a detailed investigation of the model response to changes in surface parameters, while at the same time having a much lower computational cost than the 3D set-up.
Evaluation of a UCM (off-line, on-line 1D or on-line 3D) response to changes in surface parameters may yield different results depending on the atmospheric conditions imposed and the strength of the atmospheric feedback mechanisms. On-line analysis should provide more realistic results of the sensitivity of the surface energy balance to changes in surface parameters, but it needs to be used in combination with an off-line analysis to identify the effect of feedback mechanisms. To our knowledge, only a couple of studies present such a sensitivity analysis (e.g., Loridan et al., 2013;Nemunaitis-Berry et al., 2017). Hence, to date, the effects of the feedback mechanisms on the surface energy balance have not yet been sufficiently quantified. It is not clear whether the effect of the on-line coupling can outweigh the effects of changes in surface parameters, or under what conditions the off-line evaluation is sufficient to understand the model behaviour.
To investigate the effects of the feedback mechanisms on model performance, both an off-line version of the Noah-SLUCM model and an on-line version, WRF-SCM-Noah-SLUCM, are used. The case study set-up and forcing (section 2), as well as the methodology and model set-up (section 3), are described. Model evaluation (section 4), atmospheric feedback mechanisms (section 5) and the sensitivity to changing surface parameters (section 6) are presented prior to discussion (section 7) and conclusions (section 8).

DESCRIPTION OF CASE STUDY
Off-line and on-line models are evaluated, using observations in the dense city centre of London, UK, at the King's College London measurement site (KSS, renamed to KSSW from 4 April 2012 onwards; Figure 1). Air temperature, wind, radiation and surface fluxes measured at 50 m above ground level are used (for details see Kotthaus and Grimmond, 2014a). The mixed-layer height (MLH) for central London is derived from ceilometer measurements at the MR site ( Figure 1; Kotthaus and . Comparing MLH results for two sites within central London at a distance of 4 km apart (MR and NK, Figure 1),  found spatial variations of the MLH to be mostly within the uncertainty of MLH measurements. The area around the KSS site has been extensively described by Kotthaus and Grimmond (2014b), including bulk albedo, urban land cover fractions and roof/road fractions. Unmeasured parameters are assigned based on the literature, including (a) thermal conductivity and heat capacity for the buildings in the study area, which are based on existing literature regarding the material composition of buildings in central London (Oikonomou et al., 2012) and building material thermal properties (Engineering Toolbox, 2010), and (b) emissivities based on Bohnenstengel et al. (2011). All the aforementioned derived parameters are used instead of the Chen et al. (2011a) high-density residential "default" values (Table 1).
To evaluate the model response to surface parameter settings, "simple" atmospheric conditions are chosen (i.e., avoiding clouds, rainfall and high wind speed). Following a detailed evaluation of a long-term measurement period (2010)(2011)(2012)(2013)(2014)(2015) at KSS (re-named to KSSW after 4 April 2012), a two-day period (July 23-25, 2012) was selected. During these days a moderate heat wave (nearly 30 • C) occurred in London, with relatively low wind speeds (<6 m/s at 50 m), no clouds or rain and moderate temperature advection.
KSSW observations of air temperature, wind speed, wind direction, pressure and short-and long-wave downward radiation fluxes at 50 m above ground level are used to drive the off-line model. For the SCM set-up, initial profiles of wind, potential temperature ( ), moisture and surface pressure are prepared from the nearest radio soundings (UWYO, 2012) taken at 0000 UTC at Herstmonceux, Hailsham, UK (nearly 70 km southeast of London) and the KSSW measurements. Large-scale forcing for wind and geostrophic wind is derived from the European Centre for Medium-Range Weather Forecasts (ECMWF) operational reanalysis data (ECMWF, 2012) in combination with a Weather Research and Forecasting (WRF) simulation conducted for the case study period. Temperature and moisture advection are derived from WRF3D model simulations for London and synthesized with in situ measurements of surface advection from World Meteorological (WMO) stations (NOAA/NCDC, 2012) in and around London. Initial soil temperature and moisture profiles for both set-ups were taken from the WRF 3D simulation (spun up for 12 hr) and then cycled 3 × 2 days in the off-line set-up, until the deeper soil temperature became constant and the ground heat flux showed a similar daytime range for both days of the case study. A detailed description of the case study can be found in the call for participation for the Single-column Urban Boundary Layer Inter-comparison Modelling Experiment (SUBLIME) of Steeneveld et al. (2017).

Model set-up
In both the on-line and off-line cases, the urban surface is represented using the single-layer urban canopy model (SLUCM) (Kusaka et al., 2001;Chen et al., 2011a) and the Mellor-Yamada-Janjić (MYJ) scheme (Janjić, 1994) is used for surface layer parametrization. For the land surface the Noah land surface model (Noah-LSM) version 3.4.1 Chen and Dudhia (2001) is used. The on-line simulation uses the WRF model version 3.8.1 (Skamarock et al., 2008) in a single-column format. The 2.5-order MYNN (Mellor-Yamada Nakanishi Niino) scheme (Nakanishi and Niino, 2009) is used as boundary-layer parametrization, which is well tested in combination with the SLUCM and the MYJ surface-layer scheme. For both short-wave and long-wave radiation RRTMG (Rapid Radiative Transfer Model for GCMs) schemes (Iacono et al., 2008) are applied. As no cloud or rain occurred during the case study period, the simple WSM third-class scheme of Hong et al. (2004) is chosen for the microphysics representation. Both models run for 54 hr in total, starting at 0000 UTC on July 23, 2012 and finishing at 0600 UTC on July 25, 2012. For the model evaluation (section 4) the 54-hr period is used. For the sensitivity analysis the first 6 hr are considered as spin-up time to allow for surface parameter changes to have an effect on the model response. The off-line set-up is forced at each model time step (30 min) with the observations (section 2). The on-line set-up uses initial soil moisture and temperature profiles up to a depth of 1.5 m and atmospheric profiles (moisture, and wind) up to 17 km. The advection forcing is prescribed in the 100-250 m layer and then linearly decreased to 0 at the surface and at 1,000 m. Potential temperature, moisture and the momentum advection tendency term are applied at each time step (30 s) of the on-line model and changes every 6 hr. Geostrophic wind is prescribed in the initial time step above 1 km and is then evolving in time via a tendency term that is applied each time step for all model levels. A more detailed description of the forcing data for the SCM can be found in Steeneveld et al. (2017). C wall (J m −3 K −1 ) 0.30 × 10 6 2.40 × 10 6 0.30 × 10 6 1.50 × 10 6 C road (J m −3 K −1 ) 0 .30 × 10 6 2.40 × 10 6 0.30 × 10 6 1.50 × 10 6

Urban canopy model description
The single-layer urban canopy model (SLUCM) of WRF (Kusaka et al., 2001;Chen et al., 2011a) is coupled to the Noah land-surface scheme. It uses a tile approach, where one tile represents the urban surface handled by the SLUCM and vegetation is handled by Noah-LSM. Surface radiation and turbulent fluxes are calculated for each tile and averaged according to land cover fractions. The same holds for surface temperatures and surface albedo. The SLUCM urban morphology uses a 2D canyon approach, without street orientation or varying building heights. The urban tile is split into three facets (roof, wall and road), each receiving a normalized contribution of the total urban fluxes. The plan area consists of roof (F roof ) and road (F road ). The wall fraction (F wall ) is calculated as (1) Here z h is the mean building height (in m) and W roof and W road are the roof and road widths (in m). Each surface flux (Q, W/m 2 ) is calculated as (Kusaka et al., 2001) where the total flux from the surface (Q total ) is based on the urban fraction f urb and the vegetation fraction (1 − f urb ) and their respective fluxes (Q urb , Q nat ). The sky-view factor of each facet regulates the amount of radiation (shortand long-wave) received. Turbulent sensible heat (Q H ) and moisture (Q E ) fluxes from each facet are given by where is the air density (kg/m 3 ), C P is the specific heat capacity of dry air (J kg −1 K −1 ) and C H and C E are the exchange coefficients for heat and moisture. Δ (K) is the potential temperature gradient between the surface and the air. L V is the latent heat for vaporization (J/kg). U a is the wind speed (m/s) and q skin is the specific humidity at the surface (kg/kg), while q air is the specific humidity in the atmosphere. Each of the facet's fluxes are calculated separately and then averaged proportionally to the percentage contribution of each facet (namely F roof , F road and F wall ). An anthropogenic heat flux of 38 W/m 2 (with a diurnal profile), based on yearly estimates for central London (Dong et al., 2017), is added to the first model level and incorporated into the sensible heat flux term. Loridan et al. (2010) found that some of the most important surface parameters (based on their effect in model performance) are: (a) urban fraction (f urb ); (b) albedo of roof ( roof ), wall ( wall ) and road ( road ); (c) thermal conductivities ( roof , wall and road ); and (d) heat capacities (C roof , C wall and C road ) for each facet. To test the response in both model set-ups these parameters were varied individually (Table 1), while others were kept at their respective default values.

Normalization of fluxes
While the on-line model performs quite well, when default parameters are used (see section 4) some deviation remains between the modelled and observed meteorological values.
To account for the small differences in the atmospheric conditions between both set ups, we follow the approach of Loridan and Grimmond (2012a) and normalize all energy fluxes with the total incoming radiation HereQ is the normalized flux, Q is the original flux in W/m 2 , SW down is the downward short-wave radiation flux and LW down is the downward long-wave radiation flux, both in W/m 2 . Representing radiation and turbulent fluxes as a fraction of the total incoming radiation allows us to demonstrate how changes in surface parameters affect the distribution of incoming radiation between different components of the surface energy balance. Given that there are no clouds in the case study period, the daily average of incoming radiation over these two days remains almost constant. However, in the on-line SCM cases, changes in surface properties can affect the atmospheric temperature, which alters LW down as a result. The overall change in LW down due to surface changes is <8 W/m 2 for the 54-hr period (i.e., a 0.9% change for the mean total incoming radiation in the daytime and a Following Best and Grimmond (2015), all fluxes are analysed based on the sign of the net radiation (Q * ), with positive Q * indicating daytime and anything else signifying night-time. This means that the different responses of the variables are tested under strong and weak turbulent regimes and the role of each surface parameter in the daytime or night-time energy balance is investigated.
The mean incoming radiation flux is approximately 922 W/m 2 during the daytime and 355 W m 2 during night-time in the on-line set-up. For simplicity, conversions from normalized to actual values (as presented in section 6) use these values for incoming radiation, despite small variations in the response of the latter to changes in surface parameters (as discussed). The off-line modelled (and observed) mean incoming radiation fluxes are 905 W/m 2 (daytime) and 357 W/m 2 (night-time), and do not change with varying surface parameters.

MODEL EVALUATION
Model runs using the default parameter values are used as reference runs. The variables evaluated are Q * , SW down , LW down , turbulent sensible (Q H ) and latent (Q E ) heat fluxes, air temperature (T air ) and wind components (U and V) at 50 m and the MLH. The SW down and LW down fluxes and 50 m temperature and wind speed are the observations that force the off-line model. Storage heat flux (ΔQ S ) is not directly observed. Although it could be derived as the residual of the surface energy balance (as in Kotthaus and Grimmond, 2014b), it is not analysed here given the accumulation of errors (Grimmond and Oke, 1999) and the potential mismatch between the measurement footprints of the turbulent fluxes and the radiation fluxes (Schmid et al., 1991). The modelled SW down (Figure 2b) has a mean bias of 10.8 W/m 2 for the two days, with a maximum bias of 37.0 W/m 2 around noon/afternoon. One source of this bias could be the lack of sufficient aerosol loading in the model. Observational uncertainty, such as dust accumulated on the radiation sensor, could also cause a decrease in the short-wave radiation signal measured by the instrument. Given the expected radiometer accuracy of ±10% of the measured values (Kotthaus and Grimmond, 2014a), it is difficult to identify the source of this deviation. The on-line LW down radiation is underestimated by up to −14.0 W/m 2 , with a mean bias of −4.0 W/m 2 . The reason for this underestimation could be a cold bias in the vertical profile of temperature (compared to observations) or a lack of sufficient water vapour in the atmosphere. These model biases are most likely linked to a combination of inaccurate forcing and deficiencies in the parametrization scheme, according to Kleczek et al. (2014).
The on-line case generally underestimates Q * during the daytime up to −49.8 W/m 2 , while at night there is a small overestimation by up to 11.0 W/m 2 , leading to a mean bias of −13.2 W/m 2 (Figure 2a). Off-line Q * has daytime and night-time peak biases of −66.0 W/m 2 and 13.1 W/m 2 , respectively, resulting in a slightly larger mean bias of −19.6 W/m 2 . These Q * model biases are similar in magnitude to those reported by Loridan et al. (2013) in their off-line and on-line modelling studies for the KSS site. One source of error for the daytime modelled Q * is the positive bias in modelled SW up (up to 37.1 W/m 2 ) compared to the observed SW up . This indicates a potential overestimation in the bulk surface albedo, which could be caused by a mismatch between the model description and the realistic physical description of the urban canopy in the radiometer source area. As stated by Kotthaus and Grimmond (2014b), the effective albedo at this site varies both with solar elevation and azimuth angle (i.e., it changes in time). This effect is not captured in the model and could explain some of the error. The reflection effect in Q * is more evident between 1000 UTC and 1600 UTC. Another constant error in Q * is linked to lower LW down . Finally, the long-wave upward flux (LW up ) calculated from the model contributes around 20 W/m 2 of the negative bias during the daytime, while it also causes Q * to increase by 15 W/m 2 during night-time.
All modelled turbulent fluxes have substantial deviations from the observed values. On-line daytime Q H is usually overestimated (maximum bias of 150 W/m 2 ) and underestimated at night-time (by up to −44 W/m 2 ). Overall the mean overestimation is 35.5 W/m 2 for the two days. The off-line case has lower bias, with a mean overestimation of 32.7 W/m 2 for the whole period. The observed fluctuations in Q E throughout the case study period make the evaluation of the modelled Q E challenging. Previous studies for the KSS area from Loridan et al. (2013) and Kotthaus and Grimmond (2014b) have reported similar variability in the measured Q H and Q E .
According to Kotthaus and Grimmond (2014a), variability in the turbulent transport near the KSS station could explain some of the variability in the observed fluxes. For wind directions 165 • to 205 • , which predominate in this case study, they found that the observed Q H tends to be substantially lower than for other wind directions. Changes in the source area or the generally lower friction velocity due to the River Thames might have contributed to a reduction in the Q H flux. Generally the source of Q E for this site is primarily from past rainfall, as the vegetation fraction is relatively low and the overall contribution from the River Thames (Kotthaus and Grimmond, 2014a) is not significant. During dry days, such as in this study period, the uncertainty of the small Q E flux is large. Moreover, the eddy covariance uncertainty for the individual 30 min values should be handled with care.
On-line modelled air temperature (at 50 m) follows well the diurnal cycle seen in the observations (Figure 3a). A cold bias is observed around noon (around −1.0 K) followed by a warm bias of equal magnitude late in the evening. The mean temperature bias is 0.2 K. The deviation between the modelled and observed temperatures could have many sources linked to the surface energy balance or larger-scale temperature advection. Underestimation of modelled Q * or lower anthropogenic heat flux might also explain this. U and V wind components have mean biases of 0.20 m/s and 0.04 m/s (Figure 3b), respectively. Changes in wind speed and wind direction are captured well by the model with one exception. Between 39 hr and 45 hr in the model simulation a sudden change of wind direction occurs. This is linked to a sea-breeze intrusion over London (also found by Coceal et al., 2018). The SCM is unable to capture this event because the momentum advection is imposed in 6-hr blocks.
The diagnosed MLH top (Figure 3c) in the on-line model is similar to observations on the first day of the case study, but is underestimated by up to 500 m around 36-44 hr into the simulation. Overestimations in the diagnosed MLH occur primarily during late afternoon and early morning (up to 300 m). The mean underestimation for the two days is 28 m.
The underestimation during the second day is likely caused by the prescribed negative temperature advection, which is imposed up to a height of 1,000 m during the daytime. This could potentially impose an artificial inversion, which could be diagnosed as the MLH by the model.

FEEDBACK MECHANISMS BETWEEN THE SURFACE AND THE OVERLYING ATMOSPHERE
Feedback mechanisms between the surface and the boundary layer may be negative or positive due to the interdependencies between variables (Figure 4). Here, positive feedback is considered when the increase/decrease in the actual value of a variable leads to the increase/decrease of another variable. The opposite effect is considered negative feedback.
An example of negative feedback is when an increase in Q H directly increases and the entrainment rate at the MLH top. Higher reduces the gradient to surface skin temperature (lower Δ ) and thus drives down Q H (Equation 4). The entrainment of air with higher could enhance Q H , but the heating due to entrainment is primarily directed towards increasing the MLH top rather than the near-surface . Moreover, the increase in will also increase LW down and therefore Q * and Q E , as relative humidity is reduced and evapotranspiration is thus enhanced.
Another feedback mechanism also exists between the increase of the MLH and the consequent entrainment of drier air in the mixed layer, which reduces q air and increases Q E (van Heerwaarden et al., 2009).
Moreover, any variation of a flux due to atmospheric feedback mechanisms can also affect the distribution of Q * to the other fluxes, thus creating an indirect effect. For instance, an increase in the skin temperature of the urban area will decrease Q * due to higher LW up and thus impact the turbulent heat fluxes as well.

The urban fraction
Sensitivity to the urban fraction (f urb ) impacts the total surface energy balance ( Figure 5). The observed normalized daytime Q * is 0.403 (or 366 W/m 2 ). The off-line value decreases from 0.364 (329 W/m 2 ) to 0.346 (313 W/m 2 ) as f urb increases. The on-line set-up decreases slightly faster from 0.378 (348 W/m 2 ) to 0.351 (324 W/m 2 ). The decrease of Q * with increasing urban fraction is caused by the increased short-wave absorption from the slightly lower urban albedo (0.15 vs. 0.16 for vegetation) and from the increased LW up in response to higher T skin (around 7.5 K off-line and 5.0 K on-line) for the urban compared to the vegetated surface. The more rapid decrease in on-line normalized Q * compared to the off-line version is due to the more rapid increase in normalized LW up ( Figure S1g), a result of the increase in the on-line facet temperatures (by up to 1.2 K) with increasing urban fraction. The increased on-line facet temperatures are caused by a faster increase in ΔQ S in the on-line set-up, which is linked to the slower response of Q H because of the increase in (see section 5 and Figure 4). In the off-line set-up the facet temperature does not vary with changing f urb since atmospheric does not vary. The same mechanism explains the deviations between the on-line and off-line set-ups for night-time Q * .
Daytime normalized Q H ( Figure 5) increases with increasing f urb . The on-line normalized Q H ranges between 0.185-0.228 (an increase of approximately 39 W/m 2 ), while  (Figure 5e). The on-line Q H variation is 21% smaller than the off-line one. The slower increase in Q H is attributed to the faster decrease of Q * in the on-line set-up and the increase in near-surface with increasing f urb , which partially offset the increase of Q H due to the higher skin temperature (as indicated above). This is confirmed by the slower increase in the near-surface air temperature gradient (5.7-6.7 K) in the on-line set-up compared to the off-line one (8.3-11.0 K). An increase in Q H with increasing f urb is also reported by the modelling study of Loridan et al. (2013) and the observation study of Kotthaus and Grimmond (2014b). However, both also show that Q H is not only dependent on f urb but also on the difference in urban morphology found in the measurement source area for different wind directions, which makes the response of Q H nonlinear to the urban fraction.
Daytime ΔQ S undergoes a strong linear increase with f urb . In the on-line case ΔQ S ranges from 0.107 (99 W/m 2 ) to 0.155 (142 W/m 2 ), while in the off-line case it ranges from 0.118 (107 W/m 2 ) to 0.158 (143 W/m 2 ). This value range is very similar to the findings of Loridan et al. (2013) for London. The increase in ΔQ S is a direct response to the decrease in evaporation due to the lower vegetation fraction. A small variation in the response of ΔQ S exists between the two set-ups, indicating that the slower increase of the skin-to-air temperature gradient in the on-line set-up is increasing the heat transfer rate to the urban fabric.
Finally, Q E decreases from 0.10 (92 W/m 2 ) to zero as the vegetation fractions decrease, as the SLUCM scheme does not have integrated vegetation and because the green roof and anthropogenic latent heat options were both switched off for this experiment.

Albedo
Albedo governs the energy absorption of SW down and thus affects Q * and all surface energy fluxes.
The a roof has the strongest impact on normalized daytime Q * (Figure 6a), which varies from 0.316 to 0.391 in the on-line case. This is effectively a difference of 69 W/m 2 in Q * . The impact of a wall is limited to 0.035 (33 W/m 2 ) and that of a road only to 0.011 (10 W/m 2 ). The importance of a roof over a wall and a road , to the correct estimation of Q * , is consistent with the findings of Loridan et al. (2010) and Zhao et al. (2014). Both indicate that the SLUCM urban morphology makes the roof facet albedo critical for absorbed short-wave radiation, while the wall and road facets are less important for Q * . The difference in the Q * response between on-line and off-line only occurs for a roof . This is a small increase (0.005 of normalized Q * ) with higher albedo (Figure 6a).
Three mechanisms lie behind these differences in Q * . As albedo increases, LW down decreases (up to 7 W/m 2 ) due to the lower air temperature and moisture content in the mixed layer. On the other hand, SW down increases (up to 7 W/m 2 ) because of increased SW up , which is reflected by the atmosphere back towards the surface ( Figure S2a). This mechanism is included in the RRTMG short-wave scheme used. It is based on the two-stream approach of Oreopoulos and Barker (1999). This finding demonstrates the need to include atmospheric feedback in the model during sensitivity tests, and also that multiple reflection radiation schemes (e.g., Iacono et al., 2008) should be selected over the simpler radiation schemes (e.g, Dudhia, 1989) during optimization of albedo values. On-line LW up decreases faster than the off-line one with increasing albedo because of the more rapid decrease in the facet's skin temperature in the on-line model ( Figure S2g). This feedback is linked to the decrease in near-surface with increasing albedo.
Effectively, only changes in a roof affect the on-line Q H , with the normalized values varying from 0.229 (211 W/m 2 ) to 0.173 (159 W/m 2 ) (Figure 6c). The off-line simulation ranges from 0.225 (204 W/m 2 ) to 0.153 (138 W/m 2 ). Both have a strong linear response, as Zhao et al. (2014) found in their off-line SLUCM study. The decrease of Q H is primarily caused by the decrease of Q * as albedo increases. A difference of 0.016 (15 W/m 2 ) in the Q H response can be seen between the on-line and off-line model results for changing albedo. As explained in section 5, the negative feedback mechanism between and Q H (see Figure 4) limits the increase of on-line Q H caused by albedo. Moreover, due to higher Q H there is more entrainment at the MLH top, which increases but most likely has a minimal effect on the near-surface temperature since the heating from the entrainment is used to increase the MLH. Furthermore, the increase in evaporation (Figure 6g), which is linked to atmospheric feedback, is a limiting factor for the increase of Q H in response to decreased albedo. This feedback results in a 21% lower variation in Q H in the on-line case compared to the off-line one. Night-time Q H has very small dependency on daytime albedo.
Variations in the modelled normalized ΔQ S during daytime show a maximum range of 0.015 (14 W/m 2 ) of the mean daytime downward radiation. Both cases show a similar sensitivity of ΔQ S for wall and road albedo. However, between the off-line and on-line cases there is an increasing difference (up to 0.013) in the ΔQ S flux for higher a roof . The most likely explanation lies in the different energy partitioning between Q H and Q E , which moderates the ΔQ S flux (see Figure 4). The decrease in evaporation in particular could lead to higher ΔQ S fluxes, an effect also suggested by Loridan et al. (2013). Roof and wall albedo are also contributing to variations of up to 0.010 (3.5 W/m 2 ) in ΔQ S during night-time. However, this difference is small compared to the mean night-time ΔQ S values (60-65 W/m 2 ).
Atmospheric feedback also influences the response of the Q E flux (Figure 6g) to changes in the facet's albedo. The off-line case has no variation in Q E because the atmospheric forcing is fixed and the Noah model calculates the surface energy balance separately for vegetation and urban tiles. In the on-line case, the feedback mechanisms alter the normalized Q E from the vegetation tile. Two atmospheric feedback mechanisms that affect evaporation are described in section 5. From their combined effect we find a decrease of 0.05 (∼5 W/m 2 ) for the normalized mean daytime Q E for increasing albedo, nearly 11% of the mean daytime value. Thus, it is essential to include atmospheric feedback during the evaluation of urban surface models, especially without integrated vegetation.

Thermal conductivity
Thermal conductivity changes the way energy is distributed in the urban fabric and the amount of emitted LW up . Daytime and night-time Q * (Figure 7a,b) do not show any strong variation to changes in roof and road . A noteworthy variation is the rate of change of Q * during night-time. It decreases sharply for low thermal conductivities (0.15-0.45 J K −1 m −1 s −1 ), while above 0.45 J K −1 s −1 a saturation effect or even a reversal of the slope (for the wall facet) occurs. The response of Q * is linked to the LW up radiation from the surface and consequently the skin temperature of the facets. In both wall and roof facets the increase of conductivity up to 0.60 J K −1 m −1 s −1 results in more energy being stored in the facets during the daytime, which also increases their night-time skin temperature (by about 1 K), resulting in an increased LW up flux ( Figure S3h). For higher conductivity values, the heat stored in the facet during the daytime does not increase as quickly, because it is limited by the heat capacity of the facet. Thus, the energy loss at night-time outweighs the energy gain during the daytime for conductivity values above 0.60 J K −1 m −1 s −1 , leading to a decrease in the facets' temperatures.
The increase in thermal conductivity also increases normalized ΔQ S (Figure 7e). Variations in ΔQ S due to changes in roof range from 0.128 (118 W/m 2 ) to 0.156 (143 W/m 2 ) for the on-line case, and 0.134 (121 W/m 2 ) to 0.165 (149 W/m 2 ) for the off-line case. At night, variation in ΔQ S ranges from −0.154 (−55 W/m 2 ) and −0.160 (−57 W/m 2 ) to −0.171 (−61 W/m 2 ) and −0.177 (−62 W/m 2 ) for the on-line and off-line set-ups, respectively. A similar response is found for increasing wall . Above 0.45 J K −1 m −1 s −1 , normalized night-time ΔQ S decreases, because the heat stored in the facet during the daytime is not enough to compensate for the loss during night-time (as explained above).
Normalized daytime Q H (Figure 7c) is decreasing by 0.019 (∼17 W/m 2 ) for the on-line model (from 0.212 to 0.193) while for the off-line model the decrease is 0.025 (∼23 W/m 2 ). This decrease is linked to the increase in ΔQ S (see Figure 7e) due to faster energy transmission to the urban fabric resulting in lower skin temperatures and smaller Q H flux as a result (see the feedback in Figure 4). The difference in the response of normalized Q H between the two set-ups is 0.006 (∼5 W/m 2 ) and is attributed to the smaller variation in the skin-to-air temperature gradient in the on-line set-up compared to the off-line one.

Heat capacity
Much like thermal conductivity, heat capacity alters the amount of emitted LW up and the energy partitioning at the surface.
The response of daytime normalized Q * to the heat capacity of the facets is the same as that observed for the conductivity experiment. However, during night-time some changes occur. Instead of abrupt changes in Q * , there is a more gradual change with increasing heat capacity (Figure 8b). For changes in C roof , night-time Q * ranges from −0.169 (59 W/m 2 ) to −0.195 (−69 W/m 2 ) for the on-line case. A similar range of variation is seen for the off-line case, which shows a higher Q * compared to the observed value of −0.19 (67 W/m 2 ). The levelling of Q * above C wall of 1.5 × 10 6 J m −3 K −1 is related to the nearly constant LW up flux ( Figure S4h), which indicates that there is no significant variation in the temperature increase of the facets past 1.5 × 10 6 J m −3 K −1 , because the fixed value for wall is limiting the transfer of heat to (and from) the facet, thus reducing model sensitivity at higher C wall . This interdependency between thermal conductivity and heat capacity is in agreement with the conclusions of Loridan and Grimmond (2012b), that when optimizing surface parameters in UCMs, combined effects need to be taken into account when altering surface parameter values.
Normalized daytime Q H decreases by up to 0.015 (∼14 W/m 2 ) due to changes in C roof and C wall . The decrease of daytime Q H is caused by the high skin-to-building temperature gradient and ΔQ S flux. The difference in the response of Q H to increasing heat capacity between the two set-ups is 0.004 (∼4 W/m 2 ) and is linked to a faster decrease in the skin-to-air temperature in the off-line model compared to the on-line one, as the heat capacity for roof increases. Night-time Q H shows the opposite response compared to the daytime Q H flux (Figure 8d). The on-line case shows a minimal 0.010 (4 W/m 2 ) increase in night-time Q H with increasing C roof and a 0.016 (6 W/m 2 ) increase for increasing C wall . For the off-line set-up this increase is 0.018 (6 W/m 2 ). The smaller variation in Q H flux for the on-line set-up is caused by the smaller variation of skin-to-air temperature difference.
The mean daytime ΔQ S flux increases with increasing heat capacity for all facets as a result of increased heat retention. The variation in ΔQ S for the on-line set-up is 0.020 (18 W/m 2 ) for changes in C roof , and 0.027 (24 W/m 2 ) for changes in C wall ; for the off-line set-up the increase is 0.024 (22 W/m 2 ) and 0.030 (28 W/m 2 ), respectively. There is a small increase in the difference between the two set-ups of around 0.004 (∼4 W/m 2 ) for higher heat capacities. Night-time ΔQ S decreases as heat capacity increases, which is slower for the on-line set-up compared to the off-line set-up. The difference is 0.003 (∼1 W/m 2 ) for low C roof values and 0.010 (∼3 W/m 2 ) for high values. For the wall heat capacity the initial difference of 0.002 (∼1 W/m 2 ) between the two set-ups reduces to 0.008 (∼3 W/m 2 ).

DISCUSSION
These results may differ from other urban surface schemes as a result of different model sensitivities and intensities of the feedback mechanisms. Thus, other schemes (i.e., simple "bulk" to more complex "multi-layer" urban canopy schemes) should be similarly studied in both on-line and off-line configurations.
Our on-line set-up used the MYNN boundary-layer scheme (Nakanishi and Niino, 2009) as it is well tested in combination with the MYJ surface-layer scheme (Janjić, 1994). During evaluation it produced better results for neutral conditions compared to the MYJ boundary-layer scheme and also represented the night-time conditions better compared to the Yonsei University (YSU) scheme (Hong et al., 2006). For the on-line set-up the MYJ surface layer was also tested with the YSU boundary-layer scheme. Although the results differ somewhat from the original on-line set-up, the feedback mechanisms are equally significant in altering the model's performance. However, given the differences in model sensitivity based on the boundary-layer scheme, further investigation of the interaction between boundary-layer schemes and the urban surface scheme would be useful.
Although both the off-line and on-line models use the same tiling approach to calculate the surface energy balance, we discovered that the net short-wave radiation for the vegetation tile is calculated differently for each of the two set-ups. In the off-line model, the net short-wave radiation for the vegetation tile is calculated using the albedo of the vegetation, while in the on-line model the weighted average of the albedo (urban and vegetation) of the tile is used. This can lead to a deviation in the calculated Q * for the vegetation tile and consequently the Q H and Q E fluxes. To ensure that both set-ups calculate fluxes in the same way, we decided to use the off-line approach in the on-line set-up as well, assuming that the vegetation is not influenced by variations in urban albedo.
The off-line model uses observations from one location (KSSW) as the input, but for the on-line model runs multiple sources are required for the initial input and advection for temperature, moisture and momentum. While the effective forcing of the land surface model and UCM are quite similar for both (see section 4), the differences may affect the behaviour of the urban surface scheme. Therefore a series of tests was performed to assess whether the atmospheric forcing was influencing the model's response significantly during the sensitivity tests. In the off-line model we repeated all the facets' albedo sensitivity tests with forcing derived from the on-line model run with the default surface parameter configuration.
Furthermore, the on-line experiments for roof albedo were repeated without the advection of heat, moisture and momentum, and the results were compared with those for the default off-line experiment. In both cases we still observe differences in the sensitivity responses of the on-line and off-line set-ups.
The effects of facet emissivity on the surface energy balance were only minor (similarly to Loridan et al., 2010). The main effect of the changes in facet emissivity was a slight increase in the skin temperature of the facet and thus a slightly higher LW up radiation from the surface. We did not find a direct increase in LW up caused by an increase in the emissivity of the facets. The LW up radiation calculated in the SLUCM is not transferred to the long-wave radiation scheme and is only used within the SLUCM scheme to calculate Q * for the urban tile. The radiation scheme instead calculates LW up using the emissivity of the vegetation tile and the surface temperatures for the averaged urban and vegetation tiles. This results in a different LW up radiation calculated by the radiation scheme and the urban surface model, which affects only the atmosphere and not Q * , because the latter uses LW up calculated from the urban scheme. In this study, urban LW up from both set-ups was calculated as in the SLUCM.
Other parameters used in the SLUCM could influence the model's performance. For instance, the thickness of each facet layer and the thickness distribution of the layers can affect the way heat is transferred to/from the urban fabric and thus affect the energy balance. Other parameters that affect the model performance include (a) the empirical coefficient a k (used to calculate the roughness length for heat over the canyon based on the roughness length of the momentum (Kanda et al., 2007), (b) anthropogenic heat and latent heat, (c) internal building temperature, and so on. Loridan et al. (2010) and Zhao et al. (2014) cover the sensitivity of most of the surface parameters for the off-line SLUCM model, but an on-line vs. off-line comparison for these parameters would also be useful.
Here, clear sky conditions were chosen to minimize meteorological influence and to maximize the impact of surface parameter changes. However, under other meteorological conditions (e.g., clouds or rain) the model response may differ. For instance, during cloudy days with less short-wave radiation, the predominant role of the facets' albedo will be dampened; therefore the heat capacity of the urban fabric and the urban fraction may be more dominant factors. During rain and immediately following it, water interception by impervious surfaces alters evaporation and the surface energy balance (e.g., Grimmond and Oke, 1991;Yang et al., 2016). As hydrological processes are implemented in SLUCM (Yang et al., 2015), the impact of changes to these parameters under varying weather conditions will be important to explore.
In this study we are concerned with the importance of atmospheric feedback and its representation in an NWP model. Like all models, it will have an incomplete representation of all the atmospheric feedback mechanisms. For instance, in a realistic scenario, if the surface albedo of the urban area decreases, then the surface and air temperatures will increase, resulting in stronger gradient between the city and rural areas. This can increase advection of colder air and counteract the effects of decreasing albedo. Similar atmospheric feedback is always present in the atmosphere and can be important for estimating and understanding model sensitivities. However, a full 3D representation of the atmosphere would be required to explicitly resolve and study it.

CONCLUSIONS
The model behaviour of two Noah-SLUCM set-ups, one off-line and one coupled to the atmosphere, are investigated. After evaluation for a two-day summertime period in London we varied a series of parameters, (a) urban fraction, (b) surface albedo, (c) thermal conductivity and (d) thermal capacity of the urban facets, in order to assess their effects on the surface energy balance. We identified several differences in the model response between the two set-ups, which we attributed to various feedback mechanisms between the surface and the overlying atmosphere.
The model evaluation revealed that the on-line set-up performed well at capturing Q * , SW down and LW down , with small variations compared to the observations. Both on-line and off-line models show large discrepancies for Q H and Q E , due to shortcomings of the model and measurement uncertainties. Air temperature, wind speed and wind direction are relatively well represented in the on-line case. Boundary-layer height is well simulated on the first day of the case study period, but is underestimated on the second day.
A substantial sensitivity of Q * and turbulent fluxes to surface parameters is reported. During the daytime, the urban fraction and albedo are the primary contributors to variations to Q * , Q H and Q E , while heat capacity and thermal conductivity greatly affect the ΔQ S flux. At night-time, the urban fraction, heat capacity and thermal conductivity show stronger effects on Q * , Q H and the ΔQ S flux, while variations in albedo have a small effect.
Finally, there are some distinct differences in the sensitivity of the on-line and off-line set-ups, which have been attributed to feedback mechanisms between the surface and the atmosphere. Depending on the surface parameter, the effects of the atmospheric feedback mechanism can outweigh the variation due to the surface parameter change. Overall, Q * is not significantly affected by atmospheric feedback mechanisms. The effects are most profound for Q E and ΔQ S , where indirect atmospheric feedback can account for nearly 100% and 50% of the reported variability, respectively. Feedback mechanisms also decrease the changes in daytime Q H by up to 22%. Thus we recommend taking atmospheric feedback between the surface and the atmosphere into consideration when evaluating the performance of urban canopy models for the aforementioned variables.
for providing operational analysis data, the University of Wyoming for the radio-sounding data and NOAA for providing the data for the WMO stations, all of which were used to create the forcing for the on-line model. We also would like to thank all the co-authors of the SUBLIME case study which is utilized in this article. ORCID Sue Grimmond https://orcid.org/0000-0002-3166-9415