# Distributed urban drag parametrization for sub-kilometre scale numerical weather prediction

**Funding information: ** the EPSRC Mathematics of Planet Earth Centre for Doctoral Training: EP/L016613/1 the UK–China Research & Innovation Partnership Fund through the Met Office Climate Science for Service Partnership (CSSP) China as part of the Newton Fund

## Abstract

A recently developed, height-distributed urban drag parametrization is tested with the London Model, a sub-kilometre resolution version of the Met Office Unified Model over Greater London. The distributed-drag parametrization requires vertical morphology profiles in the form of height-distributed frontal-area functions, which capture the full extent and variability of building heights. London's morphology profiles are calculated and parametrized by an exponential distribution with the ratio of maximum to mean building height as the parameter. A case study evaluates the differences between the new distributed-drag scheme and the current London Model setup using the MORUSES urban land-surface model. The new drag parametrization shows increased horizontal spatial variability in total surface stress, identifying densely built-up areas, high-rise building clusters, parks, and the river. Effects on the wind speed in the lower levels include a lesser gradient and more heterogeneous wind profiles, extended wakes downwind of the city centre, and vertically growing perturbations that suggest the formation of internal boundary layers. The surface sensible heat fluxes are underpredicted, which is attributed to difficulties coupling the distributed momentum exchange with the surface-based heat exchange.

## 1 INTRODUCTION

Urban environments alter aerodynamic, radiative, thermal, and hydrological processes, which can intensify heat waves, flash floods, and air pollution. Accurate urban models are necessary for better warnings of severe weather hazards and to improve weather forecasts and services in the most populated areas in the world. Moreover, urban climate models are crucial for planning how to adapt cities for more extreme weather and how to transform them to be more sustainable and resilient in a changing climate.

Parametrizations for urban environments in numerical weather prediction (NWP) or climate models aim to represent the effects of the built-up system without resolving it explicitly. Numerous urban schemes at different complexities exist (Grimmond *et al*., 2010). The schemes are commonly based on Monin–Obukhov similarity theory to calculate exchange fluxes between the atmosphere and the surface, where urban environments are represented by an increased roughness length that modifies surface friction and heat exchange. Single-layer canopy models are the most widely used type of urban scheme (Garuma, 2018). The urban morphology is represented in these schemes by a simple street-canyon geometry with variable height-to-width ratios, which parametrizes most physical processes in the urban canopy such as different flow regimes, shadowing, and radiation trapping (e.g., Porson *et al*., 2010a). The urban environment is effectively seen as a “bulk surface”, which interacts with the atmosphere close to the canopy top, at the level of the displacement height.

However, the use of these models is increasingly challenged by (a) the growing number of high-rise buildings in cities, which protrude deep into the atmospheric boundary layer, (b) increasing vertical and horizontal resolution of models (Lean *et al*., 2019), so that the resolution scale becomes less than the surface-feature scale instead of vice versa, and (c) the heterogeneity of urban environments across scales, the effects of which become more apparent as grid resolutions increase (Barlow *et al*., 2017). Individual tall buildings cause wake effects that are not represented by the idealised canyon geometry, and deep urban canopies, such as in central London or Asian megacities, displace and modify the airflow at heights well above the lowest atmospheric level (Hertwig *et al*., 2021). As regional NWP models are increasingly used with sub-kilometre horizontal resolutions (e.g., Boutle *et al*., 2016; Lean *et al*., 2019), heterogeneous urban neighbourhoods, from high-rise building clusters to parks, become resolved by the individual grid boxes. Assumptions behind logarithmic wind profiles are no longer valid, as urban fetches become very short due to the increased heterogeneity and the flow is never adjusted fully with the underlying surface.

The vertical depth of the urban canopy and heterogeneity between the surface grid boxes can be addressed by multilayer canopy models that interact with the atmosphere over several vertical levels (e.g., Martilli *et al*., 2002). The lowest atmospheric level is situated at ground level (without the displacement height) and wind profiles are modelled by a drag force instead of surface friction. Sützl *et al*. (2021) developed a height-dependent canopy model for urban drag based on large-eddy simulations of idealised heterogeneous urban neighbourhoods. Unlike most models, buildings are not represented by a simple street-canyon geometry but by a detailed vertical morphology profile, which considers the tallest buildings and the height variability between the buildings and therefore also captures the subgrid heterogeneity of the urban environment. This article describes a case study combining the distributed urban drag model from Sützl *et al*. (2021) with the Met Office Unified Model (UM: Davies *et al*., 2005) in a high-resolution configuration with a limited domain over Greater London and surrounding areas.

This study describes a step towards developing a new multilayer scheme: at this point we only consider momentum exchanges, with a focus on the representation and effects of heterogeneous subgrid morphology. We describe in detail how the additional morphology information is acquired and processed. Section 2 describes how the distributed-drag model is incorporated into the UM and the setup for the case study. Vertical morphology profiles of urban areas are calculated and analysed from 1-m resolution building data, alongside updated urban morphology parameters of plan-area index ${\lambda}_{\mathrm{p}}$ (buildings plan area per grid-box plan area), frontal-area index ${\lambda}_{\mathrm{f}}$ (buildings frontal area per grid-box plan area), and mean building height ${z}_{\mathrm{H}}$ per grid box (Section 3). A model comparison between the standard model setup, the model with the updated urban morphology information, and the model including the distributed-drag scheme (Section 4) is conducted. The results are analysed in Section 5. A discussion on the implications of the distributed-drag model (Section 6) is followed by concluding remarks (Section 7).

## 2 MODELS, OBSERVATIONS, CASE STUDY

### 2.1 The London Model

The London Model (LM: Boutle *et al*., 2016) is a very high resolution version of the Met Office Unified Model. The model domain covers a region of 125 $\times $ 140 km${}^{2}$ and extends over Greater London and surrounding areas (Figure 1a) at a horizontal grid length of 0.003° (approximately 333 m). The initial and boundary conditions are from the operational Met Office forecast model for the United Kingdom (UKV) at 1.5-km horizontal resolution (Tang *et al*., 2013). The vertical resolutions of the UKV and the London Model are identical, and both model convection explicitly. The numerical setup of the LM follows the RAL1-M science settings as described in Bush *et al*. (2020).

Urban areas in the UM are currently modelled by the Joint UK Land Environment Simulator (JULES: Best *et al*., 2011; Clark *et al*., 2011), which parametrizes surface subgrid-scale processes and their exchange with the atmosphere. The model separates different surface types within a grid box into tiles (e.g., different vegetation types, water, soil), where subgrid processes are calculated in parallel for each tile with its own defined properties. The results of these calculations are combined into surface exchange fluxes by weighting the contributions from each tile by their grid-box fraction. The surface fluxes are coupled to the atmosphere implicitly to maintain balance between the land surface and the atmosphere (Best *et al*., 2004). For the regional models UKV and LM, the calculation of urban properties within JULES is covered by the single-layer Met Office-Reading Urban Surface Exchange Scheme (MORUSES: Porson *et al*., 2010a; 2010b; Bohnenstengel *et al*., 2011; Bohnenstengel and Hendry, 2016). MORUSES represents urban areas using two tiles: one for the roof area of the buildings, and one for the street canyon. Material properties of building and road surfaces are fixed, but the street-canyon geometry may vary in each grid box. MORUSES calculates properties such as the heat capacity, albedo, and surface roughness length based on the varying geometry.

### 2.2 Distributed urban drag model

A recently developed model for the contributions of urban areas to wind stress is based on building-resolving large-eddy simulations in neutrally stable conditions (Sützl *et al*., 2021). The canopy model represents building effects by a height-distributed frontal area, which allows for a detailed representation of the vertical structure of the urban morphology. This representation is particularly important for areas containing high-rise buildings or grid boxes with large subgrid heterogeneity, because the tallest buildings impose a disproportionally large amount of drag on the flow (Xie *et al*., 2008).

*et al*. (2021) defined a (dimensionless) normalised frontal area as

*z*. Here, ${A}_{\mathrm{F}}$ is the total frontal area of all buildings in a grid box, and ${z}_{\mathrm{max}}$ is the height of the tallest building. The integrand $L(z)$ in Equation 1 represents the height-dependent total width of the buildings, which is the distribution of the surface area of buildings with height

*z*, such that

To keep the morphology representation relatively simple, $L(z)$ does not depend on wind direction but instead represents the average surface from all wind directions as discussed in Appendix A. The normalised frontal area has $\zeta =0$ at the maximum building height and $\zeta =1$ at ground level, and represents the urban morphology as a nonlinear function in between these heights (see Figure 1d for examples of ${\lambda}_{\mathrm{f}}$-scaled $\zeta (z)$ profiles). Only for grid boxes with buildings of uniform heights will $\zeta $ depend linearly on *z*.

*t*, height

*z*, for each horizontal component $i\in \{x,y\}$, per grid box, is

*s*is the drag distribution function. That is, the distributed stress ${\tau}_{\mathrm{D},i}$, when normalised by the total surface stress ${\tau}_{0,i}$, collapses on to a single function

*s*that depends on $\zeta $ only. Equation 3 was shown to hold by analysing the data from a series of large-eddy simulations of idealised urban morphologies, each with identical surface-cover parameters ${\lambda}_{\mathrm{p}}$ and ${\lambda}_{\mathrm{f}}$, but with different building heights and street layouts (Sützl

*et al*., 2021). The drag distribution function

*s*is represented reasonably well by the third-order polynomial

*s*is not a linear function of $\zeta $ is testament to the earlier statement that taller buildings are responsible for a larger proportion of the drag.

### 2.3 Modifications to the urban surface scheme

*et al*., 2004; Lock

*et al*., 2020)

*et al*. (1998) parametrization (Figure 2). The surface exchange of sensible heat, ${Q}_{\mathrm{H}}$, is a function of the lowest model level winds, temperature differences between the surface and the lowest model level, and the surface exchange parametrization ${C}_{\mathrm{H}}$, and has some dependence on atmospheric stability (Porson

*et al*., 2010a).

This requires specification of $L(z)$ and ${z}_{\mathrm{max}}$. The Unified Model has a parametrization for distributed orographic drag (form drag induced by unresolved small-scale hills: Wood *et al*., 2001), which was used as the basis for this implementation. Since the urban drag model does not account for the material effects of urban areas, MORUSES parametrizations are used for the radiation and heat exchange.

To prevent applying urban form effects twice, that is, via an effective roughness length and a distributed form drag, the momentum roughness length ${z}_{\text{0m}}$ in MORUSES is reset to the value $\u03f5=1{0}^{-5}$ m representing a small material roughness length, so the JULES momentum exchange (Equation 6) becomes negligible. The roughness length for heat for urban surfaces, ${z}_{\text{0h}}$, remains unchanged as parametrized by MORUSES using the MacDonald *et al*. (1998) momentum roughness length and a resistance network (Harman *et al*., 2004; Porson *et al*., 2010a). We note that this setup causes an inconsistency for the heat-exchange coefficient ${C}_{\mathrm{H}}$ in JULES (see equation 19 in Porson *et al*., 2010a), which depends on both ${z}_{\text{0m}}$ and ${z}_{\text{0h}}$, and is calculated with the small momentum roughness length and the “normal” (MORUSES) heat roughness length. Unfortunately it is far from trivial to resolve this inconsistency in the code; however, its effects will be described in Section 5.3.

The urban drag parametrization and modifications to the momentum roughness length are applied to grid boxes with plan-area indices of ${\lambda}_{\mathrm{p}}>0.1$. This threshold includes suburban grid boxes at the periphery of Greater London and other large towns in the domain, but does not include individual buildings outside these urban patches. Figure 1 shows the highest (model-level) heights of the grid boxes in the London Model domain where the drag parametrization scheme applies, which is derived from the newly calculated ${\lambda}_{\mathrm{p}}$ and ${z}_{\mathrm{max}}$ data (see Section 3).

### 2.4 Observations

Automatic Lidar and Ceilometer (ALC) observations from the London Urban Micrometeorological Archive (LUMA network: https://muhd.readthedocs.io) are used to support the model comparison. ALCs measure the light scattered back from aerosols. The distribution of aerosols in the atmosphere is a result of previous atmospheric mixing, and can therefore be used to characterise boundary-layer dynamics. Vertical profiles of aerosol backscatter are measured at the North Kensington (NK) and Marylebone Road (MR) sites in central London using Vaisala CL31 ceilometers. An automatic algorithm detects the mixed-layer height ${z}_{\text{ML}}$ from the attenuated backscatter profiles (Kotthaus and Grimmond, 2018). Kotthaus and Grimmond (2018) report good agreement between ${z}_{\text{ML}}$ inferred from backscatter profiles and the mixed-layer height inferred from observed temperature inversions for clear-sky summer days. Therefore, the case study date was chosen accordingly.

### 2.5 Case study

The case study period is June 26, 2018 with a 36-hr forecast run using the London Model starting from 25 June at 1800 UTC. This cloud-free period had easterly winds and an average wind speed of 5 m$\xb7$s${}^{-1}$ (London City Airport, at 5 m above ground level (agl)). Analysis of the model behaviour will focus on a 10 $\times $ 15 km${}^{2}$ area of central London (Figure 1c), which is densely built up and includes the historical centre and high-rise area of the City. The cluster of high-rise buildings of Canary Wharf is in the east. In the west there are three large parks: The Regent's Park, Hyde Park, and Battersea Park. Detailed analysis of 1 km${}^{2}$ (3 $\times $ 3 grid boxes) neighbourhoods includes the following (see Table S1): Oxford Circus (OC), City (CI), Shard (SH), Wapping (WA), Canary Wharf (CW), and Littlebrook Power Station (LB). The first five are within the central London area. While the City and Canary Wharf neighbourhoods contain high-rise buildings above 200 m and the Shard (London's tallest building) is 304 m high, neither the Oxford Circus nor Wapping neighbourhoods has buildings above 80 m height. Wapping, unlike Oxford Circus, is outside the highest density areas of London, situated between the City and Canary Wharf. Littlebrook Power Station (Figure 1b), a former coal-fired power station in the Thames estuary with a 211-m high chimney, provides a reference for an isolated tall building in a low-density urban area.

## 3 URBAN MORPHOLOGY OF GREATER LONDON

The London Model default urban morphology, described by the plan-area index ${\lambda}_{\mathrm{p}}$, frontal-area index ${\lambda}_{\mathrm{f}}$, and building mean height ${z}_{\mathrm{H}}$, is insufficient to use the distributed-drag parametrization, as it requires vertical morphology profiles $\zeta (z)$ for each grid box. The urban morphology profiles of Greater London and its surroundings are therefore calculated using Ordnance Survey (OS) data of all the individual buildings in this area (Section 3.1). For consistency, ${\lambda}_{\mathrm{p}}$, ${\lambda}_{\mathrm{f}}$, and ${z}_{\mathrm{H}}$ are also updated (Appendix A). Analysis is conducted to parametrize $\zeta (z)$ based on the ratio of the maximum to mean building height $r={z}_{\mathrm{max}}/{z}_{\mathrm{H}}$ (Section 3.2). The great advantage of this approach, apart from the insight it gives about the buildings in London, is that the only parameter needed in addition to those present in the London Model is the maximum building height ${z}_{\mathrm{max}}$.

### 3.1 Urban morphology calculations

Urban morphology datasets for the London Model domain are derived using the Ordnance Survey MasterMap Topography Layer – Building Height Attribute data at 1 m resolution, updated in April 2019 (Ordnance Survey (GB), 2019). The London Model domain contains approximately 9.4 million buildings. Data for around 5% of the domain area are missing, all of which are outside Greater London. For these regions, the original London Model input data are used and the maximum height is assigned a value of ${z}_{\mathrm{max}}=2.3{z}_{\mathrm{H}}$, which is the average ratio of maximum to mean building height of the OS data in the whole LM domain, for this resolution. (It was found in a limited study that the average ratio in central London increases as horizontal resolution decreases, see Figure S2.)

- assign the building to a grid box(es), allowing it to be in multiple grid boxes;
- calculate the building's plan area from the building footprint;
- determine the wind-direction-averaged vertical building width function from the three-dimensional building shape (see Appendix A);
- add the building's width function to the total width $L(z)$ and the building's plan area to the total plan area ${A}_{\mathrm{P}}$ in the appropriate grid box(es);
- compare the building height with the current grid-box maximum and update ${z}_{\mathrm{max}}$ if necessary.

The grid-box profiles $\zeta (z)$ and parameters ${\lambda}_{\mathrm{p}}$, ${\lambda}_{\mathrm{f}}$, and ${z}_{\mathrm{H}}$ are determined after the calculations for all buildings (Appendix A).

### 3.2 Parametrizing London's morphology profiles

To reduce the complexity of the required model input, the vertical profiles $L(z)$ and $\zeta (z)$ are parametrized in terms of the mean building height ${z}_{\mathrm{H}}$ and maximum building height ${z}_{\mathrm{max}}$ of the grid box. It is possible to compare the vertical functions directly amongst the grid boxes by a change of variables $\widehat{z}=z/{z}_{\mathrm{max}}$, such that the rescaled and normalised $\widehat{L}(\widehat{z})/{A}_{\mathrm{F}}$ functions and $\zeta (\widehat{z})$ are defined on the interval $[0,1]$. The $\widehat{L}(\widehat{z})/{A}_{\mathrm{F}}$ profiles are categorised according to the ratio of the maximum to the mean building height of the corresponding grid box, $r={z}_{\mathrm{max}}/{z}_{\mathrm{H}}$ (cf. Figure S3).

The data are split into six bins: $r=1$, $r\phantom{\rule{0.3em}{0ex}}\in \phantom{\rule{0.3em}{0ex}}(1,2]$, $r\phantom{\rule{0.3em}{0ex}}\in \phantom{\rule{0.3em}{0ex}}(2,3]$, $r\phantom{\rule{0.3em}{0ex}}\in \phantom{\rule{0.3em}{0ex}}(3,4]$, $r\phantom{\rule{0.3em}{0ex}}\in \phantom{\rule{0.3em}{0ex}}(4,5]$, $r>5$. The first, $r=1$, classifies grid boxes with uniform buildings, where the maximum height is equal to the mean height. Less than 1% of all London Model domain grid boxes fall into this category, and they have very low building densities (${\lambda}_{\mathrm{p}}<0.02$), usually with only one or two buildings in the area. Approximately half of the grid boxes have a building-height ratio $r\in (1,3]$; the remaining three categories contain the other 10% of the grid boxes that include buildings; overall, approximately 40% of the grid boxes do not include any buildings. The average building-height ratio is $r=2.3$. A high *r* is generally associated with a large heterogeneity in building heights. For example, grid boxes with high-rise building clusters like those in the London City (CI, Figure 1c) have typical ratios of 6–8. The Shard (within SH, Figure 1c), the highest building in the domain, is situated on the border of two grid boxes, which have values of $r=15$ and $r=17$. Some sparsely populated grid boxes with a single tall structure, for example, a transmission tower, also fall into the $r>5$ category.

*r*. The values $\tilde{r}=1,\phantom{\rule{0.3em}{0ex}}1.5,\phantom{\rule{0.3em}{0ex}}2.5,\phantom{\rule{0.3em}{0ex}}3.5,\phantom{\rule{0.3em}{0ex}}4.5,\phantom{\rule{0.3em}{0ex}}6$ are chosen to represent the bins. Linear regression between $\tilde{r}$ and the empirical values $\tilde{\alpha}$ gives the following approximation:

The fit of the empirical coefficients to the building-height ratio works remarkably well (Figure 3b). The resulting profiles $L/{A}_{\mathrm{F}}(\widehat{z};\alpha (r))$ are estimated entirely based on *r* (Figure 3c). The functions capture well the gradual change from a uniform profile ($r=1$) to one that has its weight at lower $\widehat{z}$ values (i.e., relatively closer to the ground) with increasing *r* values. Repeated analysis with a larger number of bins yields very similar results, which suggests a robust relation between the building-height ratio *r* and the shape of the vertical functions $L(z)$. The averaged morphology profiles also remain invariant over different grid resolutions in a limited analysis of central London, suggesting the parametrization may be suitable across resolutions (Figure S2). The parametric function in Equation 8 is therefore an appropriate parametrization for the morphology profiles.

*z*(with ${\lambda}_{\mathrm{f}}$ at ground level), which gives an indication of the magnitude of the stress applied at each height.

## 4 MODEL COMPARISON

### 4.1 Description of model configurations

Three different model runs are compared: (a) the *original* London Model with the current experimental configuration, (b) the *control* run with updated morphology input fields ${\lambda}_{\mathrm{p}},{\lambda}_{\mathrm{f}}$, and ${z}_{\mathrm{H}}$ derived from the 1-m resolution Ordnance Survey data, and (c) the *distributed drag* model run with the vertical parametrization of urban drag. Comparisons between (a) and (b) assess the effects of updated urban morphology inputs in MORUSES. Note that the land-cover fractions of the grid boxes have not been assessed (since the analysis did not include vegetation and other surface types) and the control run therefore does not represent a fully updated model configuration. This implies that changes to the plan area of buildings and aspect ratio of street canyons are not reflected in the tile weighting in JULES.

Comparisons between (b) and (c) assess the effects of distributing urban drag over several vertical levels. The simulations are also used to assess spatial variability, since urban drag is modelled differently between the two runs. 14% of all grid boxes in the London Model domain are above the threshold ${\lambda}_{\mathrm{p}}>0.1$, where the effective roughness formulation is replaced with the distributed-drag scheme (Figure 1a). Most commonly, the distributed-drag scheme affects the lowest two or three vertical model levels (up to 22 and 45 m agl, respectively), whereas in central London (Figure 1c) the scheme frequently exerts drag on the fourth model level (up to 75 m agl) and occasionally higher. It reaches the ninth vertical level around the tallest building in the domain.

Note that, in the comparison between (a), (b), and (c), the output variables are compared at equal vertical model levels. Although the bottom of the atmospheric model in MORUSES is conceptually situated above the displacement height (not above the ground, Hertwig *et al*., 2020), the model does not actually adjust the surface heights of the individual grid boxes, and the displacement heights therefore have no influence on the surface-layer dynamics. This inconsistency in the typical modelling approach makes it difficult to interpret the model output close to the surface. In the distributed-drag scheme, the bottom of the atmospheric model is at ground level, the building parametrization is immersed in the atmosphere, and the lower levels therefore represent profiles within the urban canopy.

The output variables compared are the following: the magnitude of the surface stress, ${\tau}_{0}(t,x,y)=\sqrt{{\tau}_{0,x}^{2}(t,x,y)+{\tau}_{0,y}^{2}(t,x,y)}$; the surface sensible heat flux ${Q}_{\mathrm{H}}(t,x,y)$; the wind speed $U(t,x,y,z)$ at all vertical model-level heights; and the mixed-layer height ${z}_{\text{ML}}(t,x,y)$. The outputs of ${\tau}_{0}$, ${Q}_{\mathrm{H}}$, and *U* at 10 m are averaged over 15-min time intervals. Wind speed at other heights and ${z}_{\text{ML}}$ are instantaneous outputs. A 24-hr time series is analysed for each model run, which discards the first 6 hr as spin-up. Outputs for the measurement sites (Section 2.4) and 1-km${}^{2}$ neighbourhoods (Section 2.5) are analysed as 3 $\times $ 3 grid-box averages. The central grid box contains the point of interest (i.e., measurement site or landmark). Note that the data location varies slightly by output variable, as the Unified Model uses a staggered Arakawa-C grid (Davies *et al*., 2005). Space averages over central London (Figure 1c) are denoted by $\u27e8\xb7\u27e9$, 24-hr time averages by $\stackrel{\u203e}{\xb7}$.

### 4.2 Matching the drag coefficient

When a test run with the resulting drag coefficient ${c}_{\mathrm{D}}=0.067$ found a space- and time-averaged surface stress over the central London area $\u27e8\stackrel{\u203e}{{\tau}_{0}}\u27e9$ slightly exceeding that of the control run, it was changed to ${c}_{\mathrm{D}}=0.06$. This yielded an almost exact match, where $\u27e8\stackrel{\u203e}{{\tau}_{0}}\u27e9$ of the control and distributed-drag run differ by less than 0.01% (for comparison, $\u27e8\stackrel{\u203e}{{\tau}_{0}}\u27e9$ of the original run is about 13% lower). The space-averaged surface stresses are also similar over time, as shown by the $\u27e8{\tau}_{0}\u27e9(t)$ time series for the three model runs (Figure 6a). Note that using Equation 11 with the wind speed at the lowest level height (*z* = 2.5 m) yields ${c}_{\mathrm{D}}\approx 1$. This indicates the close link between surface winds and boundary-layer stresses through the implicit coupling of JULES and the atmospheric model component.

### 4.3 Changes in input morphology

The urban morphology input data for the Unified Model are from the Institute of Terrestrial Ecology (ITE: Bunce *et al*., 1990). The plan-area index, frontal-area index and mean building height are derived from the “urban” (i.e., impervious) land-cover fraction ${F}_{\mathrm{U}}$ of a grid box using empirical relations (Bohnenstengel *et al*., 2011; Bohnenstengel and Hendry, 2016). The relations are derived from high-resolution building data of London (Evans *et al*., 2006) at 1 km grid resolution. With ${\lambda}_{\mathrm{p}},{\lambda}_{\mathrm{f}}$, and ${z}_{\mathrm{H}}$ all functions of ${F}_{\mathrm{U}}$, the three parameters are linked and reduced to one degree of freedom. The original run uses these morphology datasets. The new control-run datasets are derived directly from the 1-m resolution Ordnance Survey building data.

Figure 4 shows the frequency distributions of the original and new morphology grid-box values ${\lambda}_{\mathrm{p}},{\lambda}_{\mathrm{f}}$, ${z}_{\mathrm{H}}$, and ${z}_{\mathrm{max}}$ in central London. The frequency distributions of the plan-area index ${\lambda}_{\mathrm{p}}$ (Figure 4a) are largely similar between the original and new data, with differences only evident in grid boxes with ${\lambda}_{\mathrm{p}}$ < 0.1 and around the mean value (original = 0.3; new = 0.28). The maximum value increases slightly from 0.61 to 0.66. The frequency of low ${\lambda}_{\mathrm{p}}$ grid-box values is resolution-dependent (higher resolutions increase the number of grid boxes that contain very few buildings, cf. Figure S4). Using the 1-km based empirical relations on the London Model grid therefore underestimates grid boxes with very low ${\lambda}_{\mathrm{p}}$ in the original data. Spatially, the original and new plan-area indices of central London appear relatively similar overall (Figure S5). The new data have a larger cluster of high-density grid boxes north of the river Thames; the original data have more individual grid boxes with high ${\lambda}_{\mathrm{p}}$ values outside the most densely built-up area.

The new frontal-area index distribution (Figure 4b) differs from the original data distribution, with a larger mean (original = 0.22; new = 0.35), the maximum reaching 1.2 (0.5 previously), and a much broader distribution of values. The new frontal area indices are higher overall, and much higher in the most densely build-up areas north of the river and at the high-rise buildings of Canary Wharf (Figure 5). Both datasets show low frontal-area indices along the river and at the three large parks in the west. The higher overall values of the new data may partly be attributed to the definition we applied for the frontal-area index, which estimates higher frontal areas than other methods (see Appendix A); however, the data also clearly reflect the vast increase in high-rise buildings in London over the last decades.

The average value for the mean building height ${z}_{\mathrm{H}}$ increases from 9 to 11 m with the new data (Figure 4c). The distribution has a longer tail, with more grid boxes with higher ${z}_{\mathrm{H}}$ in the new dataset. The maximum ${z}_{\mathrm{H}}$ is 59 m (up from 17 m!). Decoupling the mean-height calculations from the urban land-cover fraction results in considerable spatial differences between the original and new ${z}_{\mathrm{H}}$ data (Figure S6). The new data indicate three areas with high average building heights (Figure 1c shows locations): the City (area around CI), Canary Wharf (around CW), and a narrow band southwest of the centre. This area contains the Battersea Power Station development site, which had few completed buildings in 2019. Unlike the original data (linked to impervious fraction), the river is not evident in the new ${z}_{\mathrm{H}}$, as these grid boxes have both river and buildings, which the ${z}_{\mathrm{H}}$ map now reflects. The distribution of the new input parameter, maximum building height ${z}_{\mathrm{max}}$, is of comparable shape to the ${z}_{\mathrm{H}}$ distribution of the new data (Figure 4d). The mean is 39 m and the maximum is 304 m.

## 5 CASE STUDY RESULTS

### 5.1 Surface stresses

Figure 6 shows the magnitude of the surface stress ${\tau}_{0}$ for the central London area. The space-averaged $\u27e8{\tau}_{0}\u27e9(t)$ are largely similar over time for the three model runs, but the surface stress of the original run is consistently less than the other two model runs (Figure 6a). Spatially, the surface stresses have large variations over central London for the 24-hr time-averaged $\stackrel{\u203e}{{\tau}_{0}}(x,y)$ fields (Figure 6b–d). In the original run (Figure 6b), surface stresses across urban areas are relatively uniform and the spatial structure of London is only visible through the lack of built-up areas (i.e., large parks and the river). High surface stresses in the control run (Figure 6c) occur in Canary Wharf. Higher stresses than in the original run also occur in areas with comparatively low building density in southwest London. Open areas such as the river are not evident. The distributed-drag run (Figure 6d) has high surface stresses for Canary Wharf, the City, and over the most densely built-up area in the centre of London. Distinctively low stresses are found at the parks and river. Overall, the spatial variability of surface stresses is largest with the distributed-drag model, highlighting various local features, such that the spatial patterns of ${\tau}_{0}$ reflect the heterogeneity of the urban area.

The spatial variability between the different model runs stems from changes in morphology inputs and the different parametrizations for the surface stress equations (Equations 5 and 6). The surface stress of the distributed-drag run clearly correlates with the frontal-area index ${\lambda}_{\mathrm{f}}$ (cf. Figure 5b), which results from incorporating ${\lambda}_{\mathrm{f}}$ in the distributed-drag model (Equation 5). The surface stresses of the original and control run are closely linked to the momentum roughness length ${z}_{\text{0m}}$, which is the key element of the standard momentum-exchange parametrization (Equation 6). The roughness length, calculated in MORUSES with the MacDonald *et al*. (1998) parametrization, is most strongly correlated with the mean building height ${z}_{\mathrm{H}}$. The MacDonald *et al*.
parametrization derives ${z}_{\text{0m}}$ as a fraction of ${z}_{\mathrm{H}}$, where ${\lambda}_{\mathrm{p}}$ and ${\lambda}_{\mathrm{f}}$ affect the proportionality. The ratio of ${z}_{\text{0m}}$ to ${z}_{\mathrm{H}}$ is higher for low values of ${\lambda}_{\mathrm{p}}$ and ${\lambda}_{\mathrm{f}}$, or low values of ${\lambda}_{\mathrm{p}}$ and high values of ${\lambda}_{\mathrm{f}}$.

The original dataset values of ${z}_{\text{0m}}$ have little spatial variability (Figure S7). Non-urban features such as the parks and the river are most distinct. Roughness lengths from the new morphology inputs are higher and values have a wider range. In Canary Wharf, where ${z}_{\mathrm{H}}$ and ${\lambda}_{\mathrm{f}}$ are high and ${\lambda}_{\mathrm{p}}$ relatively low, the highest ${z}_{\text{0m}}$ is 12 m. Roughness-length values that are ten times higher than typical for urban areas seem to be a gross overestimation, although there are almost no measured values in high-rise areas to say this with any certainty (Grimmond and Oke, 1999). High ${z}_{\text{0m}}$ values also occur at the Battersea development site, despite low building densities (derived from high ${z}_{\mathrm{H}}$, low ${\lambda}_{\mathrm{p}}$, ${\lambda}_{\mathrm{f}}$). The correlation with ${z}_{\mathrm{H}}$ causes higher roughness lengths along the Thames river, such that the river is no longer distinct from other neighbourhoods.

The profound differences in surface stress between the runs demonstrate the importance of high-quality and up-to-date morphology data. However, the control-run ${\tau}_{0}$ fields show that more realistic mean building heights do not necessarily improve the parametrization. The data suggest a limit on the grid length in MacDonald *et al*. (1998), such that the roughness-length parametrization is less appropriate for high-resolution models like the LM.

### 5.2 Wind speed

Vertical profiles of wind speed $U(z)$ at 1900 UTC for the reference neighbourhoods illustrate representative features of the different model runs (Figure 7), because vertical mixing through convection is lower than during times with stronger solar radiation. The wind profiles of the original and control run are relatively similar in all neighbourhoods and have the typical boundary-layer profile of air flowing over a flat surface. The wind-speed profiles with the distributed-drag scheme are much more diverse. The City and Oxford Circus differ the most in the lower model levels compared with the other neighbourhoods. The distributed-drag wind profiles change gradually in height, as the wind speeds in the lower vertical levels are lower. Wind speeds become similar to the other two model runs at several hundred metres above the ground. Significant differences between the original and control run are only found for Canary Wharf, where the much higher ${\tau}_{0}$ in the control run is reflected in smaller wind speeds in the lower model levels. The effects of height distribution are particularly evident for this neighbourhood, because the distributed-drag wind speed is higher than in the control run near ground level, but below the control run between approximately 50 and 150 m. All model runs are similar for the isolated tall chimney of the Littlebrook Power Station.

The frontal-area index function ${\lambda}_{\mathrm{f}}\zeta (z)$ for the centre grid box of each neighbourhood illustrates the different underlying morphology profiles and stress magnitudes (via the ${\lambda}_{\mathrm{f}}$-scaling) used in the drag parametrization (Figure 7). The effects of height-distributed drag can go beyond the height of the parametrization $\zeta $, as the Oxford Circus case illustrates. The maximum building height in the neighbourhood is 62 m (37 m in the central grid box), but the distributed-drag wind speed does not approach the speed of the other model runs until 250 m model height. The high-density urban surroundings of Oxford Circus are likely contributing to the large effects on the wind.

Indeed, the effect of large-scale urban areas under the distributed-drag scheme is evident in Figure 8, which shows 24-hr time-averaged wind-speed fields $\stackrel{\u203e}{U}(x,y)$ at 10-m model height for central London. The distributed-drag run shows relatively high average wind speeds to the east side of the study area (Figure 8c), where easterly winds with high wind speeds along the Thames estuary approach central London. A large wake with low average wind speeds is found downwind of the city centre to the west. This spatial pattern suggests that local disturbances introduced by the distributed drag are advected and grow vertically in the downstream direction, forming an internal boundary layer. Therefore, with an easterly wind, a neighbourhood downwind of the city centre such as Oxford Circus experiences large impacts of the distributed-drag model.

In contrast, the spatial patterns of the original and control run largely resemble the patterns of the surface stresses (cf. Figure 6). The original run (Figure 8a) has higher average wind speeds over the parks and river, where friction is smaller, and the control run (Figure 8b) has particularly low wind speeds around Canary Wharf and the southwest, where friction is high (cf. Section 5.1). However, there is little evidence in the 10-m wind field of any extended wake regions, or the growth of internal boundary layers.

Figure 8d shows the change between the control and distributed-drag run for space-averaged wind speed $\u27e8U\u27e9(t,z)$ in central London, giving a direct comparison between the standard parametrization and the distributed-drag scheme. The standard approach produces the desired effects on wind speeds by slowing down the wind speed at the lowest level and relying on diffusion to adjust the wind speed in the adjacent levels. With the building drag distributed over the full depth of the canopy, wind speeds are considerably higher at the lowest model level (often by more than 50%), but tend to be lower for model heights just above the lowest level and up to several hundred metres. A band of higher wind speeds on top of lower winds in the morning hours suggests a lower boundary-layer height for the distributed-drag run at these times. This is probably related to a reduction in the sensible heat flux (Section 5.3). At other times, there is no clear trend above the lowest few hundred metres and the relative difference becomes smaller at higher levels. This suggests that the distributed-drag forcing can produce the same effects in the upper levels, and at the same time produce wind profiles within the canopy that may be more realistic (or comparable with building-resolving simulations).

### 5.3 Mixed-layer height

Roughness and thermal effects of urban areas generate turbulence, which induces vertical mixing in the atmosphere. The height of the mixed layer ${z}_{\text{ML}}$ above cities is therefore an indication of how much turbulence is generated by surface processes. The model mixed-layer height is determined as the maximum of the surface-based mixed-layer height, which is the depth through which a positively buoyant parcel released at the surface would ascend, and the boundary-layer depth diagnosed by a threshold for the Richardson number, which marks the transition to the stable conditions of the capping inversion (Lock *et al*., 2020). Model mixed-layer heights are evaluated against mixed-layer heights estimated from ALC observations of aerosol-backscatter data. The observed mixed-layer heights at the two measurement sites in central London are largely similar in the morning and afternoon, and differ somewhat at their maximum height at midday. Reasonably good agreement between the observed and modelled mixed-layer heights is found, but the observations suggest that all simulations underestimate ${z}_{\text{ML}}$ (Figure 9). However, ${z}_{\text{ML}}$ values of the distributed-drag run are below those of the original and control run for the day investigated in this study. A typical error estimate for a measurement is around $\pm $ 10 m, thus much smaller than the difference between observations and modelled ${z}_{\text{ML}}$.

The lower mixed-layer height in the distributed-drag run is likely caused by an underprediction of the surface sensible heat flux ${Q}_{\mathrm{H}}$. Recall that ${Q}_{\mathrm{H}}$ is modelled as a function of the lowest model level winds, temperature differences, and the surface heat-exchange coefficient ${C}_{\mathrm{H}}$. The distributed-drag scheme yields higher velocities than the standard MORUSES parametrization at the lowest level; however, the exchange coefficient ${C}_{\mathrm{H}}$ is based on the heat roughness length and the altered momentum roughness length ${z}_{\text{0m}}=\u03f5$ (cf. Figure 2), and ${C}_{\mathrm{H}}$ is therefore much lower than in the control run. As a result, surface sensible heat fluxes throughout the day are greatly reduced compared with the standard scheme (Figure S8). For example, ${Q}_{\mathrm{H}}$ at Canary Wharf (similarly Shard and Wapping neighbourhoods, not shown) is reduced to half at midday (Figure 10). A smaller but significant reduction is also found at Oxford Circus (similarly the City, not shown). Night-time surface sensible heat fluxes are more alike and can be even slightly higher in the distributed-drag run (e.g., at Oxford Circus).

## 6 DISCUSSION

The results indicate that incorporating distributed drag has a profound effect on the wind in the lower atmospheric levels. The distributed-drag scheme causes much more variation in the wind speed over central London than the classical parametrizations. The pressure drag generated by clusters of tall buildings causes extensive wakes that reduce wind speeds over extended areas downstream.

However, the mixed-layer height was underestimated with the distributed-drag model, which was the result of the interaction of the distributed-drag scheme with the sensible heat flux prediction from JULES. This is a fundamental challenge for the next generation of land-surface schemes, since momentum and scalar (heat, moisture) exchange are intrinsically coupled; modelling drag in a distributed manner and heat solely at ground level is bound to lead to inconsistencies. Note that a similar conclusion was drawn for the distributed orographic scheme in a study on inland water (Rooney and Bornemann, 2013): while distributed orographic drag produces more realistic low-level winds than the orographic roughness scheme, it also produces unrealistic lower temperatures with the current model physics settings in JULES.

In order to obtain accurate predictions of sensible and latent heat fluxes with the current model, a recalibration of the scalar exchange coefficients would be required. However, the fundamental mismatch of having part of the physics represented in a distributed manner inside the model and part of the physics represented at ground level as a boundary condition to the model would remain. A “complete” multilayer model for urban areas will need to represent both the drag and scalar exchange within the atmospheric model in a distributed manner.

## 7 CONCLUSIONS

We present results from a trial implementation of a height-distributed urban drag scheme in the London Model high-resolution numerical weather prediction model. The distributed-drag scheme (Sützl *et al*., 2021) relies on a generalised frontal area $\zeta (z)$ to represent the urban morphology as a function of height. Analysis of the $\zeta (z)$ profiles for the Greater London area showed that $\zeta $ could be represented by an exponential distribution with the ratio between the maximum and mean building height (${z}_{\mathrm{max}}/{z}_{\mathrm{H}}$) as the independent variable, and the model therefore only requires ${z}_{\mathrm{max}}$ as additional input. This is remarkable, as it is not evident why London's morphology would be described by such a distribution. It would be intriguing to find out whether this distribution is specific to London or whether this is a universal relation amongst cities. The analysis also revealed that there are very few grid boxes that contain entirely uniform buildings, despite many studies of urban airflow and other processes being based on these idealised building forms. The lack of real morphology data hampers weather and climate predictions globally, as well as the delivery of integrated urban services to city residents.

A comparison between the standard model configuration (with original and updated morphology inputs) and the model with the distributed-drag parametrization shows considerable differences in the lower levels of the atmospheric boundary layer. Spatial variations in the surface stresses highlight the importance of the urban morphology inputs and modelling decisions. Original plan-area index, frontal-area index, and mean building height all depend on the impervious land-cover fraction in each grid box. After decoupling these parameters, regions with high-rise buildings experience higher surface stresses, but other local distinctions like the Thames river disappear, putting the applicability of roughness-length-based parametrizations like MacDonald *et al*. (1998) at such high resolutions into question. The distributed-drag parametrization uses a bluff-body modelling approach (i.e., considering frontal areas) and the resulting surface stresses show the greatest spatial variability: local features such as densely built-up areas, high-rise building clusters, parks, and the river are all clearly distinct.

Vertical effects of the drag distribution include a different distribution of wind speed in the first several hundred metres, with a lesser gradient and higher wind speeds at the lowest level. Wind profiles are more heterogeneous, and the perturbations introduced grow vertically in the downwind direction, suggesting the formation of internal boundary layers. The extended wakes downstream of agglomerations of densely built-up neighbourhoods and high-rise buildings are evidence for a heterogeneous urban boundary-layer flow that is not just the aggregated flow over each neighbourhood, but depends crucially on interactions between the neighbourhoods. These findings show that height-distributed drag and the inclusion of subgrid heterogeneity bear potential for improved urban wind modelling in regions close to the surface.

The model in its current form has several limitations. The morphology profiles are relatively simple: for example, they do not depend on wind direction, and sheltering between buildings is not represented. The drag distribution function is derived from a limited number of building-resolving simulations; using data from a wider range of morphologies could improve the parametrization further. Further research is needed to determine a generally suitable drag coefficient. The current implementation requires adaptation of the scalar exchange, which highlights that the distribution of momentum exchange can only be the first step towards a comprehensive multilayer model that is not based on a simple street-canyon geometry.

## ACKNOWLEDGEMENTS

Elliott Warren and Kit Benjamin's help with observation data is acknowledged. The GIS data for all buildings in Greater London and beyond are provided by EDINA Digimap Ordnance Survey Service.

## AUTHOR CONTRIBUTIONS

**Birgit S. Sützl:** Conceptualization; Formal analysis; Investigation; Software; Visualization; Writing – original draft. **Gabriel G. Rooney:** Conceptualization; Investigation; Software; Writing – review & editing. **Anke Finnenkoetter:** Conceptualization; Software; Writing – review & editing. **Sylvia I. Bohnenstengel:** Conceptualization; Writing – review & editing. **Sue Grimmond:** Conceptualization; Writing – review & editing. **Maarten van Reeuwijk:** Conceptualization; Supervision; Writing – review & editing.

## APPENDIX A: URBAN MORPHOLOGY CALCULATIONS

*B*at fixed height (Figure A1). Let ${\mathbf{e}}_{\mathbf{\theta}}=(\mathrm{cos}\theta ,\mathrm{sin}\theta )$ be the unit wind vector for a wind angle $\theta $ and ${\mathbf{e}}_{\mathbf{\perp}\mathbf{\theta}}=(\mathrm{sin}\theta ,-\mathrm{cos}\theta )$ a vector perpendicular to it. The building width $b(B,\theta )$ (in direction ${\mathbf{e}}_{\mathbf{\perp}\mathbf{\theta}}$) is defined as the distance between two lines parallel to ${\mathbf{e}}_{\mathbf{\theta}}$, such that

*B*is entirely between these two lines (Moszynska, 2006). This definition is equivalent to projecting the building cross-section onto a vector perpendicular to ${\mathbf{e}}_{\mathbf{\theta}}$ and calculating the length of the projected cross-section (Figure A1a). The building cross-section is enclosed by a unique minimal convex polygon (i.e., the boundary of the convex hull of

*B*; red polygon in Figure A1a) with vertices

*V*(red dots in Figure A1a). For calculation of the building width, it is sufficient to project the building vertices in

*V*onto ${\mathbf{e}}_{\mathbf{\perp}\mathbf{\theta}}$ and to calculate the distance between the two outermost projected vertices. We may write this as

Figure A1a shows building widths of a building cross-section for several wind angles and illustrates the projection of vertices from *V*. The shape of the building cross-section *B* may change with height. We will informally denote this by $B(z)$, such that we can express a height-dependent function $b(B(z),\theta )$ that characterises the vertical structure of a building. The frontal area for wind angle $\theta $ is derived by height integration of the building width, ${\int}_{0}^{{z}_{\mathrm{max}}}b(B(z),\theta )\phantom{\rule{0.3em}{0ex}}\hspace{0.17em}\mathrm{d}z$, which is consistent with the typical definition of a frontal area (e.g., Grimmond and Oke, 1999).

where $l(V)$ is the perimeter of the convex polygon described by *V*. This relation states that the mean width
is equivalent to the diameter of a circle with the same perimeter as the minimal convex polygon enclosing *B*
(cf. $l(V)$ in Figure A1a and b). The calculation of the building mean width is computationally inexpensive and straightforward with Equation A3 and it is therefore useful as a basis for calculating vertical profiles and morphology parameters over urban areas; in this case for the grid boxes of the NWP model. For a grid box with buildings described by the vertical building cross-section profiles

*z*as

This quantity resembles a weighted building-height average, where each building height is weighted by the building's (ground-level) mean width $\stackrel{\u203e}{b}(B(0))$. The maximum height of any building in the grid box is denoted as ${z}_{\mathrm{max}}$. The plan-area index is calculated in the usual way by ${\lambda}_{\mathrm{p}}={A}_{\mathrm{P}}/{A}_{\mathrm{T}}$, where ${A}_{\mathrm{P}}={\sum}_{n}{w}_{n}{A}_{p,n}$ and ${A}_{p,n}$ is the building's plan area.