The numerous approaches to tracking extratropical cyclones and the challenges they present

Extratropical cyclones (ETCs) are large-scale low-pressure systems that develop in the mid-latitude regions. These systems can travel thousands of kilometres and can last several days and are often, but not always, associated with high winds and heavy rain (Ulbrich et al., 2009; Catto, 2018). The most powerful ETCs can cause significant socioeconomic damage, costing millions of pounds (Hawcroft et al., 2012; Garnier et al., 2018). For example, large storm surges associated with ETCs can cause loss of life through coastal flooding, while strong winds cause falling trees and debris, in addition to disrupting transport systems and severely damaging property. Cyclogenesis can occur in numerous ways, the most ubiquitous of which is baroclinic instability characterised by strong vertical wind shear in the mid-latitudes. This shear, in turn, results via thermal wind balance, due to strong temperature gradients. ETCs act to reduce this gradient through the polewards transport of latent and sensible heat. Consequently, if the gradient is small, there is less potential energy available for cyclogenesis. Decreased equator-to-pole temperature gradients in the lower troposphere, resulting from polar amplification (the increased rate of warming in higher latitudes compared to lower latitudes as a result of increasing concentrations of greenhouse gases (Manabe and Wetherald, 1975)), are believed to be one reason behind the predicted decrease in ETC numbers for the Northern Hemisphere (NH) (Bengtsson et al., 2009; Catto et al., 2011). In addition, the increase in temperatures would enhance latent heat release and is thought to contribute to the deepening and intensification The numerous approaches to tracking extratropical cyclones and the challenges they present


Introduction
Extratropical cyclones (ETCs) are large-scale low-pressure systems that develop in the mid-latitude regions. These systems can travel thousands of kilometres and can last several days and are often, but not always, associated with high winds and heavy rain (Ulbrich et al., 2009;Catto, 2018). The most powerful ETCs can cause significant socioeconomic damage, costing millions of pounds (Hawcroft et al., 2012;Garnier et al., 2018). For example, large storm surges associated with ETCs can cause loss of life through coastal flooding, while strong winds cause falling trees and debris, in addition to disrupting transport systems and severely damaging property.
Cyclogenesis can occur in numerous ways, the most ubiquitous of which is baroclinic instability characterised by strong vertical wind shear in the mid-latitudes. This shear, in turn, results via thermal wind balance, due to strong temperature gradients. ETCs act to reduce this gradient through the polewards transport of latent and sensible heat. Consequently, if the gradient is small, there is less potential energy available for cyclogenesis. Decreased equator-to-pole temperature gradients in the lower troposphere, resulting from polar amplification (the increased rate of warming in higher latitudes compared to lower latitudes as a result of increasing concentrations of greenhouse gases (Manabe and Wetherald, 1975)), are believed to be one reason behind the predicted decrease in ETC numbers for the Northern Hemisphere (NH) (Bengtsson et al., 2009;Catto et al., 2011). In addition, the increase in temperatures would enhance latent heat release and is thought to contribute to the deepening and intensification The numerous approaches to tracking extratropical cyclones and the challenges they present Storm tracks are part of a very complex coupled system with many different interacting components that can strongly influence an ETC's location and intensity. Changes in the location of storm tracks, both latitudinally and zonally, have been linked to the subtropical jet, baroclinicity and extratropical sea-surface temperatures (Brayshaw et al., 2009;2011;Woollings et al., 2010;Feser et al., 2015). In addition, storm tracks respond to large-scale phenomena such as the El Niño Southern Oscillation, the North Atlantic Oscillation (NAO), the Quasi-Biennial Oscillation and the Madden-Julian Oscillation (Ulbrich et al., 2009;Feser et al., 2015;Yang et al., 2015;Wang et al., 2017Wang et al., , 2018. For example, Hurrell et al. (2003) illustrated how storm track activity and ETC intensity increase in regions of the North Atlantic Ocean during a positive NAO (Figure 1a). In addition, links have been identified between storm tracks and changes in the stratosphere during winter (Kidston et al., 2015).
The changes in the position and intensity of storm tracks will impact the local climate and weather over large distances (Bengtsson et al., 2006). The North Atlantic jet stream is eddy driven and therefore connected to the North Atlantic storm track. They both normally exhibit a similar southwest-northeast orientation (Figure 1a), of ETCs (Bengtsson et al., 2006;Michaelis et al., 2017).
The pathways along which ETCs typically travel are known as storm tracks. Climatological storm-track regions are prevalent areas of synoptic-scale disturbances where, for example, there is a maximum polewards transport of energy occurring in the North Pacific and North Atlantic oceans in the NH (Blackmon, 1976;Booth et al., 2017). In the Southern Hemisphere (SH), during summer, the storm track forms a circular pattern around Antarctica, which becomes more asymmetric in winter (Hoskins and Hodges, 2005;Ulbrich et al., 2009). During winter, baroclinicity is at a maximum in both the Pacific and North Atlantic Ocean basins (Nakamura, 1992;Hoskins and Hodges, 2019). In terms of baroclinic wave activity, the North Atlantic storm track reaches maximum intensity during winter, whereas the Pacific storm track has a mid-winter minimum (due to the especially strong jet stream), with maximum intensity occurring during late autumn and early spring (Nakamura, 1992). The SH storm track maximum intensity (i.e. strongest ETCs) also occurs during winter, with enhanced activity in the southern Atlantic and Indian Ocean regions (Hoskins and Hodges, 2005;Ulbrich et al., 2009;Booth et al., 2017).
directing ETCs towards northern Europe (Woollings et al., 2010). On inter-seasonal timescales in the NH, the latitude of the North Atlantic and Pacific storm tracks move poleward in the summer, before returning equatorward in the winter (Hoskins and Hodges, 2019). Similarly, there is a poleward shift of SH storm tracks during winter (Lehmann et al., 2014).
The multitude of various dynamics that can control storm track characteristics presents us with a significant challenge in how we measure and understand their impacts across our world. This has resulted in numerous and diverse tracking methods; therefore, this paper aims to (1) give an overview of the methods used in identifying and tracking ETCs, (2) discuss the implications of using different definitions of extreme or intense, (3) give an overview of the current literature where studies have compared a range of ETC statistics using several datasets and methods and (4) compare two North Atlantic transitional ETC tracks using three methods. More emphasis has been placed on NH tracking results due to the greater availability of literature; however, research on the SH storm tracks is continually growing. The paper first describes the multiple methods used for identification. We then explore the obstacles in tracking identified systems through time and the different ways to overcome them, and the significance of using different definitions of extreme or intense. A review of the current literature where studies have compared a range of ETC statistics using several datasets and methods follows, illustrated by a case study of two strong ETCs in the North Atlantic.

Identification
Each tracking algorithm has a set of known obstacles to overcome when trying to identify ETCs within the model and observational data. One such problem is that there is no universally agreed definition of what an ETC is or where its precise location is (Neu et al., 2013). It is agreed, however, that the number of ETCs is simply the number of identified ETCs in the data, ETC frequency is the number of ETCs in a defined area, and track density can be measured by counting the number of storm tracks crossing a region through time (Ulbrich et al., 2009).
Before the identification and tracking of ETCs, many storm-tracking algorithms apply spatial filters, which remove the large spatial scale or small-noise scale (Anderson et al., 2003;Zappa et al., 2013;Feser et al., 2015;Massey, 2016). This allows ETCs to be more easily identified as extrema from larger-scale systems and removes any bias towards slower-moving systems (Hoskins and Hodges, 2002;Anderson et al., 2003). As there is no set standardised way to achieve this background removal, and some methods do not involve such a step, results can vary from one method to another.
The tracking algorithm by Hodges (1994;1995;1999) uses relative vorticity at 850hPa for the identification of ETCs and has frequently been used in feature-tracking studies (Bengtsson et al., 2006). Massey's (2012; 2016) objective feature-tracking algorithm uses re-gridded minimum MSLP to identify ETCs at higher latitudes. Using these two different approaches in identification can lead to variations in the outputted storm track statistics. One reason for this is that results using MSLP represent the low-frequency, large-scale features of the atmosphere, whereas vorticity represents the high-frequency, small-scale features Hodges, 2002, 2005;Neu et al., 2013). Vorticity is often reduced to a lower resolution to decrease the amount of noise Hodges, 2002, 2005).

Tracking
There are two commonly used frameworks for evaluating storm tracks in climate models: Eulerian and Lagrangian. The Eulerian method commonly uses a 2-6-day bandpass filter to highlight synoptic timescale activity, which includes storm tracks (Blackmon, 1976;Hoskins and Hodges, 2002). Although this method computes quick and simple statistics, it does not provide the level of detail about ETC characteristics, such as the number and intensity of ETCs, that are used to determine changes in ETC trends or impacts (Hoskins and Hodges, 2002;Anderson et al., 2003;Zappa et al., 2013;Michaelis et al., 2017). The Lagrangian method, however, involves the temporal and spatial tracking of an individual ETC, known as objective feature tracking (Hoskins and Hodges, 2002;Feser et al., 2015;Catto, 2016;Michaelis et al., 2017). Using tracking algorithms allows for the analysis of long-term trends and the lifecycle of ETCs, along with their speed and intensity (Feser et al., 2015). Most objective feature-tracking methods have two phases: the identification of an ETC and tracking the same system across multiple time-steps (Raible et al., 2008;Massey, 2016;Lakkis et al., 2019).
Once identified, an ETC must be tracked through time, giving rise to what is known as the correspondence problem. Tracking algorithms must be able to identify an ETC and then identify that same system in the following time-step. Neighbour point tracking uses a local maximum or minimum value of a climate variable and then tracks this point through time using a nearest-neighbour model (Lakkis et al., 2019). Others use a cost function to improve smoothness and ensure that points match the same track (Hodges, 1994(Hodges, , 1995Massey, 2012Massey, , 2016. In addition, methods implement various constraints to reduce the possibility of matching errors (Hoskins and Hodges, 2002), for instance, setting a search radius based on the average speed of an ETC (Raible et al., 2008;Massey, 2016). Quite often, tracks are filtered so that features are only selected if the total track length exceeds 1000km and/or lasts longer than 24, 48 or even 72 hours Hodges, 2002, 2005;Hodges et al., 2003;Bengtsson et al., 2006;Raible et al., 2008;Massey, 2012Massey, , 2016Neu et al., 2013;Pinto et al., 2016). Filtering tracks help to provide some standardisation, which can be implemented across multiple studies (Neu et al., 2013;Grieger et al., 2018).
In addition to identifying and tracking an ETC through time, it is equally important to ensure that it is accurately tracked through space. Issues that can be encountered include changes in latitude-longitude grid box sizes that decrease with increasing latitude (resolution discrepancy), leading to singularities at the poles. There are multiple approaches to address these problems, ranging from spatial filters, truncating data at a certain wavenumber, re-gridding data and projecting it onto a different grid, all of which create unique tracking algorithms (Hodges, 1994;Hoskins and Hodges, 2002;Massey, 2012;Zappa et al., 2013). Some of these methods can be computationally expensive, while others are limited to only being able to track ETCs one hemisphere at a time.
New identification and tracking techniques are being created to capture more aspects of ETCs in climate models. Methods commonly identify ETCs as a minimum or maximum point within one level of data and track that point through time. However, ETCs have complex 3-dimensional features that extend through multiple levels in the atmosphere. Lakkis et al. (2019) have created a 4-dimensional (4D) feature-tracking algorithm that identifies and tracks ETCs across multiple levels in the atmosphere. They have adapted the method from Hodges (1995), repeating the process of identifying and tracking an ETC using relative vorticity on multiple vertical levels and then stacking these results to create a 4D representation of the track. All these different approaches to tracking can influence the calculation of ETC characteristics and statistics (Feser et al., 2015). It is important to note that each method has its limitations, and there is no 'correct' way to solve these issues. As a result, it is not advised to apply an algorithm without knowing its limitations.

Defining 'extreme'
Just as there are many different climate variables used in identification, there are numerous methods of defining and classifying what is an 'extreme' , 'strong' or 'intense' ETC (Catto, 2016;Chang, 2017). Approaches can involve defining extreme in terms of passing a physical threshold, and others account for the physical damage caused by an ETC, whereas some combine these (Garnier et al., 2018). Lambert (1996, pp 21, 320) defined an intense ETC as 'the occurrence of a grid point value of MSLP less than or equal to 970mbar' . This threshold was used to ensure the exclusion of most ETCs, spurious lows and any low pressures caused by high terrain. Alternatively, Zappa et al. (2013) defined strong ETCs as exceeding the 90th percentile of maximum wind speed at 850hPa in the North Atlantic and European storm tracks. More recently, Chang (2017) applied different definitions of extreme based on the exceedance of two set thresholds using variables such as MSLP, 850hPa relative vorticity and winds. Conversely, Grieger et al. (2018) defined extreme as the top 500 most intense winter tracks when using minimum MSLP to measure intensity. To help reduce discrepancies between assigned intensities, it is common to use MSLP (Feser et al., 2015).
It is important to understand that differences may arise in trends when the definition of what represents an extreme ETC is not consistent. This is not only relevant for historic trends but also for future projections as numerous studies use different definitions of ETC intensities (Ulbrich et al., 2009;Zappa et al., 2013;Michaelis et al., 2017). Research by Ulbrich et al. (2009) showed that the results of future hemispheric trends in extreme ETCs depended on how they were defined. A decrease in the number of extreme ETCs averaged over the whole NH was found when extreme was defined as being in the 99th percentile for the Laplacian of pressure, compared to an increase when defined in terms of sea-level pressure. Zappa et al. (2013) used a multi-model approach to investigate the North Atlantic ETC response to RCP4.5 and RCP8.5 future climate scenarios using Hodges' (1995; 1999) objective feature-track-ing algorithm. They found a future basin-wide reduction in the number of strong ETCs during winter. However, an increase in number and strength over the British Isles and central Europe was projected. In addition, Michaelis et al. (2017) investigated the impact of climate change on the winter North Atlantic storm track and found an overall decrease in the number of strong ETCs in the North Atlantic when defining strong as passing a minimum threshold in the sea-level pressure field. Alternatively, in the SH, Chang (2017) found that a significant increase in the frequency of future extreme ETCs was not dependent on the definition used.

Differences due to datasets and methods
There are differing results in climatological storm-track structures and densities and in historical and future trends. These may result from differences in the data used or differences in the methodology of tracking ETCs. Uncertainties regarding the dataset were identified by Hodges et al. (2003), who used several reanalysis datasets, together with Hodges' (1999) tracking algorithm, to compare the representation of historical storm tracks in both hemispheres. Differences between the reanalyses were greater in the SH, in regions of growth or decay, and were generally larger for weaker ETCs. Fewer observations in the SH generate a greater dependence on model results and consequently increase the uncertainty of historic trends (Hodges et al., 2003;Ulbrich et al., 2009). Raible et al. (2008) compared NH ETC statistics between two reanalysis datasets for the period between 1961 and 1990. Although results for extreme ETCs were in good agreement, the greatest dif-ference was found during summer with additional discrepancies in the number and intensity of tracks in regions close to significant orography. In 2009, Ulbrich et al. reviewed various methods of identification and tracking using different reanalysis datasets for both hemispheres. They also found that most disagreements were for summer months, and there was a better agreement for intense ETCs. The differences when comparing reanalysis datasets were mostly related to the different spatial resolutions.
The role of uncertainties due to the tracking method was highlighted by Neu et al. (2013), who assessed 15 different tracking algorithms as part of an experiment set up by the international Intercomparison of MId LAtitude STorm diagnostics (IMILAST). The experiment was set up so that each tracking algorithm used the same dataset (ERA-Interim reanalysis) for the same period , at the same spatial (1.5° × 1.5°) and temporal resolution (6-hourly timesteps). The largest differences between methods were for the number of ETC tracks. There was a larger spread of results in the NH and over continents than in the SH. Figure 2 shows the differences in the number of NH (30°-90°N) ETCs for December, January and February for each of the methods used, with additional results from Massey (2016). These methods differ by more than 100%, with no well-defined grouping of results based on climate variables. However, there was more agreement in the number of winter ETCs identified in both hemispheres. Winter ETCs tend to be more intense and easier to identify and track, which is in agreement with Hodges (2002, 2005) and Ulbrich et al. (2009). Grieger et al. (2018 used the same approach as Neu et al. (2013) to further understand the SH results. They found many similarities between methods, but like  Neu et al. (2013), differences included variations in ETC numbers and intensity, with a greater agreement in intense ETC statistics.
As previously discussed, MSLP and relative vorticity are popular climate variables used in feature tracking. Differences in the location and number of tracks could be due to the choice of variable used (Raible et al., 2008). When using MSLP and 850hPa vorticity, Hoskins and Hodges (2005) found that the SH storm track was strongest during winter. However, when using 250hPa vorticity (upper troposphere), maximum values occurred during summer. Vorticity and MSLP results agreed that the strongest ETCs occur in the southern Atlantic and Indian Ocean regions. In addition, Grieger et al. (2018) found that vorticity identified a greater number of tracks in the SH than MSLP. This is a result of vorticity being more capable of identifying and tracking smallscale features (Hoskins and Hodges, 2002;Neu et al., 2013;Grieger et al., 2018). It may be assumed that regions are dominated by small-scale systems when there are more ETCs identified by vorticity than MSLP.
There are many similarities between MSLP and vorticity tracks in the NH, except for regions such as the Mediterranean and at the beginning and end of tracks (Hoskins and Hodges, 2002). Pinto et al. (2016) discovered that ETC clustering in the North Atlantic and Europe compared well between multiple methods. However, there was less agreement around the initial and final positions of the storm tracks, with vorticity tracks being located further south than MSLP tracks. Conversely, Hewson and Neu (2015) stated that they could not group their results based on the climate variable used; rather, it was the variations in threshold settings that were more significant.

Comparison of three methods on two North Atlantic transitional ETCs
To illustrate the variations that can occur when using different methods, two North Atlantic transitional ETCs, Ophelia (2017) and Oscar (2018), were tracked using three separate tracking methods (Figure 3). First, the National Hurricane Center (NHC) best track was obtained from its hurricane database HURDAT2 (Landsea and Franklin, 2013). The NHC best tracks are created by collating all the observational data available, such as satellite and aircraft measurements, to subjectively determine the location, intensity and size of tropical cyclones and their tracks. Second, the National Aeronautics and Space Administration (NASA) Modeling, Analysis and Prediction Climatology of Midlatitude Storminess (MCMS) tracking algorithm applied a closed contour method using MSLP minima from ERA-Interim to locate and track ETCs (Naud et al., 2012). Finally, the Massey tracks were created by inputting 6-hourly MSLP data from ERA5 reanalysis (Hersbach et al., 2020) into the Massey (2016) stormtracking algorithm.
All the methods indicate a comparatively good agreement between the locations of the two tracks. There are apparent dissimilarities, especially at the initial and final stages of the tracks. The most evident difference in Figure 3(a) is that the NHC best track begins much earlier, identifying Ophelia as a hurricane before it transitioned into an ETC. Despite the MCMS track beginning earlier than Massey, it diverges for a small section when compared to the other tracks. As Ophelia hits the British Isles, the difference between tracks decreases; however, they begin to separate towards the end of the tracks. All three tracks finish in different locations, with MCMS at a different time-step. Interestingly, the three methods demonstrate a closer agreement for Oscar's track (Figure 3b). Once more, the largest differences are at the beginning and end of the tracks, with the MCMS method identifying and tracking Oscar before the NHC. Agreement between the tracks improves towards the latter half of the storm track. However, as highlighted by the arrow in Figure 3(b), there is a noticeable outlier in the Massey track. There is a closer agreement between Massey and NHC regarding the location of dissipation than with MCMS, which extends the track northeastwards by another time-step.
Large differences between tracks can exist at the initial and final time-steps for a variety of reasons, one being that relative vorticity is more capable of identifying a system at an earlier stage than MSLP (Hoskins and Hodges, 2002;Neu et al., 2013;Grieger et al., 2018). Rantanen et al. (2020) tracked Ophelia using Hodges ' TRACK algorithm (1994;1995) with input data from the Open Integrated Forecast System model and compared it to the NHC best track. TRACK identified Ophelia earlier than both MCMS and Massey and had a similar dissipation location as MCMS. This highlights that methods using vorticity and MSLP can produce a comparable track, with less similarity shown at either end of the tracks.
Despite both MCMS and Massey using MSLP, they used different reanalyses, approaches and thresholds in identification and tracking. Figure 3 shows how changes in methods can paint a slightly different picture in terms of location of cyclogenesis and cyclolysis. Nevertheless, there is no 'right' answer when analysing storm track statistics. While the NHC best track uses observations, it is restricted to the quality and quantity of the information available. The uncertainty over a cyclone's position in the HURDAT2 dataset depended on its intensity and availability of aircraft measurements (Landsea and Franklin, 2013).
Both Ophelia and Oscar completed extratropical transition, in that they began as hurricanes and then transitioned to ETCs. Consequently, they represent stronger ETCs that are reasonably easier to track. Therefore, it is interesting that when Ophelia reached peak intensity in terms of minimum MSLP (Figure 3(a)), there was some disagreement in its location, even between the two MSLP methods.

Summary and conclusion
The diversity and complexity of tracking ETCs is increasing as new tracking methods are developed. There are now more approaches to identify and track an ETC and measure its intensity and lifetime. However, due to their complexity, it is extremely challenging to fully represent all types of ETCs in one single agreed-upon method. As a result, we propose that this diversity of methods may lead to a lack of consensus on how ETC trends have been in the past, which makes it even more difficult to agree on how their numbers, intensities and impacts will change in the future. The main conclusions from this study are as follows: • Despite many differences, there are some common features among methods that involve filtering tracks based on their duration and distance travelled.  (Figure 2). There is, however, more agreement for winter and intense ETC statistics than for summer and weaker statistics. • The greatest differences between tracks when using MSLP or vorticity is towards the first and final time-steps. Vorticity can identify a higher number of ETCs due to its ability to track small-scale features.
Each study has contributed new and significant information that has helped in our growing understanding of these complex physical systems. Nevertheless, it is crucial to consider that using a different dataset or the same dataset on another tracking algorithm may produce significantly different results when examining trends in ETC statistics that use only one tracking method. Therefore, we agree that there is still a need to continue to compare storm-tracking methods (Raible et al., 2008;Ulbrich et al., 2009;Neu et al., 2013;Grieger et al., 2018).

Introduction
Countries around the world face pressing social, environmental, political and economic issues. Air quality transcends all of the above; poor air quality disproportionally impacts minorities and those on a low income (Di et al., 2017), while contributing to 40 000 premature deaths and an eco-nomic burden of £20 billion per year in the UK alone (Royal College of Physicians, 2016). As more than 50% of the world's population now resides in urban environments, it is these relatively small spatial areas in which the most acute pollution episodes are likely to have the largest impact on the greatest number of people.
The sources and profile of pollutants varies greatly throughout the world. In the UK, the key pollutants driving adverse health outcomes are nitrogen dioxide (NO 2 ) and fine particulate matter (PM 2.5 ), with 64% of new paediatric asthma cases in urban centres attributed to elevated NO 2 levels (Achakulwisut et al., 2019). The primary source of NO 2 in roadside locations originates from vehicle transport. Yearly net emissions data from road transport is freely available at 1-km resolution from the National Atmospheric Emissions Inventory (NAEI). However, this top-down approach loses the granularity of particular roads and junctions that are emission hotspots at different times of the day.
Measuring road traffic emissions of NO 2 at high spatial and temporal resolutions is costly and logistically challenging. In this study, we develop a cheaper and universally applicable methodology to infer road transport emissions at the resolution of an individual road. We utilise the vast amount of data generated by the widespread use of mapping products to better understand traffic flows on city roads. While the total number of vehicles on a busy road link is Using routing apps to model real-time road traffic emissions