Journal list menu
Assimilation of atmospheric infrasound data to constrain tropospheric and stratospheric winds
Funding information: Norges Forskningsråd, UK National Centre for Earth Observation, UK National Environmental Research Council
Abstract
This data assimilation study exploits infrasound from explosions to probe an atmospheric wind component from the ground up to stratospheric altitudes. Planned explosions of old ammunition in Finland generate transient infrasound waves that travel through the atmosphere. These waves are partially reflected back towards the ground from stratospheric levels, and are detected at a receiver station located in northern Norway at 178 km almost due north from the explosion site. The difference between the true horizontal direction towards the source and the backazimuth direction (the horizontal direction of arrival) of the incoming infrasound wavefronts, in combination with the pulse propagation time, are exploited to provide an estimate of the average cross-wind component in the penetrated atmosphere. We perform offline assimilation experiments with an ensemble Kalman filter and these observations, using the ERA5 ensemble reanalysis atmospheric product as background (prior) for the wind at different vertical levels. We demonstrate that information from both sources can be combined to obtain analysis (posterior) estimates of cross-winds at different vertical levels of the atmospheric slice between the explosion site and the recording station. The assimilation makes greatest impact at the 12–60 km levels, with some changes with respect to the prior of the order of 0.1–1.0 m·s−1, which is a magnitude larger than the typical standard deviation of the ERA5 background. The reduction of background variance in the higher levels often reached 2–5%. This is the first published study demonstrating techniques to implement assimilation of infrasound data into atmospheric models. It paves the way for further exploration in the use of infrasound observations – especially natural and continuous sources – to probe the middle atmospheric dynamics and to assimilate these data into atmospheric model products.
1 Introduction
Despite much recent attention to extratropical stratospheric dynamics and their connection to the troposphere, the amount of observational data in the stratosphere available to numerical weather prediction centres remains limited. A better representation of the stratospheric dynamics and the stratosphere–troposphere coupling in models has the potential to enhance tropospheric weather forecasts, in particular on subseasonal time-scales (Baldwin et al., 2003; Polavarapu et al., 2005; Charlton and Polvani, 2007; Mitchell et al., 2013; Kidston et al., 2015; Karpechko et al., 2016; Blanc et al., 2018; Haase et al., 2018; Pedatella et al., 2018; Taguchi, 2018; Kawatani et al., 2019). Moreover, the lid of several atmospheric model products has been raised into the mesosphere (Polavarapu et al., 2005) and it has been demonstrated that this can improve numerical weather and climate models (Orsolini et al., 2011; Charlton-Perez et al., 2013; Kidston et al., 2015). But the full potential of high-top models can only be unlocked if middle atmospheric winds are better represented (Baker et al., 2014; Korhonen et al., 2019; Lee et al., 2019). Hence, it is timely to explore novel datasets and assimilation approaches that can constrain the upper-stratospheric dynamics in atmospheric model products.
Infrasound waves are acoustic waves at frequencies below the human hearing limit (typically around 20 Hz). These waves can be generated by natural sources, such as volcanoes, earthquakes and ocean swell, but also by human sources, such as mining and explosions (e.g., Le Pichon et al., 2018). These waves propagate through the atmosphere and can be recorded by ground-based stations. The wave frequencies of greatest interest for atmospheric characterisation are typically of the order of 1 Hz. The time and form of the received signals provide temperature- and wind-related information about the atmosphere the waves traverse. Infrasound waves may travel from sources on the surface of the Earth, reach a maximum altitude where they are partly or fully reflected or refracted, and then reach back to the surface to be detected by a receiver. Effectively, they probe a slab of the atmosphere in a tomographic fashion since the time it takes for these waves to complete their path is affected by the characteristics of the atmosphere they pass through: in particular, the wind velocity and temperature, but also attenuation-related properties like density and relative humidity. Hence, spatio-temporally integrated information carried by the propagating infrasound waves can be utilised to reconstruct or constrain atmospheric variables. Sound waves are already exploited in other tomographic and imaging problems. For instance, in underwater acoustics, temperature profiles (Dzieciuch et al., 2013) and seafloor bathymetries (Wölfl et al., 2019) are mapped using sound waves. Probabilistic infrasound propagation has been studied by Smets et al. (2015), where measured infrasound wavefront parameters for one year of infrasound explosions were compared to ray-tracing simulations using the ensemble atmospheric wind and temperature fields of the ECMWF ensemble data assimilation system of perturbed analyses (Buizza et al., 1999).
The current study follows directly from a recent paper by Blixt et al. (2019), which used the same dataset to demonstrate that atmospheric cross-winds can be estimated directly from infrasound data using propagation time and back-azimuth deviation observations, and interpreted these results in the context of ERA-Interim reanalysis winds. There is a physical effect which is the basis of this work: when a steady cross-wind acts on a propagating acoustical plane wave, a bending of the wavefront is introduced. This creates a deviation in the apparent back-azimuth direction of infrasound wavefronts impinging on ground-based sensor array stations. We use this physical effect to assess the dynamical evolution of the stratosphere during several events, as sampled by the infrasound waves on their paths between Finland and a ground-based station in Northern Norway. The array signal processing algorithms exploit infrasound signals recorded on a set of 25 sensors distributed on the ground within a 3 km wide aperture (figure 1 of Blixt et al., 2019).
Data assimilation (DA; e.g., Asch et al., 2016, Kalnay 2003) is a discipline which aims to combine different imperfect and incomplete sources of information to produce a better estimate of a variable of interest. In particular, it takes into account the uncertainty of the information sources. The most ambitious approach obtains and updates descriptions of a system using probability distribution functions (pdfs) by application of Bayes' theorem. In practice, however, sample estimators like mean, covariance and mode of the distributions often suffice. In particular, the Kalman filter (Kalman, 1960; Kalman and Bucy, 1961) and its ensemble implementation (Evensen, 1994; Burgers et al., 1998; Tippett et al., 2003) assume Gaussian statistics in the sources of errors, as well as no or small deviations from linearity in the evolution and observation processes. The filter operates with the first two statistical moments of a distribution. An advantageous feature of the Kalman filter is that it can assimilate an integrated observation variable (in our case an average wind component resulting from vertical integration along the path of propagation) and translate this into increments at different vertical levels. This proved useful, for instance, in the assimilation of radiance satellite observations (Lei et al., 2018). A discussion on the prospects of assimilating atmospheric infrasound data into numerical weather prediction models can be found in Assink et al. (2019).
There are two main objectives of this study. The first is to develop a framework which allows for assimilation of tropospheric and stratospheric wind information based on atmospheric infrasound data. The second is to provide a first demonstration and proof-of-concept with an offline (i.e., no cycling involved) infrasound DA experiment using the developed framework, exploiting a dataset which is already well-characterised in previous works.
We generate an estimate of the averaged cross-wind component along the relevant track from the explosion site in Finland to the station in Northern Norway, as well as an associated measure of uncertainty. We apply the deterministic ensemble Kalman Filter (DEnKF) as described in Sakov and Oke (2008). We select this approach because it allows for model-space localisation, as opposed to observation-space localisation which is not feasible for integrated quantities (Lei et al., 2018). Some specifics of this method are outlined in the Appendix.
The remainder of this paper is organised as follows. Section 2 explains the system set-up, detailing the way observations are related to the state variables of the system under different degrees of simplification from the most general problem to the case considered in the current work. In Section 3, we perform synthetic-data experiments under ideal conditions with an infinite ensemble size, and with different vertical weights in the observation operator. These experiments verify the offline DA process in a controlled setting. Section 4 presents the real-data assimilation experiments using infrasound from 18 years of explosions. In Section 5 we conclude the study, discuss its limitations, and provide ideas and suggestions for future work.
2 Cross-wind effects on the propagation and arrival of infrasound wavefronts
Let us explore the effect of a cross-wind on the propagation of infrasound waves. Recall the basic principle: a background wind field affects the propagation of infrasound waves; specifically, a cross-wind can bend the wavefront. Infrasound waves, however, do not modify the background wind field.
2.1 Propagation within a plane























2.2 3D Propagation
Having explained the basics, we now move to full 3D wave propagation, that is, when the trajectory of the infrasound wave has a vertical component. This is depicted in Figure 2. Figure 2a shows an atmospheric volume discretised to a model grid. Both the source (S) and receiver (R) are at the surface. In this case, the line da is a segment of the great circle between S and R, and it is not necessarily aligned with the grid. A simple example path of an infrasound wave is shown in yellow. The wave travels both in the ra and z directions. The wave travels in the vertical to a given maximum altitude from where it returns down to ground (e.g., due to partial reflection as explained in Blixt et al., 2019) and it is then detected at the receiver. As in the 2D case, the wave travels through a cross-wind field which leads to a change on back-azimuth angle Δθ towards an apparent source S'. This cross-wind now also depends on altitude: wc(ra,z,t).











3 Synthetic-data assimilation experiments
This section describes basic synthetic DA experiments, before moving to the case of the assimilation of recorded infrasound data. Consider Nz=4 vertical levels in a propagation volume. Let the cross-wind be a Gaussian random variable with zero mean μb=0 and covariance
. We apply the DEnKF with sample size of Ne=104 elements. This sample size is practically free of sampling noise. This allows for the computation of accurate estimates of the associated pdfs, and for the background ensemble mean and covariance to be virtually identical to the real ones, that is,
and Pb→B.








Figure 4 shows the results of the assimilation experiments considering the two matrices Pb given above. This figure has three columns, one for each set of weights α. We plot several pdfs in each panel. To ease visualisation, the pdfs are scaled, and hence the vertical axes have no units. The background pdf estimated from the model ensemble is shown with a grey dotted line for the four vertical levels and operators. We also plot the analysis pd's for the two covariances. When Pb is diagonal, the DA process can only update the levels with non-zero values in α. The analysis pdfs corresponding to this case are shown by black dashed lines. In the left column, only the lowest level is updated, while in the centre column the four levels are updated. In the right column, only the top two levels are updated. All observed levels are updated similarly as we apply a non-zero operator with equal values.

A non-diagonal covariance matrix Pb yields a different result because non-zero off-diagonal values communicate information from observed to unobserved levels. The blue dotted lines show the analysis pdfs for this case. The magnitude of the update decreases with distance between the observed layers and non-observed layers, as expected from the exponential off-diagonal decay in Pb.
4 Offline assimilation experiments using observed infrasound from explosions
We finally proceed to perform offline DA based on real infrasound recordings. The offline character implies that the assimilation at a given observed time is independent from all other times.
4.1 Observations and background
Our observations come from a dataset recording explosions at the Hukkakero site in northern Finland (Gibbons et al., 2007; Gibbons et al., 2015; Gibbons et al., 2019; Blixt et al., 2019). (suresh) These explosion series were conducted during August and September, with individual explosions typically separated by about 24 hrs. The dataset considered in the current study covers the years 2001–2018. The infrasound waves produced by these explosions were detected at the ground-based ARCES array station in Norway, which is located 178 km due north from the explosion site. It takes the wave around 10 min (on average) to propagate from the source to the station.
Since we know the exact explosion and detection times, as well as the exact source and receiver locations, we can compute the celerity υ value with high accuracy. In fact, we will consider it to be error-free. The back-azimuth deviation angle Δθ for each explosion is obtained from observations. For these observations we consider an unbiased error following a normal distribution with a standard deviation of 1/20 of a degree. Blixt et al. (2019) or Szuberla and Olson (2004) give details on the estimation of observational error in this case.
Figure 5 displays the back-azimuth deviation Δθ and the celerity υ for each explosion. The years are separated by black vertical lines. To facilitate visualization we do not display the exact time of each explosion. We discard data points where the magnitude of the back-azimuth deviation is |Δθ|≥0.75 rad (not shown in the figure), retaining a total of N=370 valid events. Table 1 lists the number of events used and discarded for each summer.

Number of observations | ||
---|---|---|
Year | Included | Discarded |
2001 | 26 | 0 |
2002 | 20 | 0 |
2003 | 21 | 0 |
2004 | 19 | 1 |
2005 | 20 | 1 |
2006 | 28 | 0 |
2007 | 49 | 0 |
2008 | 34 | 1 |
2009 | 20 | 1 |
2010 | 21 | 1 |
2011 | 18 | 1 |
2012 | 19 | 2 |
2013 | 11 | 0 |
2014 | 15 | 0 |
2015 | 12 | 0 |
2016 | 17 | 2 |
2017 | 11 | 0 |
2018 | 9 | 0 |
Total | 370 | 10 |
We extract the background cross-winds from the ERA5 reanalysis product (Hersbach et al., 2019), which has 10 ensemble members. We interpolate the horizontal winds from the native grid to the along-track and cross directions to the great circle connecting Hukkakero and ARCES. This is done for all the 137 ERA5 vertical levels. The time resolution of ERA5 ensemble product is 3 hrs, so we linearly interpolate the wind values to the origin time of the explosion. The propagation time from source to receiver, which is around 10 min, is disregarded when extracting the ERA5 winds. This simplification would not be valid for longer propagation times.
Figure 6 shows statistics for the background cross-wind velocities for the 137 vertical levels (vertical axis) at the time of each explosion over the 18 years (horizontal axis). The vertical lines show the change of year and again the exact times are not shown in the axis. Note that the vertical levels do not have uniform resolution. Figure 6a displays the sample mean over the ten ensemble members. We scale the colours to cover m·s−1. In general, the mean cross-wind speed is characterised by a strong positive jet in the lower levels (around z=10 km), and a strong negative cross-wind in the upper levels (around z=60 km). However, the cross-wind shows a significant variation in
time.

Figure 6b shows the cross-wind sample standard deviation over the ten ensemble members. Lower levels have smaller standard deviations than higher levels. For instance, the region above z=50 km has standard deviations of 2 m·s−1 or larger, whereas the standard deviation at levels below 30 km are rarely higher than 0.5 m·s−1. This is expected since the reanalysis data contain information from atmospheric wind observations from these altitudes. The number of observations generally reduces with height (Duruisseau et al., 2017). This plot suggests the observational impact of the infrasound measurements to be higher in the levels above around z=30 km. However, the other factor for this impact involves the coefficients for different vertical levels, which is something we discuss in the next subsection.
Figure 6 displays a time-varying black line at around z=40 km. This represents the estimated maximum altitude to which the infrasound penetrates before being reflected towards ground. Any altitudes above those lines cannot be updated directly from the observations. Therefore, updates above this line are due to vertical covariances in the DA process. The return altitude of the infrasound is estimated by matching the travel time of a modelled infrasound ray through the model atmosphere with the observed infrasound travel time, as explained in Blixt et al. (2019).
4.2 Vertical weights
In the synthetic experiments we prescribed coefficients to compute the weighted cross-wind average. In the current section, we estimate these weights from ray-tracing through wind and temperatures (Blixt et al., 2019) extracted from the ERA-Interim reanalysis atmospheric product (Dee et al., 2011). This is shown in Figure 7 for 14 events in 2016. The lines are coloured according to the corresponding celerity υ as indicated in the label box.

This figure shows un-normalised vertical weights for each explosion. Notice that none of the explosion-generated infrasound waves penetrate higher than 50 km altitude, with the majority reaching only around 40 km. It is clear that the waves spend a significant part of the propagation time within the lowermost 10 km levels and within 30 and 40 km. The celerity υ ranges between υ=274.4 m·s−1 and υ=292.9 m·s−1 for these events.
This process is applied to all 370 explosions, yielding vertical coefficients and maximum vertical penetration values for all the events over 18 years. These profiles are plotted in Figure 8. The horizontal axis corresponds to time, the vertical axis to altitude, and the colours correspond to the un-normalised coefficients.

4.3 The data assimilation



The normalised weights computed using (26) are plotted in Figure 9 for all the events (horizontal axis) and each DA vertical level. There is temporal variability in the weights, especially for the lowermost four levels. Note that the uppermost level (60–72 km) always has zero weights, and in the next level (48–60 km) infrasound waves penetrate for only few events per year. These upper levels can only be affected by observations through the sample covariance between different levels.

4.4 The quality of the background covariance





A clear reason for applying inflation is that the ensemble background covariance is often underestimated. This is inherent to small ensemble sizes; (van Leeuwen, 1999; Sacher and Bartello, 2008; Amezcua and van Leeuwen, 2018 give detailed explanations of direct and indirect effects). There are more tangible mechanisms for the misrepresentation of the background covariance, including differences in the resolution of model and observations, and the imperfect representation in the forecast and observational process.
In our experiment set-up we recognise there are sources of imperfection. These include (a) temporal interpolation from the reanalysis times to the time of the observation, (b) consideration of instantaneous velocities while the infrasound wave propagates over around 10 min, and (c) possibility of erroneous assumptions behind the calculation of the α weights inside the ray-tracing technique. We performed the experiments with several inflation values α, and below we discuss the results obtained using two of these values.
4.5 Results
Here, we display the results for the following DA settings: Nx=NzDA=6 state variables per observational instant, Ny=1 observations, Ne=10 ensemble members, vertical localisation with a half-width of 15 km, vertical weights coming from the ray-tracing assumed to be perfect, and two different inflation factors ρ=0 and 1. The second inflation value means the standard deviation of the background is doubled compared to the data in the non-inflated assimilation.
Figure 11 shows the weighted cross-wind solved from (4) for observations (black line) and computed from (19) for the background (blue line), as well as the resulting analysis (red and green lines, depending on the inflation). To facilitate visualisation, we only display the years 2001 and 2002.

The background and observation cross-winds are similar for some events, but for most events the DA produces changes. In fact, for some events the difference is up to 1 or 2 m·s−1. In the absence of inflation, the background and analysis values are quite close. However, the use of inflation increases the differences between analysis and background, as expected.
The impact of the observations is in general low, especially in the absence of inflation. Several factors can explain this. First, the variance of the background ensemble is small, which is expected since this is a reanalysis product already containing information. The observation impact might be greater if instead we used an ensemble forecast as background. Second, as already mentioned, the ensemble size is small with only Ne=10 members. A larger ensemble would allow us to select different state variables, for instance a larger number of DA vertical levels. Less vertical averaging of the original variables would give a prior with larger variance, hence allowing for larger observational impact. It is important to point out that, in an online setting, the background would come from an ensemble forecast and the infrasound observations would not be the only data assimilated. Another aspect is that the stratospheric winds are in general significantly weaker and less variable in August and September than in winter. It will be interesting, in future work, to perform these experiments for wintertime explosions and to assess the observational impact.
How do changes in the vertically averaged cross-wind translate to the different DA vertical levels? These results are shown in Figure 12, which has two panels corresponding to two selected vertical levels: 0−12 km and 48−60 km for the 2001 and 2002 events. The blue line shows the background mean, with the cyan lines to each side indicating one standard deviation. The red line denotes the analysis mean, with the magenta lines to each side indicating one standard deviation. This analysis was produced using inflation. There are some changes in the values of the cross-wind in the lower level, however these tend to be small. The difference between background and analysis is more noticeable at higher DA levels, which are not even updated directly (recall most explosions do not penetrate these altitudes) but based on the inter-level covariances. In the no-inflation case, there are still changes, but they are less distinguishable in the plot.




Figure 13a displays the innovations with typical magnitudes between –0.2 and 0.2 m·s−1. Pink colours represent positive increments and green colours represent negative increments. Remember that these increments are the changes that the infrasound observations produce to the forecast. Figure 13b displays the resulting variance ratios of standard deviations. The plot confirms that these values, as expected, are always smaller than 1. The darkest colours correspond to the greatest reduction in uncertainty. The experiment results in a ratio which descends to around 0.9. However, we keep in mind that the reanalysis data ensembles already contained small statistical uncertainty.
Figures 14a,b summarise the increments dab and the variance ratios rab obtained for each vertical level. Box plots provide a non-parametric summary (with outliers omitted for clarity). These box plots are complemented by the mean as shown by blue dots. There are non-zero innovation results at all vertical levels, with the largest typically within the level 24–36 km, and the smallest typically within 48–60 km altitude. In the three uppermost altitude layers, at least 75% of the increments are negative. Note that in at least three levels, the mean and the median differ significantly, indicating asymmetry in the distribution of the innovations.

Figure 14b shows the resulting variance ratio, a quantity bounded between 0 and 1. In our experiments it falls between 0.9 and 1. Since this has a non-symmetric distribution, the mean and median do not coincide. Notice that the reduction of the variance is largest in the four upper levels, that is, 24–72 km. This is expected because these levels have greatest background uncertainty. Since the waves penetrate only to around 40 km, updating the DA levels at these altitudes is done both directly and via covariances. The lowermost two levels exhibit a limited covariance reduction. Although the coefficients for the lowest DA level are significant, the background winds are already well-constrained there, and hence they allow the assimilated infrasound-based data to impact the analysis only to a minor extent.
5 Summary and future work
This is the first study to explore assimilation of atmospheric infrasound data into atmospheric models in order to constrain atmospheric winds. The back-azimuth deviation of infrasound waves carries integrated information related to the cross-winds acting on the wave along its atmospheric propagation path. We show that assimilation of this information using an ensemble Kalman Filter is able to provide corrections to the wind at stratospheric and tropospheric altitudes.
We performed DA experiments for 370 explosion events over 18 years (2001–2018). We know the accurate time and location of the explosions and arrivals of the infrasound waves. This allows us to accurately calculate the propagation time and the celerity υ. It also allows us to perform complementary ray-tracing to determine the vertical sensitivity at different vertical levels, which is needed in the observation operator. To reduce the dimensionality of the problem, we consider average values corresponding to Nz=6 DA levels, each 12 km thick. This is opposed to the original Nz=137 levels of the reanalysis. Here, there might be room for improvement and subsequent work can explore in detail the effect of selecting different numbers of DA vertical levels.
The results of the DA experiments yielded non-zero analysis increments (defined as analysis mean minus background mean) for most times, with the largest values in the 24–36 km layer. More than 75% of the increments calculated above 36 km are negative, suggesting a bias in the background values. As required by construction, the variance in the cross-wind values at all levels has been reduced for all data points assimilated, while for the uppermost levels the reduction reaches up to 2–5%. This implies a reduction of the uncertainty in the estimation.
It would be desirable to apply this framework to existing datasets for explosions performed during the winter season when the stratosphere is more dynamic than in August and September. However this may present a challenge, since larger magnitudes of cross-wind can reduce the linearity of arc-tangent in (4). This may present a challenge to the DEnKF. We can instead try techniques which better handle departures from linearity. For instance, the iterative ensemble Kalman smoother (e.g., Evensen et al., 2019) is a useful candidate to solve this problem.
For future work, we suggest exploiting signals from natural continuous sources like microbaroms. These are atmospheric infrasound waves produced by ocean surface hot-spots where counter-propagating surface waves are prevalent (Posmentier, 1967; Donn and Rind, 1971; Le Pichon et al., 2006; De Carlo et al., 2020; den Ouden et al., 2020).
In this work we rely on many simplifications. In the future we aim to solve a set-up akin to Figure 2b. This would consider the cross-wind variation both in the vertical and the along-track direction of the infrasound wave. This becomes especially important when the distance between source and receiver increases. An example is the detection in Norway of infrasound from microbaroms generated near Iceland; in this case the separation is about 2000 km and considering a single horizontal slab may be detrimental to the usefulness of the estimation. In this case we may also not be able to consider the winds constant in time for each position along the trajectory.
We have an important advantage when working with the explosions dataset: the times and locations of both the emission and detection of the infrasound waves are known accurately. This, in turn, allows us to consider the celerity υ as perfectly known, which we have done in this work. In the case of microbaroms, for instance, the time and location of the detection may be accurately known, but the location and time of the emission may prove much more elusive. In these cases, there may be large uncertainty on the values of celerity. This, added to the uncertainty in the propagation medium, makes it necessary to consider celerity as another random function. Several previous works establish the pdf for celerity in infrasound propagating under stratospheric waveguide conditions. For example, Blom et al. (2015) used simulations to establish the expected celerity to be between 250 and 350 m·s−1 for propagation distances at around 200 km. Similarly, Morton and Arrowsmith (2014) analysed both simulations and measurements to find a celerity distribution at 275 km distance with values between around 280 and 310 m·s−1. A data-based study presented in Nippress et al. (2014) estimates the celerity distribution at 200 km distance to span the 270 to 300 m·s−1 range.
Regarding the dynamics of the wave propagation, we recognise that the framework applied here requires an auxiliary ray-tracing method to determine the sensitivity to the wind at different vertical levels (first native and then DA levels) in the weighted sum giving the average cross-wind impacting the observation. Follow-up studies could include the development of approaches to instead estimate these sensitivity weights as part of the assimilation process. Then the implementation of an expression akin to (13) might be required. In turn, this would require the state variable to include the along-track wind and the temperature (which the sound speed is a function of) at each of the grid points traversed by the wave. However, this would also provide an opportunity to estimate along-track winds.
An important detail to mention is that a DA process requires a verification step to assess the quality of the analysis field obtained. For identical-twin experiments, one produces a synthetic truth from which the simulated observations were extracted. Then the analysis can be assessed with respect to this reference truth. In operational DA, the true state of the system is unknown, so verification becomes more elusive. One option is to have independent observations or independent reanalysis data which can verify the analysis. In the current study, we do not have independent observations for validation. This in turn restricted us from tuning the values of localisation radius and inflation parameter. Although this is outside the scope of the current study, prospective future studies might have access to independent measurements to allow for tuning and verification. Here, the Atmospheric Dynamics Mission Aeolus satellite project will likely be a reliable benchmark for winds up to 30 km altitude (Tan et al., 2008). Likewise, future validation may be possible using data from portable lidars; for example, the CORAL system (Kaifler et al., 2017; Kaifler et al., 2015) might be upgraded to provide direct wind measurements.
Finally, the DA experiments of this study were made offline. In order to perform online assimilation experiments, it would be necessary to implement the methodology in a dynamic forecasting system. In an operational or quasi-operational setting, infrasound measurements can be added to the rest of the available observations at the moment of assimilation. Although implementation in an operational assimilation system still requires substantial further work, the methodology described in the present study provides a starting point for such developments. This an objective of a next step following up the Atmospheric dynamics Research InfraStructure in Europe (ARISE and ARISE2) projects (Blanc et al., 2018; Blanc et al., 2019).
Given that single-station infrasound measurements provide atmospheric wind measurements within a sparsely observed altitude range for a given geographical region, an extended or even global multi-station wind sampling might be feasible using, for example, infrasound station data recorded by the International Monitoring System network (Dahlman et al., 2009; Marty, 2019). Hence, there are several opportunities yet to explore in further work related to atmospheric probing and data assimilation using infrasound datasets. A long-term objective is to enhance or constrain the representation of stratospheric winds in global models, thereby contributing to enhanced surface weather predictions on subseasonal-to-seasonal time-scales (Domeisen et al., 2020a; 2020b).
ACKNOWLEDGEMENTS
We thank two anonymous reviewers whose insightful comments and suggestions helped improve and clarify this manuscript. JA acknowledges support and funding from the UK National Centre for Earth Observation. This work was supported by the project Middle Atmosphere Dynamics: Exploiting Infrasound Using a Multidisciplinary Approach at High Latitudes (MADEIRA), funded by the Research Council of Norway basic research programme FRIPRO/FRINATEK under contract no. 274377. SPN and EMB also acknowledge NORSAR institute funding. This study was facilitated by previous research performed within the framework of the ARISE and ARISE2 projects (Blanc et al., 2018; Blanc et al., 2019) (suresh) funded by the European Commission FP7 and Horizon 2020 programmes (grant agreements 284387 and 653980). The authors declare no conflicts of interest.
Appendix: The ensemble Kalman Filter framework applied in this study
In this work we use the Deterministic Ensemble Kalman Filter (Sakov and Oke, 2008). The Kalman filter (Kalman, 1960; Kalman and Bucy, 1961) is a minimum-variance DA algorithm which relies on the mean and covariance of the state variable. It has two steps: forecast and analysis. It is optimal under Gaussian statistics for the sources of additive error, and linear observation and evolution operators. Under these conditions, the process yields a full Bayesian solution of the problem (e.g., Asch et al. (2016)) .


















An alternative is to use sample estimators for mean and covariance and to work with ensembles. Evensen (1994) and Hunt et al. (2007) nicely describe handling nonlinear operators for this approach.























The sample elements in the EnKF are naturally subject to sampling errors which reduce as Ne increases. Localisation (Hamill et al., 2001) is implemented using a straightforward Schur multiplication of (A14a) and (A14b) by an adequate tapering matrix. A compact support approximation to a Gaussian off-diagonal decay is often used for this purpose (Gaspari and Cohn, 1999).