On diagnosing observation error statistics with local ensemble data assimilation

Recent research has shown that the use of correlated observation errors in data assimilation can lead to improvements in analysis accuracy and forecast skill. As a result, there is increased interest in characterizing, understanding and making better use of correlated observation errors. A simple diagnostic for estimating observation-error statistics makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. This diagnostic is derived assuming that the analysis is calculated using a best linear unbiased estimator. In this work, we consider whether the diagnostic is still applicable when the analysis is calculated using ensemble assimilation schemes with domain localization. We show that the diagnostic equations no longer hold: the statistical averages of observation-minus-background and observation-minus-analysis residuals no longer result in an estimate of the observation-error covariance matrix. Nevertheless, we are able to show that, under certain circumstances, some elements of the observation-error covariance matrix can be recovered. Furthermore, we provide a method to determine which elements of the observation-error covariance matrix can be estimated correctly. In particular, the correct estimation of correlations is dependent on both the localization radius and the observation operator. We provide numerical examples that illustrate these mathematical results.


Introduction
A key component in numerical weather prediction (NWP) is the use of data assimilation systems.Data assimilation techniques combine model states, known as forecasts or backgrounds, with observations, weighted by their respective error statistics, to provide a best estimate of the state, known as the analysis.
To obtain an accurate analysis, it is essential to have an accurate representation of the background-and observation-error statistics.
Much attention has been devoted to the estimation and representation of the background-error covariance matrix in variational assimilation systems (e.g.Bannister, 2008).In addition to this, an entire class of ensemble data assimilation schemes has been developed with the specific aim that the assimilation system itself should provide an estimate of the flow-dependent background-error statistics.First introduced by Evensen (1994), the ensemble Kalman filter estimates the background-error statistics considering a statistical sample, or ensemble, of background states during the assimilation-forecast cycle.Many forms of ensemble Kalman filter have been developed: for example Burgers et al. (1998), Houtekamer and Mitchell (1998), Anderson (2001), Evensen (2003) and Tippett et al. (2003).These methods can be split into two categories; deterministic filters and stochastic filters.Stochastic filters make use of a set of perturbed observations which are required to maintain the correct statistics of the filter (Burgers et al., 1998;Lewis et al., 2006).Deterministic filters do not require these perturbed observations and therefore no additional errors are introduced in the observations.It is these deterministic filters that we will focus on in this article.
A key limitation of ensemble filters is the prohibitively large number of ensemble members required to obtain an accurate representation of the background-error statistics (Whitaker and Hamill, 2002).Therefore, additional constraints are required for ensemble methods to prove effective when used to provide operational weather forecasts.
One approach is to 'localize' the problem by considering only a part of the state or observation space, therefore reducing the necessary ensemble size.The two most common localization methods are state-space (covariance) localization (Hamill et al., 2001;Houtekamer and Mitchell, 2001;Petrie and Dance, 2010) and domain localization (local analysis: Houtekamer and Mitchell, 1998;Ott et al., 2004;Janjić et al., 2011;Nerger et al., 2012), which is used in conjunction with observation localization (Sakov and Bertino, 2011).In covariance localization, the estimated background-error covariance matrix is Schur-multiplied by a localization matrix to suppress spurious background correlations that appear at long range due to sampling error.In domain localization, each model grid point is updated individually using a subset of observations within a given distance.In this article, we will consider the impact of domain localization on the estimation of observation-error statistics.
The quantification of observation errors has been a recent area of research.Typically, observation errors have been assumed uncorrelated and, in an attempt to satisfy this assumption, the data is often thinned or 'superobbed' (Lorenc, 1981).However, it is known that a number of different sources contribute to the observation error, some of which may be correlated, statedependent and dependent on the model resolution (Lorenc, 1986;Janjić and Cohn, 2006;Waller, 2013;Waller et al., 2014aWaller et al., , 2014b;;Hodyss and Nichols, 2015).Research has shown that observation errors can indeed exhibit significant correlations.Furthermore, the inclusion of correlated interchannel errors for satellite observations in data assimilation systems has been shown to lead to a more accurate analysis, the inclusion of more observation information content and improvements in the forecast skill score (Stewart et al., 2008;Stewart, 2010;Weston et al., 2014;Bormann et al., 2016).
One difficulty in quantifying observation-error correlations is that they can only be estimated in a statistical sense, not calculated directly.The method proposed by Desroziers et al. (2005) has become popular for estimating observation-error statistics due to its simplicity (a detailed discussion of this diagnostic is given in section 3).The diagnostic provides an estimate of the observation-error covariance matrix using the statistical average of observation-minus-background and observation-minus-analysis residuals, assuming that the analysis is calculated using least-variance linear statistical estimation.It has also been shown to be applicable to scenarios where the analysis is calculated using both 3D and 4D variational assimilation methods (Desroziers et al., 2005;Stewart, 2010).However, the diagnostic only provides a correct estimate of the observation-error covariance matrix if the assumed backgroundand observation-error statistics used in the assimilation are correct.As well as the impact of the assumed error statistics, the diagnostic has further limitations, such as the error introduced when using nonlinear observation operators (Terasaki and Miyoshi, 2014) and the fact that an ergodic assumption is often made in order to obtain sufficient sample residuals (Todling, 2015).However, Desroziers et al. (2005) also show that the result may be improved if successive iterations of the diagnostic are applied.Furthermore, with careful interpretation of the results, the diagnostic can still provide useful information about the true observation-error statistics when the assumed statistics, used in the assimilation, are not exact (Ménard, 2016;Waller et al., 2016b).Despite these limitations, the diagnostic has been used successfully in some studies to estimate observation-error variances and correlations.It has been used in simple model experiments (Li et al., 2009;Stewart, 2010;Miyoshi et al., 2013) and to estimate time-varying observation errors (Waller et al., 2014a).The diagnostic has also been applied to operational NWP observations to calculate interchannel error covariances (Bormann and Bauer, 2010;Bormann et al., 2010Bormann et al., , 2016;;Stewart et al., 2014;Weston et al., 2014;Waller et al., 2016a) and spatial error covariances (Cordoba et al., 2016;Waller et al., 2016aWaller et al., , 2016c) ) in variational assimilation systems.
Another application of the diagnostics is their use in learning about the assimilation system: for example to test self-consistency in the system (Desroziers et al., 2005) and to determine sources of errors (Waller et al., 2016a(Waller et al., , 2016c)).
One issue that appears to have been overlooked is that the diagnostics are derived assuming that the analysis is calculated using a best linear unbiased estimator.In recent work, the diagnostic has been applied to calculate observation errors where the analysis has been calculated using an ensemble assimilation scheme employing domain and observation localization techniques (Lange and Janjić, 2016;Schraff et al., 2016).In this article, we consider whether the diagnostics of Desroziers et al. (2005) are still appropriate for calculating observation-error statistics for observations used in a local assimilation scheme.We provide a new derivation of the diagnostics using the analysis calculated by a local ensemble assimilation.From this derivation, we show that the diagnostic equations no longer hold and that the statistical averages of observation-minus-background and observation-minus-analysis residuals no longer result in an estimate of the observationerror covariance matrix, in general.However, further analysis of our derived diagnostics shows that, under certain circumstances, some elements of the observation-error covariance matrix can, in principle, be recovered exactly.Those elements that cannot, in principle, be derived exactly we describe as 'incorrectly estimated'.Furthermore, we provide a method to determine which elements of the observation-error covariance matrix can be estimated correctly.In particular, the correct estimation of correlations is dependent on both the localization radius and the observation operator.We provide some special cases that show, dependent on specific background-and observation-error statistics and observation operators, that one may be lucky and may, in theory, be able to recover all elements of the observation-error covariance matrix, or unlucky and able to recover none.We also use examples to show that it is possible that, theoretically, some elements will be estimated incorrectly by the diagnostic, but, due to the choice of specific background-and observation-error statistics and observation operators, the estimated values may be close to the true values.However, some prior knowledge of the true statistics is required to be able to validate the quality of the incorrect estimates.Therefore, if the estimated error statistics are to be utilized further, it is necessary to find another method to assign values to those elements of the covariance matrix that cannot be estimated correctly by the diagnostic.This may be achieved by, for example, applying techniques such as those in Higham (2002) to provide a nearest approximate correlation matrix.
Despite these problems, the estimated observation-error covariance matrices can still be useful.It is known that even the use of approximate observation-error covariance matrices in data assimilation can improve analysis accuracy (Healy and White, 2005;Stewart, 2010;Stewart et al., 2013).
This article is organized as follows.We begin in section 2 by describing deterministic ensemble Kalman filters and their local implementation.We review the standard diagnostics of Desroziers et al. (2005) in section 3.In section 4, we derive the diagnostics where the analysis is calculated using a localized assimilation scheme and determine when the diagnostic can be used to estimate error correlations.In section 5, we demonstrate our theoretical results using numerical examples.Finally, we present our conclusions in section 6.

Notation
Data assimilation techniques combine observations, y ∈ R p , available at time t with a model prediction of the state, the background x b ∈ R n , which is often determined by a previous forecast.Here, p and n denote the dimensions of the observation and model state vectors respectively.In the assimilation, the observations and background are weighted by their respective error statistics, using the background-and observation-error covariance matrices, B ∈ R n×n and R ∈ R p×p , to provide a best estimate of the state, x a ∈ R n , known as the analysis.After an assimilation step, the analysis is then evolved forward in time using a (possibly nonlinear) model to provide a background at the next assimilation time.
One of the simplest forms of data assimilation scheme is the best linear unbiased estimator.Using this scheme, the analysis is obtained using where H : R n → R p is the (possibly nonlinear) observation operator and H is the observation operator linearized about the current state.Here, we restrict ourselves to the use of a linear observation operator.The innovation, or observation-minusbackground residual, is denoted by The assumed observation-and background-error covariance matrices R and B are used to weight the observations and background in the assimilation.The matrix is the Kalman gain ( K ∈ R n×p ) used in the assimilation.We make the distinction between the assumed error statistics, denoted by ., and the exact error statistics, as this is important for the derivation of the diagnostic in section 3.

Deterministic ensemble filters
We now describe the general form of deterministic square-root ensemble filter.At each assimilation time we have an ensemble, a statistical sample of m state estimates x (k) for k = 1, . . ., m. From this ensemble it is possible to calculate the ensemble mean, and ensemble perturbation matrix, X ∈ R n×m : Using the background ensemble members to calculate the background ensemble mean xb and background ensemble perturbation matrix X b allows us to write the background ensemble covariance matrix as We note that the background ensemble covariance matrix will not be exact and therefore can be thought of as the assumed background-error covariance matrix.The background ensemble mean is updated to provide the analysis ensemble mean, xa , as where the Kalman gain is constructed as in Eq. (3) using the background ensemble covariance matrix given in Eq. ( 6).(If the observation operator is nonlinear, the matrix K is defined differently, as in e.g.Hunt et al. (2007).Alternatively, the nonlinearity may be dealt with using an augmented state (Evensen, 2003).) The ensemble perturbation matrix update then gives information on the analysis-error covariance matrix.We do not describe the update of the ensemble perturbations in detail, but instead note that a number of different approaches are available (e.g.Tippett et al., 2003), though one must be careful to ensure that the chosen form is unbiased (Livings et al., 2008).We do not include the equations here, as we only need to understand the updated analysis mean for use with the Desroziers et al. (2005) diagnostic, which is introduced in section 3. When domain localization is applied, the component of the state vector located at the highlighted grid point x 2 would be updated using observations y 1 and y 2 that fall within the shaded area.

Local assimilation
In this section, we consider how domain localization is applied to ensemble assimilation schemes.In domain localization, each component of the background state vector is updated individually using a subset of observations located within a given distance, δ.
In general, the ith component of the mean background vector, xi , is updated independently using a local set of p{i} observations, y{i} ∈ R p{i} .The local observations can be defined as a sub-vector of the full observation vector, where {i} ∈ R p{i}×p is a selection matrix, with elements 1 and 0, that selects only observations within a given distance, δ, from the corresponding location of xi .(We remark that it is possible to update a subvector of states simultaneously if each component in the sub-vector is updated using the same local set of observations.) Example.We present an example of the update of a single component in Figure 1.In this schematic, the component x 2 would be updated using only those observations (black dots) that lie within a given radius (shaded grey circle).For this example, we are able to determine that the selection matrix for the highlighted grid point would be {2} = 1 0 0 0 0 1 0 0 , which selects observations y 1 and y 2 .(For this example, it would be possible to update x 1 and x 2 simultaneously, as both of these points would be updated using only observations y 1 and y 2 .)As only a sub-vector of observations is used, it is also necessary to consider only a sub-vector of the modelled observations, and a local Kalman gain, ˘ K{i} ∈ R n×p{i} , defined as  143: 2677-2686 (2017) The local update to the ensemble mean is calculated using We note that only the ith row of the ith local Kalman gain is required for the update.Following the update of the ensemble mean, it is necessary to update the ensemble perturbations.Again, we do not describe the update of the ensemble perturbations in detail; rather, we note that the ensemble perturbation updates are also performed locally using the local forms of the required matrices (Sakov and Bertino, 2011).Desroziers et al. (2005) present a set of diagnostics that provides estimates of the observation-and background-error covariance statistics based on combinations of observationminus-background (OmB, also known as the innovation, given in Eq. ( 2)), observation-minus-analysis (OmA) and analysis-minusbackground (AmB) residuals.In this article, we focus primarily on use of the diagnostics to estimate observation-error statistics.(Use of the diagnostics to estimate background-error statistics is discussed in Appendix A.) In the derivation of the diagnostic in Desroziers et al. (2005), it is assumed that the analysis is determined using the best linear unbiased estimator described in Eq. ( 1).Calculating the analysis in this way allows the analysis residual (OmA) to be defined as

The standard observation-space diagnostic
Under the assumption that the forecast and observation errors are uncorrelated, Desroziers et al. (2005) show that an estimate of the observation-error correlation matrix can be obtained by taking the statistical expectation of the outer product of the analysis and background residuals: where R e is the estimated observation-error covariance matrix and B and R are the exact background and observation covariance matrices.Here, we define S = R + HBH T and S = R + H BH T .
If the observation and forecast errors used in the assimilation are exact, R = R and B = B, then However, in practice the estimate will be subject to sampling error.Here, we will refer to Eq. ( 14) as the 'standard' form of the observation diagnostic.The calculation of these statistics for non-homogeneous, irregular datasets, where different observations are used in each assimilation cycle, cannot be performed using the matrix multiplication above.Instead, this calculation can be achieved by pairing components of the background and analysis residuals and binning the d oa and d ob pairs.The pairing and binning of observations will depend on the type of correlation being estimated.For example, if we wish to calculate spatial correlations, we may consider pairing only residuals that occur simultaneously in time; the binning of the d oa and d ob pairs will depend on the distance between the spatial locations of d oa and d ob .If we were to estimate time correlations, then we would consider pairs of residuals at the same location; the binning of the residual pairs would then be dependent on the time difference between d oa and d ob .The covariance, cov(β), is then computed individually for each bin, β, using where is the kth pair of elements of d oa and d ob in bin β and N β is the number of residual pairs in bin β.We note that the second term ensures that the calculation is not affected by bias (Waller et al., 2016a).

Observation-space diagnostics using localized analyses
In this section, we revisit the derivation of the diagnostics, but in this case we assume that the analysis is calculated using the local ensemble assimilation described in section 2. For the purposes of this derivation we make several assumptions.
(1) We consider a scalar case where each individual state variable is updated using a local set of observations.We note that this can be extended to the case where a local analysis update is applied to a vector of state variables (e.g.all variables in a given column at a given latitude and longitude) that share the same set of local observations.(2) We note that in this section we are concerned only with the background and analysis ensemble means and not the individual ensemble members, therefore to simplify the notation we drop the overbar.(3) In section 3 we demonstrated that the standard diagnostic in Eq. ( 14) is only correct when R and B used in the assimilation are correctly specified and therefore, for the derivation in this section, we make the assumption that the error covariance statistics used in the assimilation are exact, that is R = R and B = X b X b T = B.

Derivation of diagnostic using local analyses
If we then assume that the analysis is calculated using a local assimilation scheme, as in Eq. ( 11), the analysis residual is given by Thus, the diagnostics can be written in component form as In Eq. ( 17), we define F ∈ R n×p as where We note that F = G{k}, but that the kth row of F is equal to the kth row of G{k}.Furthermore, the rows of the matrix F have a relation to the analysis grid points, with the kth row related to x k , and the columns of the matrix F have a relation to the observations, with the jth column related to the observation y j .
Equation ( 17) relies on the use of the correct error covariances in the assimilation and is based on updating the state using an assimilation scheme employing localization.The standard diagnostic gives a correct estimate of the observation-error covariance matrix.However, when local analyses are used it is not clear if the diagnostic gives a correct estimate of the matrix R.
From Eq. ( 17), we see that the diagnostic will only result in R e i,j = R i,j if HBH T i,j − (HF) i,j = 0. To determine if this holds, we first consider which elements of the matrix F are equal to corresponding elements of the matrix BH T .To understand which elements of F have the correct value, we must consider the elements of the kth row of each of the matrices G{k}.We note that G{k} is dependent on {k}, which is a selection matrix containing zeros and ones.By reordering the vector of observations, the matrix {k} can always be arranged into the block form {k} = I 0 , where I ∈ R p{k}×p{k} is the identity matrix.Applying this rearrangement (denoted by .) to all the required matrices, G{k} can be calculated as follows: We see that the first set of columns of the matrix, i.e. the columns related to the observations selected by {k} and used in the local update, are equal to the corresponding elements of BH T .
In contrast, the second set of columns are not equal to BH T [1,2] .They are, in fact, related to the first set of columns, In Appendix B, we provide a simple example that demonstrates why the calculation of F fails to produce the desired elements of HBH T and how, in turn, this impacts on the estimate of the observation-error covariance matrix.
Rearranging the columns back to the original observation vector ordering results in G{k} having its jth column equal to the jth column of BH T only if the jth observation was used in the local update of x k .Using this information, we are able to determine that In other words, the (k, j)th element of F is equal to the (k, j)th element of BH T only if the analysis for state x k was calculated using the observation y j .We remark that no calculation is required to determine which elements of F have the correct value.It is sufficient to know only which observations are used to update which analysis states.We are able to determine that the kth row of F will only have correct elements where the observation y j has been used to update analysis state x k , i.e. is within the localization distance.As well as considering a localization distance around a grid point x k , it is possible to consider a 'region of influence' of an observation.The region of influence of an observation is the set of analysis states that are updated in the assimilation using the observation y i .
Using this definition, we are able to determine that the jth column of F will only have correct elements where the analysis state x k has been updated using the jth observation.
Example.In Figure 1 in section 2, we introduced a simple example where there are four observations available to update nine states.Using Eq. ( 21) and the diagram in Figure 1, we can determine which elements F i,j are equal to (BH T ) i,j .If we denote elements where F i,j = (BH T ) i,j by ✓and elements where F i,j = (BH T ) i,j by ✗, we have Using Figure 1, we see that that x 2 would be updated using only observations y 1 and y 2 and hence in the second row of F only the first two columns have correct entries.
In Figure 2, we see the same example as in Figure 1, but now the coloured shaded areas show the region of influence for each corresponding observation.Figure 2 shows that observation y 2 will only be used to update x 1 , . . ., x 6 ; as a result, the second column of F only has the first six elements estimated correctly.

Determining correctly estimated elements of the observationerror covariance matrix for regular datasets
To determine which elements of the observation-error covariance matrix can be estimated correctly, we require only two things: • to know which elements of the observation operator are zero and • to know which elements of F are incorrect.
It is useful to store this information in two matrices.Let C ∈ R p×n contain information on the non-zero elements of H, where and let D ∈ R n×p contain information on the incorrect elements of F, where We note that, to compute D, it is sufficient only to know the location of the observations in relation to the location of the analysis states.The matrix C provides information about the 'domain of dependence' of an observation.
The domain of dependence of an observation y i is the set of elements of the model state that are used to calculate the model equivalent of y i (i.e. the set of states The form of C and D ensures that product of these two matrices L = CD can be used to determine which elements of R e can be estimated correctly using the diagnostic.In other words, if L i,j = 0 then R e i,j = R i,j , otherwise R e i,j = R i,j . (24) We note that, for the elements where L i,j = 0, the estimated value of R e i,j is given by Eqs ( 17), ( 18) and ( 19).In section 5, we provide examples that show that the values estimated incorrectly by the diagnostic may be close to the true values or can be far from the truth; hence, without some prior knowledge of the true statistics, one cannot be certain of the quality of the incorrect estimates.
Using the definitions of the domain of dependence and region of influence, we can interpret Eq. ( 24) further as follows: The correlation between the errors of observations y i and y j can be estimated using the diagnostic only if the domain of dependence for observation y i lies within the region of influence of observation y j .
From this statement, we can see that using the diagnostic will not necessarily result in a symmetric matrix; it is possible that the domain of dependence for observation y i lies within the region of influence of observation y j , but that the domain of dependence for observation y j does not lie within the region of influence of observation y i .Hence, in this case, the element R i,j can be estimated whereas the element R j,i cannot.
Although developed in the framework of a linear observation operator, our conclusion should hold when a nonlinear observation operator is applied, as it depends only on which states the observation operator acts on and not how the observation operator acts on the state.
Example.We must now assume a form for our observation operator in our example.We assume that, to calculate a model observation equivalent, the observation operator acts on the four closest state points to the observation itself.This is shown in Figure 2, where for each coloured observation the state points required to make the model observation are selected with a correspondingly coloured square.Using an observation operator of this form and using the previous step of our example, we are able to determine that By comparing D with the F previously calculated in our example, we see that D represents in binary the correct and incorrect elements of F. Finally, using C and D we are able to determine that The matrix L tells us which elements of the observation-error covariance matrix can be estimated accurately.Using L along with Figure 2, we see the following.
• All the observation-error variances (diagonal elements) are estimated correctly.• We are not able to estimate the elements R 1,3 , R 3,1 , R 1,4 and R 4,1 .This is intuitive, since we see from Figure 2 that observations y 3 and y 4 do not fall within the region of influence of y 1 .
• We are able to estimate R 3,4 and R 4,3 , since the domain of dependence for observation y 3 lies within the region of influence of y 4 and the domain of dependence for observation y 4 lies within the region of influence of y 3 .• We are able to estimate R 1,2 but cannot estimate R 2,1 , despite the fact that y 1 and y 2 lie within each other's regions of influence.This is because the domain of dependence for observation y 1 lies within the region of influence of y 2 , but the domain of dependence for observation y 2 does not lie within the region of influence of y 1 .• The remaining entries of R cannot be estimated correctly.
In our example, we are able to estimate correctly 7 of the 16 elements of R. In general, the number of elements that can be estimated correctly will be dependent on the localization radius and observation operator.We note two particular special cases of H that show that one may be lucky and recover all elements of the observation-error covariance matrix, or be unlucky and recover none.
• If H contains no zero entries (this scenario is likely to be unphysical), then the diagnostics may provide no correct information when applied to analyses calculated using the localization.
• If H = I and n = p, i.e. the system is fully observed, then the diagnostic will have the same correct entries as matrix F. If, in addition, B and R are diagonal, then a local assimilation scheme does produce the correct solution (this follows, since for all G{k} the components of S [1,2] are zero).This methodology for determining correct elements is applicable to regular datasets where the same observation locations are used in each assimilation cycle.In this scenario, since the same set of local observations will be used for each local analysis update in each assimilation cycle, it will only ever be possible to recover certain elements of the observation-error covariance matrix.
To use the estimated statistics in an assimilation scheme, it will be necessary to 'fill in' those elements of the covariance matrix that cannot be estimated correctly by the diagnostic.It may be possible to 'fill in' some of these elements using the calculated information: e.g., if we have been able to estimate R e i,j correctly, but not R e j,i , then, using the symmetric nature of a correlation matrix, we can replace R e j,i with R e i,j .For the elements of the observation-error covariance matrix for which we have no reliable information, it may be necessary to assume no correlation or to estimate the elements by employing more sophisticated techniques such as those in Higham (2002) to provide a nearest approximate correlation matrix.

Application to irregular datasets
If the dataset is irregular, then we must modify the above approach slightly.As discussed in section 3, if we wish to calculate observation errors for irregular observations then we must use the summation form of the diagnostic and bin samples, rather than a simple multiplication of samples d oa and d ob .Instead of calculating R e and determining which elements have been estimated correctly, we must be more careful and only consider pairs of d oa and d ob that give the correct result.This can be achieved by calculating C, D and L for each assimilation cycle.We can then use the values in the matrix L to determine which d oa d ob pairs can be used in the observation-error statistics calculation: Similarly, then, it is appropriate to use d oa i d ob j in the calculation of observation-error correlation only if the observation operator applied to calculate observation y i acts only on states that have been updated using the observation y j .In other words, When using the diagnostic to estimate observationerror statistics, it is appropriate to use the pair of residuals, d oa i d ob j , only if the domain of dependence for observation y i falls within the region of influence of y j .
When calculating spatial correlations, one often assumes that the observation-error correlations will be isotropic.Because of this assumption, it is possible that we may be more successful in recovering the observation-error statistics than in the regular dataset case.Even so, it is possible that we will not be able to recover information on all the correlations we require.In this case, it may be necessary to fit a correlation function to the estimated data.

Experiment design
In this section, we describe ensemble filter and local ensemble filter twin experiments based on the example presented in Figures 1 and 2 in sections 2 and 4. We set an explicit grid spacing of one unit and localization distance of δ = 1.5.We use the analyses from these experiments to estimate the observation-error covariance matrix and use the results to illustrate the theoretical results presented in section 4.
We define the observation operator to be bilinear interpolation with equal weighting, i.e.
where C is defined in the example in section 4.2.We determine the true observation-and background-error covariance matrix elements using R i,j = σ 2 r e (− y i,j /L r ) , ( 26) where y i,j is the distance between observations y i and y j , and where x i,j is the distance between states x i and x j , respectively.Both B and R are taken from Markov distributions (Wilks, 1995).
For the assimilation experiments we have a single true solution; from this truth, pseudo-observations are created by adding errors drawn from N (0, R) and the background is determined by adding errors drawn from N (0, B).The background is then perturbed to create the ensemble members.We perform a single assimilation step using both a standard ensemble Kalman filter (Eq.( 7)) with 1000 members and a local ensemble filter (Eq.11)) with either 1000 or 100 members.To ensure accuracy of the background ensemble covariance matrix, P f , and avoid the ergodic assumption of the diagnostic, we do not implement a dynamical model.Instead, the single assimilation cycle experiment is repeated 5000 times, with new background, ensemble members and pseudoobservations created each time.The resulting OmA and OmB residuals are used to provide an estimate of the observation-error covariance matrix.(We note that we choose a large number of residual samples, since we wish to reduce the influence on the diagnostic of sampling error.)We analyse the quality of the estimated observation-error correlation matrix by calculating, for each element, the percentage error compared to the true error covariance: R i,j . (28)

Experimental results
We now consider how well the diagnostic estimates R e for two different choices of B. We note that the example in section 4.2 shows which elements we expect to be estimated correctly and incorrectly.We will first consider the case where σ b = σ r = 1 and L b = L r = 1.We plot the percentage errors in Figure 3(a) for R e estimated using 5000 residual samples from a 1000 member ensemble filter, Figure 3(b) for R e estimated using 5000 residual samples from a 1000 member local ensemble filter and Figure 3(c) for R e estimated using 5000 residual samples from a 100 member local ensemble filter.We see that, for this choice of B and R, the errors in the 'incorrectly estimated' elements are small, with the largest error in Figure 3(b) being similar to the percentage error of the reference R e calculated using analyses from a standard ensemble filter.We also note that reducing the number of ensemble members appears to reduce the percentage error values slightly.
We next consider the case where σ b = 2, σ r = 1, L b = 0.5 and L r = 1.We plot the percentage errors in Figure 4(a) for R e estimated using 5000 residual samples from a 1000 member ensemble filter, Figure 4(b) for R e estimated using 5000 residual samples from a 1000 member local ensemble filter and Figure 4(c) for R e estimated using 5000 residual samples from a 100 member local ensemble filter.We note that the colour scale for Figure 4 is a factor of four larger than that in Figure 3.We see that, for this choice of B, the errors in the 'incorrectly estimated' elements are large, with some of the errors as large as the actual elements themselves.In this example, it is clear that the correctly estimated elements are a reasonable estimate of the true error statistics, whereas the incorrectly estimated elements provide poor information.
These two examples highlight that it is not possible, in general, to determine whether the incorrect elements will give estimates close to the true values.This suggests that one can never be certain about the quality of the incorrect estimates.(c) Percentage error in elements of R e estimated using analysis residuals from a 100 member local ensemble Kalman filter; largest error 97.7%.Note that the colour scale is a factor of four larger than that in Figure 3.

Conclusions
To obtain an accurate analysis, it is important that the observation-and background-error statistics are specified accurately in the data assimilation system.A number of operational NWP centres are now using ensemble assimilation schemes that provide flow-dependent background-error statistics.However, for the computational cost of these ensemble methods to be affordable it is necessary to employ additional constraints such as variance inflation and localization.Research involving the improved treatment of observation-error statistics has been a more recent area of research.The positive impact of including correlated observation-error statistics in operational data assimilation has highlighted the need to improve their specification and inclusion in assimilation systems.One method available to estimate observation-error statistics is the diagnostic of Desroziers et al. (2005).The diagnostic makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals to provide an estimate of the observation-error covariance matrix.However, the diagnostic only gives a correct estimate of the observation-error statistics if the background-and observation-error statistics used in the analysis update are correctly specified.Despite the limitations, the diagnostic has been used successfully to estimate observationerror statistics for numerical weather prediction (Bormann and Bauer, 2010;Bormann et al., 2010Bormann et al., , 2016;;Stewart et al., 2014;Weston et al., 2014;Cordoba et al., 2016;Waller et al., 2016aWaller et al., , 2016c)).One important fact that appears to have been overlooked is that the diagnostic is derived assuming that the analysis is calculated using a best linear unbiased estimator.
Here, we consider whether domain localization impacts on the results of the diagnostic of Desroziers et al. (2005).We provide a new derivation of the diagnostic, where we assume that the analysis has been calculated using a local ensemble filter.This derivation shows that, in general, even when the assumed background-and observation-error statistics are exact, the diagnostic no longer provides a correct estimate of the observation-error covariance matrix.
Although we show that the diagnostics no longer provide a correct estimation of the observation-error covariance matrix, we are able, under the assumption that the assumed background-and observation-error statistics are correctly specified, to determine which elements of the observation-error covariance matrix are recoverable.It is possible to estimate the error correlations between two observations only if the observation operator applied to calculate the model equivalent observation y i acts only on states that have been updated using the observation y j .In other words, one can determine the following.
The correlation between the errors of observations y i and y j can be estimated using the diagnostic only if the domain of dependence for observation y i lies within the region of influence of observation y j .
Using the same logic, it is possible to determine which pairs of analysis and background residuals can be used to provide an accurate estimate of the observation-error statistics.For special (unlikely) cases, we can show that one may be lucky and recover all elements of the observation-error covariance matrix or unlucky and recover none.Using examples, we show that the values estimated incorrectly by the diagnostic may be close to the true values or may be far from the truth; hence, without some prior knowledge of the true statistics, one cannot be certain of the quality of the incorrect estimates.
To use the estimated statistics in an assimilation scheme, it will be necessary to 'fill in' those elements of the covariance matrix that cannot be estimated correctly by the diagnostic.Some elements may be determined by using the symmetric nature of a correlation matrix.For those correlations for which we have no reliable information, it may be necessary to assume a particular correlation structure or estimate the elements by employing more sophisticated techniques, such as those in Higham (2002), to provide a nearest approximate correlation matrix.
If the assumed background-and observation-error statistics are not exact, then those elements that can be estimated will be affected.In this case, the theoretical work of Waller et al. (2016b) may be used to provide knowledge on how the assumed statistics may affect the estimated observation-error statistics.
We note that we have not considered the impact of variance inflation here, but instead suggest that if the inflation factor is not correct then the variance of the background-error statistics will be either under-or overestimated.In the case of underor overestimated variances, the theoretical work of Waller et al. (2016b) can be used to provide an understanding of the impact on the estimated observation-error statistics.We do not discuss other forms of localization here, as a different derivation of the diagnostic would be required.
Our new derivation in this article may, at a glance, suggest that it is no longer possible to use the diagnostic to estimate observationerror statistics using analyses that have been calculated with an assimilation system employing localization.However, with careful consideration the diagnostic may be used to calculate some elements of the observation-error covariance matrix correctly.We must first determine the elements of the matrix F, which can be calculated using Eq. ( 19): We note that the variances are estimated correctly, but the covariances are not.Furthermore, the estimated matrix is not even symmetric.We now consider three cases to illustrate how the 'incorrect' elements can be very poor, or may indeed give reasonable estimates, although for the wrong reasons.

Figure 1 .
Figure 1.Schematic of domain localization with grid points (pluses), observations (dots) and local domain (shaded grey circle) with localization radius δ.Note that we order the grid points from left to right and top to bottom with x 1 (top left) to x 9 (bottom right).For clarity, we only label selected grid points in the figure.When domain localization is applied, the component of the state vector located at the highlighted grid point x 2 would be updated using observations y 1 and y 2 that fall within the shaded area.
) c 2017 The Authors.Quarterly Journal of the Royal Meteorological Society published by John Wiley & Sons Ltd on behalf of the Royal Meteorological Society.Q.J. R. Meteorol.Soc.

Figure 2 .
Figure2.Schematic of regions of observation influence when domain localization is applied to ensemble assimilation schemes.Grid points (pluses) and observations (dots), with observations coloured with corresponding regions of observation influence (shaded coloured circles).Assuming that the model equivalent observations are calculated using the four nearest model states, the coloured squares around grid points select the points that would be utilized by the observation operator for the observation of the corresponding colour.Note that we order the grid points from left to right and top to bottom with x 1 (top left) to x 9 (bottom right).For clarity, we only label selected grid points in the figure.

c
2017 The Authors.Quarterly Journal of the Royal Meteorological Society published by John Wiley & Sons Ltd on behalf of the Royal Meteorological Society.Q.J. R. Meteorol.Soc.143: 2677-2686 (2017)

c
2017 The Authors.Quarterly Journal of the Royal Meteorological Society published by John Wiley & Sons Ltd on behalf of the Royal Meteorological Society.Q.J. R. Meteorol.Soc.143: 2677-2686 (2017)

Figure 3 .
Figure 3. Percentage error in estimated observation-error covariance matrices when σ r = 1, σ b = 1, L r = 1 and L b = 1.(a) Percentage error in elements of R e estimated using analysis residuals from a 1000 member ensemble Kalman filter; largest error 13.4%.(b) Percentage error in elements of R e estimated using analysis residuals from a 1000 member local ensemble Kalman filter; largest error 14.0%.(c) Percentage error in elements of R e estimated using analysis residuals from a 100 member local ensemble Kalman filter; largest error 12.2%.Note that the colour scale is a factor of four smaller than that in Figure 4.

Figure 4 .
Figure 4. Percentage error in estimated observation-error covariance matrices when σ r = 1, σ b = 2, L r = 1 and L b = 2. (a) Percentage error in elements of R e estimated using analysis residuals from a 1000 member ensemble Kalman filter; largest error 8.5%.(b) Percentage error in elements of R e estimated using analysis residuals from a 1000 member local ensemble Kalman filter; largest error 97.7%.(c)Percentage error in elements of R e estimated using analysis residuals from a 100 member local ensemble Kalman filter; largest error 97.7%.Note that the colour scale is a factor of four larger than that in Figure3.
c 2017 The Authors.Quarterly Journal of the Royal Meteorological Society published by John Wiley & Sons Ltd on behalf of the Royal Meteorological Society.Q.J. R. Meteorol.Soc.143: 2677-2686 (2017) The estimated covariance matrix is given by R e = R + HBH T − HF.(B1) If we let R and B be correlation matrices (a = c = d = f = 1), with B uncorrelated (e = 0), then a b + e a + d = c b + e c + f = b/2.The estimated correlation is half the value it should be.B3.Case 3 If we let R and B be correlation matrices (a = c = d = f = 1), with −b = e, then a b + e a + d = c b + e c + f = 0.The estimate provided by the diagnostic suggests no correlation, when in fact there should be correlations, possibly very strong.c 2017 The Authors.Quarterly Journal of the Royal Meteorological Society published by John Wiley & Sons Ltd on behalf of the Royal Meteorological Society.Q.J. R. Meteorol.Soc.143: 2677-2686 (2017) c 2017 The Authors.Quarterly Journal of the Royal Meteorological Society published by John Wiley & Sons Ltd on behalf of the Royal Meteorological Society.