The role of cross-domain error correlations in strongly coupled 4D-Var atmosphere-ocean data assimilation

Strongly coupled atmosphere–ocean data assimilation offers the ability to improve information exchange across the modelled air-sea interface by enabling observations in one domain to have a direct influence on the analysis in the other. For incremental 4D-Var assimilation a strongly coupled approach enables both domains to be updated at the beginning of the assimilation window, whether they are observed or not, and is hence more likely to produce consistent initial model states. This is made possible by the explicit inclusion of cross-domain forecast error covariance information in the coupled forecast error covariance matrix. In this study we use an idealised 1D single-column coupled atmosphere–ocean model to examine the extent to which explicit cross-domain forecast error covariances play a role in shaping the coupled analysis increments compared to those implicitly generated in the inner-loop of the incremental formulation of the 4D-Var algorithm. This is done via a set of single-observation experiments with and without initial cross-domain forecast error covariances prescribed. Using single observations allows us to obtain explicit expressions for the atmosphere and ocean analysis updates, separating out the individual effects of the explicitly prescribed and implicitly generated cross-domain covariances. Our experiments show that when only one domain is observed, including explicit cross-domain error covariances allows more consistent adjustment of the unobserved domain


INTRODUCTION
The potential benefits of coupled data assimilation (DA) are well recognised and widely documented (e.g., Lawless, 2012;Penny et al., 2017). The field is still relatively young and methodologies being explored extend across a range from (very) weakly to fully or strongly coupled. In weakly coupled DA, the model-observation misfits (the "innovations") for each individual domain are measured against the coupled model forecast state but the analysis is computed independently for each model component; in strongly coupled DA, the individual components of the Earth system are brought together and analysed within a single seamless assimilation framework.
Although the specific details of application and extent of coupling vary, exploratory studies employing both variational (e.g., Lea et al., 2015;Smith et al., 2015;Fowler and Lawless, 2016;Laloyaux et al., 2016) and ensemble (e.g., Frolov et al., 2016;Sluka et al., 2016) based approaches have shown that coupled DA offers important advantages over single-domain DA.
In weakly coupled atmosphere-ocean data assimilation, the atmosphere and ocean model components only directly interact during the coupled model forecast. There is some indirect exchange of information via the innovations, but the immediate impact of the observations is limited to the domain in which they reside. A consequence of this is that the atmosphere and ocean analyses are likely to be disjointed or unbalanced (Smith et al., 2015). An option for increasing the level of coupling in incremental variational DA formulations is to perform multiple outer-loop updates. Such an approach is used at the European Centre for Medium-Range Weather Forecasts (ECMWF) and is termed "quasi-strongly coupled" DA (Laloyaux et al., 2016). In a recent study, Laloyaux et al., (2018) compared how closely this approach approximates a strongly coupled ensemble Kalman filter (with coupled ensemble covariances) via a series of single sea-surface temperature (SST) observation experiments. They found that the implicit outer-loop coupling method produced reliable estimates of the coupled state, as measured by the reduction of error in the surface air temperature estimate relative to a specified truth state. However, because there was no correction to the initial surface air temperature field with this approach, a long (at least 12 hr) assimilation window was needed to allow initial imbalances in the atmosphere and ocean increments to synchronize. The authors acknowledge that, for assimilation systems using short windows, explicit coupling may be more beneficial.
A key motivation for transitioning to fully or strongly coupled atmosphere-ocean data assimilation is the ability to increase information exchange across the modelled air-sea interface by enabling the atmosphere observations to directly influence the ocean analysis and viceversa. For variational methods that assimilate observations across a given time window, a strongly coupled approach enables both domains to be updated at the initial time, whether they are observed or not, and is hence more likely to produce consistent initial model states. This is made possible by the explicit inclusion of cross-domain forecast error covariance information in the coupled forecast (or background) error covariance matrix B. The role of the matrix B in determining the shape and size of the analysis increments in DA is well known and easily demonstrated (e.g., Kalnay, 2003;Bannister, 2008a). In particular, the forecast error correlations govern how information contained in the observations is spread throughout the model domain, passing information from observed to unobserved variables (Johnson et al., 2005). Theoretically, the case for using a strongly coupled approach is simple; where there are significant non-zero cross-correlations between the atmosphere and ocean model forecast errors, the strongly coupled DA formulation allows us to account for these and thus encourage consistency between the analysed atmosphere and ocean model states. In practice, things are less straightforward as the true structure of the coupled error covariance matrix is unknown. Cross-domain error covariances exist on multiple space-and time-scales and vary with the dynamics of the situation being modelled, and so constructing an appropriate representation of them presents a significant challenge (e.g., Han et al., 2013;Lu et al., 2015;Smith et al., 2017;Penny et al., 2017;Penny et al., 2019).
Our initial work (described in Smith et al., 2015) using an idealised 1D single-column coupled atmosphere-ocean model and employing an incremental 4D-Var assimilation algorithm found that, even when the prescribed matrix B is assumed to be diagonal (i.e., the prior atmosphere and ocean forecast errors are assumed to be univariate and vertically uncorrelated), the strongly coupled incremental 4D-Var approach, which uses a fully coupled tangent linear model, generally outperformed both the uncoupled and weakly coupled formulations in terms of producing more balanced initial analysis fields and reducing initialisation shock in subsequent forecasts. This is attributed to the ability of the incremental 4D-Var algorithm to implicitly generate forecast error covariances by evolving B across the assimilation window to the time of each observation according to the tangent linear model dynamics (Bannister, 2008a). A strongly coupled 4D-Var system can therefore produce cross-domain forecast error covariances between the atmosphere and ocean variables, even if they are initially set to zero. The assumption of a diagonal B is a great simplification; it is expected that the inclusion of cross-domain information in the initial B matrix will have further positive impact on the coupled assimilation in that it will enable greater information exchange between atmosphere and ocean model components and thus allow near-surface observations to be used to even greater effect.
In a follow-up to the work in Smith et al., (2015), we used an ensemble of strongly coupled 4D-Var assimilations to derive estimates of the atmosphere-ocean forecast error cross-correlations for our system, comparing their strength and structure between summer and winter and between day and night (Smith et al., 2017). This study found notable variation in the atmosphere-ocean error cross-correlations within the near-surface atmosphere-ocean boundary layer for different times of day and year. These were explained using knowledge of the underlying coupled model physics, external forcing and known atmosphere-ocean feedback mechanisms.
The purpose of this paper is to examine the impact of using the ensemble error correlations from Smith et al., (2017) to explicitly prescribe non-zero a priori cross-domain covariances between the atmosphere and ocean forecast errors in a strongly coupled incremental 4D-Var assimilation, where the nonlinear, tangent linear and adjoint models are all fully coupled. In addition to a fully coupled matrix B full , we also consider using a block diagonal representation, B diag , in which the initial cross-domain error covariances are assumed to be zero, but full error covariances are prescribed for the atmosphere and ocean domains. This allows us to examine the extent to which the explicit cross-domain error covariances play a role in shaping the analysis increments compared to those implicitly generated by the incremental 4D-Var algorithm. Our focus is on separating out the effects of the explicit versus implicit cross-domain error covariances within the inner-loop; therefore we do not additionally investigate the correlations introduced by using multiple outer-loop updates; all our experiments are run with only one outer-loop. This paper is organised as follows: in Section 2 we introduce our 1D coupled atmosphere-ocean model system and briefly describe the strongly coupled 4D-Var approach, including implementation of the coupled forecast error covariance matrices. Details of the experimental design are given in Section 3. In particular we present explicit expressions for the atmosphere and ocean analysis increments for the special case of a single outer-loop and direct observations taken at a single time point, and explain how these will differ if we use B = B diag rather than B = B full . Results are presented in Section 4, followed by a summary and discussion in Section 5.

THE 1D COUPLED SYSTEM
In this section we provide a brief overview of our 1D coupled model system; a complete description of the atmosphere and ocean model components and strongly coupled incremental 4D-Var assimilation algorithm is given in Smith et al., (2015); this includes the model equations and details of the model validation. The ensemble methodology used to estimate the coupled forecast error correlations is described in Smith et al., (2017), together with a detailed discussion of the different interactions between the atmosphere and ocean model variables and their errors.

The model
The model comprises a streamlined version of the European Centre for Medium-range Weather Forecasts (ECMWF) single-column atmosphere model, which is based on an early version of their Integrated Forecast System (IFS) code, coupled to a 1D mixed-layer ocean model which was developed by the National Centre for Atmospheric Science (NCAS) Centre for Global Atmospheric Modelling (Woolnough et al., 2007) and is based on the K-Profile Parametrization (KPP) vertical mixing scheme of Large et al., (1994). The atmosphere model uses a hybrid vertical coordinate with 60 unevenly spaced levels which extend from the surface (∼10 m height) to 0.1 hPa with the finest resolution in the planetary boundary layer. The prognostic atmosphere model variables are temperature, specific humidity, zonal and meridional winds. The ocean model has 35 fixed levels which stretch from 1 to 250 m depth with finest resolution in the top 25 m. The prognostic ocean model variables are temperature, salinity, zonal and meridional currents. The atmosphere and ocean model components communicate at each time-step via the exchange of SST and surface fluxes of heat, moisture and momentum, as described in Smith et al., (2015).

Strongly coupled incremental 4D-Var data assimilation
The problem of variational data assimilation is to find the initial model state such that the model forecast gives the best fit to a set of observations distributed across a given time window but at the same time staying close to an a priori estimate (the initial "background" or "forecast" state), and allowing for the uncertainty in each. The standard 4D-Var algorithm is formulated as the minimisation of a nonlinear weighted least-squares cost function where x 0 ∈ ℝ n is the initial model state vector at t 0 , x k = (t k , t 0 , x 0 ) represents the model state at a given time t k , where  is a nonlinear model operator, x b 0 ∈ ℝ n is the background state at t 0 , y k ∈ ℝ r k represents a set of imperfect observations at times t k , k = 0, … , N, h k ∶ ℝ n → ℝ r k is a (generally) nonlinear observation operator, and B ∈ ℝ n×n and R k ∈ ℝ r k ×r k are the background (forecast) and observation error covariance matrices respectively. The minimising state (the "analysis") is denoted x a ; this state should be consistent with both the observations and the system dynamics.
The incremental 4D-Var formulation (Courtier et al., 1994) replaces the nonlinear problem (Equation (1)) with a sequence of linear least-squares problems by linearising about the current state estimate. Rather than searching for the optimal initial state directly, it searches for increments, x 0 , to the initial background state estimate, x b 0 ; this is done iteratively via a series of linearised inner-loop quadratic cost function minimisations and nonlinear outer-loop update steps (e.g., Lawless et al., 2005). For strongly coupled incremental 4D-Var, the control vector, x, contains both the atmosphere and ocean prognostic variables; the observation vector can contain both atmosphere and ocean observations and the coupled model is used in both the inner-and outer-loops of the incremental 4D-Var algorithm.

Coupled forecast error covariances
represents the atmosphere increment and x O the ocean increment, the fully coupled atmosphere-ocean forecast error covariance matrix B full can be written as where n = n A + n O is the dimension of the combined atmosphere-ocean model state vector. The blocks B AA ∈ ℝ n A ×n A and B OO ∈ ℝ n O ×n O are square matrices representing the atmosphere and ocean state forecast error covariances respectively. These blocks can be further decomposed into sub-matrices containing the auto-error covariances for individual variables and the error cross-covariances between different variables within the same fluid. The off-diagonal block B AO ∈ ℝ n A ×n O contains the cross-covariances between errors in the atmosphere and ocean state variables. If we assume that these blocks are zero, we have Decomposing B full and B diag in this way allows us to see how the atmosphere-ocean blocks contribute to the atmosphere and ocean analysis increments, as we describe in section 3.2.
In our system, the coupled forecast error covariance matrices are implemented by using the control variable transform (CVT) technique (Bannister, 2008b). The first step is to decompose the matrix B (the procedure is the same for both B full and B diag ) where D 1∕2 ∈ ℝ n×n is a diagonal matrix of error standard deviations, (with inverse D −1/2 ), and C ∈ ℝ n×n is the forecast error correlation matrix, with the same block structure as in Equations (2) or (3). The assimilation problem is then re-posed in terms of a new control vector, v which is related to the original incremental 4D-Var control vector, Here E is a matrix containing the eigenvectors of C, and is a diagonal matrix of its eigenvalues. Note that since C is symmetric, its eigenvectors are orthogonal, that is, EE T = I. Using the eigen-decomposition of C rather than B gives us the freedom to vary the prescribed forecast error standard deviations. The transform is designed such that in v coordinates, elements of the background state vector are uncorrelated with each other and have unit variance. This makes the incremental 4D-Var cost function minimisation much simpler as B cancels from the background term so that there is no longer an n-dimensional background error covariance matrix to invert; we now have (for a single outer-loop) where , and H k ∈ ℝ r k ×n is the linearisation of the nonlinear observation operator h k .

EXPERIMENTS
A simple way to illustrate the role of the cross-domain terms in B full is to consider a situation in which all observations are direct and taken at a single time point within the assimilation window, that is, h k = H, where H is a matrix of zeros and ones that picks out the element(s) of the state vector that are being observed. In this case the 4D-Var equations are greatly simplified, allowing us to obtain explicit expressions for the atmosphere and ocean analysis increments in terms of the blocks of B full . We begin this section by describing the configuration of our experiments, we then present the 4D-Var equations for each of the different observing scenarios and explain how these will differ if we use B diag rather than B full .

Configuration
We compare using a fully coupled matrix B full and a block diagonal matrix B diag in three experiments: (a) a single observation of the v-wind component is assimilated at the lowest atmosphere model level, at the end of the assimilation window; (b) a single observation of the ocean temperature is assimilated at the top level of the ocean model (i.e., the bulk SST), at the end of the assimilation window; and (c) cases (a) and (b) combined.
These are single-cycle identical-twin experiments with a 12 hr assimilation window. The coupled model is assumed to be perfect; the true initial state is a 24 hr coupled model forecast from 0000 UTC on 1 December 2013, valid at 0000 UTC on 2 December 2013. The initial data for this forecast come from two separate products, the ECMWF ERA Interim reanalysis (Dee et al., 2011) and the Mercator Ocean reanalysis (Lellouche et al., 2013), and so this initial 24 hr spin-up period is used to allow the atmosphere and ocean states to come into balance. The initial background state is given by a second 24 hr coupled model forecast from 0000 UTC on 1 December 2013, valid at 0000 UTC on 2 December 2013, but initialised from perturbed data. The perturbed state is generated by adding white Gaussian noise to the data used to forecast the true initial state. The standard deviation of this noise was computed from the variance of a time series of true coupled forecast states over a 96 hr (4-day) period; this corresponds to the duration of the cycled ensemble 4D-Var experiments presented in Smith et al. (2017) and was chosen to allow time for variations in the ocean to be captured. These variances were also used to set the background error variances in D 1/2 so that the DA system has knowledge of the actual initial background error variances at the start of the assimilation cycle. For the coupled forecast error correlation matrix C, we use the December 0000 UTC ensemble error correlations from Smith et al. (2017). We intentionally chose a case where the cross-domain error correlations are strong; in this winter case, greater turbulent heat fluxes and a pronounced day-night contrast in the net heat flux lead to a strong diurnal response and strong air-sea coupling. Note that the raw ensemble correlation matrix had zero eigenvalues and so was rank deficient; it was therefore first re-conditioned using the method described in section 2.1 of Smith et al. (2018); no localisation was applied. For the block diagonal error correlation matrix C diag , we simply zero out the off-diagonal atmosphere-ocean blocks, C AO and C OA , of the full error correlation matrix C full .
The observations are taken from the truth with random noise added; the observation error variances are chosen so that 2 y ∕ 2 b = 0.1, where 2 y is the observation error variance and 2 b is the prescribed background error variance at the observation location; this ensures that the assimilation weights towards the observation(s) rather than the background state and allows us to understand the influence of a good quality observation in terms of reducing errors in the coupled state. Because we only assimilate one or two observations, if we weight towards the background state the observation(s) have little, if any, impact on the analysis; consequently, the difference between the background and the analysis is minimal whether we include explicit cross-domain covariances or not. Of course, in a more realistic observing scenario, it would be appropriate to experiment with different observation to background error variance ratios. We assume in case (c) that the atmosphere and ocean observation errors are uncorrelated, so that the observation error covariance matrix R is a 2 × 2 diagonal matrix.

Single-observation experiments
When observations are direct and taken at a single time t k > t 0 , the incremental 4D-Var cost function reduces to (8) Equation (8) is minimised when where x a 0 = x a 0 − x b 0 is the analysis increment at t 0 , and M = M(t k , t 0 ) is the tangent linear of the nonlinear model operator (t k , t 0 , x 0 ) that takes the model state from time t 0 to time t k . For a single observation, the quantities d k , HMBM T H T and R in Equation (9) will all be scalar; if we assume that the observation corresponds to the jth element of the coupled model state vector, then BM T H T is the jth column of BM T , and HMBM T H T is the diagonal element (j, j) of the coupled forecast error covariance matrix evolved to the observation time t k . Further, if we assume that B = B full and decompose B and M into blocks (as shown in the Appendices), we can write the atmosphere and ocean components of the coupled analysis increment at t 0 as: where [⋅] j denotes the jth column of the given matrix, 2 yA is the atmosphere observation error variance,̃2 A is element (j, j) of the evolved atmosphere forecast error covariance (Equation (A4) where 2 yO is the ocean observation error variance, 2 O is element (j − n A , j − n A ) of the evolved ocean forecast error covariance matrix (Equation (A5)), and The cross-domain terms (those with subscript AO) in Equations (10)-(11) enable the ocean to influence how information from an atmosphere observation shapes both the atmosphere and ocean increments. Similarly, the cross-domain terms in Equations (12)-(13) enable the atmosphere to influence how information from an ocean observation shapes both the atmosphere and ocean increments.
If we have both a single atmosphere and a single ocean observation corresponding to the ith and jth elements of the model state vector respectively (and both at time t k ), we can write the atmosphere and ocean analysis increments as (the Appendices give full details) (14)-(15) with (10)-(11) and (12)-(13) shows that having observations of both the atmosphere and ocean allows greater feedback between the two components and therefore should promote greater consistency between their analyses.
If we assume that B = B diag , then all BAO terms in Equations (10)-(15) disappear; this means that the ocean cannot influence the structure of the atmosphere analysis increments at t 0 when only the atmosphere is observed (since only the BAAMAA T term remains in Equation (10)), and the atmosphere cannot influence the structure of the ocean analysis increments at t 0 when only the ocean is observed (since only the BOOMOO T term remains in Equation (13)). In this case, the coupling is essentially one-way; the update to the observed domain is independent of the unobserved domain, meaning that the atmosphere and ocean analysis states at t 0 are unlikely to be consistent. As we show in Section 4.2, this has important consequences for the balance of the coupled initial analysis state and subsequent coupled model forecast. However, we note that the ocean will influence the amplitude of the atmosphere increments by contributing to the evolved atmosphere forecast error covariance (Equation (B3)) and the atmosphere will influence the amplitude of the ocean increments by contributing to the evolved ocean forecast error covariance (Equation (B4)). When both domains are observed, there will still be two-way coupling with B = B diag , but the consistency of the initial atmosphere and ocean increments will depend on the strength of coupling in the linearised model and the structure of the single-domain error covariances.
Comparing the results of experiments using B full with those using B diag gives us an indication of the extent to which the implicit cross-domain error covariances (terms involving MAO or MOA) act to shape the analysis increments compared to those that are prescribed a priori. Where the prescribed cross-domain covariances are small, or the air-sea coupling is weak, we would expect B full and B diag to produce similar results.

RESULTS
In the following sections we present the results of the three assimilation experiments described in Section 3.1, and use the analysis equations derived in Section 3.2 together with our knowledge of atmosphere-ocean feedback mechanisms within the coupled model to interpret the effects we see. Results are shown for the bottom of the atmosphere between ∼ 500 hPa and the surface, and for the top of the ocean from the surface down to ∼ 100 m depth as these are where the cross-domain error correlations were found to be the strongest (Smith et al., 2017) and so most likely to have an impact on the coupled analysis.

Implicit versus explicit cross-covariances
We begin by discussing the results of experiment (a) in which we assimilate a single atmosphere surface v-wind observation; this is followed by experiment (b) with a single ocean surface temperature observation and then experiment (c) where the atmosphere surface v-wind and ocean surface temperature observations are assimilated together. In each case, we first consider the impact on the ocean analysis increments followed by the atmosphere analysis increments.

4.1.1
Single surface v-wind observation Figures 1a,b show the ocean temperature analysis increments for B full and B diag when only a single observation of the surface v-wind is assimilated. Although we show only the increments for ocean temperature as an example, the pattern of results is similar for all of the ocean state variables. Using the block-diagonal coupled error covariance matrix B diag does not produce initial increments at t 0 in any of the ocean fields, although small increments are produced in the near-surface region throughout the rest of the assimilation window (t k > t 0 ) as a result of increments to the initial atmosphere state. From Equation (11) we expect the initial ocean increments to be proportional to a column of BOOM T AO; however, assumptions made in the development of the tangent linear (TL) and adjoint model codes mean that an observation of the v-wind can have no influence on the initial ocean fields in this case, even if multiple outer-loops are used. This is due to the fact that perturbations to the diffusion coefficients are neglected in the linearisation of the atmosphere-ocean vertical turbulent flux parametrizations, resulting in the u and v winds not depending on the ocean model variables in the TL model. The result of this simplification is that will be zero when j corresponds to an atmosphere wind component. This assumption is not uncommon in atmosphere and ocean assimilation systems (e.g., Janisková et al., (1999), Mahfouf (1999), Weaver et al., (2003)) and is explained in more detail in section 3.2 of Smith et al., (2015). This result highlights an important point: the extent to which a strongly coupled DA approach is able to facilitate information transfer between the atmosphere and ocean will depend on the complexity of the coupled TL (and adjoint) model dynamics; feedback will obviously be maximised by using an exact TL model in combination with good quality estimates of the a priori cross-domain forecast error covariances. Developing and maintaining TL and adjoint models is non-trivial, especially when modelling large geophysical systems where the underlying physics are complex and the model code typically includes highly nonlinear and/or non-differentiable functions. As a result, the use of approximations in deriving TL models is very common and an exact TL model is unlikely in practice. A potential solution may be to use accurate ensemble approximations to the TL model (e.g., Bishop et al., 2017), but we do not explore this here. When the full matrix B full is used, the B T AO term in Equation (11) is non-zero and so we get initial ocean increments proportional to the jth column of the matrix B T AOM T AA, and weighted by the sum of the observation error variance and the evolved surface v-wind forecast error variance. The initial ocean temperature increment is close to the surface, but this propagates down into the mixed layer producing increments throughout the assimilation window and appears to result in a perturbation to the location of the mixed-layer depth at around 8-12 hrs, as shown in Figure (1 The initial increments to the atmosphere fields are near-identical for B full and B diag , as illustrated in Figure 2a,b for the atmosphere u-wind field. We show the increments for the u-wind component as an example, but the pattern of results is similar for all of the atmosphere state variables and also for the derived wind-speed field. Qualitatively it is difficult to see differences between the increments across the whole of the assimilation window, but they do exist. Although the differences are overall small in magnitude ((10 −3 )), they become larger towards the end of the window as the error covariances implicitly evolve, as illustrated in Figure 2c. It is also worth noting that the magnitude of the differences increases if multiple outer-loops are used as a result of the nonlinear trajectory update step. The reason for the increments being so similar  (10) is zero regardless of whether we are using B full or B diag ; this means that the ocean has no influence on the shape/structure of the atmosphere increments at t 0 . Instead, they are determined by the structure of the atmosphere covariance [ j . Again, this is due to the limitations of the TL model as already discussed above.

Single SST observation
For the ocean fields, the differences between the initial increments produced using B full versus B diag are smallest when only a single SST observation is assimilated, particularly for the ocean temperature and salinity. The profiles of the ocean temperature analysis increment at t 0 are shown Column (i) shows the results of using the full a priori forecast error covariance matrix B, column (ii) the block-diagonal a priori forecast error covariance matrix B, and column (iii) the difference between columns (i) and (ii). The approximate mean height of the atmospheric boundary layer is 967 hPa (this value is diagnosed from the truth trajectory) in Figure 3 but the same pattern of results is seen in the differences across the whole of the assimilation window as shown in Figure 1f (note the colour scale). When we use the block diagonal matrix, B diag , the cross-domain term (13) disappears and so this result tells us that the contribution from this term must be small in the B full case; instead the ocean covariance term, j , is dominating the structure of the ocean increments, particularly for the ocean temperature and salinity. The differences are more obvious for the u and v currents but are still small compared to the other observation cases (not shown). The near-surface cross-domain error correlations are slightly stronger for the currents, which suggests a greater loss of information when the BAO are ignored. The atmosphere u-wind analysis increments are completely different in the single SST observation case compared to the single surface v-wind observation case (compare top and middle rows in Figure 2); the increments themselves are smaller in magnitude, but the differences between using B full and B diag are much larger (a maximum of ∼1 K difference for temperature, and 1.5 m⋅s −1 for winds). In the B diag case, the ocean is only able to influence the initial atmosphere analysis increments via the term [ B AA M T OA ] j in Equation (12). Equation (12) also tells us that the difference in the initial increments when using B full compared to B diag will be proportional to j . The large differences in this experiment imply that the evolved a priori cross-domain error covariances are enabling the ocean to have a greater influence on the shape of the atmosphere increments across the whole assimilation window. Indeed, our previous study (Smith et al., 2017) found that the wind speed-ocean temperature error correlations are particularly strong and positive throughout the atmospheric boundary layer (ABL) for this case. These correlations then switch to negative above the ABL, which would explain the change in sign of the differences at around 850 hPa. Errors in surface wind speed enhance the magnitude of errors in the latent and sensible heat fluxes, which in turn drive errors in near-surface temperature.
The results of these two single-observation experiments suggest that, when only a single domain is observed, the impact of including a priori cross-domain forecast error covariances (versus BAO = 0) is mostly seen in the unobserved domain; any differences seen in the analysis increments of the observed domain are comparatively small since here the spreading of the observational information is mainly driven by the single-domain covariances and adjoint model dynamics. The analysis equations derived in Section 3.2 suggest that the exception to this will be fields where there is strong coupling in the TL and adjoint models and/or the prescribed error cross-correlations are particularly strong. Without explicit cross-domain error covariances, the updates in the unobserved domain rely on cross-domain error covariances implicitly generated by the TL model; this limits the adjustments that can be made and hinders the ability of the matrix B to impose balance between the two domains. We see an example of this in the ocean temperature and salinity fields, as we discuss in Section 4.2.

Surface v-wind and SST observations combined
When both the atmosphere and ocean are observed, the analysis increments are essentially a weighted combination of the individual observation cases (compare Equations (10) and (12) with Equation (14), and Equations (11) and (13) with Equation (15)). As in the previous single-observation experiments, when using B diag  (15)) is zero for this case; this means that when B = B diag the structure of the initial ocean analysis increments is proportional to j and therefore determined solely by the ocean covariances and adjoint model dynamics. Comparing this with the update equation for the single ocean observation case (Equation (13) where the BAO term is neglected) tells us that the structure of the initial ocean increments will be the same as in the single-SST observation experiment, although here the surface v-wind observation will have some influence on the scaling via its contribution to the weighting term. This result is illustrated in Figure 3, which shows the profiles of the analysis increments at t 0 for ocean temperature for each of the different experiments (compare the grey lines in Figure 3b and c). Comparing Figures 3a, b and c shows how using B = B full allows the surface v-wind observation to alter the structure of the initial ocean analysis increments in combination with the SST observation. Even without referring to Equation (15), the amalgamation of the individual effects of the SST and surface v-wind observations is quite clear; this is also true across the rest of the assimilation window (Figure 1a,d,g).
For the atmosphere, the differences in the increments when using B full versus B diag are perhaps less obvious. When B = B diag , the ocean is still able to influence the shape of the initial atmosphere increments via the term j . Figure 4 shows the profiles of the analysis increment profiles at t 0 for the atmosphere u-wind for each of the different observation experiments; comparing the structure of the initial increments for this case (Figure 4c) against those for the single-observation experiments (Figure 4a,b) shows that the contribution from the [ j term is small and it is the atmosphere covariance term i that is having the most influence on the shape of the initial atmosphere increments in this case. It is easier to see the differences between the atmosphere increments produced using B full versus B diag across the rest of the assimilation window ( Figure 2); in both cases the analysis increments are closest in structure to the single surface v-wind observation experiment (compare top and bottom rows). However, comparing the differences (Figure 2 column (iii)) for each of the different observation experiments shows that the differences in this double observation experiment are actually most similar in magnitude and structure to the single-SST observation case. This indicates that using B full enables the SST observation to have a greater impact on the atmosphere analysis.
This experiment has demonstrated how using a full matrix B full allows the atmosphere and ocean observations to work in synergy; as we see in the next section, this has a positive impact on the balance of the system.

Balance
A key motivation for coupled DA is the potential to produce more consistent initial atmosphere and ocean model states for use in coupled forecasting. Figure 5(i) shows that the background state used in our experiments is such that an error develops in the gradient of the ocean temperature profile around the mixed-layer depth approximately 7-8 hr into the control forecast. The same effect is also seen in the background ocean salinity profile (not shown). Figure 1 column (i), showing the ocean temperature analysis increments over the assimilation window, and Figure 5 column (ii), showing the ocean temperature analysis errors over the assimilation window, reveal how with B = B full the assimilation is able to correct this error and bring the temperature gradient back in line with the truth by producing opposing positive and negative increments at between 50 m and 75 m depth (the pattern of results for salinity are similar but not shown). The crucial point here is that the correction is actually being produced by the assimilation of the surface v-wind observation rather than the SST observation (compare Figures 1a,d,g). In Smith et al., (2017), we found that there was a strong (but opposing) error correlation between wind speed and ocean temperature and wind speed and salinity in the atmosphere-ocean boundary layer for the December 0000 UTC test case. Explicitly including this information in the matrix B enables the coupled algorithm to maximise the use of the v-wind observation. Assimilating both the v-wind and SST observations together further improves the temperature and salinity analyses throughout the whole of the mixed layer.
As previously explained, the assimilation of the surface v-wind has no impact on the structure of the initial ocean increments when B = B diag is used (Figure 3a). Here, the implicit cross-domain error covariances are not enough to balance the initial atmosphere and ocean increments, and so the errors in the background temperature and salinity profiles persist in the analyses (Figure 5b,e,h). The difference between the results using B full versus B diag for this case tells us that the prescribed cross-domain error covariances are playing a key role in ensuring that the atmosphere and ocean analysis fields are consistent with each other.

SUMMARY AND DISCUSSION
A key motivation for strongly coupled atmosphere-ocean DA is the ability to improve information exchange across the modelled air-sea interface by enabling observations in one domain to have a direct influence on the analysis in the other. In variational DA this information exchange can be maximised by specification of a priori cross-domain forecast error covariances. However, constructing an appropriate representation of these cross-domain forecast error covariances presents a significant challenge. A potential option is to ignore them and rely on the implicit covariances generated by the 4D-Var algorithm; this could be viewed as a way of transitioning from weakly or quasi-strongly coupled approaches which use independent error covariance matrices for each domain.
In this study we have used an idealised strongly coupled incremental 4D-Var framework to compare the impact of using a fully coupled matrix, B = B full , with explicitly prescribed atmosphere-ocean forecast error covariances, against a block diagonal representation, B = B diag , in which the initial cross-domain forecast error covariances are assumed to be zero, but full error covariances are prescribed for the individual atmosphere and ocean domains. Single-observation, identical-twin, experiments were used in order to allow us to obtain explicit expressions for the individual atmosphere and ocean incremental 4D-Var analysis updates. This allows us to demonstrate the mechanisms by which strongly coupled 4D-Var allows the atmosphere and ocean to interact by separating out the individual effects of the explicitly prescribed and implicitly generated cross-domain forecast error covariances. Although we present results for particular examples, the equations we derive for the coupled analysis updates are not specific to our coupled model system; these equations, and the conclusions we draw from them, apply generally irrespective of any approximations made within the coupled TL and adjoint models.
Our experiments have shown that, when only one domain is observed, including explicit cross-domain forecast error covariances mostly impacts the unobserved domain. When the a priori cross-domain covariances, BAO, are set to zero, the structure of the initial analysis increments in the observed domain are solely determined by its individual covariance and TL model dynamics and therefore take no account of the coupled model dynamics. The initial increments in the unobserved domain rely on the implicitly generated cross-domain error covariances which in turn depend on the strength of coupling in the linearised model. This restricts the ability of the matrix B to impose balance between the two domains and so is unlikely to produce atmosphere and ocean updates that are consistent with one another. Further, for our system, limitations of the coupled TL model mean that a single surface v-wind observation is unable to produce any adjustment to the initial ocean state when BAO = 0. This result highlights the fact that the ability of the strongly coupled 4D-Var algorithm to implicitly generate cross-domain covariance information will potentially be restricted by approximations made in the development of the TL and adjoint models. When a full matrix B = B full is used, the single-domain terms still dominate the initial analysis increments in the observed domain, but the prescribed cross-domain forecast error covariances allow more consistent adjustments to the unobserved domain. In particular, we found that a single surface v-wind observation was able to correct errors in the ocean temperature and salinity fields that a single SST observation could not. Additionally, when both the atmosphere and ocean observations were assimilated simultaneously, the matrix B full allowed the observations to work together to reduce the errors in these fields and improve the consistency of the coupled system.
The analysis update equations derived in Section 3.2 suggest that, if both domains are observed, a block diagonal B = B diag may be sufficient when the coupling in the linearised model is strong and the prescribed forecast error covariances of the individual components are consistent with the coupled background state. If only a single domain is observed, using B = B diag will always lead to a loss of information; in this case the coupling is essentially one-way due to the unobserved domain being unable to influence the structure of the increments in the observed domain and so is unlikely to produce a balanced coupled initial state. Unless the true BAO are small (in which case a coupled DA approach may not be necessary anyway), using a fully coupled matrix B will always offer the greatest potential gains.
It is perhaps unfair to measure the differences between these two approaches in terms of their analysis errors when only assimilating at a single observation time. However, these experiments have given a useful insight into how the presence (or absence) of a priori cross-domain error covariances can influence the interaction between two domains, and in turn the structure of the coupled analysis increments, in different observing scenarios. Ultimately the quality of the strongly coupled 4D-Var analyses will rely on the quality of the definition of the a priori cross-domain covariances and the exactness of the linearised model dynamics.
Current operational uncoupled DA systems have seen years of development and tuning and so naturally it will take time for coupled DA systems to mature to the same level. Our simple experiments confirm that there are clear benefits to be gained from a strongly coupled approach; the important thing now is to ensure that current research efforts are maintained so that these benefits can eventually be realised in real-world operational applications.
and block form of B (given by Equation (2) where [⋅] i and [⋅] j denote the ith and jth columns respectively, wherẽ2 A is the evolved atmosphere forecast error variance of the ith element of the model state vector (this is entry (i, i) ofB AA ),̃2 O is the evolved ocean forecast error variance of the jth element of the model state vector (this is entry (j − n A , j − n A ) ofB OO ) and̃2 AO is the evolved atmosphere-ocean forecast error cross-covariance between points i and j (this is entry (i, j − n A ) ofB AO ), and wherew = (̃2 A + 2 yA )(̃2 O + 2 yO ) −̃2 AO . This gives ( where d A = y A − x i and d O = y O − x j . The atmosphere and ocean analysis increments at time t 0 can be written as and ( wherê2A is entry (i, i) of the evolved atmosphere forecast error covarianceB AA (Equation (B3)),̂2 O is entry (j − n A , j − n A ) of the evolved ocean forecast error covariancê B OO (Equation (B4)), 2 AO is entry (i, j − n A ) of the evolved atmosphere-ocean forecast error cross-covarianceB AO (Equation (B5)), andŵ = (̂2A + 2 yA )(̂2 O + 2 yO ) − 2 AO . In this case, the atmosphere and ocean analysis increments at time t 0 are