Implicit Equal-Weights Particle Filter

Filter degeneracy is the main obstacle for the implementation of particle ﬁlter in non-linear high-dimensional models. A new scheme, the implicit equal-weights particle ﬁlter (IEWPF), is introduced. In this scheme samples are drawn implicitly from proposal densities with a different covariance for each particle, such that all particle weights are equal by construction. We test and explore the properties of the new scheme using a 1,000-dimensional simple linear model, and the 1,000-dimensional non-linear Lorenz96 model, and compare the performance of the scheme to a Local Ensemble Kalman Filter. The experiments show that the new scheme can easily be implemented in high-dimensional systems and is never degenerate, with good convergence properties in both systems.


Introduction
Geophysical systems such as atmosphere or ocean systems are inherently nonlinear in nature.Numerical models which are used to simulate the true geophysical systems often have a state space of over one million variables, which results from discretising physical variables in a 3-D spatial grid.The dimension of state space keeps on growing due to the sustainable increase in model resolution and the computation capacity of the super computers.function, or posterior pdf of that model state.This update process is called data assimilation (DA).Present-day DA methods are tailored to specific statistics of the posterior pdf, e.g.mean, covariance, modes, etc. Variational methods such as 3DVar and 4DVar (?) search for the mode of the posterior pdf through the minimisation of a cost function.It cannot be guaranteed that the mode variational methods find is the global mode of the posterior pdf, which means that the search may stop at a local mode.Furthermore, it is hard to generate an uncertainty estimate of a variational solution due to its implicit use of the covariances involved.The Ensemble Kalman Filter (EnKF) (??) estimates the mean and covariance of the posterior pdf under the implicit assumptions of linearity and Gaussianity.Neither of these two popular methods can describe non-Gaussian posterior pdfs in an accurate manner, and it is still unclear what they estimate in a multimodal posterior pdf.Hybrids between the two methods like Ens4DVar, in which the ensemble from an EnKF is used to inform the background covariance of 3-or 4DVar about previous observations and flow structures, and 4DEnsVar, in which the space-time covariances in the 4DVar are explicitly generated from a forecast ensemble, like in an Ensemble Kalman Smoother, do not solve this issue.
The particle filter (PF) is a sequential Monte-Carlo method, which uses an ensemble of particles to represent the posterior pdf directly without linear or Gaussian assumptions, see e.g.(?).It has been successfully applied in systems with low dimensions, e.g.(??).But for geophysical systems with high dimensions, limited to the computation resources of modern super-computers, we cannot run enough model simulations to simulate the posterior pdf while avoiding the so-called "curse of dimensionality".
Different flavours of PFs exist, but all of them share two steps: forecast (also known as mutation) and weighting.When the numerical model equations contain errors (as they always do, of course, but these are often ignored), it is advantageous to slightly change the stochastic forecast model to stir the model closer to future observations.This is allowed in particle filtering as long as the weight of that model run is lowered accordingly in a well-specified way.Statistically this is known as drawing from a proposal density.When the model reaches the observation time these weights are multiplied by the likelihood of these observations assuming that they have been generated from that model state.The closer the model run to the original model, and the closer the model is to the observations the higher its weight will be.Most of the particle weights degenerate to a very small value as time evolves simply because it is hard to stay close to all observations.In high-dimensional situations with a large number of independent observations one particle obtains a weight close to one, while the others have weights very close to zero.The degeneracy of the particle weights leads to a loss of statistical information since the effective ensemble size reduces to 1.This is the main obstacle for PF to be applied operationally as an alternative in DA (?). ??? argue that the ensemble size must scale exponentially with respect to the "effective size" of the problem (proportional to the number of independent observations) for a particle filter to avoid degeneracy.They show that this is even the case for the proposal density that has the lowest variance in the weights, which they showed to be equal to the so-called Optimal Proposal Density, which is known to be optimal in a slightly different way.The analytical calculations were backed up by convincing experiments using a simple linear test case.
? and ?introduce an implicit proposal density method that choose a map from the implicit sampling space to the original state space.Examples on 100-500 dimensional spaces show that the method is more robust than the original particle filter with resampling, but it is easy to show that the method reduces to the optimal proposal density when observations are present at every time step and when the model noise is state independent, having the same degeneracy issues.
The equivalent-weights particle filter (EWPF) of ?? and ?? explore a particle filter that uses a proposal density of a different class than studied before.It allows for a proposal density for each particle that depends not only on the position of the particle at previous time, but on all particles at previous time.Since all particles are involved it is straightforward to ensure that the final weights of part, or al, of the particles are equal.It has been shown to be non-degenerate in even high-dimensional spaces, e.g. the 65,000 dimensional barotropic vorticity model, and recently the over 2 million dimensional climate model HadCM3 (?).The scheme has a few tuning parameters that can be adjusted for optimal performance, measured by e.g.rank histograms.It can be shown, however, that such a scheme is biased.It is well known that particle filters that explore resampling are always biased, but the bias is of a stronger nature in this filter.A bias is not an issue in itself as long as the bias is smaller than the statistical noise in the method.In the IWPF, in order to enforce weights that are close together, the particles are forced to be positioned close to a hyper-ellipsoidal shell, one for each particle.This means that the proposal density of all particles together does not explore the full state space.The equivalent-weights PF works extremely well for small ensemble sizes of order 10-100 for high-dimensional (order 1,000 or much more) systems, when the statistical noise is relatively large, but this scheme does perform less favourably when large ensemble sizes are used and the bias becomes apparent.Although we typically cannot afford more that 10-100 particles in geophysical systems this limits the usefulness of this scheme.
In this article, a new PF is proposed, which we label the implicit equal-weights particle filter (IEWPF).This scheme uses a proposal transition density in which each particle is drawn implicitly from a slightly different proposal density, the difference being a factor in front of the covariance of the proposal. .This factor depends on all other particles such that the equalweight property is fulfilled.This scheme is applied during the last transition step before the observations.In between observation times a simple relaxation scheme is used, as in the equivalentweights particle filter.One strong advantage of this new scheme is that the number of tuning parameters has been reduced drastically, and we will show that the bias is much smaller than for the EWPF.This article is organised as follows.Section 2 describes the implicit equal-weights particle filter in detail, and its performance on the linear model used by ? and 1,000 dimension Lorenz96 model are discussed in section 3, together with a comparison with the LETKF.A summary and conclusions are provided in section 4.

The basic idea
Bayes' theorem shows how the prior density p(x) is changed when multiplying it with the density of observations y given a specific model state x, the likelihood.The posterior pdf of the model state given observations p(x|y) is thus given by: The posterior pdf of a filter is the probability of the state variable x n at time-step n given the observations y 1:n at time For a Markovian system with observational errors that are independent from one time to another, the posterior pdf can be written as The transition density p(x n |x n−1 ) is related to the model equation via ) is the nonlinear deterministic model equation, and β n is a stochastic perturbation with mean zero that can, in principle, depend on x n−1 .
Let us assume for the moment that we run a particle filter and that the particle weights in the ensemble at previous time-step n − 1 are equal: When plugging equation (4) into equation (2), we find that: One can now multiply the numerator and denominator of equation ( 5) by the same factor q(x n |x n−1 , y n ), in which x n−1 is defined as the collection of all particles at time n − 1.
where the support of q(x n |x n−1 , y n ) should be equal to or larger than that of p(x n |x n−1 i ). q(x n |x n−1 , y n ) is the so-called proposal transition density.
c 2015 Royal Meteorological Society Prepared using qjrms4.cls The assumption that observations appear at every time-step is made and we draw samples from the proposal transition density q(x n |x n−1 , y n ), instead of the original transition density ).This leads the posterior pdf to be expressed as: Consequently, the posterior pdf of model state at time-step n can be written as where w i is the particle weights given by Now assuming that the model system is Markovian and using Bayes' theorem, the numerator in the expression for the weights can be expressed as Therefore the particle weight of ensemble member with index i at observed time-step becomes In the so-called optimal proposal density (?) one chooses leading to weights For systems with a large number of independent observations these weights are degenerate, see e.g.(?).
The implicit part of our scheme follows from drawing samples implicitly from a standard Gaussian distributed proposal density q(ξ) instead of the original one q(x n |x n−1 , y n ) (?).These two pdfs are related by: where || dx dξ || denotes the absolute value of the determinant of the Jacobian matrix of the R Nx → R Nx transformation x i = g(ξ i ).
In the Implicit Equal-Weights Particle Filter this function g(.) is defined via with x a i the mode of q(x n i |x n−1 , y n ), P is a measure of the width of that pdf, and α i is a scalar.In the implicit particle filter of ?
α i is determined by choosing the proposal density as the optimal proposal density, so again q(x n and using the expression for x n i directly in leading to a nonlinear scalar equation for α i .
Our scheme is different in that we choose the α i such that all particles get the same weight w target , so we determine the scalar α i for each particle from: This equation is at the heart of the IEWPF, showing the equalweights part of the scheme.It ensures that the filter is not degenerate in systems with arbitrary large dimensions and with an arbitrary large number of independent observations.We can expand this as follows.Sampling implicitly from q(ξ) instead of q(x n i |x n−1 , y n ), the particle weights are now given by where q(ξ) is the standard Gaussian distribution and w prev i introduces the weight from previous time-steps.This equation demonstrates the implicit part of the scheme.
The determinant of the Jacobian depends only on the transformation from ξ to x, and is independent of the pdfs of the these variables.Hence, we can simply obtain it from (13) and get Factorising α 1/2 i P 1/2 out from the right hand side leads to: The last factor in this equation can be simplified to a scalar by using Sylvester's determinant lemma.Hence, the equation for dξ || reduces to:

Gaussian observation and model errors, and linear observation operator
In this section the new scheme is explored for the case when observation errors and model errors are assumed to be Gaussian, and the observation operator H ∈ R Ny×Nx is assumed to be linear.With these assumptions we can write: where x a i in equation ( 13) is the mode of p(x n |x n−1 i , y n ), given by For ease of presentation we introduce: Taking minus the logarithm of the expression for the weights derived in the previous section leads to: Let J i and J prev i stand for 2 times the logarithmic particle weights of analysis time and previous time-steps respectively, then the last equation can be rewritten as: Substituting the Jacobian factor obtained in equation ( 19) we find: in which the constant term common to all particles is ignored as it plays no role in the following. Since i and α i = 1 + ε i , the equation of J i can be simplified as For ease of presentation we also introduce that J i is given by Setting the weights of all particles equal to the target weight w target is equal to putting all J i equal to a constant number C, c 2015 Royal Meteorological Society Prepared using qjrms4.clsleading to the following equation for ε i : Although this is a scalar equation, the derivative makes this implicit equation hard if not impossible to solve in general.Since we are interested in high-dimensional problems we consider this equation in the limit of large state dimension Nx.As detailed in Appendix A we can integrate this equation in this limit, leading to the much simpler equation: If this equation could be solved and the real solutions of ε i could be obtained, an absolute equal weights particle filter method that avoids filter degeneracy is discovered.The equation can be solved by iterative methods, such as Newton method, etc., but interestingly analytical solutions exist.The analytical solutions are based on the so-called Lambert W function, as detailed in the next section.

Lambert W Function
The Lambert W function (??), also called the omega function or the product logarithm function, is the inverse function of where e W (z) is the exponential function and z is any complex number.
The W function is multivalued (except at zero point) because f (•) is not injective.In this new scheme, our attention is restricted to real-valued W , the complex variable notation z is replaced by the real variable notation x, and W (x) exists when x ≥ −1/e, and is double-valued on (−1/e, 0), see Figure 1.The branch satisfying and the branch satisfying as indicated in Figure 1.As can be seen in the figure, W 0 (0) = 0 and W 0 (−1/e) = −1.
The Lambert W function decreases from Crucial identities of Lambert W function are its derivative where z / ∈ {0, −1/e} and the equation which follows directly from its definition.
Its interest for our problem is that the Lambert W function gives the solution of the generalized problem: This allows us to solve equation ( 32) to obtain an analytical solution for ε i .

Solutions for α i
Equation (32) could be generalized as Following equation ( 36), the analytical solution of x is found as c 2015 Royal Meteorological Society Prepared using qjrms4.clsso that and To ensure real-valued solutions c must satisfy In accordance with the charactersitics of Lambert W function, we find the following characteristics for α i .First, there are two real solutions for W (•), and for ε i and thus α i .ε i has a positive real solution give by W −1 branch and a negative real solution given by the negative x part of W 0 branch which is always larger than −1.Second, if the value of c is zero, the value of α i becomes a single constant solution 1 because of identity equation ( 35): Practically, the solutions can be derived numerically via the Lambert W function, or by using numerical approximation methods for the original equation ( 32).The Lambert W function could be approximated using Newton's method or Halley's method (?).

Structure of solutions
The analytical solution for   The α i values tend to be closer to 1 when Nx becomes larger, mainly because the fluctuation of γi Nx tends to be smaller when Nx increases.
When c = 0, the two branches of α i meet in one point where With increasing c, the gap between two branches becomes larger and larger and the values of α i are further away from 1.

Discussion
Figure 2 illustrates how α i performs with varying c.There is a gap when c = 0, restricting the state space that α 1/2 i P 1/2 ξ n i explores.
Since P 1/2 is a constant matrix, we ignore it in this section.We The full expression is given by We choose Nx to be 1 as the simplest case, and c/Nx has three values, 0, 1/2 and 1. Figure 3 shows the state space that f (ξ n i ) explores with varying c/Nx values when Nx = 1.
c 2015 Royal Meteorological Society Prepared using qjrms4.clsThe importance of the gap lies in the fact that the proposal density does not explore the full state space for those particles that have a gap, so all particles except one.This means that the new scheme will be biased, although it is unclear what form this bias takes.The gap position will be different for each particle, so the space missed out by several particles will be much smaller than the gap of an individual particle.And, because one particle has no gap, the ensemble as a whole will explore full state space.
The scheme is tailored to high-dimensional systems, so we studied the importance of the gap when Nx increases.For each particle there are two high probability hyper-spheres surrounding the gap, and we show in the appendix B that the ratio of the gap volume and the volumes of the two high probability hyper-spheres will become smaller when the state space dimension increases, suggestion that the bias decreases when Nx increases.

Multi Time-steps Between Observations
In typical geophysical systems several model time-steps exist between observations times.In principle one can extend the formulation above for a number of time steps, as e.g. the implicit particle filter does.In that case x a i becomes a model trajectory over time, and can be found as weak-constraint 4DVar solution with fixed initial condition.The random vector ξ n i will now extend over space and time, and so will P .This will again result in a highly nonlinear equation for α i , which can be solved numerically.
In this paper we use a simpler approach and use the relaxation proposal density also explored in e.g.?.If it is assumed that the original model error is Gaussian with known covariance matrix Q, then the model transition density is expressed as The relaxation proposal transition density of the time-steps before the last time step towards the observations is chosen as In this equation where we introduced the short-hand notation v for The relaxation strength B(τ ) is given by where τ increases linearly from zero to one at the previous timesteps and b is a constant.B(τ ) controls the strength of relaxation, but, via its dependence on Q, also spreads the information from the observed grid points to unobserved grid points.
This expression for p/q allows to generate w prev i simply by multiplying the particle weight by the p/q factor for each timestep, see ? for more details.

Linear Model Experiments
Although geophysical models tend to be high-dimensional nonlinear systems, linear models are still a simple benchmark for testing new DA schemes.Furthermore, analytical solutions are usually available in these cases.Consider the model equation and c 2015 Royal Meteorological Society Prepared using qjrms4.clsthe observations (?): x n = x n−1 + η n−1 (51) where x n is the state variable at time-step n and y n is the observation vector at time-step n.Random model perturbations η are drawn from the model error pdf N (0, Q), observation errors are drawn from observation eror pdf N (0, R).We sample the ensemble members x 0 i from the background errors N (0, B).
Observation and background errors are mutually uncorrelated, so the three matrices are diagonal.This means that effectively we are running Nx independent data assimilation systems at the same time, in which Nx is the dimension of the state space.We choose  We can also look at the shape and structure of the posterior pdf.Since we know the true posterior pdf is a Gaussian we can test how good our ensemble is.However, since the ensemble size is small a direct calculating of the posterior pdf is not very useful.To check whether this scheme feasible for large number of ensemble members case, the number of ensemble members is increased to 1000, which leads to bias issue in the equivalent weights particle filter.Figure 7 shows that a choice of 50% positive ε i leads to a pdf that is slightly too wide.Decreasing the percentage to 35 gives a better result.This result suggests that we might be able to choose ε i in a better way, e.g. according to the probability mass on each side of the gap.This will be left for future research.

High-Dimensional Lorenz96 Model Experiments
In this section the new scheme is compared to the LETKF in a moderately high-dimensional setting of the Lorenz 1996 model.
The Lorenz 96 model ( ?) is a dynamical model often used as a test model for new DA methods.It is defined as where x i is the state variable of the model at position i and F is a forcing constant, which is typically chosen as 8 for chaotic behaviour.The dimension of the Lorenz 96 model can be easily extended from 40 to 1,000, or more.In this section, 1,000 dimension Lorenz 96 model with 20 ensemble members is chosen for all the experiments.
As described in the previous section, the ε i has two different real solutions in this new scheme, one is positive and the other is negative.We will explore the sensitivity to different choices of ε i .To mimic realistic geophysical situations the grid points are observed every five time steps and three scenarios of spatial observation densities will be explored.The first one is observations at every grid point, the second one is observations c 2015 Royal Meteorological Society Prepared using qjrms4.clsat every other grid point and the last one is observations at the first half of the domain.We choose the model error covariance matrix Q and background error covariance matrix B as tridiagonal matrices.We used a time step of ∆t = 0.05 with an RK4 scheme for the deterministic and an Euler scheme for the stochastic part of the model.
The LETKF uses same background error covariance matrix, model error covariance matrix, observation error covariance matrix and observation operator, and the initial ensemble of the two methods is the same.After some experimentation the localization radius of the LETKF is set to one grid point for best performance on RMSE, which is the standard measure of performance of the LETKF.

Parameter setting for the IEWPF
In this section we explore the parameter values of the IEWPF to determine the optimal setting for the Lorenz96 model, in which all variables are observed directly, every 5th timestep.
3.2.1.1.IEWPF without Relaxation Term In this experiment the relaxation term in the IEWPF is not used.The background covariance matrix B is a tridiagonal matrix with main diagonal value 1 and sub-diagonal value and super-diagonal value 0.25.The main diagonal value of Q is 0.10 and both sub-and super-diagonal values are 0.025.R is a diagonal matrix with main diagonal value 0.16.We observe every grid point every 5th time step.We use 50% positive ε i .The ensemble needs some spin-up time to reach a more-or-less steady spread, as depicted in Figure 8.  without relaxation term, and the observational scheme is also the same.ε i is randomly chosen to be 50% positive.

Time
Step Figure 10 shows that only 20 ensemble trajectories can follow the truth properly under implicit equal-weights particle filter.
Furthermore, the ensemble members are closer to the truth when the relaxation term is present.The spin-up time of ensemble model runs could be seen to be around 90 time-steps in Figure 10.To investigate the sensitivity of IEWPF to the sign of ε i , Figure 12 and Figure 13 illustrate the trajectories and rank histograms of the ensemble members for all positive (left) and all negative (right) ε i .The system is still 1,000 dimensions and 20 ensemble members, and every grid point is observed every 5 time-steps.Comparing these two experiments, the IEWPF with relaxation term performs better than that without relaxation term, but the relaxation parameters need to be tuned.It is similar to the ensemble Kalman filters for systems that are not too nonlinear: the raw schemes are consistent, and by tuning the inflation factor and the localisation area a better performance can be achieved.
This relaxation strength is used for all further experiments on this model, with 50% of the ε i positive.

Comparison of IEWPF and LETKF
In this section the IEWPF will be compared to the LETKF.The localisation radius of the LETKF is set to 2 grid points, and the covariance inflation factor is 1.05, found as giving the lowest root- Figure 14 shows that the RMSE of the LETKF is systematically lower than that of the IEWPF apart from at the unobserved points.
There it is outperformed by the IEWPF.Arguably importantly than the actual RMSE is the ratio of the RMSE to the spread in the ensemble.Figure 15 shows that both methods perform well on this measure for the first two experiments, and that the spread in the LETKF is way too low in experiment 3, while the IEWPF still performs well.As a further comparison we look at the histograms for all experiments in figure 16.The LETKF is slightly over dispersive for the observed grid points in experiment 3, and strongly under dispersive on the unobserved grid points in that experiment.In contrast, the IEWPF performs well in all settings.

Conclusions
A new DA method, the implicit equal-weights particle filter, has been presented in this paper.A flexible proposal density with a covariance that varies with the performance of each sample is used to make the particle weights all equal.It is essential that a model error term is included for this new method to work.This is not a serious drawback as it is well recognised that model errors are present and significant, but in practical applications model errors tend to be ignored.A relaxation term is included in the timesteps between two adjacent observations to make this new scheme more efficient.The addition of the relaxation term is included in the proposal density weights.The equal-weights conditions leads to a complicated matrix determinant ordinary differential equation that is hard to solve in general.We concentrated on the c 2015 Royal Meteorological Society Prepared using qjrms4.clshigh-dimensional systems, which allows for approximations that make the problem tractable.Interestingly, one particle can explore the full state space while all others experience a gap with zero proposal probability.However, this gap diminishes with the size of the system, making the new filter ideal for high-dimensional geophysical applications.
The IEWPF can be easily implemented in high-dimensional chaotic models.Two series of high-dimensional model experiments have been conducted, using 1,000 dimensional linear model and 1,000 dimensional non-linear Lorenz 96 model.
The linear model experiment using only 20 ensemble members shows that the particle weights does not degenerate as the dimension of the model state increases, without the exponential growth of the ensemble members reported by ?.The new scheme preserves the posterior Gaussian pdf in linear model experiment.
Increasing the ensemble size to 1,000, the simulated posterior pdf shows a Gaussian distribution which is slightly too wide.
After decreasing the percentage of positive ε i for 1,000 ensemble members, the simulated posterior pdf does resemble the true posterior.This bias is subject to further study and is likely related to the fact that we should choose the percentage of positive ε i equal to the percentage of probability mass that has positive ε i .
The performance of implicit equal-weights particle filter is also examined in 1000-dimensional non-linear Lorenz 96 model.
Again the experiments show that this new scheme has very good consistency and convergence properties without filter degeneracy.
A comparison with a tuned LETKF reveals that the RMSE errors of the latter tend to be smaller than those of the IEWPF, but the ensemble spread in the LETKF is too small when the observation density decreases.The ensemble spread is always equal to the RMSE in the IEWPF.This is also reflected in the rank histograms, which are too narrow for the LETKF when large portions of the system are unobserved.A lesson to learn from this is that concentrating only on the RMSE is not good practise in nonlinear data-assimilation systems.
The IEWPF was implemented with a weak relaxation term between observations to control the spread and to achieve a better converging trajectories of the ensemble members.This is a weaker part of the scheme, also it needs tuning.More sophisticated proposal densities can be used to improve performance further, and increase robustness of the scheme.For instance, one could extend the implicit equal weights staep over the whole trajectory between observations, as the Implicit Particle Filter does.The drawback of such a proposal is that an adjoint of the model is needed to make this efficient, although ensemble schemes like 4DEnsVar might also be explored.
The new scheme has been implemented into the EMPIRE data-assimilation software system (?), and experiments on highdimensional geophysical systems are being planned.

Appendices
A. The high-dimensional limit We need to solve the equation: Now we go backward to achieve the full expression of the original equation.Plug equation (C4) into (C3), it is easy to see that Turn back to equation (C2), to find: Therefore, we get that from which we immediately see that the gap appears where γ i = Nx.

Figure 1 .
Figure 1.Lambert W function in real-valued W (x) Figure 2 shows the plot of α i as a function of γ i with varying c values under different state space dimensions.The solution for α i has two branches related to Lambert W function.In Figure 2, the dashed line is the -1 branch of the α i solutions and the full line is the 0 branch.Different line colours represent different c values.

Figure 3 .
Figure 3. Function f (ξ n i ) with different c/Nx values the relaxation term forcing the model state towards the observations at time-step n.Q is the covariance of the random forcing in the modified model, which we choose equal to Q in our experiments.Sampling from the proposal transition density instead of the original model equation leads to:

Nx = 1000 Figure 4 .
Figure 4. Rank histograms of grid point 200 using different percentage of positive εi in linear model experiment.

ε
i generates a flat rank histogram that indicates a good quality ensemble.Comparing the ratio of RMSE and the spread of the analysis ensemble for the first 200 time-steps in Figure 5, fifty percentage of positive ε i shows a stable ratio almost equal to one after some spin-up time-steps.Increasing or decreasing the percentage of positive ε i causes a degradation in spread of the ensemble, which can be seen clearly from Figure 5.

Figure 5 .
Figure 5. Ratio of RMSE and spread of the ensemble with different percentage of positive εi.

Figure 6 .
Figure 6.Posterior pdf represented by the particles using different percentage of positive εi in linear experiment.

Figure 7 .
Figure 7. Posterior PDF in 1000 ensemble members with different percentage of positive εi.

Figure 8 .
Figure 8. Trajectories of grid point 345 in this new scheme without relaxation term.The black line is the truth and the blue lines depict the evolution of the particles.Note the different time resolutions in the different plots.

Figure 9 .
Figure 9. Rank histogram of grid point 345 after a 10,000 time steps model running.

Figure 9
Figure 9 illustrates rank histogram of observations with respect to the perturbed forecast ensemble members after a 10,000 timesteps model running for grid point 345.The flatness of the histogram shows that the ensemble has a good spread and quality.

Figure 10 .
Figure10.Trajectories of grid point 856 in this new scheme with relaxation term.The black line is the truth and the blue lines depict the evolution of the particles.

Figure 11 .
Figure 11.Rank histogram of grid point 856 after a 10,000 time-steps model running.

Figure 11
Figure 11 shows the rank histogram of observations with respect to the perturbed forecast particle ensemble members after a 10,000 time steps model running of grid point 856.The flatness of the histogram elucidates that the observations are indistinguishable from any perturbed ensemble member in the situation of observations at every grid point.The distribution histogram of the state variable is not Gaussian (not shown).

Figure 12 .Figure 13 .
Figure 12.Trajectories of grid point 856 in this new scheme with different percentage of positive εi.The black line is the truth and the blue lines depict the evolution of the particles.

Figure 14 .
Figure 14.Root-mean-square error (RMSE) of the two methods for the different experiments, red for the IEWPF and blue for the LETKF.The subscript 'all' means RMSE over all gridpoint, 'o' denotes RMSE over only the observed grid points, and 'u' denotes RMES over only the unobserved grid points.

Figure 15 .
Figure 15.Ratio of the RMSE to the ensemble spread for the different experiments, red for the IEWPF and blue for the LETKF.

Figure 16 .
Figure 16.Rank histograms for the LETKF and the IEWPF for the different experiments.
have absorbed the constant factor ||P 1/2 || in the constant C.Let us now introduce the notationa = α 1/2 i , g = γ i , n = Nx, ξ = ξ n i and c = C + 2 log |P 1/2 | − φ i − J prev i .So each particlewill have a different g, ξ and c and we need to solve for a.The equation to solve now becomes:(a 2 − 1)g − 2n log a − 2 log 1 look for solutions in which α i , so a, is only a function of ξ i via γ i = ξ T i ξ i .The derivative now becomes: so that we end up with:(a 2 − 1)g − 2n log a − 2 log 1 •) is the representation of W [g(ξ)].