4

Considering a simple Kalman Filter

State update equation $x_t = x_{t-1} + w_t, w_t\sim N(0,Q)$

Observation equation $z_t = x_{t} + v_t, v_t\sim N(0,R)$

I'm curious, under what conditions, will we have steady-state $P_{\infty} = R$? In other words the final estimation error is exactly our observation error $R$. (Here my $P, Q, R$ follow the Wikipedia convention).

It seems like according to https://en.wikipedia.org/wiki/Kalman_filter#Asymptotic_form

We would have $P_{\infty} = P_{\infty} - P_{\infty}(P_{\infty}+R)^{-1}P_{\infty}+Q$, or $P_{\infty}(P_{\infty}+R)^{-1}P_{\infty}=Q$. If we enforce $P_{\infty} = R$, then we got $Q=\frac{R}{2}$, but I'm not sure what this intuitively means.....

I'd have guessed, intuitively, if $Q=0$, I'd have $P_{\infty}=R$. As in if the states never change, I keep observing with uncertainty of $R$, I'd eventually have an estimation error covariance of $R$. My estimation won't be better than my observation error, but it won't be worse, as the states don't change and I have infinite observations.

But why is it that we need $Q=\frac{R}{2}$? What's the intuition behind this condition? What's wrong with my intuition or algebra?

skf
  • 41
  • 2
  • I assume the observation equation should be $z_t=x_t+v_t$? – Kwin van der Veen Dec 03 '24 at 06:46
  • @KwinvanderVeen yes, sorry that was a typo – skf Dec 03 '24 at 12:24
  • I don't know if this will help at all ( the answer by Kwin van der veen look quite good but is currently too involved for me to understand ) because you are asking a more general question but that particular model is called a "random walk + noise" model in the econometrics literature. Harvey's KF text shows that the kalman filter for this model actually reduces to exponential smoothing. If you can get your hands on this book, it derives the updating details in there. But, as far as I know, it's not going to answer the specific question you are asking. This is Harvey's text. – mark leeds Dec 05 '24 at 06:24
  • https://www.amazon.com/Forecasting-Structural-Time-Harvey/dp/0521405734 – mark leeds Dec 05 '24 at 06:29
  • @markleeds thanks, checking it out. Really, even if $Q$ and $R$ are not diagonal? – skf Dec 08 '24 at 03:28
  • Hi skf: You're welcome but I thought that $x_t$ and $z_t$ were scalars. In that case ( where they aren't ), I guess it would be called a multivariate random walk + noise model but don't check it out in the literature because I've never seen such a thing and I don't think you'll find anything. My apologies for the mis-understanding and noise. – mark leeds Dec 08 '24 at 14:25

3 Answers3

1

In order to get a bit more insight into this it helps to consider the two extreme end cases. Namely, when $Q=0$ or when $R=0$. When doing this it also helps to know how your algebraic Riccati equation for $P_\infty$ can be derived and what that matrix actually represents. For this I will consider the following more general model

\begin{align} x_{k} &= F_k\,x_{k-1} + B_k\,u_k + w_k, \quad w_k \sim \mathcal{N}(0,Q_k) \tag{1} \\ z_k &= H_k\,x_k + v_k, \quad v_k \sim \mathcal{N}(0,R_k) \tag{2} \end{align}

where $u_k$ is deterministic and known, and $v_k$ and $w_k$ are uncorrelated. The Kalman filter equations for this model are given by

\begin{align} \hat{x}_{k|k-1} &= F_k\,\hat{x}_{k-1|k-1} + B_k\,u_k, \tag{3} \\ P_{k|k-1} &= F_k\,P_{k-1|k-1}\,F_k^\top + Q_k, \tag{4} \\ K_k &= P_{k|k-1}\,H_k^\top (H_k\,P_{k|k-1}\,H_k^\top + R_k)^{-1}, \tag{5} \\ \hat{x}_{k|k} &= \hat{x}_{k|k-1} + K_k\,(z_k - H_k\,\hat{x}_{k|k-1}), \tag{6} \\ P_{k|k} &= (I - K_k\,H_k) P_{k|k-1}, \tag{7} \end{align}

with $\hat{x}_{k|j}$ the expected value of $x_{k}$ given the measurements up to and including $z_j$ and $P_{k|j}$ the covariance of $\hat{x}_{k|j}$, i.e. $\mathbb{E}[(x_{k}-\hat{x}_{k|j})(x_{k}-\hat{x}_{k|j})^\top]$. Here is can be noted that $P_{k|k-1}$ means something else than $P_{k|k}$ and in most cases these are not equal to each other, also not in the limit case as $k\to\infty$.

In order to obtain a discrete time algebraic Riccati equation in the covariance, one must substitute $(4)$ and $(7)$ into each other. For example incrementing $k$ by one in $(4)$ and substituting in $(7)$ using $(5)$, which yields

$$ P_{k+1|k} = F_{k+1}\,P_{k|k-1}\,F_{k+1}^\top - F_{k+1}\,P_{k|k-1}\,H_k^\top (H_k\,P_{k|k-1}\,H_k^\top + R_k)^{-1}\,H_k\,P_{k|k-1}\,F_{k+1}^\top + Q_{k+1}. \tag{8} $$

This simplifies to your discrete time algebraic Riccati equation when using $P_{k+1|k} = P_{k|k-1} = P_\infty$, $F_k=H_k=I$, $R_k=R$ and $Q_k=Q$. If instead one wants to have the limit of $k\to\infty$ of $P_{k|k}$ (assuming that $F_k$, $H_k$, $Q_k$ and $R_k$ are constant), one could formulate another algebraic Riccati equation, or just apply $(7)$ once to the solution obtained from solving $(8)$. In order to distinguish between the limit of $P_{k|k-1}$ and $P_{k|k}$ I will denote $k\to\infty$ of $P_{k|k-1}$ as $P_\infty$ and $k\to\infty$ of $P_{k|k}$ as $P_\infty^*$.
Since $F_k$, $H_k$, $Q_k$ and $R_k$ are assumed constant in order to solve for $P_\infty$ and $P_\infty^*$, I will just drop the $k$ subscript, which simplifies $(8)$ to the following algebraic Riccati equation in $P_\infty$

$$ P_\infty = F\,P_\infty\,F^\top - F\,P_\infty\,H^\top (H\,P_\infty\,H^\top + R)^{-1}\,H\,P_\infty\,F^\top + Q. \tag{9} $$


For the limit case of $Q=0$ it can be shown using $(9)$ that $P_\infty=0$ is a solution. Similarly using $(7)$ it also follows that $P_\infty^*=0$. It can be noted that $Q=0$ means that the dynamics from $(1)$ is actually deterministic and only the measurements $z_k$ are stochastic. This can also be seen as a least squares problem and the more and more noisy data (the measurements $z_k$) are used to fit the error gets smaller and smaller until it is zero, also see recursive least squares. Using this solution to obtain the Kalman gain from $(5)$ gives that this gain is also zero. Note that this is only the gain as $k\to\infty$, denoted with $K_\infty$. However, using $(3)-(7)$ starting with a non-zero $P_{k|k-1}$ will initially yield non-zero $K_k$, but the gain tends to zero over time.

In short it can be summarized that for the case of $Q=0$ the Kalman filter is more and more relying on the state predictions due to the deterministic dynamics from $(1)$ and in the end not using the measurements at all, i.e. $K_\infty=0$.


For the limit case of $R=0$ together with $H=I$ it can be shown using $(9)$ that $P_\infty=Q$. Similarly using $(7)$ it follows that $P_\infty^*=0$. I will start with the result of $P_\infty^*=0$, which can be understood using that the measurement $z_k$ measures the complete state $x_k$ without noise. So the best estimate of the state given such measurement is just the measurement itself. Why it should hold that $P_\infty^*=0$ can be seen from that $P_\infty^*$ is the limit of $P_{k|k}$, the covariance of $\hat{x}_{k|k}$, with $\hat{x}_{k|k} = z_k = x_k$. From $P_\infty^*=0$ the result for $P_\infty$ can be derived using $(4)$. Namely, $P_\infty$ is the limit of $P_{k|k-1}$, so the covariance of the state estimate predicted one time step into the future. Using this solution to obtain the Kalman gain from $(5)$ gives that $K_\infty = I$, which combined with $H=I$ simplifies $(6)$ to the expected result of $\hat{x}_{k|k} = z_k$.

In short it can be summarized that for the case of $R=0$, together with $H=I$, the Kalman filter is fully relying on the noise free measurements of the full state.


When $Q\neq0$ and $R\neq0$ then each time step the Kalman filter will not only really on only prediction steps or only measurements, but a weighted middle road between the two. It can be noted asking the meaning of $P_\infty=R$ is not a very meaningful question, because the left and right hand side represent the covariances of different variables with different units. For example one could always transform that state using $q_k = T\,x_k$ and obtain an equivalent state space model.
However, some insights might be gained when using your model with $H = I$ and $F=I$, and setting $P_\infty = \alpha\,R$, with $\alpha$ some positive scalar. Using this and solving $(9)$ for $Q$, $(5)$ for $K$ and $(7)$ for $P_\infty^*$ yields

$$ Q = \frac{\alpha^2}{1+\alpha} R, \\ K = \frac{\alpha}{1+\alpha} I, \\ P_\infty^* = \frac{\alpha}{1+\alpha} R. $$

When setting $\alpha=0$ obtains the solution of when $Q=0$, so the Kalman filter is relying more and more on the prediction step. When $\alpha\to\infty$ one gets $Q=\alpha\,R$, $K=I$ and $P_\infty^*=R$. For which it can be noted that this is approaching the solution for $R=0$, but with all covariance matrices scaled by $\alpha$, so the Kalman filter is relying more and more only on the measurements, since the dynamics is subjected to way too much noise, making the predictions step unreliable.
It is worth noting that for all $\alpha$ it holds that $P_\infty \geq Q$ and $P_\infty^* \leq R$, but $P_\infty$ can be large or smaller than $R$ and $P_\infty^*$ can be large or smaller than $Q$. The constraint on $P_\infty$ can be understood by looking at $(4)$, since $P_{k-1|k-1}\geq0$ so $P_{k|k-1} \geq Q$, or in other words the covariance of the state estimate after one prediction step always is affected by the uncertainty of the prediction step itself. The constraint on $P_\infty^*$ can intuitively be understood by the fact that if $P_\infty^* > R$ then one would obtained a better state estimate by just taking the measurement as new state estimate, which would have covariance of $R$.
As far as I know there is no meaningful intuition for your case with $\alpha=1$. One thing that might be relevant is instead of using the Riccati equation from $(9)$ start with the Kalman equation starting at $(5)$ using $P_{k|k-1} = R$. This gives $K_k = I/2$, so in $(6)$ the corrected state estimate is the average of the prediction step and the measurement. Both the prediction step and the measurement have independent distributions but with the same variance, which could then also intuitively explain why $(7)$ gives $P_{k|k}=P_{k|k-1}/2=R/2$. Now the only remaining step is using $(4)$ and in order to obtain $P_{k|k-1}=R$ requires $Q=R/2$.

This answer might not have given a very clear explanation of the relation between $R$ and $Q$ when $P_\infty=R$. But I hope that the explanations for the cases of $Q=0$ and $R=0$ did give you more insights into the stationary solutions (the solution obtained by solving the algebraic Riccati equation) of the Kalman filter.

0

I don't have a complete answer to your question. But I will address why your below intuition is wrong

I'd have guessed, intuitively, if $Q=0$, I'd have $P_{\infty}=R$. As in if the states never change, I keep observing with uncertainty of $R$, I'd eventually have an estimation error covariance of $R$. My estimation won't be better than my observation error, but it won't be worse, as the states don't change and I have infinite observations.

More noisy observation will add up to be more accurate estimation. Imagine you keep taking samples from $N(0,1)$ to compute the average. Each sample is a noisy observation, with a standard error of 1, but as you keep taking more samples, your average will be closer and closer to the true average of 0 with higher and higher confidence (Central limit theorem).

0

If we enforce $P_{\infty} = R$, then we got $Q=\frac{R}{2}$, but I'm not sure what this intuitively means.....

To my mathematical intuition, it means that you can't enforce $P_{\infty} = R$, because doing so results in enforcing one free variable ($Q$) to be a function of another ($R$).

Stepping back from that, $Q$ tells you how much information about $\mathbf x$ you lose at each time step. $R$ tells you how much information you gain. So intuitively, the amount of information you have about $\mathbf x$ should settle out to a function of both $Q$ and $R$ -- and that information can be found from $P_{\infty}$. So of course (intuitively) $P_{\infty}$ must be a function of both $Q$ and $R$.