Whiteness hypothesis in Kalman filtering

Question

In Kalman filter mathematical treatment I have always read that a foundamental hypothesis is represented by the whiteness of the process noise. I have tried to do again the mathematical steps in the Kalman filter derivation but I can't see where such hypothesis is crucial.

Could you help me showing where the mathematical proof fails if I remove such hypothesis?

Thanks.

EDIT: I try to share my doubt more precisely.

Let's consider the formulas in Fig. 1 in this article. These two formulas have been obtained without assumptions on the process noise. Then, again without assumptions on the process noise, the author arrives at formula (5), in which appears the probability density $p(x_k | x_{k-1})$. Taking into account the first equation of the system (1), now he says that if we suppose the gaussianity and the whiteness of the process noise, we can then write such probability density as $\mathcal{N}(f(x_{k-1}),Q)$.

It is exactly in this point that my doubt arises: it seems to me that I could have written $$p(x_k | x_{k-1})=\mathcal{N}(f(x_{k-1}),Q)$$ even if I had only assumed gaussianity of the process noise, without whiteness, because $x_k$ is conditioned only by $x_{k-1}$ and not also by $q_{k-1},q_{k-2},...$.

Yes, it is also white, but relative to my question we can separate process and measurement noise and consider only the first to do our reasoning. — Nameless, Jan 15 '22 at 11:31

score 0 · Accepted Answer · answered Jan 23 '22 at 13:48

0

In the section of the paper mentioned in the question (just below equation 5, page 3), the authors have mentioned that

By assuming that $w_k$ is a white Gaussian noise independent of $x_k$, we have $x_k | x_{k-1} \sim \mathcal{N}(f(x_{k-1}), Q)$

The above distribution is derived from the model described in equation 1, page 2, where it is stated that $x_{k+1} = f(x_k) + w_k$, along with other descriptions. To have the above distribution for a deterministic function $f(\cdot)$, three things are required: that $w_k$ is independent of $x_k$, that the distribution of $w_k$ is Gaussian with mean zero, and that the covariance matrix of $w_k$ does not vary over $k$ and remain fixed as $Q$. The whiteness of the Gaussian noise $w_k$ indicates the third condition, that the covariance matrix of $w_k$ is constant over $k$. This is somewhat different from the usual univariate white noise random variable. Here, the white noise is a multivariate random vector.

answered Jan 23 '22 at 13:48

joy

1,260

Suppose that the noise was not white. Then, can I write the same distribution for $x_k|x_{k-1}$, except for the fact that $Q$ now becomes $Q_k$? – Nameless Jan 23 '22 at 17:29
Anyway, I gave you the bounty, it was expiring. – Nameless Jan 23 '22 at 17:31
"Suppose that the noise was not white. Then, can I write the same distribution for $x_k | x_{k−1}$, except for the fact that $Q$ now becomes $Q_k$?" -- Of course you can. Thank you for the bounty, and let me know if you have other questions. – joy Jan 23 '22 at 17:44
@Nameless And could you please let me know if my response is acceptable as the answer to your question, or you have other query? – joy Jan 23 '22 at 17:49
1

No other query, thank you. – Nameless Jan 23 '22 at 18:55
I have another question: if I can use $Q_k$ (instead of $Q$) in the case of non-white process noise, then why do in literature we find the necessity of the "whiteness procedure" in this case? We find this type of analysis, for example, in "Kalman Filtering: With Real-Time Applications Di Charles K. Chui, Guanrong Chen", chapter 5. – Nameless Jan 25 '22 at 15:44
The principal issue is the estimation of parameters. If you have unknown $Q_k$, you need to estimate them, and there are a lot of extra parameters. Because of these excess parameters, estimation variance would be high, if at all they can be estimated from the data (for example, in a linear model setup, you cannot estimate more than $n$ parameters uniquely from $n$ sample points). So, some restrictions are added to reduce the number of parameters. In chapter 5 of the book you mentioned, the restrictions are given in terms of $\Gamma_k$, $M_k$ and $N_k$. – joy Jan 25 '22 at 17:50
And by reducing the model so as to obtain white noise, you can apply methodology or results you have already developed or proved for white noise models. This is the significance of the first line of section 5.1 there. – joy Jan 25 '22 at 17:53

Whiteness hypothesis in Kalman filtering

1 Answers1