2

I am learning Hidden Markov Model, and I have some trouble to understand how the independance is used in the calculus

\begin{aligned} \mathbb{P}(O(t) \mid y(t), \lambda) &=\prod_{j=1}^{\ell} \mathbb{P}\left(O_{j} \mid y_{j}, \lambda\right) \\ &=\prod_{j=1}^{\ell} b_{y_{j}}\left(O_{j}\right) \\ &=b_{y_{1}}\left(O_{1}\right) b_{y_{2}}\left(O_{2}\right) \ldots b_{y_{t}}\left(O_{t}\right) \end{aligned}

Here an example, O(t) are the observed variables from 1 to time t and y is the hidden markov chain. How do we have the first product? That's what I don't understand.

\begin{aligned} \mathbb{P}(O(t) \mid y(t), \lambda) = \frac{\mathbb{P}(O_1,..,O_t,y_1,...y_t,\lambda)}{\mathbb{P}(y_1,...,y_t,\lambda)} \end{aligned}

How do I use the independancies properties of the Hidden Markov Chain here to have the result please?

1 Answers1

2

The hidden Markov chain is a directed acyclic graph. The observed states are d-separated from the previous observed state given the hidden states, i.e. are conditionally independent. Given the hidden state at time t the observed is also conditionally independent of the other hidden states. For more info on d-separation, you can use Google.

We get $$p(O|y)=\prod p(O_i|y)=\prod p(O_i|y_i)$$

where the first equality comes from conditional independence of hidden states and second comes from conditional independence of $O_i$ given $y_i$ to $y_j,j\ne i$.

It is also not necessary to be familiar with graph theory to get this result. We can use the definition of hidden Markov model and conditional probability rules.

$$\begin{split}\frac{P(O_1,…,O_n,y_1,…,y_n)}{P(y_1,…,y_n)}&=\frac{p(y_1)p(O_1|y_1)\prod_{i=2}^n p(y_i|y_{i-1})p(O_i|y_i)}{p(y_1)\prod_{i=2}^n p(y_i|y_{i-1})}\end{split}$$

Cancelling gives us the desired result.

Vons
  • 11,285