Hidden Markov Model, transition probabilities which are modeled with an exponential distribution

Question

I'm looking at implementing an algorithm described in a paper, but I'm having trouble understanding how the transition probabilities for a Hidden Markov Model are defined.

In the first sections, I have segmented an image into a number of areas, each denoted as "text" or "gap". The algorithm then uses a HMM to refine these regions, and I'm stuck in the section of the paper that describes the parameters for the HMM.

It starts out easy enough by defining two states (c₀ = text, c₁ = gap) and their initial probabilities π₀, π₁ both being 0.5, but then the transition probabilities are described and there's something I'm missing. It goes:

"The transition probabilities are modeled by an exponential distribution with parameters m_j, j ∈ {0,1}, the mean height of each region for the whole document image, as follows

a_jj(i) = P(s_{[h,h+H_i]} = c_j | s_{[h-H_i-1,h]} = c_j)

= exp(-H_i/m_j), j ∈ {0,1}

where H_i denotes the height of the i-th area and s_{[h,h+H_i]} the state of the region that starts at height h and extends up to h+H_i. The transition probabilities of a₀₁(i) and a₁₀(i) result as the difference of a₀₀(i) and a₁₁(i) from 1, respectively."

From Papavassiliou, Stafylakis, Katsouros, Carayannis: "Handwritten document image segmentation into text lines and words" (Pattern Recognition 43 (2010) pp. 369-377)

I can easily do the exp() calculations with the area data that I have, but I don't understand how this becomes a 2x2 matrix. As I understand the formula, the result is dependent on the two variables i (= area index), and j (= 1 or 0) but how do I select i / the area it describes?

Update: Additionally, the regions are alternately text or gap; so the P() formula above should equal 0.0, as there are no two regions i, i-1 that have the same state. This would give a transition probabily matrix of:

$\begin{bmatrix} 0.0 & 1.0 \\ 1.0 & 0.0 \end{bmatrix}$

It will always transition to the opposite state at each point in time – which definitely does not seem useful.

There's most likely something that I'm misunderstanding in the above, my theoretical grounding isn't very solid.

Note: The transition probability matrix has to be 2 x 2 (it describes the probability of transition between states, including staying in the same state, so both dimensions are equal to the number of states).

The transition matrix is going to be $$\begin{pmatrix}a_{00} & 1-a_{00} \ 1-a_{11} & a_{11} \end{pmatrix}$$ — Kirill, Oct 03 '14 at 00:43
Yes, but what are a00 and a11? I only have the functions a11(i) and a00(i). — meide, Oct 03 '14 at 00:53
Um, $a_{00}$ is $a_{00}(i)$. The transition matrix depends on $i$. — Kirill, Oct 03 '14 at 00:55
But all the implementations of HMMs that I've seen require a matrix of calculated values. I cannot calculate the values without knowing i. — meide, Oct 03 '14 at 00:58
Put differently: Without knowing i, I can create n 2x2 matrixes where n is the number of areas. How do I go from this set of matrixes to selecting the "final" matrix to use in the HMM? Also, how is there a non-zero probability that two consecutive items in an alternating sequence have the same value? — meide, Oct 03 '14 at 10:40

Hidden Markov Model, transition probabilities which are modeled with an exponential distribution

0 Answers0