I am studying Markov Rewaed Processes right now, and I wish to gain a deeper understanding of the Bellman equation's relationship with them.
I learned the Bellman equation in the following form:
$v = R + \gamma Pv$
Here, $R$ is the reward vector, $P$ the transition probability matrix, $\gamma$ is the discount factor and $v$ is the value we are trying to find/assign. The equation can be rearranged to
$v = (I - \gamma P)^{-1} R$
However, this rearrangement is only possible if (I - γP) is invertible. Is there a concise way to state the conditions under which this is the case?