$m_i$ is the true unobserved label for pixel $i$ (0 for background, 1 for building/road/whatever the model is segmenting)
$\tilde{m}_i$ is the observed label for pixel $i$
$\hat{m}_i$ is the prediction for pixel $i$.
$\theta_0$ and $\theta_1$ are the probability of false positives and false negatives in the labels.
\begin{equation} \label{thetas}
\begin{split}
\theta_0 &= p(\tilde{m}_i = 1 | m_i = 0) \\
\theta_1 &= p(\tilde{m}_i = 0 | m_i = 1)
\end{split}
\end{equation}
We don't try to minimize the difference between label and prediction anymore ($\epsilon = \tilde{m}_i - \hat{m}_i$) but the difference between the probability that the true unobserved label is 1 and the prediction (for an input $s$: $\epsilon = p(m_i = 1 | \tilde{m_i}, s) - \hat{m}_i$).
Bayes law gives us:
\begin{equation}
p(m_i = 1 | \tilde{m}_i) - \hat{m}_i = \frac{p(\tilde{m}_i | m_i = 1) * p(m_i=1)}{p(\tilde{m}_i)} -\hat{m}_i
\end{equation}
and since $m_i$ can only be $0$ or $1$,
\begin{equation}
p(\tilde{m}_i) = p(\tilde{m}_i | m_i=1) * p(m_i=1) + p(\tilde{m}_i | m_i=0) * p(m_i=0)
\end{equation}
The definitions of $\theta_0$ and $\theta_1$ give us using the Bernoulli law those two distributions:
\begin{equation}
\begin{split}
p(\tilde{m}_i | m_i=0) & = \theta_0^{\tilde{m}_i} * (1-\theta_0)^{(1-\tilde{m}_i)} \\
p(\tilde{m}_i | m_i=1) & = \theta_1^{(1-\tilde{m}_i)} * (1-\theta_1)^{\tilde{m}_i}
\end{split}
\end{equation}
since $p(m_i =1) = 1 - p(m_i = 0)$ and $ p(m_i=1 ) = \hat{m}_i$ we get if $\tilde{m}_i = 0$
\begin{equation}
\epsilon = \frac{\theta_1 * \hat{m}_i}{\theta_1 * \hat{m}_i + (1- \theta_0) * (1-\hat{m}_i))} - \hat{m}_i
\end{equation}
and if $\tilde{m}_i = 1$
\begin{equation}
\epsilon = \frac{(1-\theta_1) * \hat{m}_i}{(1-\theta_1) * \hat{m}_i + \theta_0 * (1-\hat{m}_i)} -\hat{m}_i
\end{equation}
that is what is plotted in Figure 2.