Sinkhorn theorem for doubly stochastic matrices

Question

I was reading something about doubly stochastic matrices and got stuck while reading the original proof of the uniqueness part of the Sinkhorn theorem. I'm not able to understand the logic. Could someone help me in figuring it out? I state below the theorem and the proof as written in the original paper.

Theorem: To a given strictly positive $N \times N$ matrix $A$ corresponds exactly one doubly stochastic matrix $T=D_1AD_2$ where $D_1$ and $D_2$ are diagonal matrices with positive diagonals. The matrices $D_1$ and $D_2$ are themselves unique up to a scalar factor.

Uniqueness proof: if there exist two different pairs of diagonal matrices $C_1, C_2$ and $D_1, D_2$ such that both $C_1 A C_2$ and $D_1 A D_2$ are doubly stochastic, then this means that there exist a positive doubly stochastic matrix $P$ and matrices

$$B_1 = \mbox{diag}[b_{11}, b_{12},\dots, b_{1N}]$$

and

$$B_2 = \mbox{diag}[b_{21}, b_{22}, \dots, b_{2N}]$$

which are not multiple of identity matrix, for which $B_1 P B_2$ is also doubly stochastic. But this is impossible since by convexity one obtains:

$$\min_jb_{2j} \le \frac{1}{b_{1i}} \le \max_j b_{2j}$$

and

$$\min_ib_{1j}\le \frac{1}{b_{2j}} \le \max_jb_{1i}$$

which leads to a contradiction if $b_{1i} b_{2j} \ne 1$ for some $i$ and $j$. It follows that $C_1 = p D_1$, $C_2 = p^{-1} D_2$ for some $p > 0$.

What I do not understand: why from the existence of $C_1,C_2$ and $D_1,D_2$ it will exist $P$, $B_1$ and $B_2$? What is the convexity argument? And why there is a contradiction? And how does it imply the thesis?

Thanks.

Related: https://math.stackexchange.com/q/179940/339790 – Rodrigo de Azevedo Mar 06 '21 at 09:34 — Rodrigo de Azevedo, Mar 06 '21 at 09:34

score 3 · Accepted Answer · answered Oct 12 '15 at 07:15

3

Clearly, the author means $P=D_1AD_2,\ B_1=C_1D_1^{-1},\ B_2=D_2^{-1}C_2$.

I'm not sure what he/she exactly meant by "convexity", but here is an alternative proof. Let $b_{1i_1} = \min_i b_{1i}$ and $b_{2j_2} = \max_j b_{2j}$. Observe that the $(i,j)$-th entry of $B_1PB_2$ is $b_{1i}b_{2j}p_{ij}$. Since $B_1PB_2$ is row stochastic, it follows that $\sum_j b_{1i}b_{2j}p_{ij}=1$ and in turn $\sum_j b_{2j}p_{ij}=\frac1{b_{1i}}$ for every $i$. Therefore, $$ b_{2j_2} = b_{2j_2}\sum_j p_{i_1j} \ge \sum_j b_{2j}p_{i_1j} = \frac1{b_{1i_1}}.\tag{1} $$ By similar reasoning, if we consider the column stochasticity of $B_1PB_2$ instead, we get $$ b_{1i_1} = b_{1i_1}\sum_i p_{ij_2} \le \sum_i b_{1i}p_{ij_2} = \frac1{b_{2j_2}}.\tag{2} $$ Therefore $b_{1i_1}b_{2j_2}=1$ and equalities must hold in both $(1)$ and $(2)$. Consequently, all $b_{2j}$s are identical to each other and all $b_{1i}$s are equal. That is, $B_1,B_2$ are scalar multiples of the identity matrix (and in fact, $B_2=B_1^{-1}$). Since $B_1=C_1D_1^{-1}$ and $B_2=D_2^{-1}C_2$, it follows that $D_1,D_2$ are respectively scalar multiples of $C_1$ and $C_2$.

answered Oct 12 '15 at 07:15

user1551

149,263

Hi, thanks so much for the answer. It gives me more insight even if I don't see your first sentence so clear. The author calls $T=D_1AD_2$, while P is a positive stochastic matrix that should exists by the fact that there exist $T=D_1AD_2$ and $R=C_1AC_2$ (why?). We have to show that $T=R$ and $Cs=Ds$ up to a scalar factor. $B_1PB_2$ is doubly stochastic but not necessary strictly positive as P (why?). It seems to me that you simplified a little bit the demonstration. – alecsphys Oct 12 '15 at 21:02
@user26945 You can use the fact that $B_2=B_1^{-1}$ to directly verify that $D_1AD_2=C_1AC_2$. Since $D_1,D_2$ are positive diagonal matrices and $A$ is entrywise positive, it follows that $D_1AD_2$ is positive. As to the existence of $P$ (or $D_1,D_2$), of course I haven't demonstrated it, because all your question asks is about the uniqueness part. – user1551 Oct 13 '15 at 09:12
We are negating the thesis assuming that there exist $C_1,C_2$ and $D_1,D_2$ different but for which $P=D_1AD_2=C_1AC_2$. Thanks to your reply we can easily show that there exist at least one couple $B_1,B_2$ for which $B_1PB_2$ is doubly stochastic and $B_1B_2=I$. So we can write the following chain: $B_1PB_2=B_1D_1AD_2B_2=B_1C_1AC_2B_2$, so $B_1D_1=B_1C_1$ and $D_2B_2=C_2B_2$ that means $D_1=pC_1$ and $D_2=(1/p)B_2$ contraddicting the negation of the hypothesis. So we have done. What I'm missing now is: we're assuming $P=D_1AD_2=C_1AC_2$, but why the negation is not $P=D_1AD_2$ $R=C_1AC_2$ – alecsphys Oct 15 '15 at 11:44
Sorry, but I think the better way to show the result is: by negating the thesis we have $P=D_1AD_2$, $R=C_1AC_2$. So $A=D_1^{-1}PD_2^{-1}$ and $R=C_1D_1^{-1}PD_2^{-1}C_2$. So we can call $B_1=C_1D_1^{-1}$ and $B_2=D_2^{-1}C_2$ and because R is doubly stochastic, thanks to your demonstration we have $B_1B_2=I$ and after substituing the previous expressions we get $D_1=pC_1$ and $D_2=p^{-1}C_2$. This solve all my doubts and I can pass to the existence part of the theorem. Thanks again! – alecsphys Oct 15 '15 at 12:47
1

For anyone curious, the author of this paper says "by convexity" a few times to refer to the fact $\text{min} x_{i}\le \sum_{i=1}^{n}x_{i}p_{i}\le \text{max}{i}x{i}$ when $\sum_{}^{}p_{i}x_{i}$ is a convex combination of $x_{i}$. So I think that this solution is the same as what Sinkhorn had in mind in his paper. – user3281410 Mar 15 '18 at 05:29

score 0 · Answer 2 · answered Dec 30 '23 at 12:50

Let $P = (p_{ij})$ be a doubly stochastic matrix, and $C = (c_{ij})$ another matrix. Consider their Hadamard product $P\odot C=(p_{ij} c_{ij})$. We have

$$\sum_{j} p_{ij} c_{ij}\le \max_{j} c_{ij}$$ and so

$$\min_i (\sum_{j} p_{ij} c_{ij}) \le \min_i \max_i c_{ij} $$

Similarly

$$\max_j \min_i c_{ij} \le \max_j (\sum_{i} p_{ij} c_{ij})$$

Now let's assume moreover that

$$\min_i (\sum_{j} p_{ij} c_{ij}) = \max_j (\sum_{i} p_{ij} c_{ij}) ( = s) $$

( in general, it is $\le$, since sum by rows equals sum by columns). This is equivalent to all the rows and all the columns of $(p_{ij} c_{ij})$ has the same sum $s$. We conclude

$$\max_j \min_i c_{ij} \le s \le \min_i \max_j c_{ij}$$

Now the above inequality for the extremes is not a new one. However, let's assume moreover that

$$\max_j \min_i c_{ij}= \min_i \max_j c_{ij}$$

in other words, the matrix $(c_{ij})$ has a saddle point $(i_0, j_0)$ ( and also that all $p_{ij}\ne 0$, so $>0$). From the above we conclude thas the $i_0$ row and the $j_0$ column are constant ( $=s$).

Now we can prove uniqueness. Assume that $P$ is doubly stochastic with all entries $>0$, $C= (\alpha_i \cdot \beta_j)$, with $\alpha_i$, $\beta_i\ge 0$, and the matrix $P\odot C$ has all the row sums and column sums equal. Now the matrix $C$ has a saddle point (indeed, $\min_i \max_j \alpha_i \beta_j = \min_i \alpha_i \cdot \max_j \beta_j$), so $C$ has one constant row and one constant column. We conclude that all of the entries of $C$ are the same.

Sinkhorn theorem for doubly stochastic matrices

2 Answers2