Prove/disprove the matrix convergence in probability

Question

Problem:

Given $\alpha\in\mathbb{R}^+$, $N\in \mathbb{Z}^+$ and $D\in \{N, N+1, N+2, \cdots\}$, a random matrix $\mathbf{A}$ is generated by the following steps:

$(1)$ Randomly select $N$ numbers from $\{1,2,\cdots,D\}$ to form a sequence $p=\{p_i\}_{i=1}^N$.

$(2)$ Then calculate $\mathbf{A}=[a_{ij}]_{N\times N}$, where $a_{ij}=e^{-\alpha |p_i - p_j|}$.

Please prove or disprove the following proposition:

$\mathbf{A}$ converges to $\mathbf{I}_N$ in probability, i.e., for any $\epsilon>0$ and choice of norm $\|\cdot\|$, there is: $$ \mathbb{P}[\|\mathbf{A}-\mathbf{I}_N\|\geq\epsilon]\to0~~(D\rightarrow \infty). $$

My Efforts:

I am confused by how to start.

I may know that the diagonal elements of $\mathbf{A}$ will be all ones, since $|p_i-p_i|=0$.

And I may know that all elements of $\mathbf{A}$ are in $[0,1]$ and $\mathbf{A}$ is symmetric.

Intuitively, I guess that when $D$ increases, the absolute distances between each two $p_i$s may become larger and larger, so $a_{ij}$ is expected to be smaller and smaller.

I also write the following Python program for numerical validation:

import numpy as np
import random
from scipy import spatial
alpha = 1
N = 10
I = np.eye(N)
for D in range(N, 10000):
    MSE = 0.0
    for i in range(100):
        p = np.array(random.sample(range(1, D + 1), N)).reshape(N, 1)
        A = np.exp(-alpha * spatial.distance.cdist(p, p))
        MSE += np.sum((A - I) ** 2.0)
    MSE /= (100 * N * N)
    print(MSE)

I can see that when $D$ increases, the mean squared error between $\mathbf{A}$ and $\mathbf{I}_N$ converges to zero.

0.027683220252563596
0.02508590350202309
0.02317795057344325
...
0.0001934704436327538
0.00032059290537374806
0.0003270223508894337
...
5.786435956425624e-05
1.1065792791574203e-05
5.786469182583059e-05

How to prove/disprove the proposition by exactly analysing the process of $D\rightarrow \infty$?

Jacob Manaker · Accepted Answer · 2022-09-09T04:12:23.350

2

The (original, now edited out of the question) claim that $A=I_N$ whp is false.

For any $j\neq k$, the $(j,k)$^th entry of $I_N$ is $0$. But the $(j,k)$^th entry of $A$ is $e^{-\alpha|p_j-p_k|}$. Thus \begin{align*} \mathbb{P}[{A=I_N}]&\leq\mathbb{P}[{0=e^{-\alpha|p_j-p_k|}}] \\ &=\mathbb{P}[{\infty=\alpha|p_j-p_k|}] \\ &=0 \end{align*} since $\alpha$ is a constant and $|p_j-p_k|$ is a.s. finite.

With that said, $A$ converges to $I_N$ in probability: for any $\epsilon$ and choice of norm $\|\cdot\|$, the value $$\mathbb{P}[\|A-I_N\|\geq\epsilon]\to0$$

To see this, note that all norms are equivalent on the finite-dimensional vector space of $N\times N$ matrices; I will choose the $\infty\to1$ norm $$\|M\|=\sum_{j,k}{|M_{j,k}|}$$

Then \begin{align*} \|A-I_N\|&=\sum_{j,k}{|(A-I_N)_{j,k}|} \\ &=\sum_{j<k}{2e^{-\alpha|p_j-p_k|}} \tag{1} \end{align*}

Now, pick your favorite function $f(D)$ that is $\omega(1)$ (i.e., grows without bound) but $o(D)$ (for example, $f(D)=\sqrt{D}$). Since $p_j$ and $p_k$ are (assumed) pairwise independent, $$\mathbb{P}[{|p_j-p_k|\leq f(D)}]\leq\frac{f(D)}{D}\to0$$ Taking the complement, $$2e^{-\alpha|p_j-p_k|}\leq2e^{-\alpha f(D)}\to0$$ on an event of probability arbitrarily close to $1$.

The value (1) is the sum of $\binom{N}{2}$-many such terms; the latter is finite and independent of $D$, so if each term converges to $0$ whp, the sum must converge to $0$ as well.

edited Sep 09 '22 at 04:12

answered Sep 07 '22 at 16:36

Jacob Manaker

10,173

Hi @JacobManaker, thanks for your help! I think my previous statement of the proposition is not true. The true one should be with $\mathbb{P}[|A-I_N|\geq\epsilon]\to0$ as you mentioned. And I also do not know how to prove or disprove $\mathbb{P}[|A-I_N|\geq\epsilon]\to0$. Could you please provide some ideas about that? – BinChen Sep 08 '22 at 01:59
Hi @JacobManaker, I have edited the problem to make the statements accurate. – BinChen Sep 08 '22 at 02:04
1

@BinChen: In general, it's considered bad form here to edit a question in a way that invalidates existing answers (but no harm, no foul in this case). I've updated my answer to include more details for the problem you actually meant. – Jacob Manaker Sep 09 '22 at 04:16
Hi @JacobManaker, thanks for your kind reminder! I understand what your mean. I will pay more attention about this. Such bad problem editing will not occur in the future. – BinChen Sep 09 '22 at 04:53
Hi @JacobManaker, thanks for providing more details! I can partially understand your answer now. I think that your introduction of $f(D)$ is a quite key and amazing step. In addition, I may have three problems as follows: – BinChen Sep 09 '22 at 05:03
$(1)$ "To see this, note that all norms are equivalent on the finite-dimensional vector space of $N\times N$ matrices;". I may understand that the main body of your analysis is based on a special case. But I do not understand why "all norms are equivalent". – BinChen Sep 09 '22 at 05:11
$(2)$ I have not touched the concept of "$\infty \rightarrow 1$ norm". I guess this is a norm similar to $\ell_1$-norm of a vector. I also know the Frobenious norm of a matrix. I may not find this concept from https://en.wikipedia.org/wiki/Matrix_norm. Did I miss something? – BinChen Sep 09 '22 at 05:19
For $(1)$: I notice that https://en.wikipedia.org/wiki/Matrix_norm introduces the concept of "Equivalence of norms" at the buttom of its webpage. – BinChen Sep 09 '22 at 05:23
$(3)$ Can the original proposition be written as: $\left(\forall \epsilon >0,~\underset{D\rightarrow \infty}{\lim}\mathbb{P}\left[ \lVert \mathbf{A}-\mathbf{I}_N \rVert \ge \epsilon \right] =0\right)$? Can I deduce the following proposition from it? $\underset{\epsilon \rightarrow 0^+}{\lim}\left(\underset{D\rightarrow \infty}{\lim}\mathbb{P}\left[ \lVert \mathbf{A}-\mathbf{I}_N \rVert \ge \epsilon \right] \right)=\underset{D\rightarrow \infty}{\lim} \left( \underset{\epsilon \rightarrow 0^+}{\lim}\mathbb{P}\left[ \lVert \mathbf{A}-\mathbf{I}_N \rVert \ge \epsilon \right] \right) = 0$ – BinChen Sep 09 '22 at 05:29
1

@BinChen: Following links, Wikipedia eventually cites this article by Keith Conrad, which explains quite well (IMHO) why all the norms are equivalent. The equation immediately following "$\infty\to1$ norm" is the definition thereof; it is the operator norm with the inputs measured in $\ell^{\infty}$ but the outputs in $\ell^1$. – Jacob Manaker Sep 09 '22 at 22:19
1

Wrt (3): Yes, the proposition I prove is your first equation, $(\forall\epsilon>0\text{, ...})$. The interchange of limits you ask for is false; since probability measures are continuous, taking the limit in $\epsilon$ first gives $\mathbb{P}[{|A-I_N|=0}]$, which we know has probability $0$ for any $D$. – Jacob Manaker Sep 09 '22 at 22:21
I accept this answer. I am particularly grateful to @JacobManaker for his/her efforts and kind help with such a long answer and detailed explanations. – BinChen Sep 10 '22 at 02:10
Hi @JacobManaker, if $\mathbf{A}$ converges to $\mathbf{I}_N$ in probability. Can I prove/disprove that $\mathbb{E}[\ln(\det(\mathbf{A}))]$ converges to $0$? I try to learn something from https://math.stackexchange.com/questions/4008712/convergence-of-the-expectation-of-a-bounded-random-variable, but this problem is too complicated for me. I can not handle it now. How can I do? – BinChen Sep 12 '22 at 08:49
Hi @JacobManaker, I am sorry to disturb you again. I post this problem at another place: https://math.stackexchange.com/questions/4529767/prove-disprove-mathbbe-ln-det-mathbfa-rightarrow-0-when-bounded. – BinChen Sep 12 '22 at 09:17

Prove/disprove the matrix convergence in probability

1 Answers1

Linked