Shifted vector inequality

Question

Suppose $n$ is some large integer, and consider the following two matrices:

$$ S = \begin{pmatrix} 0 &1 & 0 & \dots & 0\\ 0 & 0 & 1 & \dots & 0 \\ \vdots & & & \ddots & \vdots\\ 0 & 0 & 0 & \dots & 1 \\ 0 & 0 & 0 & \dots & 0 \end{pmatrix}$$

(i.e., a non-circulant shift) and $$ A = \begin{pmatrix} 0 & 0 & 0 & \dots & 0 \\ 0 & 1 & 0 & \dots & 0\\ 0 & 0 & 2 & \dots & 0\\ \vdots & & & \ddots & \vdots \\ 0 & 0 & 0 & \dots & n-1\end{pmatrix}$$ (i.e., a diagonal matrix with increasing diagonal entries.)

Suppose I have a vector $v\in\mathbb{R}^n$ satisfying $v^Tv=1$ and $v^TSv \geq 1-\epsilon$. What kind of lower bound can I get for $v^TAv$?

It seems like for $v^Tv\geq 1-\epsilon$, we need $v$ to be highly "spread out". If $v$ is spread out, then more of its "weight" will be multiplied with higher numbers in $A$, leading to a larger product $v^TAv$. But I can't get any precise bounds.

I tried an optimization kind of technique: $v^Tv=1$ and $v^TSv\geq 1-\epsilon$ and constraints and $v^TAv$ is an objective function, and this could be minimized with Lagrange multipliers. But it results in a horrid recursive formula where I think it is hopeless to solve for the Lagrange multipliers. The solution I expect to give an optimal value -- the first $m\geq \frac{1}{\epsilon}$ entries exactly $\frac{1}{\sqrt{m}}$, the remainder $0$, which gives $v^TSv=1-\frac{1}{m}$ -- doesn't seem to satisfy the Lagrange conditions of this optimization.

EDIT: Solving this numerically, the results look almost exactly like the square root of a Poisson distribution (i.e., $x_i^2$ follow a Poisson distribution), with parameters that seem to be almost unchanged in $n$ for sufficiently large $n$.

Using Lagrange multipliers, I get

$$\nabla v^TAv + \mu_1\nabla v^TSv + \mu_2\nabla v^Tv = 0$$ which produces $n$ equations of the form $$ 2kv_k + \mu_1(v_{k-1}+v_{k+1}) + 2\mu_2 v_k = 0$$ (if $v_0=v_{n+1}=0$). Letting $v_k = \sqrt{\text{Poisson}(k;\lambda)}$ does not solve this, however.

Any ideas?

Yes, $\epsilon\in (0,1)$. I think we also need $\epsilon \geq \frac{1}{n}$ for this to even be feasible in $n$ dimensions. — Sam Jaques, Jan 05 '22 at 20:15
you can show that if the minimum exists, then it must be attained at $x_1\geq x_2\geq\dots x_n$ by looking at $(x_i, x_{i+1})\to \left(\sqrt{\dfrac{x_i^2+x_{i+1}^2}{2}}, \sqrt{\dfrac{x_i^2+x_{i+1}^2}{2}}\right).$ So if you could show that $x_n = 0,$ then it should be easy to to conclude from there.
Here, $v = (x_1,x_2,\dots x_n)$ — dezdichado, Jan 05 '22 at 20:19
That technique preserves $v^TAv$ but can decrease $v^TSv$. Consider a case where where all $x_i=0$ except some index, where $x_i=0.49$, $x_{i+1}=0.62$, and $x_{i+2}=\sqrt{1-x_i^2+x_{i+1}^2}$. This satisfies $x_{i+1}\geq x_{i+2}$, but the value of $v^TSv$ decreases after the transformation you suggest (from $\approx$0.684 to $\approx$0.655). — Sam Jaques, Jan 06 '22 at 10:13
@SamJaques Did you use SDP (semidefinite programming) to solve it? Do you want an analytical lower bound (function of $\epsilon$)? — River Li, Jan 13 '22 at 05:28
I'd like an analytical solution. A numerical solution from SDP could get me started but I have no experience in SDP. When I try to use the cvxpy library it complains that the $v^TSv\geq 1-\epsilon$ constraint is not a DCP constraint. — Sam Jaques, Jan 13 '22 at 09:34
SDP works well (numerically). You may look at https://math.stackexchange.com/questions/4167787/minimize-xaax-such-that-xaax-1-and-xx-1/4173309#comment8709252_4173309 — River Li, Jan 13 '22 at 11:46
I figured out how to get a numerical answer with scipy, and updated the question. Thanks! — Sam Jaques, Jan 13 '22 at 15:15
@SamJaques So, you can find the minimum of $v^\mathsf{T}Av$ numerically subject to the constraints, but do not know if it has some desired property? — River Li, Jan 13 '22 at 16:04
My ultimate goal is a theorem of the form "if you satisfy $v^TSv\geq 1-\epsilon$, then you must satisfy $v^TAv \geq f(n,\epsilon)$" for some function $f$. Numerical results give the shape of $f$, but I still need to find $f$ and get a proof that it really is a lower bound. — Sam Jaques, Jan 13 '22 at 16:15

River Li · Answer 1 · 2022-01-14T08:05:47.693

Some thoughts:

Let $B = S + S^\mathsf{T}$. This is a symmetric tridiagonal matrix [2].
The eigenvalues of $B$ are given by $2\cos \frac{k\pi}{n + 1}$, $k=1, 2, \cdots, n$.
We have $v^\mathsf{T}S v = \frac12 v^\mathsf{T}B v$. Thus, for $v^\mathsf{T}S v \ge 1 - \epsilon$ to be true, we need $\epsilon \ge 1 - \cos \frac{\pi}{n + 1}$.

Let $1 - \cos \frac{\pi}{n + 1} < \epsilon < 1$ be given.
Clearly, there exists $v_0 \in \mathbb{R}^n$ such that $v_0^\mathsf{T}B v_0 > 2(1 - \epsilon)v_0^\mathsf{T}v_0$.
Let $\alpha \ge 0$ be a constant such that $$v^\mathsf{T}A v \ge \alpha$$ for all $v\in \mathbb{R}^n$ with $v^\mathsf{T}v = 1$ and $v^\mathsf{T}B v \ge 2(1 - \epsilon)$. We have $$v^\mathsf{T}B v \ge 2(1 - \epsilon)v^\mathsf{T}v \quad \Longrightarrow \quad v^\mathsf{T}A v \ge \alpha v^\mathsf{T}v. \tag{1}$$ Using S-Lemma (Theorem 2, [1]), (1) holds if and only if there exists $\lambda \ge 0$ such that $$A - \alpha I \ge \lambda[B - 2(1 - \epsilon)I]$$ or $$A - \beta [B - 2(1 - \epsilon)I] \ge \alpha I \tag{2}.$$ From (2), we have $$\lambda_{\min}\Big(A - \beta [B - 2(1 - \epsilon)I]\Big) \ge \alpha$$ where $\lambda_{\min}(\cdot)$ denotes the smallest eigenvalue of a symmetric matrix. Thus, the best (largest) $\alpha$ is given by $$\alpha = \max_{\beta \ge 0}~ \lambda_{\min}\Big(A - \beta [B - 2(1 - \epsilon)I]\Big). \tag{3}$$

Numerically, we can use convex programming to solve (2) for the best $\alpha$. We omit this part here. We focus on analytical lower bounds of the best $\alpha(\epsilon)$, based on (2).

For $n = 2, 3$, the best $\alpha$ admits a closed form. For $n = 4$, it seems that the best $\alpha$ can not expressed in closed form. The details are given later.

I will try to find some analytical lower bounds for $n\ge 4$. The idea is to use the results for tridiagonal matrix [2]. To be continued.

Some simple cases:

Case $n = 2$:

For given $1/2 < \epsilon < 1$, we need to find $\alpha\ge 0$ and $\beta \ge 0$ such that $$\begin{pmatrix} 2(1 - \epsilon)\beta - \alpha & -\beta \\ -\beta & 2(1 - \epsilon)\beta - \alpha + 1 \end{pmatrix} \succeq 0. $$ It is easy to get the best (largest) $\alpha = \frac12 - \frac12\sqrt{(2\epsilon - 1)(3 - 2\epsilon)}$ (and $\beta = \frac{1 - \epsilon}{\sqrt{(2\epsilon - 1)(3 - 2\epsilon)}}$).

Case $n = 3$:

For given $1 - \frac{1}{\sqrt2} < \epsilon < 1$, we need to find $\alpha \ge 0$ and $\beta \ge 0$ such that $$\begin{pmatrix} 2(1 - \epsilon)\beta - \alpha & -\beta & 0 \\ -\beta & 2(1 - \epsilon)\beta - \alpha + 1 & -\beta \\ 0 & -\beta & 2(1 - \epsilon)\beta - \alpha + 2 \end{pmatrix} \succeq 0. $$ It is easy to get the best (largest) $\alpha = 1 - \sqrt{-2\epsilon^2 + 4\epsilon - 1}$ (and $\beta = \frac{1 - \epsilon}{\sqrt{- 2\epsilon^2 + 4\epsilon - 1}}$).

Case $n = 4$:

For given $1 - \cos\frac{\pi}{5} < \epsilon < 1$, we need to find $\alpha \ge 0$ and $\beta \ge 0$ such that $$\begin{pmatrix} 2(1 - \epsilon)\beta - \alpha & -\beta & 0 & 0 \\ -\beta & 2(1 - \epsilon)\beta - \alpha + 1 & -\beta & 0 \\ 0 & -\beta & 2(1 - \epsilon)\beta - \alpha + 2 & -\beta \\ 0 & 0 & -\beta & 2(1 - \epsilon)\beta - \alpha + 3 \end{pmatrix}\succeq 0. $$ The best (largest) $\alpha$ is given by $$\alpha = 2(1 - \epsilon)\beta + \frac32 - \frac12 \sqrt{6\beta^2 + 2\sqrt{5\beta^4 + 12\beta^2 + 4} + 5}$$ where $\beta > 0$ is the unique positive real solution of $$2(1 - \epsilon) = \frac{3\beta\sqrt{5\beta^4 + 12\beta^2 + 4} + 5\beta^3 + 6\beta}{\sqrt{30\beta^6 + 97\beta^4 + 84\beta^2 + 20 + 2(5\beta^4 + 12\beta^2 + 4)\sqrt{5\beta^4 + 12\beta^2 + 4}}}.$$

Reference:

[1] “On the S-procedure and some variants”, http://www.thesis.bilkent.edu.tr/0002558.pdf

[2] https://en.wikipedia.org/wiki/Tridiagonal_matrix

Very cool! Since I only need a lower bound on $\alpha$, finding $\lambda_{min}$ for any $\beta$ would give such a bound. But finding these arbitrary $\lambda_{min}$ sounds tricky. — Sam Jaques, Jan 14 '22 at 10:36
@SamJaques If we choose $\beta$ arbitrarily, the bound may be bad. Also, we should analyze what happens as $n \to \inf$ (if any interesting thing happens?). — River Li, Jan 14 '22 at 12:06
Some thoughts: We know $A \geq \begin{pmatrix} 0 & 0 \ 0 & kI_{n-k}\end{pmatrix}$ for $1\leq k\leq n-1$. Let $T_n = 2(1-\epsilon)I_n - B_n$ ($n$ being the dimension) This is Toeplitz and we know its eigenvalues. We almost have $T_n = \begin{pmatrix} T_k & 0 \ 0 & T_{n-k}\end{pmatrix}$. If we did, we could argue that for any $n$-dimensional $v$, we split it into $v = \sqrt{p}v_1 + \sqrt{1-p}v_2$ where $v_1$ has support only on the first $k$ dimensions, $v_2$ on the last $n-k$, both are unit vectors.... — Sam Jaques, Jan 14 '22 at 14:38
Then $v^T(A+\beta T)v \geq p\beta v_1^T T_kv_1 + (1-p)\beta v_2^T T_{n-k}v_2 + (1-p)kv_2^Tv_2$ and we argue that $v_1^T T_kv_1 \geq 2(1-\epsilon) +\cos(k\pi/(k+1))$ and $v_2^TT_{n-k}v_2 \geq 2(1-\epsilon)+\cos((n-k)\pi/(n-k+1))$. This gives a lower bound of $\beta\left(2(1-\epsilon) + p\cos(k\pi/(k+1)) + (1-p)\cos((n-k)\pi/(n-k+1)\right) + k(1-p)$. From there, optimizing for $p$, $\beta$, and $k$ would be easier.
Problem is, $T_n\neq \begin{pmatrix} T_k & 0 \ 0 & T_{n-k}\end{pmatrix}$ because of the pesky off-diagonal $-\beta$ terms. If we could argue that those have minimal contribution? — Sam Jaques, Jan 14 '22 at 14:41
@SamJaques Did you do some numerical simulation for your thoughts (how good is your lower bound)? — River Li, Jan 14 '22 at 15:19

Shifted vector inequality

1 Answers1