Different Approaches for Proving Kantorovich Inequality

Question

Here is a statement of the famous Kantorovich inequality.

Thoerem (Kantorovich). Let $A$ be a $n\times n$ symmetric and positive matrix. Furthermore, assume that its eigenvalues are $0 < \lambda_1 \leq \dots \leq \lambda_n$. Then, the following inequality holds for all $\mathbf{x}\in\mathbb{R}^n$ \begin{equation} \frac{(\mathbf{x}^{\top}A\mathbf{x})(\mathbf{x}^{\top}A^{-1}\mathbf{x})}{(\mathbf{x}^{\top}\mathbf{x})^2} \leq \frac{1}{4}\frac{(\lambda_1+\lambda_n)^2}{\lambda_1\lambda_n} = \frac{1}{4}\Bigg(\sqrt{\frac{\lambda_1}{\lambda_n}}+\sqrt{\frac{\lambda_n}{\lambda_1}}\Bigg)^2. \end{equation}

There are a variety of proofs for this inequality. My aim for asking this question is three fold. First, to gather a list of all nice proofs about this inequality. Second, to see if a proof with constrained optimization techniques is possible. Third, to know how Kantorovich thought about the problem. Here are the main questions.

Questions

What are different approaches (excluding those mentioned below) for proving Kantorovich inequality?

Can it be proved via constrained optimization techniques, continuing what I described below?

How did Kantorovich prove it himself?

Different Approaches

This is an elegant and beautiful proof based on probability techniques.
This is another proof by simple and clever algebra.

A Constrained Optimization Way

However, I am wondering if it can be proved via the most naive idea that comes to mind. Indeed, by maximizing the left hand side of the inequality! For this purpose, we can rewrite the left hand side by introducing $\mathbf{y} = \frac{\mathbf{x}}{\lVert\mathbf{x}\rVert}$ as below

\begin{equation} f(\mathbf{y}) = (\mathbf{y}^{\top}A\mathbf{y})(\mathbf{y}^{\top}A^{-1}\mathbf{y}). \end{equation}

Now, it seems natural to maximize $\phi(\mathbf{y})$ subject to the constraint $\mathbf{y}^{\top}\mathbf{y} = 1$. To make the problem even simpler, one can use the spectral decomposition $A=Q^{\top}\Lambda Q$ to write $\phi(\mathbf{y})$ as

\begin{equation} g(\mathbf{z}) = \big(\sum_{i=1}^{n} \lambda_i z_i^2\big) \big(\sum_{i=1}^{n} \frac{1}{\lambda_i} z_i^2\big), \end{equation}

where $\mathbf{z} = Q \mathbf{y}$. Finally, let $\xi_i = z_i^2$ to arrive at

\begin{equation} \phi(\boldsymbol{\xi}) = \big(\sum_{i=1}^{n} \lambda_i \xi_i\big) \big(\sum_{i=1}^{n} \frac{1}{\lambda_i} \xi_i\big) = \sum_{i=1}^{n}\sum_{j=1}^{n} \frac{\lambda_i}{\lambda_j}\xi_i\xi_j = \boldsymbol{\xi}B\boldsymbol{\xi}, \end{equation}

with the constraints

\begin{equation} \sum_{i=1}^{n}\xi_i = 1, \qquad \xi_i \ge 0. \end{equation}

As we are usually fond of symmetric matrices we can replace $B$ by $\frac{1}{2}(B + B^{\top})$ because we know that $B = \frac{1}{2} (B + B^{\top}) + \frac{1}{2}(B - B^{\top})$ and $\frac{1}{2}\boldsymbol{\xi}(B - B^{\top})\boldsymbol{\xi} = 0.$ Consequently, $f$ can be rewritten as

\begin{equation} \phi(\boldsymbol{\xi}) = \frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n} \Bigg(\frac{\lambda_i}{\lambda_j} + \frac{\lambda_j}{\lambda_i}\Bigg)\xi_i\xi_j = \frac{1}{2}\boldsymbol{\xi}H\boldsymbol{\xi}. \end{equation}

Can we find the maximizer of $\phi(\boldsymbol{\xi})$ subject to the aforementioned constraints via constrained optimization techniques?

Would using AM-GM be considered a "simple" way? Or you are already aware of this proof? — V.S.e.H., Nov 17 '23 at 22:24
Is it related to my answer here: https://math.stackexchange.com/questions/4004848/an-upper-bound-of-product-of-two-inner-products/4005730#4005730? — River Li, Nov 17 '23 at 23:00
@V.S.e.H.: Nope. I am not aware of that proof. I would be happy to see it. Please write an answer. :) — Hosein Rahnama, Nov 18 '23 at 06:47
@RiverLi: Yes, indeed, it is related. However, I was thinking about constrained optimization techniques but your solution is also nice. :) — Hosein Rahnama, Nov 18 '23 at 06:48
@V.S.e.H.: I couldn't spot anything wrong. Could you be more specific? — Hosein Rahnama, Nov 18 '23 at 12:31
The original inequality is stated with a tighter bound, but that's okay, yours is true. — V.S.e.H., Nov 18 '23 at 12:37

score 3 · Answer 1 · edited Nov 23 '23 at 09:08

I will borrow your notation and definitions.

Semi-optimization proof with AM-GM

Using AM-GM we have for any $s>0$ that

\begin{align} 2\left(\sum_{i=1}^n \lambda_i\xi_i\right)^{1/2}\left(\sum_{i=1}^n \lambda_i^{-1}\xi_i\right)^{1/2} &= 2\left(s^{-1}\sum_{i=1}^n \lambda_i\xi_i\right)^{1/2}\left(s\sum_{i=1}^n \lambda_i^{-1}\xi_i\right)^{1/2} \\ &\leq \sum_{i=1}^n \xi_i\left(\frac{\lambda_i}{s} + \frac{s}{\lambda_i}\right) \tag{AM-GM}\\ &\leq \max_{\lambda_1\leq \lambda \leq \lambda_n}\left(\frac{\lambda}{s} + \frac{s}{\lambda}\right) \sum_{i=1}^n \xi_i \\ &= \max_{\lambda_1\leq \lambda \leq \lambda_n}\left(\frac{\lambda}{s} + \frac{s}{\lambda}\right). \end{align}

Notice that the function $g(\lambda) = \frac{\lambda}{s}+\frac{s}{\lambda}$ is convex for any $s>0$, so it attains its maximum at either $\lambda_1$ or $\lambda_n$. But can we choose $s$ so that $g(\lambda_1)=g(\lambda_n)$? Indeed if that would be the case, then

$$ \frac{\lambda_1}{s} + \frac{s}{\lambda_1}=\frac{\lambda_n}{s} + \frac{s}{\lambda_n}\iff s^2(\lambda_1^{-1}-\lambda_n^{-1})=\lambda_n-\lambda_1 \iff s=\sqrt{\lambda_1\lambda_n}. $$

So now we just need to evaluate $g(\lambda_1)$. Indeed

$$ g(\lambda_1)=\frac{\lambda_1}{\sqrt{\lambda_1\lambda_n}}+\frac{\sqrt{\lambda_1\lambda_n}}{\lambda_1}=\frac{\lambda_1^2+\lambda_1\lambda_n}{\lambda_1\sqrt{\lambda_1\lambda_n}}=\frac{\lambda_1+\lambda_n}{\sqrt{\lambda_1\lambda_n}}. $$

Finally putting it all together we get

$$ \left(\sum_{i=1}^n \lambda_i\xi_i\right)\left(\sum_{i=1}^n \lambda_i^{-1}\xi_i\right)\leq\frac{1}{4}\frac{(\lambda_1+\lambda_n)^2}{\lambda_1\lambda_n}. $$

(+1) Thanks a dozen. Nice and clean. :) I didn't get what is the tighter bound. Is it all of the last line and Is it the answer to the optimization problem I pointed out? — Hosein Rahnama, Nov 18 '23 at 12:49

Souza · Answer 2 · 2024-03-13T11:38:22.660

Let me present an alternative algebraic approach to the inequality.

Theorem (Kantorovich). Consider a set of positive numbers satisfying $\sum_{i=1}^n a_i = 1$ and $\lambda_n \geq \cdots \geq \lambda_1 > 0$. Then, $$\sum_{i=1}^n \lambda_i a_i \cdot \sum_{i=1}^n \frac{a_i}{\lambda_i} \leq \frac{(\lambda_1 + \lambda_n)^2}{4 \lambda_1 \lambda_n}.$$

Let's dive into the proof.

Starting with the assumption $\lambda_n \geq \cdots \geq \lambda_1 > 0$, we have:

\begin{align} \lambda_i \geq \lambda_1 > 0 & \implies \sqrt{\lambda_i} \cdot \sqrt{\lambda_i} \geq \sqrt{\lambda_1} \cdot \sqrt{\lambda_1} \implies \sqrt{\frac{\lambda_i}{\lambda_1}} - \sqrt{\frac{\lambda_1}{\lambda_i}} \geq 0. \end{align}

By the same reasoning, we can establish $\sqrt{\frac{\lambda_n}{\lambda_i}} - \sqrt{\frac{\lambda_i}{\lambda_n}} \geq 0$ for every $i = 1, \dots, n$.

Hence, we can conclude

\begin{align} 0 &\leq \sum_{i=1}^n \left(\sqrt{\frac{\lambda_i}{\lambda_1}} - \sqrt{\frac{\lambda_1}{\lambda_i}}\right) \left(\sqrt{\frac{\lambda_n}{\lambda_i}} - \sqrt{\frac{\lambda_i}{\lambda_n}}\right) a_i \\ %&= \sum_{i=1}^n \left(\sqrt{\frac{\lambda_n}{\lambda_1}} - \sqrt{\frac{\lambda_i^2}{\lambda_1 \lambda_n}} - \sqrt{\frac{\lambda_1 \lambda_n}{\lambda_i^2}} + \sqrt{\frac{\lambda_1}{\lambda_n}}\right) a_i \\ &= \left(\sqrt{\frac{\lambda_n}{\lambda_1}} + \sqrt{\frac{\lambda_1}{\lambda_n}}\right) \sum_{i=1}^n a_i - \frac{1}{\sqrt{\lambda_1 \lambda_n}} \sum_{i=1}^n \lambda_i a_i - \sqrt{\lambda_1 \lambda_n} \sum_{i=1}^n \frac{a_i}{\lambda_i}. \end{align}

Using $\sum_{i=1}^n a_i = 1$ and the AM-GM inequality, we find

\begin{align} \sqrt{\frac{\lambda_n}{\lambda_1}} + \sqrt{\frac{\lambda_1}{\lambda_n}} & \geq \frac{1}{\sqrt{\lambda_1 \lambda_n}} \sum_{i=1}^n \lambda_i a_i + \sqrt{\lambda_1 \lambda_n} \sum_{i=1}^n \frac{a_i}{\lambda_i} \\ & \geq 2 \left(\frac{1}{\sqrt{\lambda_1 \lambda_n}} \sum_{i=1}^n \lambda_i a_i \cdot \sqrt{\lambda_1 \lambda_n} \sum_{i=1}^n \frac{a_i}{\lambda_i}\right)^{1/2} \\ & = 2\left(\sum_{i=1}^n \lambda_i a_i \cdot \sum_{i=1}^n \frac{a_i}{\lambda_i}\right)^{1/2}. \end{align}

Squaring the last expression yields Kantorovich's inequality.

(+1) Thanks for the nice and clean answer. – Hosein Rahnama Mar 04 '24 at 20:54 — Hosein Rahnama, Mar 04 '24 at 20:54

Different Approaches for Proving Kantorovich Inequality

Different Approaches

A Constrained Optimization Way

2 Answers2

Linked