7

Here is a statement of the famous Kantorovich inequality.

Thoerem (Kantorovich). Let $A$ be a $n\times n$ symmetric and positive matrix. Furthermore, assume that its eigenvalues are $0 < \lambda_1 \leq \dots \leq \lambda_n$. Then, the following inequality holds for all $\mathbf{x}\in\mathbb{R}^n$ \begin{equation} \frac{(\mathbf{x}^{\top}A\mathbf{x})(\mathbf{x}^{\top}A^{-1}\mathbf{x})}{(\mathbf{x}^{\top}\mathbf{x})^2} \leq \frac{1}{4}\frac{(\lambda_1+\lambda_n)^2}{\lambda_1\lambda_n} = \frac{1}{4}\Bigg(\sqrt{\frac{\lambda_1}{\lambda_n}}+\sqrt{\frac{\lambda_n}{\lambda_1}}\Bigg)^2. \end{equation}

There are a variety of proofs for this inequality. My aim for asking this question is three fold. First, to gather a list of all nice proofs about this inequality. Second, to see if a proof with constrained optimization techniques is possible. Third, to know how Kantorovich thought about the problem. Here are the main questions.

Questions

  1. What are different approaches (excluding those mentioned below) for proving Kantorovich inequality?
  2. Can it be proved via constrained optimization techniques, continuing what I described below?
  3. How did Kantorovich prove it himself?

Different Approaches

  • This is an elegant and beautiful proof based on probability techniques.
  • This is another proof by simple and clever algebra.

A Constrained Optimization Way

However, I am wondering if it can be proved via the most naive idea that comes to mind. Indeed, by maximizing the left hand side of the inequality! For this purpose, we can rewrite the left hand side by introducing $\mathbf{y} = \frac{\mathbf{x}}{\lVert\mathbf{x}\rVert}$ as below

\begin{equation} f(\mathbf{y}) = (\mathbf{y}^{\top}A\mathbf{y})(\mathbf{y}^{\top}A^{-1}\mathbf{y}). \end{equation}

Now, it seems natural to maximize $\phi(\mathbf{y})$ subject to the constraint $\mathbf{y}^{\top}\mathbf{y} = 1$. To make the problem even simpler, one can use the spectral decomposition $A=Q^{\top}\Lambda Q$ to write $\phi(\mathbf{y})$ as

\begin{equation} g(\mathbf{z}) = \big(\sum_{i=1}^{n} \lambda_i z_i^2\big) \big(\sum_{i=1}^{n} \frac{1}{\lambda_i} z_i^2\big), \end{equation}

where $\mathbf{z} = Q \mathbf{y}$. Finally, let $\xi_i = z_i^2$ to arrive at

\begin{equation} \phi(\boldsymbol{\xi}) = \big(\sum_{i=1}^{n} \lambda_i \xi_i\big) \big(\sum_{i=1}^{n} \frac{1}{\lambda_i} \xi_i\big) = \sum_{i=1}^{n}\sum_{j=1}^{n} \frac{\lambda_i}{\lambda_j}\xi_i\xi_j = \boldsymbol{\xi}B\boldsymbol{\xi}, \end{equation}

with the constraints

\begin{equation} \sum_{i=1}^{n}\xi_i = 1, \qquad \xi_i \ge 0. \end{equation}

As we are usually fond of symmetric matrices we can replace $B$ by $\frac{1}{2}(B + B^{\top})$ because we know that $B = \frac{1}{2} (B + B^{\top}) + \frac{1}{2}(B - B^{\top})$ and $\frac{1}{2}\boldsymbol{\xi}(B - B^{\top})\boldsymbol{\xi} = 0.$ Consequently, $f$ can be rewritten as

\begin{equation} \phi(\boldsymbol{\xi}) = \frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n} \Bigg(\frac{\lambda_i}{\lambda_j} + \frac{\lambda_j}{\lambda_i}\Bigg)\xi_i\xi_j = \frac{1}{2}\boldsymbol{\xi}H\boldsymbol{\xi}. \end{equation}

Can we find the maximizer of $\phi(\boldsymbol{\xi})$ subject to the aforementioned constraints via constrained optimization techniques?

2 Answers2

3

I will borrow your notation and definitions.

Semi-optimization proof with AM-GM

Using AM-GM we have for any $s>0$ that

\begin{align} 2\left(\sum_{i=1}^n \lambda_i\xi_i\right)^{1/2}\left(\sum_{i=1}^n \lambda_i^{-1}\xi_i\right)^{1/2} &= 2\left(s^{-1}\sum_{i=1}^n \lambda_i\xi_i\right)^{1/2}\left(s\sum_{i=1}^n \lambda_i^{-1}\xi_i\right)^{1/2} \\ &\leq \sum_{i=1}^n \xi_i\left(\frac{\lambda_i}{s} + \frac{s}{\lambda_i}\right) \tag{AM-GM}\\ &\leq \max_{\lambda_1\leq \lambda \leq \lambda_n}\left(\frac{\lambda}{s} + \frac{s}{\lambda}\right) \sum_{i=1}^n \xi_i \\ &= \max_{\lambda_1\leq \lambda \leq \lambda_n}\left(\frac{\lambda}{s} + \frac{s}{\lambda}\right). \end{align}

Notice that the function $g(\lambda) = \frac{\lambda}{s}+\frac{s}{\lambda}$ is convex for any $s>0$, so it attains its maximum at either $\lambda_1$ or $\lambda_n$. But can we choose $s$ so that $g(\lambda_1)=g(\lambda_n)$? Indeed if that would be the case, then

$$ \frac{\lambda_1}{s} + \frac{s}{\lambda_1}=\frac{\lambda_n}{s} + \frac{s}{\lambda_n}\iff s^2(\lambda_1^{-1}-\lambda_n^{-1})=\lambda_n-\lambda_1 \iff s=\sqrt{\lambda_1\lambda_n}. $$

So now we just need to evaluate $g(\lambda_1)$. Indeed

$$ g(\lambda_1)=\frac{\lambda_1}{\sqrt{\lambda_1\lambda_n}}+\frac{\sqrt{\lambda_1\lambda_n}}{\lambda_1}=\frac{\lambda_1^2+\lambda_1\lambda_n}{\lambda_1\sqrt{\lambda_1\lambda_n}}=\frac{\lambda_1+\lambda_n}{\sqrt{\lambda_1\lambda_n}}. $$

Finally putting it all together we get

$$ \left(\sum_{i=1}^n \lambda_i\xi_i\right)\left(\sum_{i=1}^n \lambda_i^{-1}\xi_i\right)\leq\frac{1}{4}\frac{(\lambda_1+\lambda_n)^2}{\lambda_1\lambda_n}. $$

V.S.e.H.
  • 2,889
  • 1
  • 11
  • 23
3

Let me present an alternative algebraic approach to the inequality.

Theorem (Kantorovich). Consider a set of positive numbers satisfying $\sum_{i=1}^n a_i = 1$ and $\lambda_n \geq \cdots \geq \lambda_1 > 0$. Then, $$\sum_{i=1}^n \lambda_i a_i \cdot \sum_{i=1}^n \frac{a_i}{\lambda_i} \leq \frac{(\lambda_1 + \lambda_n)^2}{4 \lambda_1 \lambda_n}.$$

Let's dive into the proof.

Starting with the assumption $\lambda_n \geq \cdots \geq \lambda_1 > 0$, we have:

\begin{align} \lambda_i \geq \lambda_1 > 0 & \implies \sqrt{\lambda_i} \cdot \sqrt{\lambda_i} \geq \sqrt{\lambda_1} \cdot \sqrt{\lambda_1} \implies \sqrt{\frac{\lambda_i}{\lambda_1}} - \sqrt{\frac{\lambda_1}{\lambda_i}} \geq 0. \end{align}

By the same reasoning, we can establish $\sqrt{\frac{\lambda_n}{\lambda_i}} - \sqrt{\frac{\lambda_i}{\lambda_n}} \geq 0$ for every $i = 1, \dots, n$.

Hence, we can conclude

\begin{align} 0 &\leq \sum_{i=1}^n \left(\sqrt{\frac{\lambda_i}{\lambda_1}} - \sqrt{\frac{\lambda_1}{\lambda_i}}\right) \left(\sqrt{\frac{\lambda_n}{\lambda_i}} - \sqrt{\frac{\lambda_i}{\lambda_n}}\right) a_i \\ %&= \sum_{i=1}^n \left(\sqrt{\frac{\lambda_n}{\lambda_1}} - \sqrt{\frac{\lambda_i^2}{\lambda_1 \lambda_n}} - \sqrt{\frac{\lambda_1 \lambda_n}{\lambda_i^2}} + \sqrt{\frac{\lambda_1}{\lambda_n}}\right) a_i \\ &= \left(\sqrt{\frac{\lambda_n}{\lambda_1}} + \sqrt{\frac{\lambda_1}{\lambda_n}}\right) \sum_{i=1}^n a_i - \frac{1}{\sqrt{\lambda_1 \lambda_n}} \sum_{i=1}^n \lambda_i a_i - \sqrt{\lambda_1 \lambda_n} \sum_{i=1}^n \frac{a_i}{\lambda_i}. \end{align}

Using $\sum_{i=1}^n a_i = 1$ and the AM-GM inequality, we find

\begin{align} \sqrt{\frac{\lambda_n}{\lambda_1}} + \sqrt{\frac{\lambda_1}{\lambda_n}} & \geq \frac{1}{\sqrt{\lambda_1 \lambda_n}} \sum_{i=1}^n \lambda_i a_i + \sqrt{\lambda_1 \lambda_n} \sum_{i=1}^n \frac{a_i}{\lambda_i} \\ & \geq 2 \left(\frac{1}{\sqrt{\lambda_1 \lambda_n}} \sum_{i=1}^n \lambda_i a_i \cdot \sqrt{\lambda_1 \lambda_n} \sum_{i=1}^n \frac{a_i}{\lambda_i}\right)^{1/2} \\ & = 2\left(\sum_{i=1}^n \lambda_i a_i \cdot \sum_{i=1}^n \frac{a_i}{\lambda_i}\right)^{1/2}. \end{align}

Squaring the last expression yields Kantorovich's inequality.

Souza
  • 981