17

Let $\mathbf{X,Y}$ be two positive definite matrices. Can we obtain the following Jensen-like inequality $$(1-\lambda)\mathbf{X}^{-1}+\lambda\mathbf{Y}^{-1} \succeq((1-\lambda)\mathbf{X}+\lambda\mathbf{Y})^{-1}$$ for any $\lambda \in (0,1)$, where the notation $\mathbf{A}\succeq \mathbf{B}$ represents matrix $(\mathbf{A-B})$ is positive semi-definite?

ViktorStein
  • 5,024
mewmew
  • 523

4 Answers4

18

Assume $X, Y$ are positive definite and $\lambda \in [0,1]$. We know $Z(\lambda) = (1-\lambda) X + \lambda Y$ is positive definite and so does $Z^{-1}(\lambda)$. Let us use $(\ldots)'$ to represent derivative with respect to $\lambda$, we have:

$$ Z Z^{-1} = I_n \implies Z'Z^{-1} + Z ( Z^{-1} )' = 0_n \implies (Z^{-1})' = - Z^{-1} Z' Z^{-1} $$ Differentiate one more time and notice $Z'' = 0_n$, we get: $$(Z^{-1})'' = - (Z^{-1})' Z' Z^{-1} - Z^{-1}Z' (Z^{-1})' = 2 Z^{-1}Z' Z^{-1} Z' Z^{-1}\tag{*1}$$

Pick any random non-zero vector $u$ and consider following pair of vector/matrix valued functions:

$$v(\lambda) = Z'(\lambda) Z^{-1}(\lambda) u\quad\quad\text{ and }\quad\quad \varphi(\lambda) = u^T Z^{-1}(\lambda) u$$

$(*1)$ tell us

$$\varphi''(\lambda) = u^T (Z^{-1})''(\lambda) u = 2 v^T(\lambda) Z^{-1}(\lambda) v(\lambda) \ge 0\tag{*2}$$

because $Z^{-1}(\lambda)$ is positive definite. From this we can conclude $\varphi(\lambda)$ is a convex function for $\lambda$ over $[0,1]$. As a result, for any $\lambda \in (0,1)$, we have:

$$\begin{align}&(1-\lambda)\varphi(0) + \lambda\varphi(1) - \varphi(\lambda) \ge 0\\ \iff& u^T \left[ (1-\lambda) X^{-1} + \lambda Y^{-1} - ((1-\lambda) X + \lambda Y)^{-1}\right] u \ge 0\tag{*3} \end{align}$$ Since $u$ is arbitrary, this implies the matrix within the square bracket in $(*3)$ is positive semi-definite and hence:

$$(1-\lambda) X^{-1} + \lambda Y^{-1} \succeq ((1-\lambda) X + \lambda Y)^{-1}$$

Please note that when $Z' = Y - X$ is invertible, $v(\lambda)$ is non-zero for non-zero $u$. The inequalities in $(*2)$ and $(*3)$ become strict and the matrix within the square bracket in $(*3)$ is positive definite instead of positive semi-definite.

achille hui
  • 125,323
  • How the convex function for $\lambda$ is concluded from $\ast 2$? – Farz May 06 '14 at 12:42
  • 1
    @Arry Assume $x < y$ and $f \in C^2([x,y])$ such that $f'(z) \ge 0$ on $[x,y]$, then for any $\lambda \in (0,1)$, we can repeatly apply MVT to find three numbers $\lambda_1, \lambda_2, \lambda_3$ which satisfy $1 > \lambda_1 > \lambda, \lambda_3 > \lambda_2 > 0$ and – achille hui May 06 '14 at 13:11
  • $$\begin{align} & \lambda f(x) + (1-\lambda)f(y) - f(\lambda x + (1-\lambda)y)\ = & \lambda\left[ f(x) - f(\lambda x + (1-\lambda) y )\right] - (1-\lambda) \left[ f(\lambda x + (1-\lambda) y ) - f(y) \right]\ = & \lambda (1-\lambda)(x-y) \left[ f'(\lambda_1 x + (1-\lambda_1) y) - f'(\lambda_2 x + (1-\lambda_2)y) \right]\ = & \lambda (1-\lambda)(\lambda_1 - \lambda_2)(x-y)^2 f''(\lambda_3 x + (1-\lambda_3)y )\ \ge & 0 \end{align} $$ In short, $f''(z) \ge 0$ on $[x,y] \implies f$ convex on $[x,y]$. – achille hui May 06 '14 at 13:12
11

Let $\mathbf{P}=\mathbf{X}^{-1/2}\mathbf{Y}\mathbf{X}^{-1/2}$. By left- and right- multiplying both sides of the equation by $\mathbf{X}^{1/2}$, the inequality is equivalent to $(1-\lambda)I+\lambda\mathbf{P}^{-1} \succeq((1-\lambda)I+\lambda\mathbf{P})^{-1}$. Since $\mathbf P$ is positive definite, it can be unitarily diagonalised and hence we may assume that it is a diagonal matrix. So, the inequality reduces down to the scalar case $(1-\lambda)+\lambda p_{ii}^{-1} \ge ((1-\lambda)+\lambda p_{ii})^{-1}$, which is true because the function $f(t)=\frac1t$ is convex for $t>0$.

user1551
  • 149,263
  • could you please explain how the first inequality is drawn? – Farz May 06 '14 at 12:44
  • as you mentioned: "By left- and right- multiplying both sides of the equation by $X^{1/2}$", so I assume you mean : $ X^{1/2} P X^{1/2}=X^{1/2}X^{-1/2} Y X^{-1/2}X^{1/2} \Rightarrow X^{1/2} P X^{1/2}=Y$ but how the inequality $(1−\lambda)I+\lambda P^{−1} \succeq ((1−\lambda)I+\lambda P)^{−1}$ can be concluded? – Farz May 06 '14 at 13:37
  • 1
    @Arry Hmm, I still don't quite get what you mean. Anyway, a real matrix $M$ is PSD iff $C^TMC$ is PSD for any invertible matrix $C$. So, $(1-\lambda)X^{-1}+\lambda Y^{-1} - ((1-\lambda)X+\lambda Y)^{-1}\succeq0$ iff $X^{1/2}\left[(1-\lambda)X^{-1}+\lambda Y^{-1} - ((1-\lambda)X+\lambda Y)^{-1}\right]X^{1/2}\succeq0$, i.e. iff $(1-\lambda)I+\lambda P^{-1}-((1-\lambda)I+\lambda P)^{-1}\succeq0$. – user1551 May 06 '14 at 13:48
  • "and hence we may assume that it is a diagonal matrix" @user1551 - do you mind explaining that in a little bit more detail? I do not follow you there. thank you. – makansij Sep 10 '17 at 01:57
  • @Sother Just perform a change of basis. – user1551 Sep 10 '17 at 06:30
  • like this? https://math.stackexchange.com/questions/1203362/change-basis-so-that-a-positive-definite-matrix-a-is-now-seen-as-i – makansij Sep 10 '17 at 18:04
  • 2
    @Sother No, I meant a unitary diagonalisation. Do you know that every Hemitian matrix (including the PSD ones) can be unitarily diagonalised? – user1551 Sep 11 '17 at 07:02
2

Let $ S^{n}_{++} $ be the set of $ n \times n $ positive definite matrices, and $ S^{n}_{+} $ be the set of $ n \times n $ positive semi-definite matrices.

$ \forall X, Y \in S^{n}_{++}, \theta \in [0, 1], $

$$ \left(\begin{array}{2} X^{-1} & I \\ I & X \end{array}\right), \left(\begin{array}{2} Y^{-1} & I \\ I & Y \end{array}\right) \in S^{2n}_{+}, $$

Therefore, $$ M = \left(\begin{array}{2} \theta X^{-1} + (1 - \theta)Y^{-1} & I \\ I & \theta X + (1 - \theta)Y \end{array}\right) \in S^{2n}_{+}. $$

Let $ S = (\theta X + (1 - \theta)Y) - (\theta X^{-1} + (1 - \theta)Y^{-1})^{-1} $, which is the schur complement of $ \theta X^{-1} + (1 - \theta)Y^{-1} $ in $ M $.

Since $$ \theta X^{-1} + (1 - \theta)Y^{-1} \in S^{n}_{++}, $$ $$ M \in S^{2n}_{+}, $$ we have $ S \in S^{n}_{+}, $ or $$ (\theta X + (1 - \theta)Y) - (\theta X^{-1} + (1 - \theta)Y^{-1})^{-1} \succeq O. $$

This step follows https://math.stackexchange.com/a/2835328/1459075.

Then, $$ \theta X^{-1} + (1 - \theta)Y^{-1} \succeq (\theta X + (1 - \theta)Y)^{-1} $$

Inspired by POSITIVE DEFINITE MATRICES (Rajendra Bhatia)

  • I think one of the $I$'s in each matrix has to be negative for the matrices to be psd. How does the last step follow? – ConnFus Nov 02 '24 at 17:01
  • @ConnFus No. The last step is Theorem 1.3.3 in the book mentioned in the answer. I will update the proof after finishing my exam. – snylonue Nov 06 '24 at 13:52
  • If we take 1x1-blocks and make the matrix $\left(\begin{array}{2} X^{-1} & I \ I & X \end{array}\right) := \pmatrix{x&1\1&\frac1x}$ for some $x>0$, then we have $\det\pmatrix{x&1\1&\frac1x}=0$, so it's only semi-definite – ConnFus Nov 06 '24 at 16:32
  • 1
    @ConnFus I have fixed the problem and added more details about last step. – snylonue Dec 22 '24 at 06:05
0

Edit: The proof is wrong as pointed out by @mewmew. I am tryin to fix it.

Let $\lambda\in [0,1]$. Let, $$A=(1-\lambda){X}^{-1}+\lambda {Y}^{-1}\\ B=\left((1-\lambda){X}+\lambda Y\right)$$

Then, we note that $A$ and $B$ are positive definite since $X,\ Y$ are positive definite. Hence $X,Y,A,B$ are invertible.

Then \begin{align} AB-I=&\left((1-\lambda){X}^{-1}+\lambda {Y}^{-1}\right)\left((1-\lambda){X}+\lambda Y\right)-I\\ \ =& \left(1-\lambda\right)^2I+\lambda(1-\lambda)\left(X^{-1}Y+Y^{-1}X\right)+\lambda^2I-I\\ \ =& 2\lambda(\lambda-1)I+\lambda(1-\lambda)\left(X^{-1}Y+Y^{-1}X\right)\\ \ =& \lambda(1-\lambda)\left(-X^{-1}X-Y^{-1}Y+X^{-1}Y+Y^{-1}X\right)\\ \ =& \lambda(1-\lambda)(Y^{-1}-X^{-1})(X-Y)\quad \\ \end{align} So, for any vector $u\ne 0$ \begin{align} u^T(B^TAB-B^T)u=& u^TB^T(AB-I)u=\lambda(1-\lambda)u^TB^T(Y^{-1}-X^{-1})(X-Y)u\\ \ =&\lambda(1-\lambda)u^T((1-\lambda)X^T+\lambda Y^T)(Y^{-1}-X^{-1})(X-Y)u \end{align} Now, for any vector $v\ne 0\ \exists$ a vector $u\ne 0$ such that $$Bu=v$$. Hence, for any vector $v\ne 0$\begin{align} v^T(A-B^{-1})v=& (Bu)^T(A-B^{-1})Bu\\ \ =& u^T(B^TAB-B^T)u\ge 0 \end{align}

Hence $$A\ge B^{-1}\\ \Rightarrow (1-\lambda)X^{-1}+\lambda Y^{-1}\ge \left((1-\lambda){X}+\lambda Y\right)^{-1}\quad \forall \lambda\in [0,1]\hspace{0.6cm}\Box$$