Coordinate-free proof that dual Minkowski norm is indeed a Minkowski norm on the dual space

Question

Context: let's say that a Minkowski norm on a vector space $V$ is a map $F\colon V\to \Bbb R_{\geq 0}$ such that $F$ is smooth on $V\setminus \{0\}$, $F$ is positive-homogeneous of degree $1$, and for every $x\in V\setminus \{0\}$, the symmetric bilinear form $g_x$ on $V$ defined by $$g_x(v,w) = \frac{1}{2} \frac{\partial^2}{\partial t\partial s}\bigg|_{t=s=0}F^2(x+tv+sw)= \frac{1}{2} D^2(F^2)(x)(v,w)$$is positive-definite. Now define $F^*\colon V^* \to \Bbb R_{\geq 0}$ by $F^*(\xi) =\max \{\xi(x) \mid F(x)=1\}$.

I want to understand the proof that $F^*$ is a Minkowski norm, in the above sense, on $V^*$. Everything is clear except the positivity condition.

I am following the notes by Matias Dahl (clearly based on the book by Zhongmin Shen). He does a heavy coordinate computation on page 7 that, as I see it, hides the actual idea behind the proof. This is Lemma 3.1.2 on Shen's book. So, I want a proof without coordinates, but I'm struggling to translate what he did there.

We are free to use the Legendre transform $\ell\colon V\to V^*$ given by $\ell(x) = g_x(x,\cdot)$ and $\ell(0) = 0$, and its properties --- in particular that $F = F^*\circ \ell$.

From $F^2 = (F^*)^2 \circ \ell$, we have that $$D(F^2)(x)(v) = D((F^*)^2)(\ell(x)) \circ D\ell(x)v,\tag{$1$}$$and thus $$D^2(F^2)(x)(v,w) = D^2((F^*)^2)(\ell(x))(D\ell(x)v,D\ell(x)w) + D((F^*)^2)(\ell(x))((D^2\ell)(x)(v,w)),\tag{$2$}$$so the proof is concluded once we establish that $$D((F^*)^2)(\ell(x))((D^2\ell)(x)(v,w)) = 0,\tag{$3$}$$as it will follow that $g = \ell^*(g^*)$, where $g$ is the assignment $x\mapsto g_x$ on $V\setminus \{0\}$ and $g^*$ is the assignment $\xi \mapsto g^*_\xi$ on $V^*\setminus \{0\}$ induced by $F^*$. So each $g_x$ being positive-definite implies each $g^*_{\ell(x)}$ positive-definite as well, and we're done as $\ell$ is surjective.

I cannot verify that $(3)$ is true. From Dahl's coordinate computation, the homogeneity relation $g_{\lambda x} = g_x$ (for $\lambda > 0$) must enter. This implies that $\ell(\lambda x) = \lambda\ell(x)$ (for $\lambda>0$), and (1) reads $g_x(x,v) = g^*_{\ell(x)}(\ell(x), D\ell(x)v)$, while $(D\ell(x)v)w = g_x(v,w)$, but we only have that $(D^2\ell)(x)(x,\cdot) = 0$ and I don't see how this is enough.

score 1 · Answer 1 · answered Nov 05 '21 at 01:38

I have cooked up a proof. Differentiating the relation $(1/2)F^2=(1/2)(F^*)^2\circ \ell$ twice at $x$ and evaluating at $v$ and $w$, we directly have that \begin{align*} g_x(x,v) &= g_{\ell(x)}^*(\ell(x),D\ell(x)v), \\ g_x(v,w) &= \ell^*(g^*)_x(v,w)+ \frac{1}{2} D((F^*)^2)(\ell(x))((D^2\ell)(x)(v,w)). \end{align*}It remains to show that this error term vanishes. First, consider the vector $z\in V$ such that $g_x(z,\cdot) = (D^2\ell)(x)(v,w)$. Thus, $D\ell = g$ gives that $D\ell(x)z = (D^2\ell)(x)(v,w)$. With this in place, the above relations give that \begin{align*} \frac{1}{2} D((F^*)^2)(\ell(x))((D^2\ell)(x)(v,w)) &= g_{\ell(x)}^*(\ell(x),(D^2\ell)(x)(v,w) ) \\ &= g_{\ell(x)}^*(\ell(x),D\ell(x)z) \\ &= g_x(x,z) \\ &= [(D^2\ell)(x)(v,w)]x. \end{align*} As for why we have $[(D^2\ell)(x)(v,w)]x=0$, we differentiate the previously established identity $(D\ell(x)w)x = \ell(x)w$ at the point $x$, in the direction of $v$, to obtain $$ [(D^2\ell)(x)(v,w)]x + (D\ell(x)w)v = (D\ell(x)v)w. $$Since $(D\ell(x)w)v = (D\ell(x)v)w$ (as $g_x$ is symmetric), we are done.

Deane · Accepted Answer · 2021-11-05T05:07:26.543

Here's how I think of this:

$\newcommand{\R}{\mathbb{R}}$ First, if $F: V \rightarrow \R$ is a convex homogeneous function positive away from the origin, then the function $F^*: V^* \rightarrow \R$, defined to be $$ F^*(\xi) = \sup \left\{ \frac{\langle \xi,x\rangle}{F(x)} \right\}, $$ is also convex and homogeneous, because it is the sup of linear functions.

Now assume that $F$ is $C^2$ and let $\phi = \frac{1}{2}F^2$ and $\phi^* = \frac{1}{2}(F^*)^2$. Their differentials are maps \begin{align*} D\phi = F,DF: V &\rightarrow V^*\\ D\phi^* = F^*\,DF^*: V^* &\rightarrow V. \end{align*} The Hessian $D^2\phi = F\,D^2F + DF\otimes DF$ is a symmetric $2$-tensor-valued function of $V$ that is homogeneous of degree $0$. Let $g$ be the restriction of this tensor to the unit sphere, $F = 1$. $g$ is a Riemannian metric on the sphere if and only if $D^2\phi$ is always positive definite.

If we assume that the Hessian $D^2\phi$ is always positive definite away from the origin, then $D\phi$ and $D\phi^*$ are homogeneous diffeomorphisms away from the origin. Moreover, you can show that the maps $D\phi$ and $D\phi^*$ are inverses of each other. This implies that, if $\xi = D\phi(v)$, then $$ D^2\phi^*(\xi) = (D^2\phi(v))^{-1} $$ In particular, $D^2\phi^*$ is always positive definite and therefore its restriction to the tangent space of the unit sphere $F^* = 1$ is positive definite.

You can find this all spelled out in a little survey paper I wrote: https://www.math.nyu.edu/~yangd/papers/affine_survey.pdf

Thank you Deane! This is a very nice and reader-friendly write-up. For whoever sees this, the part of the survey addressing the issue here is Section 5.6. — Ivo Terek, Nov 05 '21 at 14:43

Coordinate-free proof that dual Minkowski norm is indeed a Minkowski norm on the dual space

2 Answers2