Context: let's say that a Minkowski norm on a vector space $V$ is a map $F\colon V\to \Bbb R_{\geq 0}$ such that $F$ is smooth on $V\setminus \{0\}$, $F$ is positive-homogeneous of degree $1$, and for every $x\in V\setminus \{0\}$, the symmetric bilinear form $g_x$ on $V$ defined by $$g_x(v,w) = \frac{1}{2} \frac{\partial^2}{\partial t\partial s}\bigg|_{t=s=0}F^2(x+tv+sw)= \frac{1}{2} D^2(F^2)(x)(v,w)$$is positive-definite. Now define $F^*\colon V^* \to \Bbb R_{\geq 0}$ by $F^*(\xi) =\max \{\xi(x) \mid F(x)=1\}$.
I want to understand the proof that $F^*$ is a Minkowski norm, in the above sense, on $V^*$. Everything is clear except the positivity condition.
I am following the notes by Matias Dahl (clearly based on the book by Zhongmin Shen). He does a heavy coordinate computation on page 7 that, as I see it, hides the actual idea behind the proof. This is Lemma 3.1.2 on Shen's book. So, I want a proof without coordinates, but I'm struggling to translate what he did there.
We are free to use the Legendre transform $\ell\colon V\to V^*$ given by $\ell(x) = g_x(x,\cdot)$ and $\ell(0) = 0$, and its properties --- in particular that $F = F^*\circ \ell$.
From $F^2 = (F^*)^2 \circ \ell$, we have that $$D(F^2)(x)(v) = D((F^*)^2)(\ell(x)) \circ D\ell(x)v,\tag{$1$}$$and thus $$D^2(F^2)(x)(v,w) = D^2((F^*)^2)(\ell(x))(D\ell(x)v,D\ell(x)w) + D((F^*)^2)(\ell(x))((D^2\ell)(x)(v,w)),\tag{$2$}$$so the proof is concluded once we establish that $$D((F^*)^2)(\ell(x))((D^2\ell)(x)(v,w)) = 0,\tag{$3$}$$as it will follow that $g = \ell^*(g^*)$, where $g$ is the assignment $x\mapsto g_x$ on $V\setminus \{0\}$ and $g^*$ is the assignment $\xi \mapsto g^*_\xi$ on $V^*\setminus \{0\}$ induced by $F^*$. So each $g_x$ being positive-definite implies each $g^*_{\ell(x)}$ positive-definite as well, and we're done as $\ell$ is surjective.
I cannot verify that $(3)$ is true. From Dahl's coordinate computation, the homogeneity relation $g_{\lambda x} = g_x$ (for $\lambda > 0$) must enter. This implies that $\ell(\lambda x) = \lambda\ell(x)$ (for $\lambda>0$), and (1) reads $g_x(x,v) = g^*_{\ell(x)}(\ell(x), D\ell(x)v)$, while $(D\ell(x)v)w = g_x(v,w)$, but we only have that $(D^2\ell)(x)(x,\cdot) = 0$ and I don't see how this is enough.