0

consider using Lagrange to optimize this.

$\boldsymbol{y} \in \mathbb{R}^{n}$ column vector. $X \in \mathbb{R}^{n \times d}$, an $n \times d$ matrix where $X X^{⊤}$ is invertible, $\boldsymbol{w}=\left(w_{1}, w_{2}, \ldots, w_{d}\right)^{⊤} \in \mathbb{R}^{d}$ be a $d$-dimensional column vector.

$$ \min _{\boldsymbol{w}} \frac{1}{2}\|\boldsymbol{w}\|^{2} \text { subject to } \boldsymbol{y}=X \boldsymbol{w} $$

where $\|\boldsymbol{w}\|=\sqrt{w_{1}^{2}+w_{2}^{2}+\ldots+w_{d}^{2}}$. The Lagrange function is given by

$$ L(\boldsymbol{w}, \boldsymbol{\mu})=\frac{1}{2}\|\boldsymbol{w}\|^{2}+\boldsymbol{\mu}^{⊤}(\boldsymbol{y}-X \boldsymbol{w}) $$

where $\boldsymbol{\mu} \in \mathbb{R}^{n}$ is the Lagrange multipliers.

consider we want to express the stationary points of $L(\boldsymbol{w}, \boldsymbol{\mu})$ ,in the form of $\boldsymbol{w}=A \boldsymbol{y}$ and $\boldsymbol{\mu}=B \boldsymbol{y}$ . How to express the matrices $A \in \mathbb{R}^{d \times n}$ and $B \in \mathbb{R}^{n \times n}$ using only $X$?


I could only write these:

$$ \begin{aligned} L(\boldsymbol{w}, \boldsymbol{\mu}) =& \frac{1}{2}\|\boldsymbol{w}\|^{2}+\boldsymbol{\mu}^{⊤}(\boldsymbol{y}-X \boldsymbol{w})\\ =&\frac{1}{2}\|A \boldsymbol{y}\|^{2}+(B \boldsymbol{y})^{⊤}(\boldsymbol{y}-X A \boldsymbol{y}) \end{aligned} $$

then stucked.

AsukaMinato
  • 1,007

1 Answers1

1

The stationary points of $L(w,μ)$ are solutions of the system of equations $$\frac{\partial L}{\partial w}=w-X^T\mu=0,\qquad \frac{\partial L}{\partial \mu}=y-Xw=0.$$

It follows that $$X(w-X^T\mu)=0\Longrightarrow y=XX^T\mu,$$ and $B=[XX^T]^{-1}$ is such that $\mu=By.$

Therefore, $$w=X^T\mu\Longrightarrow w=X^TBy, $$ and $A=X^TB$ is such that $w=Ay$.

Note: Please see Differentiate matrix expression if you need help on partial derivatives of $L(w,\mu)$. You can find more searching for "\(\langle f(w),g(w)\rangle\) derivative" on SearchOnMath.