0

I understand that in projected gradient descent, I need to project my (intermediate) solution to the feasible set with shortest distance like here:

What is the difference between projected gradient descent and ordinary gradient descent?

$$ \min_x f(x) \text{ subject to } x \in C $$

$$ y_{k+1} = x_k - t_k \nabla f(x_k)\\ x_{k+1} = \text{arg} \min_{x \in C} \|y_{k+1}-x\| $$

So in my specific case, my optimization problem is $f(X) = \text{tr}((I-X)A(I-X)^T + XBX^T )$ subject to $l \leq Xt \leq u$. $A, B$ are positive semidefinite matrices. $X$ is an $n$ by $m$ matrix, $l, u$ are known $n$ by $1$ vectors and $t$ is a known $m$ by $1$ vector.

How do I project a given $X$ to above feasible set?

Basically in the projection step I want to do something like $\min ||X - Y||^2$ s.t $Xt \leq b$, but this seems a bit messy, as in I dont know how to exactly define $||X - Y||^2$ when $X, Y$ are matrices.

[EDIT] If it's a problem with $x,y,a$ as vectors and $b$ as a scalar such as $\min_{x} || x - y ||^{2}$ subject to ${a}^{T} x \leq b$

Then I can find

$$ x = \begin{cases} y & \text{ if } \; {a}^{T} y \leq b \\ y - \frac{{a}^{T} y - b}{ {\left\| a \right\|}_{2}^{2} } a & \text{ if } \; {a}^{T} y > b \end{cases} $$

Royi
  • 10,050

1 Answers1

1

You may use the trick in Orthogonal Projection onto Half Space of Matrices.
It will generate the equivalent problem as:

$$ \boldsymbol{l} \leq \boldsymbol{T} \boldsymbol{x} \leq \boldsymbol{u} $$

Then you may reformulate it as:

$$ \begin{bmatrix} \phantom{-} \boldsymbol{T} \\ - \boldsymbol{T} \end{bmatrix} \boldsymbol{x} \leq \begin{bmatrix} \phantom{-} \boldsymbol{u} \\ - \boldsymbol{l} \end{bmatrix} $$

Which is equivalent to the problem in Orthogonal Projection onto a Polyhedron (Matrix Inequality).

Royi
  • 10,050
  • Thanks Royi, Can you see the edit to my problem body? I can solve a simpler vector case, but not a messy matrix case – Taylor Fang Sep 28 '24 at 13:21
  • @TaylorFang, If you look at the link I provided you'd see this is the sub problem to solve. See Orthogonal Projection onto a Half Space. – Royi Sep 28 '24 at 14:07
  • @TaylorFang, If you share the data / function, I would be able to show you in code. – Royi Sep 28 '24 at 14:08
  • yes, I see your link, but that's the vector/scalar case I can solve. I don't know how to solve the matrix case. My objective function is $(I-X)A(I-X)^{T} + X B X^T$ But I don't think it matters? It's just a matter of projecting to constraint feasible set, the objective function shouldn't matter? – Taylor Fang Sep 28 '24 at 14:19
  • @TaylorFang, I don't understand. Are you after $\boldsymbol{X}$ or $\boldsymbol{t}$? – Royi Sep 28 '24 at 15:12
  • @TaylorFang, Just write your problem completely. I will give a full solution. – Royi Sep 28 '24 at 15:13
  • I am after $X$, but $t$ is a known vector as a part of the constraint. The constraint is $l \leq Xt \leq u$ – Taylor Fang Sep 28 '24 at 15:13
  • I edited my question body. my optimization problem is $f(X) = (I-X)A(I-X)^T + XBX^T$ subject to $l \leq Xt \leq u$. $X$ is an $n$ by $m$ matrix, $l, u$ are known $n$ by $1$ vectors and $t$ is a known $m$ by $1$ vector. Thanks a lot for your help – Taylor Fang Sep 28 '24 at 15:16
  • @TaylorFang, The function you describe does not yield a scalar. Maybe you missed the Trace operator? – Royi Sep 28 '24 at 15:20
  • yes sorry there is a trace. But if I understand correctly, the projected gradient descent method just cares about projecting to the constrained feasible set, not the obj function – Taylor Fang Sep 28 '24 at 15:20
  • @TaylorFang, I have updated the answer. You have a link to a trick you should employ I wrote for you. – Royi Sep 28 '24 at 16:07
  • beautiful, thank you – Taylor Fang Sep 28 '24 at 19:18