I'd like to perform constrained gradient descent in a high-dimensional space. I'm planning to compute the gradient in the high-dimensional space and then project it to a lower-dimensional space that satisfies multiple constraints. I don't know how to mathematically express that lower-dimensional subspace, or how to project a higher dimensional vector onto it.
The objective is to have a matrix of $\gamma$ values that are as close to one as possible.
$$ \begin{bmatrix} \gamma_a & \gamma_b \\ \gamma_c & \gamma_d \end{bmatrix} $$
The $\gamma$ values are non-negative weights that are multiplied by non-negative $n$ values. The constraints act upon the products of the $\gamma$ and $n$ values.
$$ \begin{bmatrix} \gamma_a & \gamma_b \\ \gamma_c & \gamma_d \end{bmatrix} \odot \begin{bmatrix} n_a & n_b \\ n_c & n_d \end{bmatrix} = \begin{bmatrix} \gamma_a n_a & \gamma_b n_b \\ \gamma_c n_c & \gamma_d n_d \end{bmatrix} = \begin{bmatrix} a & b \\ c & d \end{bmatrix} $$
The constraints are that the rows and columns of the $\begin{bmatrix} a & b \\ c & d \end{bmatrix} $ matrix sum to the same values:
$$ a + b = c + d $$ $$ a + c = b + d $$
Since $\gamma$ is a weight I'd like the objective function to score $2$ and $\frac{1}{2}$ equivalently (and $3$ and $\frac{1}{3}$ equivalently) as if they're 'equally far from 1'. I've come up with the following objective function to optimize: $$ f = \sum_{i}^{a, b, c, d} \log(\gamma_i)^2 $$
Let's say - for example's sake - that the values of $n$ are as follows: $$ \begin{bmatrix} n_a & n_b \\ n_c & n_d \end{bmatrix} = \begin{bmatrix} 1 & 60 \\ 99 & 40 \end{bmatrix} $$
How do I figure out the optimal values for $ \begin{bmatrix} \gamma_a & \gamma_b \\ \gamma_c & \gamma_d \end{bmatrix} $?
I've tried using Lagrange multipliers for this, but it requires me to solve a system of equations with logarithms and reciprocals in it, which breaks my automatic equation solver sympy. I'm looking to extend this idea to many dimensions, so I have to have an automatic solver to solve the system of equations, solving by hand is not an option. sympy throws a NotImplementedError when solving the system of equations for me.
The derivative of the objective function is: $$ \frac{\partial f}{\partial \gamma_i} = 2 \log(\gamma_i)\frac{1}{\gamma_i} $$
I'm also fine with redifining the objective function (so long as $x$ and $\frac{1}{x}$ receive equal score):
$$ f = \sum_{i}^{a, b, c, d} (1 -\gamma_i)^2 + (1 - \frac{1}{\gamma_i})^2 $$
But this didn't help me with the Lagrange Multiplier approach.
Back to constrained gradient descent. I'm guessing that this is a convex problem, which would mean that gradient descent would converge to the global optimum.
I got the idea of constrained gradient descent from this post: Gradient descent with constraints
The idea is to project the high-dimensional gradient to the subspace that meets the constraints. In their case the projection is onto a sphere, which can be performed as normalizing the vector norm.
Their solution doesn't fit my needs, because I don't know how to formulate and perform the projection to a lower dimensional space that satisfies the constraints.
Any help is appreciated.