2

I am studying the following problem. I want to obtain all the local minima of $$ \min_{x\in D} x^Tx \quad \quad \quad \quad \quad \quad (P.1) $$ where $x\in\mathbb{R}^n$ and $D = \{x: x^TA_ix = 1, \forall i=1,\dots,m\}$ and $A_1,\dots,A_m$ are a collection of $m<n$ positive definite $n\times n$ matrices. According to the Lagrange multiplier theorem ("Nonlinear programing" by Dimitry Bertsekas):

Proposition 3.1.1 (Lagrange Multiplier Theorem - Necesary Conditions) Let $x^*$ be a local minimum of $f$ subject to $h(x)=(h_1(x),\dots,h_m(x))=0$ and assume that the gradients $\nabla h_1(x^*),\dots,\nabla h_m(x^*)$ are linearly independent. Then, there exists a unique vector $\lambda^*=(\lambda_1^*,\dots,\lambda_m^*)$ called Lagrange multiplier vector such that $$ \nabla f(x^*) + \sum_{i=1}^m\lambda_i^*\nabla h_i(x^*) =0 \quad \quad \quad \quad \quad \quad (3.3) $$

Thus, the set of ALL local minima can be divided in two, the ones which have $\{\nabla h_i\}_{i=1}^m$ linearly independent (regular points), and such points may be found using (3.3), and other minima which have $\{\nabla h_i\}_{i=1}^m$ linearly dependent (nonregular points).

My question is: I already know how to find the regular points. My interest is regarding the nonregular ones. Can those nonregular points be studied (obtained) in my case? For reference, note that $h_i(x) = x^TA_ix-1$ and $\nabla{h_i}(x) = 2A_ix$.

From this point, let me explain what I have found up to now:

Suppose $\hat{x}$ is one of such points. I can partition the set of vectors $\mathcal{A}=\{A_i\hat{x}\}_{i=1}^m$ into two disjoint sets $\mathcal{B}=\{B_i\hat{x}\}_{i=1}^s\subset \mathcal{A}$ and $\mathcal{C}=\{C_i\hat{x}\}_{i=1}^r\subset \mathcal{A}$ with $r+s=m$ such that the vector in $\mathcal{B}$ are linearly independent and the ones in $\mathcal{C}$ are a linear combination of the ones in $\mathcal{B}$.

Then, there exists real constants $\{\alpha_{ij}\}_{j=1}^m$ such that $$ C_i\hat{x} = \sum_{j=1}^s \alpha_{ij}B_j\hat{x} $$ And for this point to lie in $D$ I require that $\hat{x}^TC_i\hat{x} = 1$ but also $\hat{x}^TB_j\hat{x}, j=1\dots,s$. Hence, it must be the case that $\sum_{j=1}^s \alpha_{ij}=1$ for $i=1,\dots,r$ and $j=1,\dots,s$. (is this useful?).

My original intention was to cast this as another (reduced) nonlinear program in which the gradients $\{\nabla h_i\}$ are linearly independent. My attempt was to try to show that $\hat{x}$ will be a local minimum of $(P.1)$ ignoring the "linearly dependent components". This is, if I solve (P.1) for all combinations of $\{A_i\}_{i=1}^m$, for example building $D$ only with the $A_i$ for separate, then by pairs, etc... then $\hat{x}$ would show up as a solution of one of those, in which the corresponding $\{\nabla h_i\}$ are linearly dependent. Is this true? The more I think about it the more I convince myself it is not the case. But what do you think? As an example of what I mean here, is the following. Consider $m=2$, $n=3$ and $A_1 = \text{diag}(1,1,0.1)$, $A_2 = \text{diag}(1,1,0.5)$. The constraints $x^TA_1x=1$, $x^TA_2x=1$ look like the red and blue surfaces here: enter image description here

Their intersection is the yellow circle. Note that points in such intersection will have linearly dependent gradients, basically since $A_1x=A_2x$ for $x$ in such interection. However, if I study (P.1) just by taking into account $A_1$, I could still find those points. Of course this is just an example that motivated my idea, but It seems not to generalize.

My other attempt was to try to find explicitly all points that have $\{A_i\hat{x}\}_{i=1}^m$ linearly dependent. This is, find coeficients $\alpha_1,\dots,\alpha_m$ such that the matrix $H(\alpha)=\sum_{i=1}^m\alpha_iA_i$ is singular, and then look for $\hat{x}$ in the kernel of $H(\alpha)$ and see if such set of points intersect the problem constraints, etc... However, I haven't been able to obtain anything new from here.

What do you suggest me to do next? Do you think is possible to obtain such points? Actually any suggestion, advice, comment would be helpful for me.

  • Without looking at the proof, I can't be certain, but your first question regarding Prop. 3.1.1 looks like the answer is no. This only states that "linearly independent gradients at a local min $x^$ implies (3.3)." It does not say that (3.3) implies you are a local minima with linearly independent gradients. You can certainly still partition the minima into {$x^$ with linearly independent gradients} $\sqcup$ {$x^*$ without linearly independent gradients}. It's just that the former are not entirely characterized by (3.3). – Zim Sep 30 '20 at 14:58
  • I totally agree with you. Do you have a suggestion regarding the other part of the question? thanks – FeedbackLooper Sep 30 '20 at 16:08
  • I'll let you know if I find some time to sit down and really think about the other questions. By the way, this problem is nonconvex, but you could numerically solve the convex relaxation (obtained by replacing $x^\top Ax=1$ with $\leq 1$) pretty easily using a quadratic programming software package. – Zim Sep 30 '20 at 20:21
  • Thanks for the feedback. Do you think is ok now? And yes, thats one way to put it. However, I have been able to obtain all regular points (the ones that comply with (3.3)) my question is regarding the nonregular ones. – FeedbackLooper Oct 02 '20 at 10:41
  • For the case where $m=1$ and $n=2$, take a look at this. – Rodrigo de Azevedo Oct 02 '20 at 11:07
  • Thanks @RodrigodeAzevedo! Actually I have a very similar solution for the case with $m>1$ in other to find all other points which comply (3.3), this is all regular points. Moreover, when $m=1$, there are no nonregular points, so I think I have that case completely understood. My complications come when I have $m>1$, when the nonregular points may exist. – FeedbackLooper Oct 02 '20 at 11:11

1 Answers1

-1

$$\begin{array}{ll} \underset{\mathrm x \in \Bbb R^n}{\text{minimize}} & \mathrm x^\top \mathrm x\\ \text{subject to} & \mathrm x^\top \mathrm A_1 \mathrm x = 1\\ & \qquad\vdots\\ & \mathrm x^\top \mathrm A_m \mathrm x = 1\end{array}$$

where $m < n$. We define the Lagrangian

$$\mathcal L ({\rm x}, \mu) := \mathrm x^\top \mathrm x + \mu_1 \left( \mathrm x^\top \mathrm A_1 \mathrm x - 1 \right) + \cdots + \mu_m \left( \mathrm x^\top \mathrm A_m \mathrm x - 1 \right)$$

Taking the partial derivatives and finding where they vanish,

$$\begin{aligned} \left( \mathrm I_n + \mu_1 \mathrm A_1 + \cdots + \mu_m \mathrm A_m \right) \mathrm x &= 0_n \\ \mathrm x^\top \mathrm A_1 \mathrm x &= 1 \\ &\vdots\\ \mathrm x^\top \mathrm A_m \mathrm x &= 1 \end{aligned}$$

I have no idea how to solve this system of $n+m$ equations in $n+m$ unknowns. Since we are not looking for the solution ${\rm x} = 0_n$, from the first $n$ equations, we have

$$\det \left( \mathrm I_n + \mu_1 \mathrm A_1 + \cdots + \mu_m \mathrm A_m \right) = 0$$

but I do not know what to do with this. Suggestions are most welcome.

  • Thanks! I obtained something similar before. Actually, the problem can be studied using duality analysis, and the Lagrange multipliers can be found by solving a set of polynomial equations (showing zero duality gap is easy). So I already know how to obtain those. However, as I described before, those regular points (which have $\nabla \mathcal{L} = 0$) are not so much of my interest. I'm interested in the nonregular ones, those which may not comply with $\nabla \mathcal{L} = 0$ due to linear dependency in the gradients of the constraints, but still being local minima. – FeedbackLooper Oct 02 '20 at 14:43