Solve a linear equation system, while enforcing a unit vector solution

Question

I have an equation system of the form Aix + Biy + Ciz = Di, where (x,y,z) is a unit vector, and (Ai, Bi, Ci, Di) are sets of measurements from a noisy system (with typically 3-5 independant readings).

My first intuition to solve this problem was to pose this as an overdetermined linear equation system AX = B where X = (x,y,z), and to solve for X. However, with that approach, I have no way to enforce that the solution for vector X is a unit vector.

Is there an elegant (or standard) solution to that problem, or should I simply dive into non-linear equation solving solutions?

Suggestion: If you have $n$ measurement sets, then you have $\binom{n}{3}=% \frac{n(n-1)(n-2)}{3!}$ ways of choosing $3$ of them and for each one solve the corresponding linear equation system $AX=B$. You could determine $% X^{\ast }=(x^{\ast },y^{\ast },z^{\ast })$ so that $\left\vert 1-\left( x^{\ast }\right) ^{2}-\left( y^{\ast }\right) ^{2}-\left( z^{\ast }\right) ^{2}\right\vert $ is minimum. Finally you would compute $X^{\ast \ast }=(x^{\ast \ast },y^{\ast \ast },z^{\ast \ast })=\frac{X^{\ast }}{\left\vert X^{\ast }\right\vert }$. — Américo Tavares, Nov 05 '10 at 22:31
This was originally addressed by the following paper. Walter Gander, "Least squares with a quadratic constraint", Numerische Mathematik, 1980. — GDumphart, Apr 03 '24 at 21:44

score 6 · Answer 1 · answered Nov 05 '10 at 22:04

Unfortunately, the problem is almost certainly innately non-linear, since your solution space is inherently so. Pragmatically, the best approach is probably to treat your overdetermined linear system as a minimization-problem, least-squares style; this gives you a (positive definite) function to be minimized. Find the (exact) minimum value in $\mathbb{R}^3$, project that down to $S^2$ (i.e., normalize your vector X) to find the closest point on the sphere, and use that as a starting point for some descent-based method (i.e., find the component of the gradient of your goodness-of-fit function tangent to the sphere, add that to your current point on $S^2$ and renormalize down to the sphere again, and keep doing that until you meet whatever convergence criterion you want to use. There's no guarantee that this will work if your data is particularly ill-conditioned, but in practice it's likely to be the most straightforward method.

Laurent Lessard · Answer 2 · 2010-11-06T02:17:56.420

4

Your objective is to solve $Ax=b$ subject to the constraint $||x||=1$. One way to do this is to recognize that the solution to $Ax=b$ is an affine space (translated subspace). This means that out of all the solutions to $Ax=b$, there is a unique "smallest" one, i.e. where $||x||$ is as small as possible. To find this solution, you can solve the minimium-norm optimization problem, whose solution is: $$ x_0 = \lim_{\lambda \to 0^+} (A^TA+\lambda I)^{-1}A^Tb $$ In the case where $A$ is skinny and full-rank, this simplifies to $(A^TA)^{-1}A^Tb$. In general, the solution can be found by taking the SVD of $A$.

Once you have verified that $x_0$ is indeed a solution, the next thing to do is check that $||x_0|| \leq 1$, for if that were not the case, then you could confidently assert that there is no solution to $Ax=b$ with norm 1.

The next step is to find a basis for the nullspace of $A$, i.e. the set of vectors $x$ such that $Ax=0$. This can be done using the SVD again. If $N$ is a matrix whose columns form a basis for the nullspace of $A$, then the set of all solutions to $Ax=b$ is: $$ x = x_0 + Ny $$ where $y$ is an arbitrary vector of appropriate dimension. The key point here is that $x_0$ is orthogonal to all vectors in the nullspace of $A$ (because $x_0$ is the solution point closest to the origin). Thus, you may compute the norm of $x$ using the Pythagorean theorem: $$ ||x||^2 = ||x_0||^2 + ||Ny||^2 $$ Now we see that the if we want $||x||=1$, we simply need to choose $y$ so that $$ ||Ny|| = \sqrt{1 - ||x_0||^2} $$ We may always choose the columns of $N$ to be orthonormal, so let's assume we have done that. Then the set of desired $y$ is precisely the sphere with radius $\sqrt{1 - ||x_0||^2}$. By choosing any $y$ on this sphere, we parametrize all solutions to your problem. If you only care about a particular solution, then you can choose $y$ of the form $[\alpha,0,...,0]$, where $\alpha = \sqrt{1 - ||x_0||^2}$.f

edited Nov 06 '10 at 02:17

answered Nov 05 '10 at 22:11

Laurent Lessard

2,044

The 2-norm may or may not be appropriate here, I think; the OP never said anything about how noisy his data is. – J. M. ain't a mathematician Nov 05 '10 at 22:33
The 2-norm is the appropriate norm if the noise is IID Gaussian, which is as good an assumption as any. "how noisy" it is makes no difference -- the only thing that matters is the pdf. – Laurent Lessard Nov 06 '10 at 00:21
Yes, I'm just giving a note that one should be careful, since not all noisy data have Gaussian noise. – J. M. ain't a mathematician Nov 06 '10 at 00:26
1

The solution to $Ax = b$ would be a subspace if the system were underdetermined, wouldn't it? The OP stated that their system was overdetermined. – Nov 06 '10 at 01:04
i meant to say "affine space", i.e. a subspace that has been translated so it no longer necessarily passes through the origin. I corrected my solution. With regards to your comment, the solutions to Ax=b, if any exist, ALWAYS belong to an affine space. This is true regardless of whether the equations are over- or under-determined. It is also possible for no solutions to exist in either case as well. – Laurent Lessard Nov 06 '10 at 02:22
1

The problem is that in practice, it's always the case that no solutions exist for these 'noisy' overdetermined equations. Imagine shooting three rays 'at' a point in the plane, each missing by some small random amount; generically they'll pairwise intersect in a triangle somewhere near the target point, but seldom (more precisely, with probability 0) will all three have a common intersection. – Steven Stadnicki Nov 07 '10 at 17:57
1

indeed. I assumed in my write-up that the problem actually had a solution -- my method is then exact. In the case where it does not ($Ax_0 \neq b$), then $x_0$ is the least-squares estimate, which minimizes the mean squared error across all measurements. Similarly, if $A$ has no nullspace, we can let $N$ be the right-singular-vector corresponding to the smallest singular value of $A$. – Laurent Lessard Nov 08 '10 at 03:51
I probably would have upvoted if your last comment was in your answer. – Tim Seguine Jan 19 '13 at 15:07
@LaurentLessard, What about the case the least squares solution has a norm bigger than 1? – Royi Nov 17 '23 at 22:08
@LaurentLessard, Also, in the overdetermined case the last vector of $ V $, the right eigen vectors, won't have the property of $ A N = \boldsymbol{0} $ which means the trick won't work. – Royi Nov 17 '23 at 22:29
1

@StevenStadnicki, I think the solution works when the system is underdetermined and the least norm least squares solution has a norm not bigger than one. Otherwise it will fail. Am I missing something? – Royi Nov 18 '23 at 09:02

user502144 · Answer 3 · 2023-11-26T19:42:52.307

Distance to ellipsoid

The problem is equivalent to finding the closest point on an ellipsoid.

Let SVD decomposition of A be $U \Sigma V^T$. We can substitute $y=\Sigma V^T x - U^T b$. Then the problem $$min\|A x - b\|_2, \quad\|x\|_2=1$$ can be written as $$min\|y\|_2, \quad \|V \Sigma^{-1} ( y+U^Tb)\|_2=1$$ The first expression is minimization of Euclidean distance from the origin and the second constraints the point to lie on a surface of ellipsoid with center $-U^T b$ and dimensions given by diagonal elements of $\Sigma$.

Distance from the point to the ellipsoid can be efficiently computed as described in Distance from a Point to an Ellipse, an Ellipsoid, or a Hyperellipsoid by minimization of a univariate function. Below is a sample implementation in Python, which uses distance calculation from Geometric Tools

def solve_ellipsoid(A, b):
    U, S, V = np.linalg.svd(A, full_matrices=False)  # A == U @ S @ V
    center = -U.T @ b
    axes = np.eye(len(S))
    point = np.zeros(len(S))
    distance, points = DCPQuery()(point, Hyperellipsoid(center=center, extent=S, axes=axes))
    return V.T @ ((points[1] + U.T @ b) / S)

Second-order cone programming

Another option is to use SOCP solver. Unfortunately, it works only in the case when unconstrained least squares solution is outside unit sphere. Otherwise it returns just least squares solution.

General formulation of SOCP program: $$min f^T y, \quad \|A_i y + b_i\|_2 \le c_i^T x + d_i$$

The problem can be solved by introducing slack variable $s$ and finding $y = \begin{bmatrix} x \\ s \end{bmatrix}$ with the following substitutions. Then $x$ will be a solution to the original problem (if least squares solution is outside of unit sphere).

$$A_1 = \begin{bmatrix} A & 0_n \\ 0_n & 0 \end{bmatrix} , \quad b_1 = \begin{bmatrix} -b \\ 0 \end{bmatrix} , \quad c_1 = \begin{bmatrix} 0_n \\ 1 \end{bmatrix}, \quad d_1=0$$ $$A_2 = \begin{bmatrix} I_{n\times n} & 0_n \\ 0_n & 0 \end{bmatrix}, \quad b_2 = 0_{n+1} , \quad c_2=0_{n+1}, \quad d_2=1$$ $$f = \begin{bmatrix} 0_n \\ 1 \end{bmatrix}$$

Old answer (incorrect):

This problem can be solved using singular value decomposition. The original problem can be stated as $$min\|Ax-b\|_2, \|x\|_2=1$$ By substituting $$A'=[A;-b], x'=w x; w$$ and choosing $w=\frac{\|x\|_2}{\|x;1\|_2}$ we get $\|x'\|_2=w\|x;1\|_2=\|x\|_2$, so the problem becomes $$min\|A'x'\|_2, \|x'\|_2=1$$ which is same as in this question: Minimizing $\|Ax\|_2$ subject to $\|x\|_2 = 1$ It can be solved by decomposing $A'$ into $U \Sigma V^T$ and then taking the last column of $V$, which corresponds to the smallest singular value of $A'$.

Finally $x$ is calculated by dividing $x'$ by the last element, taking elements of $x'$ except the last one and then normalizing it.

I tried this on MATLAB and it doesn't yield the correct answer. I think the normalization step invalidates the solution. Your solution above doesn't show that the solution to the original problem, when scaled and added 1 is the solution to the 2nd problem you created. Hence the way back isn't working. — Royi, Nov 17 '23 at 22:50
@Royi Thank you for noticing the issue. I wasn't able to figure out how to fix this solution, but I found another one that works and updated the answer. — user502144, Nov 26 '23 at 19:46

Solve a linear equation system, while enforcing a unit vector solution

3 Answers3

Distance to ellipsoid

Second-order cone programming

Old answer (incorrect):

Linked