Box-constrained orthogonal matrix

Question

Given constants $\ell, u \in \mathbb{R}^{3 \times 3}$ and the following system of constraints in $P \in \mathbb{R}^{3 \times 3}$ $$ P^T P = I_{3 \times 3},\quad \ell_{ij} \leq P_{ij} \leq u_{ij}, $$ I would like to find a matrix $P$ which satisfies this system, or determine that it is infeasible. Is there a computationally efficient way to perform this task?

The solution doesn't have to be closed form, but it should be an algorithm implementable on a computer which runs quickly, exploiting the fact that it is an $9$ dimensional problem.

A numerical algorithm which converges to a solution is also a very good option, but there should be a proof that indeed it converges to a solution of the problem.

While the dimension is small, this is a constrained quartic (not quadratic) programming problem. Anyway, if you pick a $P_0$ randomly within the given margin and $USV^T$ is its singular value decomposition, $P=UV^T$ would make a good first guess. If the existence of solution is assumed, you may try a rejection method. I have done a quick experiment on Octave. Provided that the margin $u-\ell$ is not too large (say, smaller than 0.1), a solution can usually be obtained after a few (<10) guesses. — user1551, Jul 11 '18 at 05:23
Yes. I understand the difficulty. I am not sure, however, what is the probability of success, so I can estimate the number of iterations. — Alex Shtoff, Jul 11 '18 at 05:26
@RodrigodeAzevedo Yes, it's quadratic that way, but if you turn it into a single objection function, it becomes $|P^TP-I|_F^2$, which is quartic. — user1551, Jul 11 '18 at 22:27
Looking at the system, I understand that it defines a semialgebraic set which is of low dimension. So there are algorithms to test for feasibility for thst case. I wonder weather they can be efficient when the set is compact and the dimension is small. — Alex Shtoff, Jul 12 '18 at 08:26
Another numerical approach could be gradient descent on the manifold $\mathcal O={P:P^TP=I}$ towards the minimum of $\operatorname{dist}{\mathcal B}(P)$, where $\mathcal B={P:\ell{ij}\le p_{ij}\le u_{ij}}$. With the parameters of Rodrigo's experiment, and an initial guess taken uniformly from $\mathcal B$, this converges in no more than two descent steps over 90% of the time. Unfortunately, about 0.2% of the time it gets stuck in a local minimum. — , Jul 14 '18 at 14:35
This is not a complete answer to your question, but I doubt you will make much much progress until you can first isolate the truly independent parameters in your orthogonal matrix P (of which there are only 3 plus 3 additional choices of sign). See here for one way (but probably not the only way) of doing this. — John Polcari, Jul 15 '18 at 13:43

score 0 · Answer 1 · edited Jun 12 '20 at 10:38

We have a system of quadratic equations and linear inequalities in $\mathrm X \in \mathbb R^{3 \times 3}$

$$\begin{aligned} \rm X^\top X = \rm I_3\\ \rm L \leq X \leq \rm U\end{aligned}$$

where $\rm L \leq U$, of course. The convex hull of the orthogonal group $\mathrm O (3)$ is defined by $\mathrm X^\top \mathrm X \preceq \mathrm I_3$, or, equivalently, by the inequality $\| \mathrm X \|_2 \leq 1$. We then have the following (convex) feasibility problem

$$\begin{array}{ll} \text{minimize} & 0\\ \text{subject to} & \rm L \leq X \leq \rm U\\ & \| \mathrm X \|_2 \leq 1\end{array}$$

In order to avoid the zero matrix solution, let us look for a solution on the boundary of the (convex) feasible region. Hence, we generate a random matrix $\mathrm C \in \mathbb R^{3 \times 3}$ and minimize $\rm \langle C, X \rangle$ instead

$$\begin{array}{ll} \text{minimize} & \langle \mathrm C, \mathrm X \rangle\\ \text{subject to} & \rm L \leq X \leq \rm U\\ & \| \mathrm X \|_2 \leq 1\end{array}$$

Numerical experiment

Suppose we would like to find a $3 \times 3$ matrix that is orthogonal, whose entries on the main diagonal are in $[-0.95, 0.95]$, and whose entries off the main diagonal are in $[-0.5, 0.5]$.

Using NumPy to randomly generate matrix $\rm C$ and CVXPY to solve the convex program,

from cvxpy import *
import numpy as np
n = 3
C = np.random.rand(n,n)
I = np.identity(n)
J = np.ones((n,n))
L = -0.95 * I + (-0.5 * (J - I))
U =  0.95 * I + ( 0.5 * (J - I))
X = Variable(n,n)
define optimization problem
prob = Problem( Minimize(trace(C.T * X)), [ L <= X, X <= U, norm(X,2) <= 1 ] )
solve optimization problem
print prob.solve()
print prob.status
print results
print "X = \n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "Round(X.T * X) = \n", np.round((X.value).T * (X.value), 1)

which outputs the following

-2.04689293246
optimal
X = 
[[-0.94511744  0.20404426 -0.2650367 ]
 [-0.30732554 -0.84243613  0.44563059]
 [ 0.13860874 -0.499968   -0.85597357]]
Spectral norm of X =  1.00521536292
Round(X.T * X) = 
[[ 1. -0. -0.]
 [-0.  1. -0.]
 [-0. -0.  1.]]

I ran the Python script a few (maybe $5$) times until I obtained a matrix $\rm X$ that is (approximately) orthogonal. In other words, the algorithm does not produce such nice results for all choices of $\rm C$.

Thank you for your effort. Unfortunately, I already tried many relaxations. Some of them are successful, like yours. However, I am looking for a provable solution. For example, if there was a way to bound the probability of failure of your random method, I could estimate how many times I should run it until I am convinced that the problem is infeasible. I do not see an immediate way to prove such a bound for your method. — Alex Shtoff, Jul 14 '18 at 08:45
You can use real quantifier elimination to prove infeasibility, though $9$ dimensions may be too many. If it is indeed feasible, then you can use the relaxation in my answer. — Rodrigo de Azevedo, Jul 14 '18 at 09:06
Perhaps there is a finite set of matrices $C$ such that if you try them all you are guaranteed to find the solution whenever it exists. Is it enough to try all $2\cdot9$ choices of $C$ orthogonal to the constraint hyperplanes? Or all $2^9$ choices pointing towards the vertices of the box $L\le X\le U$? — , Jul 17 '18 at 10:49

score 0 · Answer 2 · 2018-07-21T14:23:28.100

We use the following pseudo Newton iteration, that gives very quickly an orthogonal matrix.

$V_{k+1}=0.5(V_k+V_k^{-T})$.

Indeed, let $f:V\rightarrow V^TV-I_n$ and $Df_V:H\rightarrow H^TV+V^TH$. The standard Newton iteration is $V_{k+1}=V_k-(Df_{V_k})^{-1}(f(V_k))$, but $Df_{V_k}$ is not invertible; yet $H=0.5(V_k-V_k^{-T})$ is a solution of $Df_{V_k}(H)=f(V_k)$. For this solution $V_{k+1}=0.5(V_k+V_k^{-T})$.

A trial is composed with the following $3$ steps.

Step 1. We choose an element of $[L,U]$: $V_0=kL+(1-k)U$ where $k$ has the form $i/10,i=0,\cdots,10$.

Step 2. We use the above iterative algorithm with initial point $V_0$ and with $10$ iterations.

Step 3. If the obtained limit is in the box, then we stop. Otherwise, we continue with the next value of $k$.

The time of calculation for $1000=10^3$ trials is $17"$. There are on average $5$ failures for $10000=10^4$ trials.

Below, a copy of the procedure in Maple (the result is in $V$).

restart;
with(LinearAlgebra);
 Digits := 20;

roll := rand(1 .. 3); 

n := 3; id := IdentityMatrix(n):

 ve := Vector(11);

 vv := Vector([.5, .6, .4, .7, .3, .8, .2, .9, .1, 1, 0]);


 t:=time(): 
 for l from 1 to 10000 do

  K:=RandomMatrix(n,n,shape=skewsymmetric):

  U:=evalf((id-K).(id+K)^(-1)): 

 Z1:=Matrix(3,3,(i,j)->0.3*roll()):Z2:=Matrix(3,3,(i,j)->-0.1*roll()): 

  U1:=U+Z1:U2:=U+Z2:

test := [10, 10]; co := 0;
while test<>[0,0] and co<11 do  

co:=co+1:k:=vv[co]:

    V:=k* U1+(1-k)* U2:  
for i to 10 do 

V := .5*(V+1/Transpose(V)) end do;

test:=[0,0]: 

R1:= U+Z1-V:R2:=U+Z2-V: 

 for i from 1 to 3 do 

 for j from 1 to 3 do  

if R1[i,j]<0  then test:=test+[1,0]: fi:  

 if R2[i,j]>0  then test:=test+[0,1]: fi:   

 od:od: 

ve[co]:=test: 

if co=11  and test<>[0,0] then print(ve[7..11],U,Z1,Z2); fi: 

  od:  od:  

time()-t;

EDIT. Case 1. There is some solutions in the box and we want to find an approximation of one such solution. A Maple specialist advised me to use the command NPSolve; unfortunately, NPSolve only works (in general) if you give it an initial point not too far from a solution. Yet we can do as follows:

STEP 1. As in my above procedure, we pick some point $V0=kL+(1−k)U$ where k has the form $i/10,i=0,⋯,10$ on the segment $[L,U]$ and we project it in $V_1$ on $O(3)$. In a few cases, none of the $11$ points $V_1$ is in the box; yet, some are very close to the edge of the box.

STEP 2. We use NLPSolve (this soft accepts only the closed boxes) with $V_1$ as initial point; then the software uses the tangent vector space of $O(3)$ in $V_1$ and, with a gradient method, pushes this point towards the edge of the box. In general we obtain by this method a point $V_2$ that is very close to the edge of the box and we are done. For 20000=2.10^4 trials (as in my above procedure), I did not suffer any failure (on the $11$ points $V_1$, NLPSolve always works for many of them)!!

Of course there is no proof that the method is without fail. One must even be able to find a counter-example by making sure that there is only one solution (on the edge of the box). Of course, the OP did not take a lot of risk by offering a big bonus for rigorous proof. He will be forgiven because 500000 dollars is the price of a nice house.

CASE 2. There are no solutions in the box and we want to prove this fact. Except if you find a condition in the form $x[i,j]>1$, I think we cannot avoid a formal proof. In these conditions, I do not see any other methods than the Gröbner's basis method. Over $\mathbb{C}$, there are black boxes that do the job when the number of relations is not too large; unfortunately, it is known that in general the problem is much more complicated in the real case; in particular, it is necessary to use gradient methods. Anyway, I don't know how to do the job with Maple.

(i) Why use this "pseudo-Newton" procedure instead of the usual SVD-based projection $P \mapsto UV^T$ where $U\Sigma V^T=P$? (ii) Any guarantees that one only needs to sample initial guesses of the form $kL+(1-k)U$, and not any other matrices satisfying $L\le P\le U$? — , Jul 15 '18 at 04:53
@Rahul . For 1), it's a gag: the limit of pseudo-Newton method is exactly the orthogonal projection on $O(n)$. For 2), I don't agree. It is ridiculous to use a sophisticated method that is valid for all cases (physicists are less stupid than us) whereas most cases are resolved in $10^{-2}$ second. In a second time, we deal with recalcitrant cases... — , Jul 15 '18 at 11:51
Rahul, what about infeasibility? What if the box does not contain any orthogonal matrix? — Alex Shtoff, Jul 15 '18 at 12:13
You don't agree... with my question? About whether your method can fail to find a solution when one exists? — , Jul 15 '18 at 12:28
I saw your question 2) as an objection but I was wrong. Of course you are right; my method can fail ($\approx 5$ times for $10^4$ trials) to find a solution when one exists; moreover, I claim this gap because I don't see any better result. You declare yourself $2$ failures for $10^3$ trials. — , Jul 15 '18 at 12:39
Sorry, I missed that in your original post. I don't understand where the randomness is coming from, since you say that you choose the initial guess of the form $V_0 = kL + (1-k)U$ with $k\in{0,\frac1{10},\frac2{10},\dots,\frac9{10},1}$, so there are only $11$ different possibilities. Or are $L$ and $U$ not the fixed lower and upper bounds on the matrix entries? — , Jul 16 '18 at 05:24
Also you seem to have missed the OP's comment above (apparently misaddressed to me). — , Jul 16 '18 at 05:28

score 0 · Answer 3 · answered Jul 17 '18 at 23:57

0

Here is one suggestion. You can eliminate the $P^TP=I$ constraint as follows.

$P^TP=I$ means that $P$ is a rotation matrix. Every rotation matrix can be written as $\exp([\omega])$, where $\exp$ is the matrix exponential and $$[\omega]=\left( \begin{array}{ccc} 0 & -a_3 & a_2 \\ a_3 & 0 & -a_1 \\ -a_2 & a_1 & 0 \\ \end{array} \right)$$ is a skew symmetric matrix. Also, from here (Matrix exponential of a skew symmetric matrix) you can find the matrix exponential analytically as: $$x = \sqrt{a_1^2+a_2^2+a_3^2}; \quad P=\exp([\omega])= I + \dfrac{\sin x}{x}[\omega] + \dfrac{1-\cos x}{x^2}[\omega]^2.$$

What I can think of is to perform a brute-force search on $a_1$, $a_2$, and $a_3$ to see if any value for them results in a $P$ that satisfies your bounds constraints.

answered Jul 17 '18 at 23:57

Hashimoto

420

The exponentials of skew symmetric matrices generate only matrices in $SO(3,\mathbb R)$. Orthogonal matrices $P$ with determinant $-1$ cannot be generated this way. – user1551 Jul 18 '18 at 03:50
@user551 You are right. Can it be solved by also trying to negate a column? – Hashimoto Jul 18 '18 at 03:52
This approach has two major drawbacks from my standpoint. First, it seems that the parameterization is unbounded, so there is no compact domain to try the 'brute force' on. Second, I believe that guessing is much harder if the bounding box becomes small. – Alex Shtoff Jul 20 '18 at 08:28

Box-constrained orthogonal matrix

3 Answers3

Numerical experiment

define optimization problem

solve optimization problem

print results