Correct way to calculate the mean of SO(3) rotation matrices

Question

I have N SE(3) matrices that define the rigid body transformation between a LiDAR and an IMU. These SE(3) matrices are estimated using a calibration procedure at different trials. I want to find the resultant SE(3) by estimating the mean of the SE(3) matrices.

I can decompose the matrix into (x,y,z) translation vector and SO(3) matrix to calculate the mean translation, but I'm not sure if converting the SO(3) rotation matrix into Euler space and then calculating the mean for each rotational axis is the right way.

Additionally, in Euler space, the angles could have flipped signs compared to previous trials, although it would still result in the same rotation. Im not quite sure how I can handle it.

example data in Euler space

Can you please point me in the right direction? Anything really helps.

Edit 1: To be clear, I want to combine all the SO(3) matrices estimated during different trials to get the best representation/resultant matrix of the rotation.

What do you mean by the mean of a bunch of rotation matrices? One can ask about the mean of a random variable with numerical values, and the answer would be a numerical value. The mean of a bunch of rotation matrices does not really make sense, not without some further explanation on your part. — Lee Mosher, Aug 01 '24 at 19:00
When I say the mean of the Rotation matrices, I meant the mean of rotation angles w.r.t to each axis represented by the rotation matrix. Does that make sense? I'm sorry if I am missing something. — spacemanspiff, Aug 01 '24 at 19:26
@spacemanspiff if you want the means with respect to the $x,y$ and $z$ axes separately then the Euler representation is the correct way to go. — CyclotomicField, Aug 01 '24 at 19:53
@CyclotomicField, I need the rotation matrix that best represents the average of the matricies estimated during various trials. One way of thinking about it was to get the averages in Euclidean space and then converting it back to rotation matrix. Does that still hold good? — spacemanspiff, Aug 01 '24 at 21:00
@spacemanspiff What do you mean by the average in Euclidean space? Are you looking for the center of mass of certain vectors? What vectors are you considering to define the rotation? — CyclotomicField, Aug 01 '24 at 21:05
Is the example you give the typical situation? Where the resulting matrices are all rather "close" together with the only reason for the numbers appearing different due to periodicity ambiguities of the parameterization? You know you could describe them all in a single "nice" coordinate chart? Do you need the method to be robust enough to handle when this is not the case (can it give bad results or an error in that situation)? — AHusain, Aug 27 '24 at 18:40
there is a problem right away when the bunch of matrices form a group ( finite). Then it's a very symmetric situation and the mean should be invariant under the symmetries of the group... no matter how we get it ( as an optimization problem) it would have several solutions (as a rotation). — orangeskid, Sep 01 '24 at 01:29

Jyrki Lahtonen · Answer 1 · 2024-08-28T03:13:30.113

As a first approximation I might try the following. A rotation by angle $\theta$ about the axis spanned by the unit vector $\vec{u}$ is represented by the quaternion $$q=\cos(\theta/2)+\sin(\theta/2)\vec{u}.$$ Here $\cos(\theta/2)$ can always be arranged to be non-negative. We can then form the average of those 4-vectors (it will also have non-negative real part), normalize it and use that. There are some anomalous situation (such as with an average of 180 degree rotations), but one of the reasons quaternions are used in computer graphics is exactly that you can (save for some similar anomalous situations) smoothly interpolate from one rotation to another (without having to worry about discontinuities like gimbal locks that may appear if you use Euler angles), and calculating an average is an extension of the same idea.

But, I won't claim the above would be the "correct" definition for an average of rotations. You might prefer to use a quantity like $1-\cos\theta$ to tell how far two rotations are from each other, and then try and minimize the mean square error. I don't know of a practical way of solving that, but it could be possible. Again using quaternions to represent rotations, when the quotient $q_1q_2^{-1}$ (or its real part) can be used to calculate the above "distance" between rotations $q_1$ and $q_2$. Similar caveats apply in the sense that there is no guarantee to get a unique answer this way either!

Ben Grossmann · Accepted Answer · 2024-08-27T20:23:00.070

Usually, taking the "average" of matrices would mean the usual arithmetic mean, $A = \frac 1n\sum_{i=1}^n A_i$. The problem with this, however, is that there is no reason to believe that this arithmetic mean would be another rotation.

One solution to this problem is to use a representation of rotations within a space that allows for linear combinations (and hence allows for arithmetic means). Jyrki's suggestion is to use the unit quaternion representation. I will suggest using a matrix logarithm.

The idea is as follows: the matrix exponential defines an onto map to $SO(3)$ from the skew-symmetric matrices (the "Lie Algebra" of $SO(3)$). If we reverse this map, average the resulting skew-symmetric "representations" of the matrix, then use the matrix exponential on this average, then in a sense we have the "average" rotation. (Aside: the exponential map over a Lie algebra is generally onto if the corresponding Lie group is connected and compact)

For a unit vector $\mathbf v = (v_x,v_y,v_z)$, let $K_\mathbf v$ denote the corresponding cross-product matrix, $$ K_{\mathbf v} = \pmatrix{ 0&-v_z & v_y\\ v_z&0&-v_x\\ -v_y & v_x & 0}. $$ The Rodrigues rotation formula tells us that $$ \exp(\theta K_{\mathbf v}) = I + \sin \theta K_{\mathbf v} + (1 - \cos \theta)K_{\mathbf v}^2. $$ Reversing this formula is tricky, but here's an approach that works as long as $\theta \neq 180^\circ$. In order to reverse this formula and find values of $K$ and $\theta$ corresponding to a given rotation $R$, we can get the angle of the rotation using the of the rotation: $$ \operatorname{Tr}(R) = 1 + 2 \cos \theta \implies \theta = \arccos\left[ \frac{\operatorname{Tr}(R) - 1}{2}\right]. $$ From there, we can use the skew-symmetric part of $R$ to find the axis: $$ R - R^T = 2 \sin \theta K_{\mathbf v} \implies K_{\mathbf v} = \frac 1{2 \sin \theta}(R - R^T). $$ This fails if $\theta = 0$, but in that case we can simply give the zero-matrix as the logarithm. Putting it all together, we have $$ \theta(R) \arccos\left[ \frac{\operatorname{Tr}(R) - 1}{2}\right], \quad \log(R) = \begin{cases} I & \theta(R) = 0,\\ \frac 1{2 \sin \theta}(R - R^T) & 0 < \theta(R) < \pi,\\ ??? & \theta(R) = \pi. \end{cases} $$ So, given rotation matrices $R_1,\dots,R_k$, we compute the skew-symmetric matrix $$ X = \sum_{j=1}^k \log(R_k). $$ From there, you could compute the matrix exponential in the usual way. Alternatively, you could apply the Rodrigues formula. Note that if $X = \theta K_{\mathbf v}$, then $\theta = \sqrt{\frac 12 \text{tr}(X)}$ and $K_{\mathbf v} = \frac 1{\theta} X$.

Alternatively, I suspect that the following might lead to more consistent behavior of the average: if the rotations are all close to each other, then we can use one of the rotations as a reference point and, in essence, use the tangent space to $SO(3)$ at that point rather than the tangent space at the identity.

For a given reference rotation $R_0$ and other rotations $R_1,\dots,R_k$, this amounts to the following: compute $$ X = \frac 1k \sum_{j=0}^k \log(R_0^T R_j) $$ (of course we always have $R_0^TR_j = I$ for $j = 0$), and then take your "average" rotation to be $R_0 \exp(X)$.

One advantage of this method is that, as long as the rotations are close to each other, we can be sure that we won't need the logarithm in (or near) the $\theta = \pi$ case.

Another alternative to consider: you could take the usual average of the matrices, $A = \frac 1n\sum_{i=1}^n A_i$, and then get the nearest orthogonal matrix to this average. As in the solution to Wahba's problem, you can obtain this answer using the singular value decomposition of $A$. If $A$ has the SVD $A = U \Sigma V^T$, then the nearest element of $SO(3)$ is given by $R = U \tilde \Sigma V^T$, where $$ \tilde \Sigma = \pmatrix{1&0&0\\0&1&0\\0&0&\det(U)\det(V)}. $$ Notably, if the matrices $A_i$ are sufficiently close to each other, then the continuity of the determinant implies that we'll have $\det(U)\det(V) = \text{sgn}(\det(A)) = 1$, so that $\tilde \Sigma = I$

Also noteworthy: in the case that $\tilde \Sigma = I$, this nearest element of $SO(3)$ is also the orthogonal matrix $U$ from the polar decomposition of $A$.

Thanks for sharing. +1 for the reference to Wahba's problem alone! — Jyrki Lahtonen, Aug 28 '24 at 03:08
@JyrkiLahtonen Glad you appreciated it, I like your answer as well — Ben Grossmann, Aug 28 '24 at 20:59

Crystal Math · Answer 3 · 2024-08-29T04:07:22.190

As others have pointed out, there are multiple interpretations to what the "mean" of rotations actually is, and in practice people optimize objectives that are computationally convenient and work well enough for the purpose at hand. These methods typically trade off ease of implementation with robustness as well as how well the result matches our expectation of what the "average" should be.

I'll suggest an alternative, which in my opinion, is a more ideal objective to minimize: the sum of geodesic distances (i.e. shortest paths) on the SO(3) manifold. The "mean" rotation corresponds to the $L^2$ center of mass, and the geodesic distance is the magnitude of the angle of rotation to align two rotations. The other proposed methods do not actually optimize over this metric, mainly because it is difficult to do so (with two rotations, a closed-form expression exists for quaternions given by SLERP, but I'm not aware of any for more than two). The Euclidean average in the tangent space proposed by @Ben Grossman is similar, but that objective is not invariant to rotation (i.e. the point where you define the tangent space from).

I did come across an interesting gradient descent method for quaternions to compute this geodesic-minimizing average that I'll briefly share below, but I recommend checking out the original paper$^1$ if you're interested in more details. The paper also mentions algorithms for computing similar median and minimax centers.

With $n$ quaternions and corresponding weights $\alpha_i$, some initial guess $q_0$, and a step size for gradient descent $\beta>0$, compute the following iterations to converge to mean quaternion $\overline{q}$ with $\overline{q}_0 = q_0$: $$\overline{q}_{k+1} = \overline{q}_k^{\frac{1}{2}}(\exp(\beta\sum_i^n \alpha_i \log(\overline{q}_k^{-\frac{1}{2}}q_i\overline{q}_k^{-\frac{1}{2}})))\overline{q}_k^{\frac{1}{2}}$$

The algorithm terminates when $||\log(\overline{q}_{k+1}^{-\frac{1}{2}}\overline{q}_k\overline{q}_{k+1}^{-\frac{1}{2}})|| < \epsilon$ for some small convergence threshold $\epsilon$. The downside of this approach is that it requires an initial guess, but from my experience, it converges rather quickly in many applications if the initial guess is chosen as one of the rotations being averaged. It also may be more complicated to implement than other approaches depending on how comfortable you are with quaternion math. However, I think it gives more "accurate" estimates than other objectives. Hope this is helpful!

$^1$Angulo, Jesús. (2014). Riemannian L (p) Averaging on Lie Group of Nonzero Quaternions. Advances in Applied Clifford Algebras. 24. 10.1007/s00006-013-0432-2.

Edit: fixed typo in formula

A nice first post! Welcome to the site! – Jyrki Lahtonen Aug 29 '24 at 04:43 — Jyrki Lahtonen, Aug 29 '24 at 04:43

Correct way to calculate the mean of SO(3) rotation matrices

3 Answers3