Component-wise averaging of similar quaternions while handling quaternion "double cover issue"

Question

To average together quaternions in a well-defined way, the eigendecomposition method of Markley et al. may be used, from Averaging Quaternions, Journal of Guidance, Control, and Dynamics, 30(4):1193-1196, June 2007, Eqs. (12) and (13).

However, if a set of all quaternions are close to each other (meaning that they represent very similar rotations), then element-wise averaging of the quaternions followed by normalization may produce a sufficiently "central" quaternion. (Elementwise averaging is much faster than the eigendecomposition, which is important for some applications.)

However, the quaternions $\bf{q}$ and $\bf-{q}$ represent the same rotation (sometimes called the "double cover issue" of quaternions), so element-wise averaging cannot be applied without first somehow making sure that any quaternions that are to be averaged lie within the same "half" of the rotation group SO(3).

There are several possible methods for "standardizing" each quaternion in a set of quaternions so that the double-cover issue is not a problem, and I wrote about these in this answer, but I am not sure which of these methods is correct (or optimal, and under what assumptions). Some possible methods for standardizing all quaternions ${\bf q}_i \in Q$ (while ensuring that each quaternion still represents the same rotation) include the following:

If the $w$ component is negative, negate the quaternion (i.e. replace ${\bf q}_i$ with $-{\bf q}_i$), so that the $w$ component is positive for all quaternions in the set $Q$.
Take the dot product of ${\bf q}_1$ with all subsequent quaternions ${\bf q}_i$, for $2 \le i \le N$, and negate any of the subsequent quaternions whose dot product with ${\bf q}_i$ is negative.
For each quaternion, measure the angle of rotation about the rotation axis of the quaternion, and normalize it so it always rotates the "short way around", i.e. such that $-\pi \le \theta \le \pi$. If it rotates the "long way around", i.e. $\theta \lt -\pi$ or $\theta \gt \pi$, then negate the quaternion.

These sometimes produce the same result, but they all produce different results in some cases (i.e. they all can negate different quaternions in a set of quaternions) -- therefore they are not equivalent.

What is the best way to deal with quaternions in a standardized way in order to overcome the double cover issue in situations like this?

Note that it is not just element-wise averaging of quaternions that can cause the double cover issue to affect the results. Another example is the swing-twist decomposition: in a naive implementation, the recovered rotation component around a given axis can represent either a rotation "the short way around" or a rotation "the long way around", which can lead to some unexpected or unstable results if you care only about the rotation about the axis, not the full quaternion.

If the quaternions are really for similar rotations, wouldn't method 2 always produce one of two results representing the same rotation? — David K, Oct 31 '20 at 04:26
It would be natural to require that the resulting average be invariant under multiplication by a unit quaternion, and independent of the choice of ordering of the arguments. None of the three options have both of these properties. One way that would would be to choose a quantity that measures how "close together" the arguments are, and choose signs so as to maximize this quantity. — Kajelad, Oct 31 '20 at 06:48
@DavidK I'm not sure what you mean? If $\bf{q}$ and $\bf{p}$ are quaternions, then $\bf{p} \cdot (-\bf{q}) = -(\bf{p} \cdot \bf{q})$. Can you please rephrase? — Luke Hutchison, Nov 01 '20 at 11:44
@Kajelad that's exactly what these three suggestions are trying to achieve: finding which version of the quaternion (itself or its negation) brings it "closest" to the other quaternions when treating the quaternions as 4-dimensional vectors. Can you please elaborate? — Luke Hutchison, Nov 01 '20 at 11:57
There are all kinds of criteria for "closeness" which may or may not be useful. One option would be choose the signs so as to minimize the variance of the set of quaternions; this would have the advantage of being both rotation-invariant and independent of ordering. — Kajelad, Nov 01 '20 at 12:27
What I mean is, if the quaternions in a set $S$ all represent similar rotations, you can partition them into subsets $A$ and $B$ such that all pairs of quaternions from $A$ have positive dot products, all pairs from $B$ have positive dot products, but any quaternion from $A$ and one from $B$ have a negative dot product. So in method 2 you will either negate all quaternions in $A$, producing $\mathbf q_A=\frac1n(-\sum A+\sum B),$ or negate all quaternions in $B$, producing $\mathbf q_B= \frac1n(\sum A-\sum B).$ But $\mathbf q_A=-\mathbf q_B,$ the same rotation. — David K, Nov 01 '20 at 14:11
I'm assuming your "dot product" uses only the imaginary part of the quaternion. If you also use the real part I'm not sure the partition in my previous comment works. I also assume "similar rotation" means "around nearby axes". — David K, Nov 01 '20 at 14:27
@DavidK I'm suggesting to use the full dot product, because that is the suggestion I see everywhere as a way to measure similarity of quaternions (people say use $\mathrm{cos}^{-1}({\bf p} \cdot {\bf q})$, and it's wrong half the time, if the sign of one of the two flips). — Luke Hutchison, Nov 03 '20 at 20:24
@Kajelad this is interesting -- however how are you defining variance? For starters, variance is only defined additively (as the squared difference from the mean) -- and even calculating the mean is not defined if quaternion addition is not defined. You also run into a circular problem here: you can't add quaternions (to calculate the mean) until you solve the sign-flipping problem. Perhaps you mean that Lagrange multipliers, constrained to $-1$ or $+1$, should be used to find the sign assignment with the lowest variance, treating the quaternions as vectors? — Luke Hutchison, Nov 03 '20 at 20:30
@LukeHutchison I'm not quite sure what you mean; addition, mean, and variance are all perfectly well defined in $\mathbb{H}$. Your last sentence seems to be on the right track: given a set of quaternions $q_1,\cdots,q_n$, choose $s_i\in{1,-1}$ such that the variance of $s_1q_1,\cdots,s_nq_n$ is minimized, and then define the "rotational average" as$$\frac{\sum_is_iq_i}{\left|\sum_is_iq_i\right|}$$This will be rotationally invariant, independent of order, and will give sensible results if the corresponding rotations are "close together". — Kajelad, Nov 03 '20 at 20:54
Let see: taking four-element dot products, $\bf p \cdot \bf p = 1$ and $\bf p \cdot -\bf p = -1$, so $\cos^{-1}(\bf p \cdot \bf p) = 0$ and $\cos^{-1}(\bf p \cdot -\bf p) = \pi,$ even though $\bf p$ and $-\bf p$ represent the same rotation. So whatever kind of "similarity of quaternions" this rule is good for, it seems no good for similarity of rotations represented by quaternions. Who recommends this? One high-quality, accessible source will do. — David K, Nov 03 '20 at 23:47
@Kajelad I wouldn't even know where to begin finding the $s_i$ values efficiently, when there are $O(2^{|Q|})$ of them. Any pointers or suggestions please? — Luke Hutchison, Nov 04 '20 at 07:49
@DavidK correct, they are connected: $\mathrm{cos}^{−1}({\bf p})=π−\mathrm{cos}^{−1}(−{\bf p})$. But admittedly most of the places I have seen people suggest taking the arc cosine of the dot product of two quaternions to measure their similarity are game development resources, e.g. https://3dgep.com/understanding-quaternions/#Quaternion_Dot_Product And there are various suggestions that the result of arccos should be doubled, e.g. https://www.mathworks.com/matlabcentral/answers/415936-angle-between-2-quaternions#answer_432283 and https://stackoverflow.com/a/21513697/3950982 — Luke Hutchison, Nov 04 '20 at 08:14
@DavidK this answer is promising... https://math.stackexchange.com/a/90098/365886 This seems to indicate a better measure of quaternion similarity is $1 - ({\bf p} \cdot {\bf q})^2$, which handles the fact that ${\bf p} = -{\bf p}$, because of the squaring. However with this squaring, it's impossible to know which signs of which quaternions to flip to enable element-wise averaging. — Luke Hutchison, Nov 04 '20 at 08:15
Indeed it seems https://math.stackexchange.com/a/90098/139123 gives a measure of similarity (or dissimilarity) that has the properties we want. You have convinced me that the four-component dot product is the correct one to use, not the three-component dot product; I was mistakenly fixated on rotation axes in some of my earlier comments. (I may want to clean out some of my comments at some point.) I think I have managed to formalize my intuition about the two sets $A$ and $B$; I've written it up as an answer. — David K, Nov 06 '20 at 04:24
@DavidK Right, taking the the three-element dot-product doesn't help for this particular problem, because if you wanted to flip the sign of the three imaginary parts, you can just also flip the sign of the real part to compensate. — Luke Hutchison, Nov 07 '20 at 05:42

score 3 · Accepted Answer · answered Nov 06 '20 at 04:16

As in this answer, let's define $d(\mathbf p, \mathbf q) \triangleq 1 - (\mathbf p \cdot \mathbf q)^2$ to represent the dissimilarity (or "distance") between two quaternions, where $\mathbf p \cdot \mathbf q$ is the usual componentwise inner product of the quaternions treated as four-dimensional vectors.

On the assumption that we are only going to average together quaternions that represent similar orientations, let's suppose that we have a set $Q$ containing some finite positive number of unit quaternions and that there exists some unit quaternion $\mathbf q_0$ (not necessarily a member of $Q$) such that for every $\mathbf q \in Q,$

$$ d(\mathbf q_0, \mathbf q) < \frac12. \tag1 $$

For component-wise averaging to be a good method, I think we would actually want the dissimilarity to be much smaller than this bound. I chose $\frac12$ merely because it is small enough to establish a property I want. If a set $Q$ admits a tighter bound, that's fine; what follows will be just as true, but the final result may be even better.

In particular, $d(\mathbf q_0, \mathbf q) < \frac12$ implies that $\lvert \mathbf q_0 \cdot \mathbf q\rvert > \frac{\sqrt2}2$, which implies that either $\mathbf q_0 \cdot \mathbf q > \frac{\sqrt2}2$ and the angle between $\mathbf q_0$ and $\mathbf q$ is less than $\frac\pi4$, or $-\mathbf q_0 \cdot \mathbf q > \frac{\sqrt2}2$ and the angle between $-\mathbf q_0$ and $\mathbf q$ is less than $\frac\pi4$.

This also implies for any two quaternions $\mathbf p,\mathbf q \in Q,$ that $\mathbf q_0 \cdot \mathbf p$ and $\mathbf q_0 \cdot \mathbf q$ both have signs (positive or negative), that if these signs are the same then the angle between $\mathbf p$ and $\mathbf q$ is less than than $\frac\pi2$ and therefore $\mathbf p \cdot \mathbf q > 0,$ and that if the signs are opposite then the angle between $\mathbf p$ and $\mathbf q$ is greater than than $\frac\pi2$ and therefore $\mathbf p \cdot \mathbf q < 0.$

So we can partition $Q$ into two subsets: the subset $Q_+ = \{\mathbf q\in Q \mid \mathbf q_0 \cdot \mathbf q > 0\}$ and $Q_- = \{\mathbf q\in Q \mid \mathbf q_0 \cdot \mathbf q < 0\}$. Any two quaternions from one subset will have a positive dot product, whereas any two quaternions from different subsets will have a negative dot product.

Now consider method 2. If the quaternion $\mathbf q_1$ is in $Q_+$, then after replacing $\mathbf q_i$ with $-\mathbf q_i$ whenever $\mathbf q_1\cdot\mathbf q_i<0,$ all the quaternions will be in $Q_+$ and the final result of averaging these quaternions and normalizing the result will be some quaternion $\bar{\mathbf q}.$ On the other hand, $\mathbf q_1$ is in $Q_-$, then after replacing $\mathbf q_i$ with $-\mathbf q_i$ whenever $\mathbf q_1\cdot\mathbf q_i<0,$ all the quaternions will be in $Q_-$ and the final result will be $-\bar{\mathbf q},$ that is, the exact opposite of the quaternion we would have gotten if $\mathbf q_1$ were in $Q_+$, representing the exact same rotation.

Hence, given a finite set of orientations that are sufficiently similar, the final result is completely independent of which of the two possible quaternions is selected to represent each orientation. Moreover, the quaternions that are figured into the final average are all relatively close together on the $3$-sphere; whereas if you take any method that is not equivalent to this one, the difference between the method must manifest in the fact that the alternative method averages one or more quaternions from $Q_+$ with one or more quaternions from $Q_-$, which will certainly introduce worse undesired cancellation effects than using quaternions from only one subset.

I would therefore choose method 2.

Many thanks for persisting -- I think you have the correct answer here. That makes logical sense. — Luke Hutchison, Nov 07 '20 at 05:56

Component-wise averaging of similar quaternions while handling quaternion "double cover issue"

1 Answers1

Linked