Angle eigenvector makes with $x$-axis

Question

Say I have a symmetric matrix that lives in $xy$ space. I will write it as $$A = \begin{pmatrix} a & b \\ b & c \end{pmatrix}.$$ I am interested in the covariance matrix, so let me also say $a,c>0$, and $A$ is positive-definite, which means $|b|\leq\sqrt{ac}$. The eigenvalues of $A$ are therefore $\geq 0$, and the eigenvectors are orthogonal. The eigenvalue problem $A\mathbf{x}=\lambda\mathbf{x}$ reads $$\begin{pmatrix} a & b \\ b & c \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix} = \lambda \begin{pmatrix} x \\ y \end{pmatrix}.$$ The system of equations can be written as \begin{cases} y = x(\lambda-a)/b \\ x = y(\lambda-c)/b, \end{cases} and combined to yield $$ \frac{c-a}{b}=\frac{y}{x}-\frac{x}{y}.$$

In the figure above, the red and blue vectors are the eigenvectors with the larger and smaller eigenvalue respectively. Working with the largest one, I can write the equation above the figure as \begin{align} \frac{c-a}{b} &= \tan{\theta}-\tan{\beta} \\ &= (1+\tan{\theta}\tan{\beta})\tan(\theta-\beta) \\ &= 2\tan(2\theta-\pi/2) \\ &=-2\cot(2\theta), \end{align} where to write the second equality I have used the trigonometric identity for the tangent of a difference. The third line follows from writing $\tan \theta \tan \beta = 1$ and $\beta = \pi/2 - \theta$. Lastly, I used the trigonometric relations between complementary angles to write the last equality, which can be recasted as \begin{equation} \tan(2\theta) = \frac{2b}{a -c}. \end{equation}

The following equation $$\theta = \frac{1}{2}\textrm{atan2}(2b,a-c),$$ where $\textrm{atan2}(y,x)=\textrm{Arg}(x+iy)$, can be used to compute the angle the eigenvector with largest eigenvalue, $\mathbf{x}$, makes with the $x$-direction. Choosing the principal branch, the range of $\theta$ is $(-90,90]^\circ$, i.e. the right half of the plane.

PROBLEM

As you may have noticed, I absorbed the eigenvalue $\lambda$ in the equation right before the figure. This means the rest of the derivation should hold for both eigenvectors of $A$. However, I have tested the calculation with multiple cases and magically the angle which $\theta$ measures is always that between the $x$-direction and the eigenvector with largest eigenvalue.

As an example, we can look at the special case $$A = \begin{pmatrix} a & 0 \\ 0 & c \end{pmatrix},$$ where the eigenvectors are aligned with the coordinate axes, and $a$ and $c$ are the eigenvalues.

If $a>c$, we get $\theta=0^\circ$. However, if $a < c$, we get $\theta = 90^\circ$.

Can anybody provide me with an explanation of why the ambiguity in the eigenvector that $\theta$ is describing seems to fade away somehow? I'd really appreciate it!

NEW OBSERVATION

I have realised something interesting. If I multiply $$\tan(2\theta) = \frac{2b}{a -c}$$ by $-1$ twice, getting $$ \tan(2\theta) = \frac{-2b}{c-a},$$ and I proceed as before and write $$\theta = \frac{1}{2}\textrm{atan2}(-2b,c-a),$$ now the angle is calculated between the $x$-axis and the eigenvector with the smaller eigenvalue!

In the wikipedia article on $\textrm{atan2}$, specifically the section called "East-counterclockwise, north-clockwise and south-clockwise conventions, etc.", it says:

Apparently, changing the sign of the x- and/or y-arguments and swapping their positions can create 8 possible variations of the $\mathrm{atan2}$ function and they, interestingly, correspond to 8 possible definitions of the angle, namely, clockwise or counterclockwise starting from each of the 4 cardinal directions, north, east, south and west.

I think this brings me closer to the answer to my question but I need some help putting everything together.

VISUAL AID

Let me call the eigenvectors with larger and smaller eigenvalue $\mathbf{L}$ and $\mathbf{S}$ respectively. Let's look at how a few eigenvectors might look like in the $(a-c,b)$ plane. Consider the sketch that follows. The eigenvectors $\mathbf{L}$ and $\mathbf{S}$ are drawn in red and blue respectively, and the angle they make with the $x$ axis is called $\theta_l$ and $\theta_s$ respectively in the top subplot. As you can see, traversing the $(a-c,b)$ plane clockwise leads to the direction of the eigenvectors in the $xy$ plane rotating clockwise too. The arrow heads indicate the side of the vectors which falls in the range $(-90,90]^\circ$.

I have used Mathematica to produce the following surface plots of the angles in the $(a-c,b)$ plane. This is how the angle between the $x$ axis and $\mathbf{L}$, given by $$\theta_l=\frac{1}{2}\textrm{atan2}(2b,a-c),$$ looks like:

And this is how the angle the $x$ axis makes with $\mathbf{S}$, $$\theta_s=\frac{1}{2}\textrm{atan2}(-2b,c-a),$$ looks like:

If you check the surface values you can see they match what my sketch described.

Let's come back to what was described in the wikipedia link.

We can see $\textrm{atan2}(y,x)$ follows the "East-anticlockwise" convention. We have $\theta_l=0^\circ$ when $\mathbf{L}$ is pointing East, and the angle grows as $\mathbf{L}$ rotates anticlockwise.
Now, if we try to make sense of the $\theta_s$ values as if they described $\mathbf{L}$ too, we can see $\theta_s=0^\circ$ when $\mathbf{L}$ is pointing North, and grows as $\mathbf{L}$ rotates anticlockwise. Hence, the $\textrm{atan2}(-y,-x)$ convention might be "North-anticlockwise".

But again, I don't know what is special about $\mathbf{L}$. From the derivation of the angles, either equation could have corresponded to either eigenvector. There is still a missing piece of the puzzle which I believe must lie in the derivation of the equation for $\theta_l$. Can anybody give me a hand? Any insights would be greatly appreciated.

SUMMARY

The equation $$\tan(2\theta) = \frac{2b}{a -c}$$ holds for the azimuths of both eigenvectors, L and S, of the matrix $$A = \begin{pmatrix} a & b \\ b & c \end{pmatrix}.$$

Let me define 2 new vectors:

$\mathbf{p}=(a-c,2b)$, with azimuth $\gamma_p=\textrm{atan2}(p_y,p_x)$.
$\mathbf{n}=-\mathbf{p}$, with azimuth $\gamma_n=\textrm{atan2}(-p_y,-p_x)$.

We can write $$\tan(2\theta) = \tan(\gamma_p) = \tan(\gamma_n).$$

It turns out using the first equality we get $$2\theta_l=\gamma_p=\textrm{atan2}(2b,a-c),$$ where $\theta_l$ is the angle between the $x$ axis and the eigenvector with larger eigenvalue, $\mathbf{L}$.

The second equality $$2\theta_s = \gamma_n = \textrm{atan2}(-2b,c-a)$$ gives us $\theta_s$, the angle between the $x$ axis and the eigenvector with smaller eigenvalue, $\mathbf{S}$.

Now the question remaining is why does that happen? How could I have predicted that $\mathbf{p}$ and $\mathbf{n}$ would always have an azimuth that is twice that of $\mathbf{L}$ and $\mathbf{S}$ respectively?

DIAGONALISATION

I have found this derivation from Howard E. Haber from his Physics 116A class of Winter 2011. He obtains the same equation for $\tan(2\theta)$ by diagonalising the matrix $A$ (note the difference in notation: he uses $b$ in $A_{22}$, and $c$ in the off-diagonal terms). He then proceeds by setting constraints in the angle $\theta$. When he plugs his eq. 1 in his eq. 8 he makes it explicit that $\theta$ is measuring the angle between the positive $x$ axis and the eigenvector with largest eigenvalue. The conclusions drawn are the same as mine, but I somehow bypassed all that when I decided to use the function atan2 (unjustifiably, but it works). The question remains: why does my approach of using atan2 work?

I don't think this affects anything but it is worth noting that $a$ and $c$ being positive is not sufficient to make all eigenvalues positive, consider the matrix $[[1,-3];[-3,1]]$. — Daniel Gendin, Oct 25 '21 at 22:08
Interesting, that is true. Consider the covariance matrix, which describes an ellipse whose semi-major and semi-minor axes are given by the direction of the eigenvectors, and the dispersion along those directions given by the larger and smaller eigenvalue respectively. How can the dispersion be negative... I am confused — Luismi98, Oct 25 '21 at 22:17
A covariance matrix must be positive definite (this is equivalent to saying all of its eigenvalues are positive). But both diagonal entries being positive is insufficient to establish positive definiteness. So in order for your matrix to be a proper covariance matrix you need to make sure that it is positive definite. — Daniel Gendin, Oct 25 '21 at 22:22
I see! Following this question in this particular case I believe I'd need the condition $|b|<\sqrt{a c}$. I agree it doesn't affect my Q but thanks for the insight! — Luismi98, Oct 26 '21 at 00:07
This derivation from the university of california is very relevant: http://scipp.ucsc.edu/~haber/ph116A/diag2x2_11.pdf — Luismi98, Mar 04 '22 at 16:37
The "atan2 unjustifiably, but it works" phenomenon reminds me of the use of atan2 in Jim Blinn's solution to the 2x2 svd, summarized here and here. Those two inverse tangent calls, implemented using atan2, always annoyingly magically robustly give the singular vectors corresponding to the larger and smaller singular values respectively-- I have never understood why. — Don Hatch, May 20 '25 at 15:36
... furthermore, when I try to rewrite Blinn's algorithm without trig function calls, I mostly succeed except that it's a bloody mess to try to get the two singular vectors to correspond to the two singular values in the correct order; my best attempt required lots of messy case testing. Somehow Blinn's use of atan2 magically picks the correct one of the two choices, always. Infuriating. — Don Hatch, May 20 '25 at 15:43

5xum · Answer 1 · 2021-10-26T07:08:35.110

4

Assuming that $b\neq 0$, the equation

$$\frac{c-a}{b} = -2\cot(2\theta)$$ holds for both eigenvectors, and, assuming $c-a\neq 0$

so does the equation $$\tan(2\theta) = \frac{2b}{a-c}.$$

However, you then go from that final equation to the equation $\theta=\frac12\arctan2(2b, a-c)$, and that is where you are making the selection.

Basically, you don't know that you can use $\arctan2$ to calculate your angle $\theta$. Taking this function means picking one of two possible $\theta$ values, and the one you discard belongs to the other eigenvectors.

What is interesting is that you always select the eigenvector with the larger eigenvalue. From what I can tell from playing around with the numbers, the point $(a-c, 2b)$ always lies in the same quadrant as the eigenvector of the larger eigenvalue, which is actually quite an interesting property in itself!

edited Oct 26 '21 at 07:08

answered Oct 25 '21 at 15:43

5xum

126,227
6
135
211

Thanks for your answer. You said when I take arctan2 I am "picking one of two possible $\theta$ values". Can you clarify which is the other one? – Luismi98 Oct 25 '21 at 18:59
@Luismi98 I mean the fact that $\tan(\theta)=\tan(\theta+\pi)$. For each value $y$ there exist two angles $\phi_0, \phi_1\in[0,2\pi)$ such that $\tan(\phi)=y$. In your case, there exist two angles $\theta_0,\theta_1$ such that $\tan(2\theta)=\frac{2b}{a-c}$, and $\arctan2$ only selects one of them - the one that belongs to the eigenvector with the higher eigenvalue. – 5xum Oct 26 '21 at 07:04
1

@Luismi98 Note that $\arctan2$ selects one of the two values based on which quadrant the point $(a-c, 2b)$ lies in, which is where your connection comes from. At it's core, your connection comes from the fact that the eigenvector with the larger eigenvalue always lies in the same quadrant as the point $(a-c,b)$. – 5xum Oct 26 '21 at 07:09
1

I see what you mean, thanks! "the eigenvector with the larger eigenvalue always lies in the same quadrant as the point $(a-c,b)$". Yes! Now the question remaining is why? Why not the eigenvector with smaller eigenvalue, $\mathbf{S}$? And why does $\mathbf{S}$ lie in the same quadrant as $-(a-c,b)$? – Luismi98 Oct 26 '21 at 11:38
@Luismi98 I suspect these answers would be most clearly answered by diagonalizing $A$ first, but I'm not sure... – 5xum Oct 26 '21 at 12:00
This is a full derivation of how diagonalising $A$ looks like: http://scipp.ucsc.edu/~haber/ph116A/diag2x2_11.pdf – Luismi98 Mar 04 '22 at 16:38
In eq. (8) they are explicitly treating $\lambda_1$ and $\lambda_2$ differently. Their whole thing makes sense and it's in agreement with everything in my Q. I still do not understand why my use of the function atan2 works the way it does though – Luismi98 Mar 04 '22 at 16:43

Angle eigenvector makes with $x$-axis

1 Answers1