Motivation behind characteristic equations in recurrence relations

Question

In learning about solutions to linear recurrences of the form $a_n = ia_{n-1}+ ja_{n-2}$, where $i, j$ are known constants, I came across the technique of using characteristic equations. I was learning this as a part of contest prep, and not as an undergraduate student, so I am not familiar with linear algebra.

The technique showed how we assume $a_n = \lambda^n$ to be a solution of the recurrence, and thus obtain: $\lambda^n - i\lambda^{n-1} - j\lambda^{n-2} = 0$, or equivalently, $\lambda^2 - i\lambda - j = 0$. Then, if the equation has two roots $\alpha, \beta$, the solution of the recurrence is $a_n = c_1\alpha^n + c_2\beta^n$, where you determine $c_1, c_2$ using initial conditions.

I am confused regarding the motivation for the assumption, which popped seemingly out of nowhere, that $a_n = \lambda^n$ is a solution.Is there any motivation that is not based on linear algebra? If not, can someone explain the linear algebra involved in simple terms? I' ve seen the terms linear maps and shift operators being used, but couldn't understand any of the notation or idea behind them. Also, is there a proof of the fact that $a_n = c_1\alpha^n + c_2\beta^n$ is indeed the solution, that does NOT use induction?

emacs drives me nuts · Accepted Answer · 2025-01-05T13:42:48.487

In the remainder, I'll use $\alpha$ and $\beta$ instead of $i$ and $j$ since complex numbers will be envolved, and $i$ may be confusing when it denotes something other than $\sqrt{-1}$.

I am confused regarding the motivation for the assumption

So here is an approach that doesn't use some ansatz out of the blue: We have $$ a_n=\alpha a_{n-1} + \beta a_{n-2} \qquad;\quad n\geqslant 2 \tag 1 $$ which can be written as

$$ \binom{a_{n+1}}{a_n} = \underbrace{\left(\matrix{\alpha&\beta \\ 1&0}\right)}_{\textstyle =:M} \binom{a_n}{a_{n-1}}\qquad;\quad n\geqslant1 \tag2 $$ With a constant 2×2 matrix $M$. The recurrence is hence $$ \binom{a_{n+1}}{a_n} = M \binom{a_n}{a_{n-1}}\qquad;\quad n\geqslant1 \tag3 $$ and after $n$ multiplications with $M$ we get:

$$ \binom{a_{n+1}}{a_n} = M^n \binom{a_1}{a_0}\qquad;\quad n\geqslant0 \tag4 $$ Thus, the computation of the $n$-th value boils down to computing $M^n$ for a given 2×2 matrix $M$.

Now suppose that $M$ has two distinct eigenvectors $v_1$ and $v_2$ with eigenvalues $\lambda_1$ and $\lambda_2$, i.e.

$$ M v_m = \lambda_m v_m \qquad;\quad m\in\{1,2\} \tag 5 $$

then we have

$$ M^n v_m = \lambda_m^n v_m \qquad;\quad m\in\{1,2\} \;;\; n\geqslant 0\tag 6 $$

As the eigenvectors are different, they span all of $\Bbb R^2$ and hence there is a representation

$$ \binom{a_1}{a_0} = c_1v_1 + c_2v_2 \tag7 $$ and therefore $$\begin{align} \binom{a_{n+1}}{a_n} &= M^n \binom{a_1}{a_0} \\ &\stackrel{(7)}= M^n c_1v_1 + M^n c_2v_2 \\ &\stackrel{(6)}= \lambda_1^n c_1v_1 + \lambda_2^n c_2v_2 \tag{10} \end{align}$$

So all you have to do is to determine $\lambda_m$, $v_m$ and $c_m$ where the $c_m$ are solutions of the 2×2 linear system (7).

What's remaining is a corner case where the two eigenvalues are the same. This can be solved by taking limits $\lim_{\lambda_2\to\lambda_1}$.

All this generalizes nicely to the $n$-dimensional case resp. linear recurrences of length $n$. See for example this answer.

Example: $\alpha = \beta=1$

With $\alpha = \beta=1$, the eigenvalues and eigenvectors of $M=\left(\matrix{1&1 \\ 1&0}\right)$ are such that $$ \lambda_m ^2 = \lambda_m+1 \tag 8 $$ which you can get by finding the roots of the characteristic polynomial of $M$. Also notice that due to (8), we have: $$ M\cdot\binom{\lambda_m}1=\left(\matrix{1&1 \\ 1&0}\right)\cdot\binom{\lambda_m}1 =\binom{\lambda_m+1}{\lambda_m} \stackrel{(8)}= \lambda_m\cdot\binom{\lambda_m}1 \tag9 $$ and hence we can take $$ v_m=\binom{\lambda_m}1 $$ as eigenvector for the respective eigenvalue. The condition (7) for the coefficients $c_m$ becomes:

$$ \binom{a_1}{a_0} = c_1\binom{\lambda_1}1 + c_2\binom{\lambda_2}1 = \left(\matrix{\lambda_1&\lambda_2\\1&1}\right)\binom{c_1}{c_2} \tag{7a} $$ $$\begin{align} \iff \binom{c_1}{c_2} &= \left(\matrix{\lambda_1&\lambda_2\\1&1}\right)^{-1} \binom{a_1}{a_0} \\ &= \frac1{\lambda_1-\lambda_2} \left(\matrix{1&-\lambda_2\\-1&\lambda_1}\right) \binom{a_1}{a_0} \end{align}$$

When you plug in $\lambda_1=\phi$ and $\lambda_2=-1/\phi$ where $\phi=\frac12(1+\sqrt5)$ is the Golden Ratio, then you are getting Binet's formula for the Fibonacci numbers whan you plug everything into (10).

I will catch up with eigenvectors and eigenvalues, after which I hope I can understand the second part of the answer — Illusioner_, Jan 05 '25 at 05:38
@Illusioner_ I added a sketch of an example calculation for $\alpha=\beta=1$, maybe it's clearer now how to arrive at the final explicit formula. — emacs drives me nuts, Jan 05 '25 at 11:28

FShrike · Answer 2 · 2025-01-04T18:07:44.177

Before you read the rest of it, let me summarise: a quadratic difference equation ought to, just as an ordinary quadratic does, factorise into linear parts. If the application of both linear parts maps the sequence to zero, then the application of the first linear part to a solution gives you something which solves the linear equation associated to the second part; since solving linear equations is easier, we are happy about this. Since factorisation and root-finding are closely linked, in the polynomial case, it is reasonable and correct to expect them to be linked in this more abstract setting.

Let me pass to the continuous analogue of this. If we want to solve: $$y''+\alpha y'+\beta y=0$$For twice differentiable functions defined on a neighbourhood of $0$ and $\alpha,\beta$ real constants, we can remember that solving $y'+\sigma y=0$ is rather easy and hope to break our degree-$2$ equation/behaviour/operations into a sequence of degree-$1$ equations/operations. In physics, we enjoy the Hamiltonian formalism for many reasons, in no small part because Newton's second-order equation $F=ma$ was reduced to two first-order equations. This is a well-motivated, natural idea, I think, and it also takes inspiration from completing the square, an ancient technique for solving quadratic equations - by understanding $x^2+2bx+a=(x+b)^2+(a-b^2)$, we are applying the simpler, linear, equation $(x+b)$ twice, in some sense, and this leads to its solution.

We might want to say $y''+\alpha y'=(y'+\alpha y)'$; using the idea of decoupling the system into two first-order equations, we might try to declare a variable $z=y'+\alpha y$, and then we have to solve $z'=-\beta y$, but it is not clear how to express $y$ in terms of $z$. The right hand side has no $y'$ term in it, for instance; ideally, we just want to solve $z'+\sigma z=0$ because, as I mentioned, this is easy. Maybe $z:=y'+\alpha y$ doesn't work, then. It is reasonable to imagine $z:=y'+\lambda y$ for some $\lambda$ could work, though, and if we want to identify $\lambda$ and $\sigma$ such that $z'+\sigma z=0=y''+\alpha y'+\beta y$ then we realise we need $\lambda+\sigma=\alpha$ and $\sigma\lambda=\beta$; these are just the negatives of the roots of the auxiliary equation.

In terms of difference equations, if the sequence $z_n=a_{n+1}+\lambda a_n$ is understood, then the sequence $a_n$ is potentially easy to "integrate" (sum up); if we want to understand $z_n$, we might want it to satisfy a linear equation $z_{n+1}+\sigma z_n=0$, i.e. $a_{n+2}+(\lambda+\sigma)a_{n+1}+\sigma\lambda a_n=0$, and then indeed we have to solve the auxiliary equation. $z_{n+1}=\gamma z_n$ follows where $\gamma$ is a root of the auxiliary equation, so $z_n=\gamma^n\cdot z_0$ easily follows, and then $a_{n+1}=\gamma' a_n+\gamma^n\cdot z_0$ for the "other" (sometimes they are the same) root $\gamma'$ and this one is also not too challenging to solve. The technique of integration in calculus becomes summation, here: $$a_n-(\gamma')^na_0=a_n-\gamma'a_{n-1}+\gamma'(a_{n-1}-\gamma' a_{n-2})+(\gamma')^2(a_{n-2}-\gamma'a_{n-3})+\cdots\\=z_0\cdot(\gamma^n+\gamma'\cdot\gamma^{n-1}+(\gamma')^2\cdot\gamma^{n-2}+\cdots)$$And such sums can be evaluated and the value of $a_n$ gotten in general. You can think of this as integrating the "differential operator" $a\mapsto(a[+1]-\gamma'a)$. In the case of calculus, given $x^2+\alpha x+\beta=(x-\lambda)(x-\sigma)$, we can express: $$y''+\alpha y'+\beta y=\left(\frac{d}{dx}+\lambda\right)\left(\frac{d}{dx}+\sigma\right)y=0$$And so it follows $z:=(d/dx+\sigma)y$ must satisfy $(d/dx+\lambda)z=0$, and we solve this first, and then solve for $y$ by "integrating the differential operator", so to speak.

score 2 · Answer 3 · answered Jan 04 '25 at 18:10

I think that, unfortunately, the linear algebra is deeply involved into the motivation, but I try to provide a simple explanation. So let $A$ be the set of sequences $(a_n)$ (where $n=1,2,\dots$) of real (or complex, if you wish) satisfying the given recurrence condition $a_n=ia_{n-1}+ja_{n-2}$ for each natural $n\ge 3$. We can do the following operations with elements of $A$. Namely, if $(a_n)\in A$ and $c$ is a real (or complex, if you wish) number then $c(a_n)$ is the sequence whose consecutive members are $ca_1,ca_2,\dots$. It is easy to see that the sequence $c(a_n)$ also belongs to $A$. Moreover, if $(b_n)\in A$ then $(a_n)+(b_n)$ is the sequence whose consecutive members are $a_1+b_1,a_2+b_2,\dots$. It is easy to see that the sequence $(a_n)+(b_n)$ also belongs to $A$. So if we have two sequences $(a_n),(b_n)$ belonging to $A$ then we have a big bunch of sequences of the form $c_1(a_n)+c_2(b_n)$ belonging to $A$ for any real (or complex) $c_1$ and $c_2$. Why this is useful? Suppose that we have to find a sequence $(d_n)\in A$ given only $d_1$ and $d_2$ (which determine the whole sequence $(d_n)$ by the recurrence relation). If we find some real (or complex) $c_1$ and $c_2$ such that $c_1a_1+c_2b_1=d_1$ and $c_1a_2+c_2b_2=d_2$ then $c_1(a_n)+c_2(b_n)$ is the required sequence. Geometric progressions provide an easy way to find the needed partial solutions $(a_n)$ and $(b_n)$, because if there exists a real (or complex) number $\lambda$ such that $a_n=\lambda^n$ for each natural $n$, then $(a_n)\in A$ if and only if $\lambda^2-i\lambda-j=0$. Finally I note that if the equation has equal roots then for the first basic sequence $(a_n)$ holds $a_n=\lambda^n$ for each natural $n$, but for the second basic sequence $(b_n)$ holds $b_n=n\lambda^n$ for each natural $n$.

Michael T · Answer 4 · 2025-01-05T08:34:52.703

Mmmmh, trying to focus on the questions that you ask.

Q1: Is there a motivation that is not linear algebra?
A1: Yes, just play. I would call $a_n=\lambda^n$ an Ansatz, a guess and not an assumption. Actually, it is quite natural, if you start with $a_0=1$ and $a_1=\lambda$ to get going...

Q2: Is there a proof of the fact that $a_n=c_1α^n+c_2β^n$ is indeed the solution, that does NOT use induction?
A2: I don't quite see an induction in putting in and testing the solution for general $n$.

In my eyes, the key non-trivial thing is, that there are not other solutions. I guess you would show that

by demonstrating that $a_0$ and $a_1$ fix a solution (given, $a_0$ and $a_1$, the solution is defined, there is no other one)
by demonstrating that you can cover all $a_0$ and $a_1$ choices with the given general solution

I guess that will be using induction and a system of 2 equations with 2 unknowns.

Cesareo · Answer 5 · 2025-01-04T19:36:02.767

2

Some curious facts: think of

$$ a_n = i a_{n-1} $$

the solution is given by

$$ a_n = i^n a_0 $$

now in the case of

$$ a_n = i a_{n-1}+j a_{n-2} $$

we can equivalently represent it as

$$ \left(\matrix{\alpha_1\\ \alpha_2}\right)_n=\left(\matrix{0&1 \\ i & j}\right)\left(\matrix{\alpha_1\\ \alpha_2}\right)_{n-1} $$

with solution

$$ \left(\matrix{\alpha_1\\ \alpha_2}\right)_n=\left(\matrix{0&1 \\ i & j}\right)^n\left(\matrix{\alpha_1\\ \alpha_2}\right)_{0} $$

and in general, for constant $M$ we have

$$ \alpha_n = M^n\alpha_0 $$

with $$ M = \left(\matrix{0&1&0&0&\cdots&0\\ 0&0&1&0&\cdots&0\\ 0&0&0&1&\cdots&0\\ \vdots&\vdots&\vdots&\vdots&\cdots&0\\ -c_1&-c_2&-c_3&-c_4&\cdots&-c_n}\right) $$

Now with the help of Cayley-Hamilton theorem we can affirm that

$$ p(\lambda) = \det\left(\lambda I-M\right) = \lambda^n + \sum_{k=1}^{n-1}c_k\lambda^{k-1} $$

and the roots of $p(\lambda)=0$ are the eigenvalues of $M$. Note that the polynomial coefficients are the recurrence constants!

edited Jan 04 '25 at 19:36

answered Jan 04 '25 at 19:19

Cesareo

36,341

$\left(\matrix{\alpha_1\ \alpha_2}\right)_n$ What is this notation? I saw the same "type" of matrix vector multiplication in @emacs drives me nuts answer. Is this just different notation for the same thing? – Illusioner_ Jan 05 '25 at 05:42
1

$$ \left(\matrix{\alpha_1\ \alpha_2}\right)n \equiv \left(\matrix{\alpha{1,n}\ \alpha_{2,n}}\right) $$ – Cesareo Jan 05 '25 at 10:02
Great insight. I'm assuming the general case ties to a relation $a_n = c_1a_{n-1} + c_2a_{n-2} + . . . c_ka_{n-k}$? – Illusioner_ Jan 05 '25 at 11:52

Jap88 · Answer 6 · 2025-01-04T21:33:23.737

This bothered me a lot as well as a student. To make the point done here it is unfortunately required to involve the notion of eigenvalues and eigenfunctions, but only in a relatively elementary way.

To see where the out-of-the blue "ansatz" $\lambda^n$ comes from, let's introduce a shift operator S defined as: $$ S(a_n)=a_{n+1} $$

If we look at:

$$a_n=i a_{n-1}+j a_{n-2}$$

we can, for convenience, rewrite this as:

$$a_{n+2}=i a_{n+1}+j a_n$$

Let's now use the shift operator to express this (in a bit clunky way) as:

$$S(S(a_{n}))=i S(a_n)+j a_n$$

Now, the recurrence relationship is linear and homogeneous (no constant term) so it is fair to assume that a term in the sequence is just a constant times the previous term:

$$ S(a_{n})=\alpha a_{n} $$

Using this, we get

$$\alpha^2 a_n=i \alpha a_n +j a_n$$

which is very useful since we can divide by $a_n$ (assuming the terms are not all zero) and solve for $\alpha$.

So, this will give us the value of $\alpha$ but we still don't know what $a_{n}$ is.

We can now ask ourselves in more general terms which very special expression $\alpha$ is such that: $$ S(a_{n})=\alpha a_{n} $$

It turns out (by trial and error) that sequences of the type $a_n=\lambda^n$, or, more generally, $a_n=c\lambda^n$, will answer this question.

To see why, start with the equation:

$$ S(a_{n})=\alpha a_{n} $$

No, by inserting $a_n=\lambda^n$ we get:

$$ S(a_{n})=a_{n+1}=\lambda^{n+1} $$

However, at the same time, assuming $a_n=\lambda^n$, we also have that:

$$ \alpha a_n=\alpha \lambda^n $$

We now would like these two relationships to be equal for all $n$, which means that we must have:

$$ \lambda^{n+1}=\alpha \lambda^n $$

Assuming $ \lambda \neq 0 $, we can divide both sides by $\lambda^n $ and get:

$$ \lambda=\alpha $$

To sum all up, by introducing the shift operator:

$$ S(a_n)=a_{n+1} $$

and the homogeneous linear relationship:

$$ S(a_{n})=\alpha a_{n} $$

we have concluded that $a_{n}=\lambda^n$ are the eigenfunctions of the shift operator and that $ \lambda$ is the eigevalue of the shift operator.

The fact that these are the eigenfunctions and eigenvalue of the shift operator makes the introduction of $\lambda^n$ seem completely out of the blue. This is kind of typical and, as shown in some of the other posted answers, also is characteristic of the continuous case of differential equations and their eigenvalues and eigenfunctions. You can say that $\lambda^n$ plays the same role for linear recurrent relations as $cos(2\pi n x)$, $sin(2\pi n x)$, and $e^{i 2 \pi n x}$ play for linear differential equations.

Note: In general you have that the solution looks like $a_n=c_1\lambda_1^n+c_2\lambda_2^n$ since there are two roots for the equation in $\alpha$ above. Furthermore, in the case that $=\alpha$ is a double root we instead get that the eigenfunctions are $a_n=(c_1+c_2 n )\lambda^n$.

"so it is fair to assume that a term in the sequence is just a constant times the previous term: $S(a_{n})=\alpha a_{n}$". But don't we already have a well defined recurrence for $S(a_n)$, that is $S(a_n) = ja_{n-1} + iS(a_{n-1})$. — Illusioner_, Jan 05 '25 at 05:54
Good question. $S(a_n)=\alpha a_n$ is just another guess just like $\lambda^n$. It will only help us on the way to the final solution which typical is a superposition of multiple solutions. In this case we get two solutions superimposed since we have a second order relation. But $S(a_n)=\alpha a_n$ is a structured guess that breaks down the problem into simpler parts. This is typically the role of finding eigenvalues and eigenfunctions (also for differential equations). In the end you find the total solution by summing them up since linearity means the superposition principle applies. — Jap88, Jan 05 '25 at 06:11

Marc van Leeuwen · Answer 7 · 2025-01-04T22:59:46.663

This is becomes really quite transparent once you formulate the problem in the proper terms; this does require the language of linear algebra, but you do not need to known much of its theory.

The first point is to view as basic object entire infinite sequences of numbers of you favourite variety (rational, real or complex) that are called scalars. Since we are also going to deal with polynomials, it is best to work with complex numbers (so that all non-constant polynomials will have roots) so I will take $\def\C{\Bbb C}\C$ as set of scalars. Technically, a sequence is then defined as a map $\def\N{\Bbb N}\N\to\C$ so that every term $a_i$ of the sequence is associated to particular a natural number$~i$ (its position in the sequence). The universe $U$ of all sequences is vast, but by imposing recurrence relations we will be interested only in a minuscule part of it. There is structure that makes it easier to understand $U$: we can add together two sequences (by adding terms at each position separately), and we can multiply an entire sequence by a scalar (by multiplying by that same number at each position). This allows us to call $U$ a (vector) space of sequences.

One can interpret a recurrence relation as a system of infinitely many equations, namely one for each position at which the recurrence applies, and by the nature of these equations (called their linearity), the set of solutions forms a subspace of $U$: the sum of two solutions or a scalar multiple of a solution is again a solution. However there is a point of view that simplifies things, by viewing each recurrence relation as a single (linear) equation, but which relates two elements of $U$. To this end we use a (linear) map $f:U\to{U}$ (such a map is called an operator on$~U$), and impose on a sequence $S$ the condition $f(S)=C$ for some specific sequence $C$. In practice $C$ will always by the zero sequence, and in this case we say $S$ lies in the kernel of the operator$~f$. For each recurrence relation$~R$, the subspace of sequences that satisfy it will form the kernel of a specific operator associated to$~R$.

To produce a repertoire of operators, we use a single basic operator$~\phi$, and the possibility to iterate it, as well as the possibility to add operators and multiply them be a scalar (the latter via the rules $(f+g)(S)=f(S)+g(S)$ and $(\lambda f)(S)=\lambda f(S)$ for any sequence $S\in{U}$ and $\lambda\in\C$). Iteration of $\phi$ is written $\phi\circ\phi$ for the composite map $S\mapsto \phi(\phi(S))$; this second iteration is abbreviated $\phi^2$, and one similarly defines $\phi^n$ for any $n\in\N$ (with $\phi^0$ the identity operator $S\mapsto S$). By combining with addition and scalar multiplication we can form polynomials in $\phi$ like $3\phi^5-6\phi^4-\phi^2+7\phi-2\phi^0$, which is naturally associated to the polynomial $3X^5-6X^4-X^2+7X-2$. If $P$ is that polynomial in $X$ we shall write $P[\phi]$ for the corresponding polynomial in $\phi$. It is not hard to see that multiplication of polynomials corresponds to composition of the corresponding polynomials in $\phi$ as linear operators: $(PQ)[\phi]=P[\phi]\circ Q[\phi]$; at the basis, it comes from the relation $\phi^k\circ\phi^l=\phi^{k+l}$ for $k,l\in\N$. This formal calculus does not depend on what the linear operator $\phi$ is precisely.

Now for recurrence relations, we use as $\phi$ the operator of shifting back one position: for each $i\in\N$, the term at position $i$ in $\phi(S)$ is the number that was the term at position $i+1$ of the sequence $S$. (So the term at position $0$ in $S$ has no effect on $\phi(S)$ (it is "forgotten" by the operator) and every term of $S$ beyond that is represented in $\phi(S)$, but at a position one less than where it was originally.) If you try to express that as a formula, it reads $\phi:((a_i)_{i\in\N})\mapsto((a_{i+1})_{i\in\N})$, but you may find that not very easy to read. Then $\phi^k$ ends up forgetting $k$ initial terms, and shifting the remaining terms back to start at position $0$. The kernel of $\phi$ consists of the sequences that have arbitrary initial term but are $0$ at all further positions. Somewhat more interestingly the kernel of $\phi-\lambda\phi^0$ (with $\lambda\in\C$) consists of the sequences $((a_i)_{i\in\N})$ that at each position $i$ satisfy $a_{i+1}-\lambda a_i=0$ (because the left hand side is the term at position $i$ of $(\phi-\lambda\phi^0)((a_i)_{i\in\N})$, and it is required to be zero). But that is just the recursion relation $a_{i+1}=\lambda a_i$ (which is why we chose to write a minus sign before $\lambda$ in the operator). Similarly the kernel of the operator $\phi^2-\phi-\phi^0$ consists of the sequences that satisfy the Fibonacci recurrence $a_{i+2}=a_{a+i}+a_i$. Every (constant coefficient) linear recurrence relation thus corresponds to the kernel of a particular polynomial $P[\phi]$ in $\phi$, and $P$ is the characteristic equation (polynomial really) of that recurrence relation.

Once you get this correspondence, here is the relation to your question. The geometric sequence you write as $a_n=\lambda^n$, in above notation the sequence $(\lambda^n)_{n\in\N}$, is just the solution of the first order recurrence relation $a_{i+1}=\lambda a_i$ mentioned above, with $1$ as initial term. The characteristic polynomial for that recurrence is $X-\lambda$, and the full set of solutions of the recurrence relation are all the geometric sequences of ratio$~\lambda$ (so $(c\lambda^n)_{n\in\N}$ for any $c\in\C$), in other words the scalar multiples of the given geometric sequence. The reason this case is so important is the general fact that for a polynomial multiple $QP$ of some polynomial $P$, the kernel of $(QP)[\phi]$ contains the kernel of $P[\phi]$. This is because if $P[\phi](S)=0$, then also $$ (QP)[\phi](S)=(Q[\phi]\circ P[\phi])(S)=Q[\phi](P[\phi](S))=Q[\phi](0)=0 $$ (this does use that $Q[\phi]$ is a linear operator, which therefore maps the zero sequence to itself). So for any polynomial multiple $P$ of $X-\lambda$, the geometric sequences of ratio $\lambda$ are solutions of the recurrence relation with characteristic polynomial $P$. But the condition on the polynomial $P$ is just that $\lambda$ be a root; working backwards, once one has that characteristic polynomial of a recurrence relation, one knows that all geometric series whose ratio is a root of that polynomial will satisfy the recurrence. And since the set of solutions is a subspace, we can also take sums of such geometric series.

It is a theorem (the kernel decomposition theorem) that as long as polynomials $P,Q$ have no common factors (no common roots in$~\C$) the kernel of $(PQ)[\phi]$ is precisely de (direct) sum of the kernels of $P[\phi]$ and of $Q[\phi]$ (this is true regardless of the linear operator $\phi$ used). In our case it means that as long as the characteristic polynomial of our recurrence relation has only simple roots, the mentioned sums of geometric series form the complete subspace of solutions to the recurrence. To handle the case of roots with multiplicity as well, one needs to generalise the case of geometric series to that of recurrences with characteristic polynomial of the form $(X-\lambda)^k$, but since my answer is already getting too long, I will leave this as an exercise.

Motivation behind characteristic equations in recurrence relations

7 Answers7

Example: $\alpha = \beta=1$