2

I read recently that every function $f : \mathbb{R}^m \to \mathbb{R}^n$ can be written as $f(\mathbf{x}) = \mathbf{Ax}$ where $\mathbf{A}$ is a matrix of constants.

On the one hand this is intuitive because the elements of $f(\mathbf{x})$ will be linear combinations of the elements of $\mathbf{x}$. On the other hand the one-dimensional case that students usually learn first as their first "linear" equation is a little different:

$$y = mx + b$$

Notice the presence of the intercept term $b$ which doesn't appear to have an analogy in $f(\mathbf{x}) = \mathbf{y} = \mathbf{Ax}$.

As far as I can tell, $g(\mathbf{x}) = \mathbf{Ax} + \mathbf{b}$ still preserves linearity, and can express transformations that $f(\mathbf{x}) = \mathbf{Ax}$ cannot (e.g. add 50 to every element of $\mathbf{x}$ even when $\mathbf{x}=\mathbf{0}$).

We could define a function $h$ that is equivalent to $g$ that extends $\mathbf{x}$ to have an extra $1$ appended on the end, call it $\mathbf{x}_{+1}$ and where $A$ has an extra column to act on it, e.g. $\begin{bmatrix}b \\ 0 \\ 0 \\ \vdots\end{bmatrix}$. Call the new matrix $\mathbf{C}$. Then we could say $h(\mathbf{x})=\mathbf{Cx}_{+1}$. But we're no longer purely multiplying by a constant matrix.

So:

  • When we say $y = mx + b$ is a linear equation and we say $\mathbf{y} = \mathbf{Ax}$ is a linear transformation, do we mean the same thing?
  • Is it correct to say $f(\mathbf{x}) = \mathbf{Ax}$ can represent all linear transformations $\mathbb{R}^m \to \mathbb{R}^n$, provided we are free to modify the elements of $\mathbf{A}$?
  • Is $g(\mathbf{x}) = \mathbf{Ax} + \mathbf{b}$ not also a linear transformation?
  • When people say $f(\mathbf{x})$ can represent all linear transformations do they really mean $h(\mathbf{x})$ or $g(\mathbf{x})$?
Joseph Garvin
  • 1,218
  • 12
  • 25
  • @DonThousand Say $\mathbf{A}=0$, $\mathbf{x}=0$, and $\mathbf{b}=\vec{50}$. What is $\mathbf{B}$? – Joseph Garvin Mar 21 '20 at 21:08
  • @DonThousand they call it an affine transformation -- so $\mathbf{y}=\mathbf{Ax}$ is linear, and $\mathbf{y}=\mathbf{Ax}+\mathbf{b}$ is affine? Then is only $y=mx$ linear, and $y=mx+b$ being called a "linear equation" is a misnomer and it's really affine? – Joseph Garvin Mar 21 '20 at 21:17
  • How is $g:\mathbf x\mapsto A\mathbf x+\mathbf b$ linear? $g(\mathbf x_1+\mathbf x_2) = A(\mathbf x_1+\mathbf x_2)+\mathbf b \ne g(\mathbf x_1)+g(\mathbf x_2)$; $g(c\mathbf x)=cA\mathbf x_1+\mathbf b \ne cg(\mathbf x)$. – amd Mar 21 '20 at 21:59
  • 1
    Unfortunately, terminology is not consistent across all of mathematics. “Linear” means different things in different contexts. This comes up fairly frequently here. See, for example, https://math.stackexchange.com/questions/331460/concept-of-linearity and https://math.stackexchange.com/q/279252/265466. Another common term that’s overloaded is “normal.” – amd Mar 21 '20 at 22:05
  • As another example of inconsistent usage, “orthogonal” doesn’t even have a single meaning within linear algebra, sadly. (Compare “orthogonal matrix” and “orthogonal basis.”) – amd Mar 22 '20 at 00:43

3 Answers3

2

I think there is just a minor terminological confusion here. A linear transformation $f$ is required to satisfy $f(\mathbf{0}) = \mathbf{0}$ and is represented by matrix multiplication: for some constant matrix $\mathbf{A}$, $f(\mathbf{x}) = \mathbf{A}\mathbf{x}$ for all $\mathbf{x}$. If you compose $f$ with a translation along a constant vector $\mathbf{b}$, say, you get a transformation $g$ satisfying $g(\mathbf{x}) = \mathbf{A}\mathbf{x} +\mathbf{b}$ for all $\mathbf{x}$. Such a $g$ is called an affine transformation.

Rob Arthan
  • 51,538
  • 4
  • 53
  • 105
  • Is it not misleading then that $y=mx+b$ is referred to as a "linear equation"? Shouldn't it be an "affine equation"? – Joseph Garvin Mar 21 '20 at 21:29
  • 1
    Sadly, mathematical terminology is not always as consistent as one might like. If you say "affine equation" you'll likely just get blank looks. "Linear equation" is usually used for an equation of the form $\mathbf{A}\mathbf{x} =\mathbf{b}$ (so "linear" is referring to the l.h.s.). – Rob Arthan Mar 21 '20 at 21:37
2

Just to sum up what others have said or alluded to: 'linear' has slightly different meanings in mathematics. It is well established terminology that a 'linear transformation' is one that can be represented by $\mathbf{x}\mapsto A\mathbf{x}$ where $A\mathbf{x}$ means ordinary multiplication of matrices. On the other hand $y=mx+c$ is the equation of a (non-vertical) line and so has some claim to be a 'linear' equation even though the map $x\mapsto mx+c$ is not a linear map (unless $c$ happens to be zero) as you rightly point out.

It is however noting that affine maps (in other words ones that map $\mathbf{x}\mapsto A\mathbf{x}+\mathbf{b}$) while not typically linear, can nevertheless be represented by matrices -- not by the usual matrix multiplication but by a slight modification of it. Write the input vector $\mathbf{x}$ with an extra bottom entry equal to 1. Then put $$\left(\begin{array}{c}\mathbf{y}\\ 1\end{array}\right)=\left(\begin{array}{cl}A & \mathbf{b}\\ \mathbf{0} & 1\end{array}\right)\left(\begin{array}{c}\mathbf{x}\\ 1\end{array}\right).$$

(Here $\mathbf{0}$ is a row vector of the appropriate length.) Then $\mathbf{y}$ is the image of $\mathbf{x}$ under the affine map $\mathbf{x}\mapsto A\mathbf{x}+\mathbf{b}$. Moreover the composition of affine maps corresponds to multiplication of matrices of this form.

1

In general, the property of being linear is defined as:

$f(x)$ is linear if $f(ax) = af(x)$ and $f(x+y)=f(x)+f(y)$. If $A$ is representing a linear transformation $T$, then the above tells us we need both of the following to be true:

$T(a{\bf x}) = aT({\bf x})$ and $T({\bf x} + {\bf y}) = T({\bf x}) + T({\bf y})$.

Now consider the two possibilities:

  1. $T({\bf x}) = A{\bf x} + {\bf b}$
  2. $T({\bf x}) = A{\bf x}$

Only the second satisfies the definition given above, so only the second is called linear.