Why can a system of linear equations be represented as a linear combination of vectors?

Question

I was watching Gilbert Strang's first Linear Algebra lecture, and the very first thing he does is relating the standard view of a system of linear equations as lines -in $\mathbb{R}^2$ of course- (what he calls the row picture) with the notion of taking a linear combination of the vectors given by the columns of the matrix (the column picture).

Now, I can see it works in practice to reach the same solution, but I don't intuitively understand why this is the case. A priori, they seem like very different things, and it's mysterious that these two views somehow correspond to each other.

Could anyone shed some light on this? I'd greatly appreciate it!

score 12 · Answer 1 · 2019-12-19T01:45:56.890

12

If you are wondering why the linear system $$ 2x-y=1,\\x+y=5\tag{1} $$ and the so called column form (by Strang) $$ x\begin{bmatrix} 2\\1 \end{bmatrix} +y\begin{bmatrix} -1\\1 \end{bmatrix}= \begin{bmatrix} 1\\5 \end{bmatrix}\tag{2} $$

above are the same, then the short answer would be

it is by "definition" so. You will learn the "definition" later to see exactly what (2) means, which will tell you why (1) and (2) are the same thing. Here is a list for what you might want to pay attention to in later lectures:

What is a vector $[2,1]^T?$
What does it mean by multiplying a real number $x$ to the vector $[2,1]^T$?
What does it mean by adding two vectors together?
When are two vectors the same?

Here is Strang's own explanation in his textbook: Consider the example:

edited Dec 19 '19 at 01:45

answered Oct 09 '16 at 17:03

Hold on, I'm curious to know what is the definition of what you're talking about. Is it related to Eric's answer below concerning vector addition and multiplication by scalars? (That is, the properties of a linear transformation if I'm not wrong). – Matt24 Oct 11 '16 at 00:32
Same here, could you point to that definition? – Joaquin Brandan Apr 23 '17 at 16:22
The "definition" I'm referring to is the one for "additions" and "scale multiplications" of vectors in $\mathbb{R}^2$. For instance $(a,b)^T+(x,y)^T$ is defined to be the vector $(a+x,b+y)^T$. – Apr 23 '17 at 17:08

erfink · Answer 2 · 2021-09-17T17:52:27.157

You're right to be curious about why these seemingly disparate things can be viewed as the same solution---this is a rather deep questioning underpinning the field of linear algebra. For me, the key insight is that we're asking for both equations to be fulfilled at the same time. We aren't looking at this as equations 1 and 2 for lines, we looking at this as the system of equations and what the solution to this system is.

I find it useful to break down how we went from one representation to the other in a very pedantic and slow way, seeing what insight we can gain in the process. First, note that two vectors $\vec{v}= \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}$ and $\vec{w} = \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}$ are equal if and only if the components are equal, that is, $v_1 = w_1$ and $v_2 = w_2$. Since we want the system of equations to be satisfied by satisfying every individual equation, using vector equalities is a natural construct: $$\begin{bmatrix} 2x -y \\ x+y \end{bmatrix}=\begin{bmatrix} 1 \\ 5\end{bmatrix}.$$ Note that saying these two vectors are equal is the exact same statement as asking for the system of equations to be satisfied: these two vectors are equal if and only if they're equal in the first component and equal in the second component. From here, we can use vector addition to break apart our expression: $$\begin{bmatrix} 2x \\ x \end{bmatrix} + \begin{bmatrix} -y \\ y \end{bmatrix} = \begin{bmatrix}1 \\5 \end{bmatrix}$$ and then use scalar multiplication to write $$x\begin{bmatrix} 2 \\ 1 \end{bmatrix} + y\begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix}1 \\5 \end{bmatrix}.$$

Mathematically, these are all just different ways of expressing the same set of equalities--you should convince yourself that a solution of $x$ and $y$ to the system of equations is also a solution to our vector problem. However, they place the focus on different aspects. If we view it as a system of equations, we might naturally ask what set of points satisfy the first equation (a line) and what set of points satisfy the second equation (another line) and then ask where both equations are satisfied (the intersection of the two lines). Looking at the problem as a vector equation, the focus is now on the two vectors $\begin{bmatrix} 2 \\ 1 \end{bmatrix}$ and $\begin{bmatrix} -1 \\ 1 \end{bmatrix}$. Taking linear combinations of them gives me two distinct directions I can head and asks me how far to go in each direction in order to end up at $\begin{bmatrix} 1 \\ 5 \end{bmatrix}$. I envision this as almost like playing with an Etch-A-Sketch: there are two knobs labelled $x$ and $y$ that move the stylus in different directions. Unlike a normal Etch-A-Sketch, however, these knobs don't move straight horizontally and vertically at similar rates. Instead, they move the stylus in funky directions at different rates and we're tasked with turning the $x$ and $y$ knobs juuuuusssstt right so that we end at a specified location. Same problem, different focus.

A priori, there isn't a reason to prefer one over the other--they're just different. Just like how I can write a line as $y = mx+b$ or $(y-y_0) = m (x-x_0)$, they express the same thing in different forms. Neither one is necessarily better or worse, they're just different and place the emphasis on different aspects. As we move deeper into the rabbit hole of linear algebra, there are a few reasons why we might prefer the vector version:

In the plane, it's pretty easy to visualize how to lines can intersect or fail to intersect: they can have different slopes and intersect at a unique point, they can be parallel and never intersect, or they could be parallel and actually the exact same line (infinitely many intersections / solutions). In vector land, these cases correspond to our vectors associated with $x$ and $y$ pointing in distinct directions (giving a unique solution), our vectors pointing in the same direction with the RHS pointing somewhere else (no solution), or both vectors aimed directly at the vector on the RHS. Not too bad....
Now let's move up a dimension to 3D ($\mathbb{R}^3$). Can you visualize all of the ways for three planes to intersect (or not) in three dimensions? It's possible to draw them all out, but there are many more possibilities. How about yet higher dimensions? 4 hyperplanes in $\mathbb{R}^4$? 10,000 hyperplanes in $\mathbb{R}^{10,000}$?
In comparison, using linear combinations of vectors (the column picture) is much easier to contextualize in higher dimension. Do the 10,000 knobs on your hyper-Etch-A-Sketch allow a way to move the stylus to the desired point, or will you never get to the correct location no matter how furiously you crank them? Are any of the knobs redundant, giving you multiple solutions?
Looking forward to where you'll be headed with linear algebra, we can rewrite the vector equation $$x\begin{bmatrix} 2 \\ 1 \end{bmatrix} + y\begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix}1 \\5 \end{bmatrix}$$ as the matrix/vector equation $$A \vec{x} = \begin{bmatrix} 2 & -1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \vec{b}.$$ Now, our focus has shifted to matrix $A$, highlighting the importance of the coefficients in our original system of equations. Writing the system this way also brings our focus onto the associated function that inputs a vector $\vec{x}$ and spits out a new vector $\vec{y} = A \vec{x}$ (mathematically, this would be notated as $\vec{x} \mapsto A \vec{x}$ ). With a regular function of one variable, we can ask questions like "what values of $x$ solve $f(x) = c$?" or "what is the range of $f$?". We can ask similar questions about our new function $\vec{x} \mapsto A \vec{x}$: Is there a vector $\vec{x}$ such that $A \vec{x} = \vec{b}$? Is this solution unique? What are all the possible vectors I can get out, i.e., what is the range of the function $\vec{x} \mapsto A \vec{x}$?

Again, neither representation is "correct" or "better," they're just different and can be more or less useful depending on the context. This is actually a pretty useful lens through which most of linear algebra can be viewed (and my personal favorite aspect of the subject): many statements mean fundamentally the same thing, they're just different points of view. For example, by the end of Chapter Two in Strang, you can construct the following string of "if and only if" statements for a square matrix $A_{n \times n}$:

A is invertible $\iff$ $A^{-1}$ exists $\iff$ the columns of $A$ are linearly independent $\iff$ $A$ has $n$ pivots $\iff$ the determinant of $A$ is non-zero ($\det(A) \neq 0$) $\iff$ the equation $A \vec{x} = \vec{0}$ has a unique solution

Without focusing on what these individual statements mean, I want to you to think about the structure. It says that any single one of these statements is interchangeable with any other--you either get all or none of these statements being true. This is just like the "row picture" versus the "column picture":

$(x,y)$ solves our system of equations $\iff$ $x \vec{v} + y\vec{w} = \vec{b}$ $\iff$ $A \vec{x} = \vec{b}$

We either get a solution to all of them, or none of them. The different statements highlight different aspects of our solution (intersection of lines vs linear combinations vs finding the correct input(s) for a function), but it's still the exact same solution. It's actually a pretty useful (albeit difficult) exercise to step back and think about what is really happening in each of these contexts as you learn different algorithms and theorems throughout your course.

Wow, thanks for such a clear explanation! I can now see where the so called "column picture" comes from.
The only thing I'm still not comfortable with is the idea of multiplying a row vector times a matrix (in that order). It's evident that the multiplication algorithm follows naturally from the column picture, but it's not that obvious why it doesn't work the same way when you change the order, or in other words, why we can't define this operation to be commutative.

Anyway, you've already helped me out a lot! And by the way, I really enjoyed the Etch-A-Sketch analogy! :) — Matt24, Oct 11 '16 at 00:47
Thanks for the feedback! I even use the Etch-A-Sketch analogy when teaching linear algebra. I just need my engineering students to get busy on making me one that operates in $\mathbb{R}^n$ for $n>3$ =) — erfink, Oct 11 '16 at 04:30
As for row vector times a matrix, it actually falls into a similar sort of analogy. If we know how to multiply a matrix times a column vector, then a row vector times a matrix is just the transpose operation: $(A\vec{x})^T = \vec{x}^T A^T$. Same thing, same result. The difference is if we want to record vectors as columns $\begin{bmatrix} 1 \ 2 \end{bmatrix}$ or as rows $\begin{bmatrix} 1 & 2 \end{bmatrix}$. Again, there's no "right" answer, just different levels of usefulness depending on the context. — erfink, Oct 11 '16 at 04:36
Oh, so a multiplication by a row vector is nothing more than the "reverse" of another system of equations? — Matt24, Oct 11 '16 at 15:47

Emilio Novati · Answer 3 · 2016-10-09T15:08:01.877

3

Hint:

write $$ \begin{cases} ax+by=h\\ cx+dy=k \end{cases} $$ as: $$ x\begin{bmatrix} a\\c \end{bmatrix} +y\begin{bmatrix} b\\d \end{bmatrix}= \begin{bmatrix} h\\k \end{bmatrix} $$

this is a linear combinations of the columns vectors. And, yes, this is a different thing with respect to the intersection of two straight lines.

edited Oct 09 '16 at 15:08

answered Oct 09 '16 at 14:52

Emilio Novati

64,377

I know. But why is this the case? It's like magical that both things are the same... – Matt24 Oct 09 '16 at 15:25
It's surprising, yes. And there is also another interpretation as inversion of a linear transformation. – Emilio Novati Oct 09 '16 at 15:37
What is that other interpretation @Emilio? where can I find that? – Alejo Mar 01 '23 at 17:27
@Alejo. You can see te answer at :https://math.stackexchange.com/questions/4090259/whats-the-relationship-between-linear-transformations-and-systems-of-equations – Emilio Novati Mar 01 '23 at 18:04

Red Banana · Answer 4 · 2016-10-11T18:50:44.173

There is an interesting article by Berry Mazur called: "When are two things equal?" In which he discuss the problem of equality in relation to the thousand faces that mathematical objects have. Which face should you show first when teaching people about it? I read the other questions and I feel that other answerers tried to answer you how instead of why and from your comments on their answers, it's clear that you knew that.

Imagine that your house is located to your right, but you can only walk forward. Can you reach your house? No. You need to walk to the right if you want to get there. You'll find in my answer that the reason to write a system of equation as vectors is that you can walk to more places than you could before and with just a few ideas!

The thing is that when you have the system of equations represented in vectorial form, you gain certain powers of expression and the ability to say more$[1]$ things about that object. This is one of the interesting things in mathematics, seeing mathematical objects through different representations and gaining new insights based on these new representations.

For the problem you asked, I'll show some examples of what I just said. For example:

$$ax+by=\alpha\\cx+dy=\beta$$

You can rewrite it as:

$$\begin{bmatrix} x &y \end{bmatrix}\begin{bmatrix} a & c \\ b & d \end{bmatrix}=\begin{bmatrix} \alpha & \beta \end{bmatrix}$$

What does it reveal?$[2]$ An interesting feature. You can treat it as an equation of matrices and finding solutions $x,y$ ammounts just to find the inverse matrix (when it is invertible) of $\begin{bmatrix}a & c \\ b & d \end{bmatrix}$ and then left multiply the equation:

$$\begin{bmatrix} x &y \end{bmatrix}\overbrace{\begin{bmatrix} a & c \\ b & d \end{bmatrix}\begin{bmatrix} a & c \\ b & d \end{bmatrix} ^{-1}}^{ I}=\begin{bmatrix} \alpha & \beta \end{bmatrix}\begin{bmatrix} a & c \\ b & d \end{bmatrix} ^{-1}$$

And then:

$$\begin{bmatrix} x &y \end{bmatrix}=\begin{bmatrix} \alpha & \beta \end{bmatrix}\begin{bmatrix} a & c \\ b & d \end{bmatrix} ^{-1}$$

The solution is the product of that two matrices (this is different of Gaussian elimination). Here you've gained a shorthand notation for the system of equations, a new way of finding a solution and you can use a lot of tools in matrix theory. Now, take as an example the dot product of two vectors: $\langle (x,y,z),(a,b,c) \rangle=ax+by+cz$. The dot product express a geometric property: The dot product equals zero when two vectors are orthogonal. Now what does this means for the following system of equations?

$$\langle (x,y,z),(a,b,c) \rangle=0\\\langle (x,y,z),(d,e,f) \rangle=0\\\langle (x,y,z),(g,h,i) \rangle=0$$

It means - geometrically - that the solution vector $(x,y,z)$ is orthogonal to each vector $(a,b,c),(d,e,f),(g,h,i)$ the property is preserved if you do the following:

$$\langle (x,y,z,1),(a,b,c,-\alpha) \rangle=0\\\langle (x,y,z,1),(d,e,f,-\beta) \rangle=0\\\langle (x,y,z,1),(g,h,i,-\gamma) \rangle=0$$

Expand it and see that it is a common system of $3$ equations in $3$ variables! You can construct the very important idea of a cross product with this, the cross product $a\times b$ gives you a vector with is orthogonal in relation to $a,b$. With this, you are mixing the ideas of a system of equations with geometric notions and now you can use some geometric gadgetry in there and indeed, you can treat geometric objects as sets of vectors and some geometric transformations can be made with just matrix multiplication!

In analytic geometry courses, you usually see the idea of converting conic sections forms to a canonical equation in which It's easier to decide if that quadratic form is a circle, a pair of lines, an Ellipse, a parabola, etc. You do this via two changes of coordinates and the calculation is usually big. Now, there is a cleaver way to represent these conic sections in a matrix equation and converting it to the canonical equation amounts to some basic matrix operations, one of which involves eigenvalues and eigenvectors. If you just want to know which conic section it is without having the canonical equation (which provides you further information), you can just compute the rank of the matrices in the equation and the rank is just the order of the greatest non-vanishing sub-determinant of the matrix$[3]$! How awesome is that?! Any stranger could take me to bed by just whispering this in my ear$[4]$!

Further in your linear algebra course, you'll see that you can say some things about linear transformations using the very idea of converting systems of linear equations to a matrix product. For example: In a system as $\begin{bmatrix} x &y \end{bmatrix}\begin{bmatrix} a & c \\ b & d \end{bmatrix}=\begin{bmatrix} \alpha & \beta \end{bmatrix}$, there is a short rewrite for it: $xA=b$. There is something called null space, which is the set of solutions for $xA=0$. If the only solution is $x=[0,0]$, then the linear transformation is injective. And to know that, you just need to check if $\det A \neq 0$. Some linear transformations can be coded as matrices, and to check several properties about it, you can use just some standard methods you'll learn there. I could proceed with my enthusiasm, but I guess you got the point. With just a few computational tools, you have access to a lot of mathematical concepts mixed in one package and you gain revealing insights with it.With just a few computational tools, you have access to a lot of mathematical concepts mixed in one package.

Also, you are able to express that ideas in some core ideas of linear algebra: Linear combinations, basis, change of basic, change of coordinates, etc. And as you'll see further, these are quite general ideas that can be applied to calculus, abstract algebra, etc.

$[1] : $ Whereof one cannot speak, thereof one must be silent. - Wittgenstein's Tractatus $(7)$.

$[2] : $ Notice that $$\begin{bmatrix} x &y \end{bmatrix}\begin{bmatrix} a & c \\ b & d \end{bmatrix}=\begin{bmatrix} \alpha & \beta \end{bmatrix}$$ is closely related to

$$x\begin{bmatrix} a\\c \end{bmatrix} +y\begin{bmatrix} b\\d \end{bmatrix}= \begin{bmatrix} \alpha \\ \beta \end{bmatrix}$$

Just think of the rows in the matrix as the columns in the second equation.

$[3]: $ See Howard Eves' Elementary Matrix Theory. p.104: An affine classification of conics and conicoids according to the ranks of the associated matrices.

$[4]:$ $♥♥♥♥♥$.

This is a great post, thanks! You're right that having so many perspectives, one never gets to understand something completely. I'll try to spend more time thinking about them, but for now, I'd like to ask you about the orthogonal vectors view. It's quite a cool approach; thing is, that 4th component like <1,-alpha> seems weird to me. I mean, it makes sense algebraically of course, but it's still strange. Maybe I just need to think about these concepts a bit more. Thanks anyway! — Matt24, Oct 11 '16 at 01:02
@Matt24 Yes. You can look a proof of it here. I made it for the plane, Adriano extended it with generality. But I understand your feeling of weirdness. I guess Von Neumann was right after all: In mathematics, you don't understand things. You get used to it. — Red Banana, Oct 11 '16 at 01:21
@Matt I think - for example - limits are a very weird thing. It's like having an infinite straight path and you're actually able to go to the end of the path (???). Being able to find volumes with integrals is very very weird. I find it odd that some people take weird mathematical ideas as natural or obvious. As for the orthogonality with the additional component, you should ask yourself: How can you see that it's true? — Red Banana, Oct 11 '16 at 01:27
Oh, BTW: $$\langle (x,y,z,1),(a,b,c,-\alpha) \rangle = ax+by+cz-\alpha=0\ax+by+cz=\alpha$$
Perhaps I was not so clear with it. But you should multiply the last component of the first vector with the last component of the second vector. Component-wise. — Red Banana, Oct 11 '16 at 01:31
First of all, thank you again for your efforts! It's people like you who make StackExchange a very special place for students and math enthusiasts. That Von Neumann quote you mention hits the nail on the head. I feel like I understand limits only because I worked a lot with them and now it feels natural to take for granted those important small philosophical issues you mention. And I wish I could say the same things about volume calculations by integrals, but that still needs some getting used to for me (and getting used to the idea of what taking an integral means too). — Matt24, Oct 11 '16 at 15:29
It's a shame university courses don't spend some more time trying to shape our intuitions. It bothers people like me (like us?) that don't feel too comfortable with a concept and a proof until you boil it down to its essentials. But I guess that knowledge of mathematics wouldn't advance at this rate if we spent so much time dwelling on basic math. What do you think about this? — Matt24, Oct 11 '16 at 15:33
And regarding the orthogonality of the last component, I'm trying to understand Adriano's proof but I can't precisely see how they're related. The proof shows that non-standard vectors can be orthogonal when they make up a right triangle, correct? How can I see that in relation to $$\langle (x,y,z,1),(a,b,c,-\alpha)$$? — Matt24, Oct 11 '16 at 15:40
@Matt24 It's complicated to say. Mathematics is made to be compact/economic, I guess they do this for you to dwell and eventually find out the things for yourself. I always wanted to know the motivation for the concepts, but I found out that this is a little heavier than it seems to be. You have to obtain a lot of texts in history of mathematics, trying to think what motivated the giants is usually a bit hard. But I find it interesting to do it this way. — Red Banana, Oct 11 '16 at 15:42
@Matt24 What do you mean with "see"? This vector is in $4-$dimensions, you actually can't see - unless you use some trick for it (Ex: You can show $4-$dimensions in two planes of $2-$dimensions, but there is no certainty that the new visualization will be meaningful). The thing is: With the development of analytic geometry, the proofs for $2-$dimensions can be extended easily to proofs in $n-$dimensions. That's what Adriano did. — Red Banana, Oct 11 '16 at 15:47
Agreed. It's a longer process than just taking it at face value, but it's what makes math so beautiful (in my opinion). — Matt24, Oct 11 '16 at 15:50
Ah, sorry, I shouldn't have copy-pasted the full 4D dot product, but rather one in 3D like this one: $$\langle(x,y,1),(a,b,-\alpha)$$ — Matt24, Oct 11 '16 at 15:53

sureshkumar Shanmugam · Answer 5 · 2021-01-01T11:06:19.277

@Matt24, I was thinking upon the same set of questions regarding- how did Strang start by equating a system of linear equations to a column form that you have posed in your thread. Not sure if this thread is still open. Here is my 2 cent.

After pouring over so many books on linear algebra, finally found the answer in W.W.Sawyer's book titled -"An Engineering Approach to Linear Algebra". This is available as pdf, just search for it.

The clue to the missing link between the row form (SLE) and the column form is the glaring omission of the below 3 items by Strang and others in the depiction of the vector form.

Strang hasn't provided the exact definition of the problem statement that the 2 linear equations are representing. In real world, the 2 equations could be representing anything or for example price (first equation) and weight (second equation). This is a colossal omission for engineers, whereas acceptable for mathematicians as they breath in the abstract form.
Due to the above (1), the X & Y axes definitions which represent the actual problem statement are not provided for the Vector form. I.e. what does X axis and Y axis in vector form represent about the original problem statement from the real world?
The row form of SLEs are represented as a single graph with 2 lines representing 2 equations in this case. The arrangement/intersection of lines represent say either 'a' solution or 'many' solutions or 'no' solution and if solution(s) exist, what are they? This is not intuitive or suitable for studying or solving a real world problem which has multiple attributes. By the end of this response, we will understand what this means.

I believe, if only mathematicians have spent some more time in explaining the above 3 items in their text books, this could have been intuitively understood by mortals like us.

To make it simpler and understand better, I have provided the problem statement as below.

The price of milk is 2 dollars per bottle and the price of cheese is 1 dollar per box. The weight of milk is 1 pound per bottle and the weight of cheese is 1 pound per box. Due to a promotion, when you buy a bottle of milk, a box of cheese is free and the delivery to your home is free if the combined weight is up to 5 pounds. What quantities of milk and cheese can be bought for 1 dollar and make avail the promotion for free delivery (up to 5 pounds) (if at all possible)?

The above problem statement can be represented by 2 linear equations. 1 for price view and 1 for weight view.

Total Price = (cost of milk per bottle * number of milk bottles) - (cost of cheese per box * number of cheese boxes) ---> Price Equation

Total Weight = (weight of milk per bottle * number of milk bottles) + (Weight of cheese per box * number of cheese boxes) --> Weight equation

There is '-' in the Price equation to accommodate free cheese. You don't have to pay for it due to promotion otherwise you would have paid. The '+' in the weight equation to accommodate the combined weight milk and cheese. The price of cheese if free whereas the weight still needs to be carried and hence counted.

After applying the price, weight and constraints from the problem statement, the 2 linear equations are

2x - y = 1 (price equation)

x + y = 5 (weight equation)

Note: Have deliberately kept the price, weight as 2 dollars, 1 pound for milk and 1 dollar, 1 pound for cheese just to highlight there may or may not be a solution.

This is the key here. Both equations represent different constraints of the problem statement. i.e. both are not price equations/both are not weight equations. Can both represent same constraint like 2 price equations? Probably yes, but that would be like price equation for 2 separate occasions etc. and not this problem statement, maybe different problem statement.

As explained above, let's assume the first equation is representing price view and the second equation is representing weight view. With this explanation, let's try to answer your questions.

Simplified progression from SLEs to Vectors

Vector view with dimensions

What is a vector [2, 1] T?

From the 2 linear equations, there are 3 vectors.

Milk Vector = (+2x dollars from first equation, +1x pounds from second equation) = (2, 1).

Cheese Vector = (-1y dollars from first equation, +1y pounds from second equation) = (-1, 1).

Combined Milk & cheese vector (Resultant vector) = (+1 dollar from first equation, +5 pounds from second equation) = (1, 5).

So the vector form represents 2 attributes (price and weight in 2 dimensions) of milk or cheese or both!!! Due to different attributes (dimensions) that we are interested, the price and weight cannot be added by usual arithmetic (say number line addition) as this doesn't make any sense. So this calls for different arithmetic (linear algebra) with its own rules and methods. This is the crux of linear algebra which heavily uses vectors, matrices, complex numbers and all of these aids addition and multiplication of numbers which are represented by 2 or more quantities (tuples).

What does it mean by multiplying a real number x to the vector [2, 1] T?

By now, this should be evident. Multiplying a real number x to the vector (2, 1) - x in this case represent scalar/real number which represents the 'number of milk bottle'. Each milk bottle has 2 attributes. I.e. each milk bottle represents 1 dollar cost and 1 pound weight.

What does it mean by adding two vectors together?

In this case vector addition means, addition of milk vector (price, weight) and cheese vector (price, weight) to obtain combined milk and cheese vector or resultant vector.

When are two vectors the same?

In this case, when price and weight of both milk (say 2 dollars & 1 pound) and cheese (say 2 dollars and 1 pound) are same, milk and cheese vectors shall be same. The book definition says when magnitude and directions are same, both vectors are same. When same attributes of milk and cheese vectors are added in the X-Y plane, the magnitude and direction will also be same.

What is the linear function or linear map for the above problem statement? The linear function or linear map for the above example can be depicted as below.

f(x) is mapped to f(x*)

f(milk bottle) is mapped to (milk price + milk weight)

f(y) is mapped to f(y*)

f(cheese box) is mapped to (cheese price + cheese weight)

f(x,y) is mapped to f(x*,y*) or f(x+y) is mapped to f(x*+y*)

f(milk bottle, cheese box) = ( price of milk and cheese, weight of milk and cheese)

f(x,y) = (2x-y, x+y) The price/weight does not contain any powers or combinations of each other.

f(1 bottle of milk, 1 box of cheese) ==> f(x,y) ==> f(1,1) = (2.1-1,1+1) = (1,2) ie f(1,1) is mapped to (1,2)

f(2 bottle of milk, 1 box of cheese) ==> f(2x,y) ==> f(2,1) = (2.2-1,2.1+1) = (3,2). Replace x in second equation with 2x. i.e. f(2,1) is mapped to (3,2)

This is an example of 2 SLE with 2 unknowns. The same can be extended to any set of SLEs with any unknowns. Hope this makes the intuition clear. If there are any glaring holes in my argument, let me know. I can share some of the worksheets and simulations in excel which show the above clearly.

You have posted what is, essentially, a copy-paste of the same answer multiple times. This is not appropriate for Math SE, and is considered "noise". If you believe that one answer is appropriate for multiple questions, please post one answer, then flag the other posts as duplicates. — Xander Henderson, Jan 04 '21 at 12:19
Thanks for the feedback. Will keep one response and take down the others. — sureshkumar Shanmugam, Jan 04 '21 at 13:15

score 0 · Answer 6 · answered Oct 09 '16 at 16:53

Well, since you are watching the first lecture, I suppose you might not know many of the pre-requisites for the answer I am going to give, still, I advice you to read this in a couple of months from now, when you will be able to fully understand it.

Using the following notation: $$A.X=b$$ where $A$ is the coefficient matrix of the system, $X$ is the unknown vector and $b$ is the vector whose coordinates are the right hand side of each linear equation.

The column view is the idea of viewing the solution space of a system of linear equations as the set of vectors whose image through the linear transformation represented by the matrix $A$ is the vector $b$.

The row view is a way of understanding a system of linear equations in the light of the linear functionals that have as the domain the vector space considered. Now, you will eventually learn that, in the finite dimensional case, a linear transformation can be seen as a function whose "coordinate functions" are linear functionals (that is, each coordinate in the image is given by a linear functional). So the row view is a way of seeing the linear transformation represented by $A$ as a bunch of "coordinate linear functionals", that is observing the same linear transformation (and this is why they coincide) in each coordinate separately.

I can't understand. I also have the same doubt for nearly two months. — Sathasivam K, Oct 09 '16 at 16:59
What is a linear functional exactly? That's why I couldn't follow the argument. — Matt24, Oct 11 '16 at 01:03
A linear functional is a linear transformation whose domain is a vector space $V$ over a field $F$ and whose codomain is $F$. What is really important in order to understand the explanation is that if $V$ is finite dimensional, then every linear functional is of the form $f(x_1,...,x_n)=a_1x_1+...+a_nx_n$, where the $a_i$ are constants in $F$. — Arthur, Oct 14 '16 at 03:02

score 0 · Answer 7 · edited Jun 12 '20 at 10:38

I was watching Gilbert Strang's first Linear Algebra lecture, and the very first thing he does is relating the standard view of a system of linear equations as lines -in R2 of course- (what he calls the row picture) with the notion of taking a linear combination of the vectors given by the columns of the matrix (the column picture).

Now, I can see it works in practice to reach the same solution, but I don't intuitively understand why this is the case. A priori, they seem like very different things, and it's mysterious that these two views somehow correspond to each other.

Consider this system of equations: $$ \begin{array}{r} x\, +\, y=4 \\ 2x-2y=4 \end{array} $$

We can view a system of equations like this in two ways. If we look at the rows, we can say that these are two equations that have to be true at the same time.

I.e. each equation defines a relationship between $x$ and $y$, and we are asking "for what $x$ and $y$ do both these relationships equal 4?"

The other way is to look at these equations is as a system that takes a tuple $(x,y)$. The question is then "which tuple $(x,y)$ does this system transform into $(4,4)$?"

Notice, we didn't need any new notation to justify these two ways of interpreting the equations above. We can look at the equations as a pair of $x \mapsto f(x)$ and ask a question about when they are the same, or we can look at the equations as a whole, i.e. as $(x,y) \mapsto f(x,y)$ and ask about what 'the system' $f$ does to some tuple.

Tuples already have a nice interpretation as arrows, we don't need to invent anything new for that. So intuitively, the way I would put it is that the row view and column view are both there at the same time. They are two aspects of systems of (linear) equations that exist simultaneously.

By the way, the solution to the equations above is $x = 3$ and $y = 1$ or $ \begin{bmatrix} 3 \\ 1 \end{bmatrix} $ depending on how you look at it.

Imagine if the solutions from both approaches didn't correspond. Then we would not be able to look at a system of equations in these two different ways, which would basically break math :-)

Why can a system of linear equations be represented as a linear combination of vectors?

7 Answers7

Linked

Related