@Matt24, I was thinking upon the same set of questions regarding- how did Strang start by equating a system of linear equations to a column form that you have posed in your thread. Not sure if this thread is still open. Here is my 2 cent.
After pouring over so many books on linear algebra, finally found the answer in W.W.Sawyer's book titled -"An Engineering Approach to Linear Algebra". This is available as pdf, just search for it.
The clue to the missing link between the row form (SLE) and the column form is the glaring omission of the below 3 items by Strang and others in the depiction of the vector form.
Strang hasn't provided the exact definition of the problem statement that the 2 linear equations are representing. In real world, the 2 equations could be representing anything or for example price (first equation) and weight (second equation). This is a colossal omission for engineers, whereas acceptable for mathematicians as they breath in the abstract form.
Due to the above (1), the X & Y axes definitions which represent the actual problem statement are not provided for the Vector form. I.e. what does X axis and Y axis in vector form represent about the original problem statement from the real world?
The row form of SLEs are represented as a single graph with 2 lines representing 2 equations in this case. The arrangement/intersection of lines represent say either 'a' solution or 'many' solutions or 'no' solution and if solution(s) exist, what are they? This is not intuitive or suitable for studying or solving a real world problem which has multiple attributes. By the end of this response, we will understand what this means.
I believe, if only mathematicians have spent some more time in explaining the above 3 items in their text books, this could have been intuitively understood by mortals like us.
To make it simpler and understand better, I have provided the problem statement as below.
The price of milk is 2 dollars per bottle and the price of cheese is 1 dollar per box. The weight of milk is 1 pound per bottle and the weight of cheese is 1 pound per box.
Due to a promotion, when you buy a bottle of milk, a box of cheese is free and the delivery to your home is free if the combined weight is up to 5 pounds.
What quantities of milk and cheese can be bought for 1 dollar and make avail the promotion for free delivery (up to 5 pounds) (if at all possible)?
The above problem statement can be represented by 2 linear equations. 1 for price view and 1 for weight view.
Total Price = (cost of milk per bottle * number of milk bottles) - (cost of cheese per box * number of cheese boxes) ---> Price Equation
Total Weight = (weight of milk per bottle * number of milk bottles) + (Weight of cheese per box * number of cheese boxes) --> Weight equation
There is '-' in the Price equation to accommodate free cheese. You don't have to pay for it due to promotion otherwise you would have paid.
The '+' in the weight equation to accommodate the combined weight milk and cheese. The price of cheese if free whereas the weight still needs to be carried and hence counted.
After applying the price, weight and constraints from the problem statement, the 2 linear equations are
2x - y = 1 (price equation)
x + y = 5 (weight equation)
Note: Have deliberately kept the price, weight as 2 dollars, 1 pound for milk and 1 dollar, 1 pound for cheese just to highlight there may or may not be a solution.
This is the key here. Both equations represent different constraints of the problem statement. i.e. both are not price equations/both are not weight equations.
Can both represent same constraint like 2 price equations? Probably yes, but that would be like price equation for 2 separate occasions etc. and not this problem statement, maybe different problem statement.
As explained above, let's assume the first equation is representing price view and the second equation is representing weight view. With this explanation, let's try to answer your questions.
Simplified progression from SLEs to Vectors
Vector view with dimensions
What is a vector [2, 1] T?
From the 2 linear equations, there are 3 vectors.
Milk Vector = (+2x dollars from first equation, +1x pounds from second equation) = (2, 1).
Cheese Vector = (-1y dollars from first equation, +1y pounds from second equation) = (-1, 1).
Combined Milk & cheese vector (Resultant vector) = (+1 dollar from first equation, +5 pounds from second equation) = (1, 5).
So the vector form represents 2 attributes (price and weight in 2 dimensions) of milk or cheese or both!!! Due to different attributes (dimensions) that we are interested, the price and weight cannot be added by usual arithmetic (say number line addition) as this doesn't make any sense. So this calls for different arithmetic (linear algebra) with its own rules and methods. This is the crux of linear algebra which heavily uses vectors, matrices, complex numbers and all of these aids addition and multiplication of numbers which are represented by 2 or more quantities (tuples).
What does it mean by multiplying a real number x to the vector [2, 1] T?
By now, this should be evident. Multiplying a real number x to the vector (2, 1) - x in this case represent scalar/real number which represents the 'number of milk bottle'. Each milk bottle has 2 attributes. I.e. each milk bottle represents 1 dollar cost and 1 pound weight.
What does it mean by adding two vectors together?
In this case vector addition means, addition of milk vector (price, weight) and cheese vector (price, weight) to obtain combined milk and cheese vector or resultant vector.
When are two vectors the same?
In this case, when price and weight of both milk (say 2 dollars & 1 pound) and cheese (say 2 dollars and 1 pound) are same, milk and cheese vectors shall be same. The book definition says when magnitude and directions are same, both vectors are same. When same attributes of milk and cheese vectors are added in the X-Y plane, the magnitude and direction will also be same.
What is the linear function or linear map for the above problem statement?
The linear function or linear map for the above example can be depicted as below.
f(x) is mapped to f(x*)
f(milk bottle) is mapped to (milk price + milk weight)
f(y) is mapped to f(y*)
f(cheese box) is mapped to (cheese price + cheese weight)
f(x,y) is mapped to f(x*,y*) or f(x+y) is mapped to f(x*+y*)
f(milk bottle, cheese box) = ( price of milk and cheese, weight of milk and cheese)
f(x,y) = (2x-y, x+y)
The price/weight does not contain any powers or combinations of each other.
f(1 bottle of milk, 1 box of cheese) ==> f(x,y) ==> f(1,1) = (2.1-1,1+1) = (1,2)
ie f(1,1) is mapped to (1,2)
f(2 bottle of milk, 1 box of cheese) ==> f(2x,y) ==> f(2,1) = (2.2-1,2.1+1) = (3,2). Replace x in second equation with 2x.
i.e. f(2,1) is mapped to (3,2)
This is an example of 2 SLE with 2 unknowns. The same can be extended to any set of SLEs with any unknowns. Hope this makes the intuition clear. If there are any glaring holes in my argument, let me know. I can share some of the worksheets and simulations in excel which show the above clearly.