Sep 30, 2018 - Linear Algebra Intuition
Continuing on my quest to have the best unifying intuition about linear algebra, I came across the excellent series Essence of Linear Algebra on YouTube by 3Blue1Brown.
This series really expanded on the intuition behind the fundamental concepts of Linear Algebra by illustrating them geometrically. I have vaguely arrived at the same sort of intuition through thinking about them before, but never this explicitly. My notes are here.
Chapter 1: Vectors, what even are they
Vectors can be visualized geometrically as arrows in 1-D, 2-D, 3-D, ..., n-D coordinate system, and can also be represented as a list of numbers (where the numbers represent the coordinate values).
The geometric interpretation here is important to build interpretation and can later be gneralized to more abstract vector spaces.
Chapter 2: Linear combinations, span, and basis vectors
Vectors (arrows, and list of numbers) can form linear combinations, which can involve multiplication by scalars and addition of vectors. Geometrically, multiplication by scalar is equivlaent to scaling the length of a vector by that factor. Vector addition means putting vectors tail to head and find the resulting vector.
Any vectors in 2D space, for example, can be described by linear combinations of a set of basis vectors.
In other words, 2D space are spanned by a set of basis vectors. Usually, the basis vectors are chose to be
As a corollary, if a 3-by-3 matrix
Chapter 3: Linear transformations and matrices
Multiplication of a 2-by-1 vector
Specifically, assume the original vector
Therefore, the matrix-vector multiplication then has the geometric meaning of representing the original vector
This is a very powerful idea -- the specific operation of matrix-vector multiplication makes clear sense under this context.
Chapter 4: Matrix multiplication as compositions and Chapter 5: Three-dimensional linear transformations
If a matrix represents a transformation of the coordinate system (e.g. shear, rotation), then multiplication of matrices represent sequences of transformations. For example
This also makes it obvious why matrix multiplication is NOT commutative. Shearing a rotated coordinate system can have different end result as rotating a sheared coordinate system.
Inverse of a matrix
Chapter 6: The determinant
This is a really cool idea. In many statistics formulas, there will be a condition that reads like "assuming
In the 2D case, if a matrix represents a transformation of the coordinate system, the determinant then represents the area in the transformed coordinate system of a unit area in the original system. Imagine a unit square being stretched, what is the area of the resulting shape -- that is the value of the determinant.
Negative determinant would mean a change in the orientation of the resulting shape. Imagine a linear transformation as performing operations to a piece of paper (parallel lines on the paper are preserved through the operations). Matrix with determinant of -1 would correspond to operations that involve flipping the paper.
In 3D, determinant then measures the scaling a volume, and negative determinant represent transformations that does not preserve the handed-ness of the basis vectors (i.e. right-hand rule).
If negative determinant represent flipping a 2D paper, then 0 determinant would mean a unit-area becoming zero. Similarly, in 3D, 0 determinant would mean a unit-volume becoming zero. What shape has 0 volume? A plane, a line, and a point. What shape has 0 area? A line and a point.
So if
From this intuition, the computation of determinant also makes a bit more sense (at least in the 2D case).
Chapter 7: Inverse matrices, column space and null space
This chapter combines the geometric intuition from before and solving systems of linear equations to explain inverse matrices, column space, and null space.
Inverse matrices
Solving system of linear equations in the form of
In this nominal case where
Intuitively this makes sense -- to get the vector pre-transformation, we simply reverse the transformation.
Now suppose
Column Space and Null Space
So in the nominal case, each vector in 3D is mapped to a different one in 3D space. Therefore the set of all possible
The set of vectors
In cases where
If this rank-deficient
Geometrically, this means if a transformation compresses a cube into a plane, then an entire line of points are mapped onto the origin of the resulting plane (imagine a vertical compression of a cube, the entire z-axis is mapped onto the origin and is therefore the null space of this transformation). Similarly, if a cube is compressed into a line, then an entire plane of points are mapped onto the origin of the resulting number line.
Left and Right Inverse
Technically, the inverse
For matrices with more columns than rows, the situation is a bit different. Using our intuition from before treating the columns of matrix as mapping of the original basis vectors into a new coordinate system, consider the matrix
Consequently, if
These explanations, in my opinion, are much more intuitive and easy to remember than the rules and diagram taught in all of the linear algebra courses I have ever taken, including 18.06.
Chapter 8: Nonsquare matrices as transformations between dimensions
This is straight forward applying the idea that the columns of a matrix represent mapping of the basis vectors.
Chapter 9: Dot product and duality
This one is pretty interesting. Dot product represents projection. At the same time, we can think of dot product in 2D,
This is the idea of vector-transformation duality. Each n-dimensional vector's transpose represents an N-to-1 dimensional linear transformation.
Chapter 10: Cross products
This one is also not too insightful, perhaps because cross product's definition as a vector was strongly motivated by physics.
Cross-product of two 2D vectors
Then the triple product
Chapter 11: Cross products in the light of linear transformations
This one basically explains the triple product by interpreting the dot-product operation (
Not too useful.
Chapter 12: Change of basis
This also follows from the idea that a matrix represents mapping of the individual vectors of a coordinate system.
The idea is that coordinate systems are entirely arbitrary. We are used to having
This is done by a change of basis -- by multiplying
Change of basis can also be applied to translate linear transformations. For example, suppose we want to rotate the vector
So we can achieve this easily by first transforming
This is a powerful idea and relates closely to eigenvectors, PCA, and SVD. SVD especially represents a matrix by a series of coordinate transformations (rotation, scaling and rotation).
Chatper 13: Eigenvector and eigenvalues
This is motivated by the problem
The eigenvectors are vectors that, after transformation by
The imagery is suppose we have a square piece of paper/mesh, we pull at the top right and bottom left corners to shear it. A line connecting this two corners will still have the same orientation, though have longer length. Thus this line is an eigenvector, and the eigenvalue will be positive.
However, if a matrix represents rotation, then the determinant will be 0, and the eigenvalues will be imaginary. Imaginary eigenvalues means that a rotation is involved in the transformation.
Chapter 14: Abstract vector spaces
This lesson extrapolate vectors, normally thought of as arrows in 1- to 3-D space and corresponding list of numbers, to any construct that obey the axioms of vector space. Essentially the members of a vector space follow the rule of linear combination (or linearity).
An immediate generalization is thinking of functions as infinite-dimensional vectors (we can add scalar multiples of functions together and the result obey the linearity rule). We can further define linear transformations on these functions, similarly to how matrices are defined on geometric vector spaces.
An example of linear transformation on function is derivative.
This insight is pretty cool and really ties a lot of math and physics concepts together (ex. modeling wavefunctions as vectors in quantum mechanics, eigen-functions as analogs of eigen-vectors).