Let be a field. Previously, we discussed at length how matrices with
rows and
columns of entries in
are basically the same as linear transformations from
to
.
In this post, we explore the idea of invertible transformations and how we can use these ideas to help us solve simultaneous linear equations. It turns out that solving this problem is key to helping us understand dimensions in vector spaces.
Let be vector spaces of
. Recall that a linear transformation
is a vector space isomorphism if
is bijective. The collection of inverse functions is a fascinating subject of study.
Lemma 1. For any nonempty set , let
denote the set of bijections (more technically, symmetries) from
to itself.
- For any
,
.
- For any
,
.
- The identity map
and for any
,
.
- For any
,
and
.
Together, we call a group under function composition, called the symmetric group over
. Furthermore, if
is a vector space over
, then the general linear group of
is defined by
and is the group of vector space isomorphisms (under function composition) from to itself. In the special case
, we call
the general linear group of dimension
over
.
It turns out that if there exists an isomorphism , then we must have
. This will be the key result we aim to prove in this blogpost. But first, a motivating question:
What are the solutions (in
) to
?
Intuitively, we would think that . This means that
can be any free parameter
, and
can be any free parameter
. In that sense, we have the solution
Now we could represent this information in terms of vectors. Denote the vector
The equation tells us that we need to combine the recipe with the ingredients
to obtain the dish
Indeed, any that solves the given equation
is simply an element of
and vice versa. From our exploration, we can actually compute this set, since
Therefore, . With
equation in
variables, we get a kernel with basis containing
elements. What if we had
equations?
What are the solutions (in
) to the following system?
Denoting and
, we now want to calculate
. A careful analysis yields
Unfortunately, evaluating the right-hand side will inevitably lead us back to where we started. We need some clever tricks. Well, recalling the method of elimination, if we subtracted the first equation from the second, we get
Multiplying by then allows the coefficient of
to become
, yielding
Finally, adding times of the second equation from the first yields
Denoting , we conclude that
The key idea is that in each step, we could “undo” the process (e.g. multiplying by “undoes” multiplying by
). At each step, we are transforming each equation, which corresponds to transforming the rows of
. The final matrix
is known as the reduced row-echelon form where the columns are arranged as follows:
- The columns
are arranged right-ward. These are called pivot columns. Columns that are not pivot columns are non-pivot columns.
- Before a column that is
, all columns are
.
- Any column between
and
belongs to
. Here, we use round brackets to emphasise the order of the basis.
Using induction, we can prove that every matrix has a unique reduced row-echelon form.
Therefore, given , we want to prove that
. In fact, this result will hold if there exists an invertible (i.e. bijective) linear transformation
such that
If we can prove that our suggested moves do indeed correspond with bijective linear transformations, we can prove the result. We will use matrices to illustrate the technique.
Lemma 2. Suppose is invertible (i.e. bijective). Then there exists a matrix
such that
Furthermore, .
Proof. By invertibility,
By the definition of ,
We can check that the map induces a group structure on
with identity . For the inverse of
,
yielding .
Henceforth, we will make no distinction between and
(with the standard ordered bases), and in particular, regard
for matrices
.
Corollary 1. A matrix is invertible if it belongs to
.
Let for discussion purposes.
Lemma 3. Multiplying row by the constant
is defined by the invertible matrix
Proof. To show that we multiply the row, we carry out the matrix multiplication:
We can similarly verify that the inverse of is
, left as an exercise.
Lemma 4. Adding times of row
to row
is defined by the invertible matrix
Swapping row with
is defined by the invertible matrix
Proof. Exercise. Check that
The matrices of the forms in Lemmas 3 and 4 are called elementary matrices, which can be defined similarly for matrices, as well as operations between different rows.
Lemma 5. For any matrix
with reduced row-echelon form
, there exists an invertible matrix
such that
. In particular, for vectors
,
,
This procedure of converting a matrix into its reduced row-echelon form is called Gaussian elimination (or more precisely, Gauss-Jordan elimination).
Proof. Thanks to the matrices in Lemmas 3 and 4, there exist a sequence of elementary matrices such that
Denoting , which is invertible,
.
We have come to the fundamental theorem of invertible matrices.
Theorem 1. Let be an
matrix. The following are equivalent.
is invertible.
- The solution to
is
.
- The reduced row-echelon form of
is
.
can be expressed as a product of elementary matrices.
Proof. If is invertible, then
is injective, and the second proposition holds.
Suppose the solution to is
. We claim that the reduced row-echelon form
of
has no zero rows and no non-pivot columns.
If has a zero row, then
, so that
. Hence,
has a non-trivial solution, a contradiction.
Suppose has a non-pivot column
. If
is the first column, it must be
, and so
will be a non-trivial solution to
, a contradiction.
If instead is not the first column, then the first column is
. Suppose
for some nonzero
without loss of generality. Then the first row entries of row
will resemble
This implies that , and
can be regarded as a free parameter
so that
In particular, choosing such that
yields a nonzero
such that
, which yields
, a contradiction.
Since has no zero rows or non-pivot columns, all
columns are pivot columns. Therefore,
Suppose , which is an elementary matrix. By the proof in Lemma 5, there exist a sequence of elementary matrices
such that
Since inverses of elementary matrices are themselves elementary matrices,
is a product of elementary matrices. Since each elementary matrix is invertible, so is , as required.
Analysing pivot columns is itself interesting, since we can generalise the arguments above to deduce the necessary sizes of the matrices.
Corollary 2. Let .
- If
is injective, then
.
- If
is surjective, then
.
Proof. Let denote the reduced row-echelon form of
. Write
. We have the following relationships
If is injective, then
has no non-pivot columns. On the other hand, we claim that if
is surjective, then
has no zero rows. We will prove this claim by contrapositive.
Suppose has at least one zero row. Then the equation
has no solution since row
yields
for any
. Therefore, the equation
has no solution. In other words, there exists a vector, namely
such that for any
,
. Therefore,
is not surjective.
Now that we have established the relative sizes of matrices given its injectivity or surjectivity status, we are finally ready to discussion dimensions. Roughly speaking, a linearly independent set corresponds to an injective matrix, while a spanning set corresponds to a surjective matrix. The numbers of the dimensions will then follow accordingly. But this post is getting as lengthy as it needs to be, and so we delay this discussion to next time.
—Joel Kindiak, 1 Mar 25, 1431H
Leave a reply to Introductory Eigenstuff – KindiakMath Cancel reply