250syl.html

Lecture 8

Inverses and Elementary Matrices

Inverses

We have defined a multiplication which, for square matrices of the same size, yields an output which is again a square matrix of the same size. We have already seen that in this case there is a multiplicative identity. It is thus natural to ask whether a square matrix has a multiplicative inverse.

Def. An n x n matrix A is invertible if there is an n x n matrix B such that AB = BA = I_n. In this case, B is called an inverse of A.

Examples: diagonal matrices with and without inverses; rotation matrices

Uniqueness Theorem
If A is an invertible matrix, then it has only one inverse. (This inverse will then be denoted A^-1.)

To see this, note that if both B and C are inverses to A, then we have

C = CI_n = C (AB) = (CA)B = I_n B = B.

Solving Systems of Equations: If A is invertible, and Ax = b, then x = A^-1 b.

Indeed, A^-1 Ax= A^-1b and the result follows. Notice, of course, that this method of "solving" the system of equations only applies if the number of equations and unknowns is the same and, in addition, the matrix of coefficients of the system is invertible. But many other types of systems can also be solved - though not by this particular method.

Properties of Inverses Let A and B be invertible n x n matrices. Then
(a) (A^-1 ) is invertible, and (A^-1 )^-1 = A.
(b) AB is invertible, and (AB)^-1 = B^-1 A^-1. (Backwards!)
(c) A^T is invertible, and (A^T)^-1 = (A^-1)^T.

To see these results, notice first that (A^-1 )A =A(A^-1 )= I_n and so, by definition, A inverts A^-1. Next, (B^-1 A^-1 )AB=B^-1 (A^-1 A)B= B^-1 B=I_n and similarly AB(B^-1 A^-1 )= I_n, so (B^-1 A^-1 ) inverts AB. Finally, (A^-1^T A^T =(A^-1 A)^T = I_n and similarly A^T (A^-1 )^T = I_n, so (A^-1)^T inverts A^T.

Example A similar argument shows that if all of the matrices involved are invertible, then

(A₁ A₂ . . . A_k )^-1 = A_k^-1 . . . A₂^-1 A₁^-1.

Elementary Matrices

Row operations on an m x n matrix can also be viewed as multiplication (on the left) of that matrix by certain other m x m matrices. In each case, the matrix which performs a given row operation can be found by applying that row operation to the m x m identity matrix. We willuse "E" to denote one of these matrices, generically. Then what we are saying is:

For the interchange of row i and row j, E is obtained from I_m by interchanging those rows.
For the multiplication of row i by a nonzero scalar c, E is obtained from I_m by multiplying row i by c.
For replacement of row j by (row j)+ k(row i), E is obtained from I_m by replacing row j by (row j) + k(row i).

Note that, in each case, for any m x n matrix A, the matrix product EA yields the matrix with the corresponding row operations performed on A.

Def. An elementary matrix E is one obtained from I_m by any of the three row operations, as above.

Theorem
Elementary matrices are invertible, and their inverses are again elementary matrices.

Indeed, the inverses are clearly seen to be, respectively, the elementary matrices corresponding to interchanging row j and row i, multiplying row i by (1/c), and replacing row j with (row j) - k (row i).

Theorem Let A be an m x n matrix with reduced row echelon form R. Then there exists an invertible m x m matrix P with PA = R.

We construct P out of the elementary matrices E₁ ,. . . E_k that reduce A to R via steps 1 through k, in that order. Then R = E_k · · · E₁ A. If we set P = E_k · · · E₁ , then P is the invertible matrix such that PA = R, and its inverse is P^-1 = E₁^-1 · · · E_k^-1.

Application
Let S= { u₁ ,. . . , u_m } be a set of vectors in R^m . Let A be the matrix with these vectors as its columns, that is, A = [u₁ . . . u_m ]. Let PA = R be the reduction of A to reduced row echelon form. Then R has as its columns a collection of standard vectors (we will use subscripted vectors f for these) and has additional columns which are sums of multiples of those unit vectors (denoted by subscripted vectors g). If, for example, we have

R = [f₁ f₂ g₃ f₄ f₅ g₆],

then from the reduced row echelon format we know that there are constants such that

g₃ = a₁ f₁ +a₂ f₂, and

g₆ = b₁ f₁ + b₂ f₂ + b₄ f₄ + b₅ f₅.

But we know that P[u₁ . . . u₆ ]= [f₁ f₂ g₃ f₄ f₅ g₆], and the columns of this equation mean that

Pu₁ = f₁, Pu₂ = f₂, Pu₃ = g₃, Pu₄ = f₄, P u₅ = f₅, and Pu₆ = g₆.

By linearity we then have

Pu₃ = g₃ = a₁ f₁ + a₂ f₂ = a₁ Pu₁ + a₂ Pu₂ = P(a₁ u₁ + a₂ u₂ ),

and therefore (multiplying the far left and far right hand sides of this equation by P^-1 ), we conclude that

u₃ =a₁ u₁ + a₂ u₂ .

Similarly

g₆ = b₁ u₁ + b₂ u₂ + b₄ u₄ + b₅ u₅ .

In other words, the same linear relationship (meaning the same coefficients in corresponding places!) is satisfied by the original columns of the matrix, as by the columns of the reduced row echelon form of the matrix. We can state this result formally, since it is clearly true in general, not just in our example.

Linear Correspondence Property Any linear relationship among the columns of a reduced row echelon form matrix corresponds to the same linear relationship among the corresponding columns of the original matrix.

Examples

$Back$ 250 Lecture Index