Lecture 19

Diagonalization of Matrices

We have already seen that for square matrices that describe the change in population distributions after a fixed period of time, the positive integer powers of those matrices are used to find population distributions after multiple time intervals. Similar statements can be made about a variety of input-output situations where the unit-time transition is described by the action of a square matrix. Unfortunately, for ageneral matrix any reasonably large size, the number of computations involved in computing its successively higher powers is phenomenally high.

On the other hand, we do know matrices whose powers are easy to compute - the diagonal ones. As is easily verified, if D is a diagonal matrix with diagonal entries d1, . . ., dn, then D k is the diagonal matrix with diagonal entries the corresponding powers d1k, . . ., dnk.

Are there more general matrices whose powers can be computed this easily? Suppose A is a square matrix which is similar (in the sense defined before) to a diagonal matrix - this means that A = PDP-1 for some invertible matrix P. Conveniently,

Ak = (PDP-1 )(PDP-1 )· · · (PDP-1 )= PD k P-1 .



Thus the computation of a high power of a m atrix similarto a diagonal matrix is almost as simple to compute as that of a diagonalmatrix.

Def. An n x n matrix A is called diagonalizable if A = PDP -1 for some diagonal n x n matrix D and some invertiblen x n matrix P.

Example

Example: a 2 x 2 matrix which is not diagonalizable.

How can we recognize whether a matrix is diagonalizable or not? We must consider how to find the diagonal matrix and the invertible matrix P, if it is diagonalizable. We are looking for P invertible, and D diagonal, satisfying the following: AP = PD. Write P in terms of its columns:

P = [p1 . . . pn].

Notice that the column vectors form a basis of Rn, because P is invertible. Now, AP can be written in terms of its columns as

AP = [Ap1 . . . Apn].

On the other hand, if the diagonal entries of D are d1, . . ., dn, then clearly

PD = [d1p1 . . . dnpn].

Since AP and PD have the same columns, this means that each of the columns satisfies Apk = dk pk. In other words, if A is diagonalizable, then automatically the diagonal matrix D to which it is similar has eigenvalues of A as its diagonal entries, and the invertible matrix P which diagonalizes A has the corresponding eigenvectors as its entries. In fact, these statements provide a complete characterization of the diagonalizable matrices.

Diagonalization Theorem An n x n matrix A is diagonalizable if and only if there is a basis of Rn consisting of eigenvectors of A. A = PDP-1 for P invertible and D diagonal if and only if the diagonal entries of D are eigenvalues of A and the columns of P are the corresponding eigenvectors.

We have already seen that if A is diagonalizable, then D and P have the given properties, and the columns ofP provide a basis of Rn consisting of eigenvectors of A. On the other hand, suppose Rn has such a basis, and we form the corresponding matrix P with the basis vectors as its columns. Then AP = PD as above, where D has the corresponding eigenvalues as its diagonal entries. Thus A is diagonalizable and A = PDP-1.

Examples

How can we be determine given an n x n matrix A, whether we can, or cannot, find a basis of Rn consisting of eigenvectors? We already know an algorithm for finding the eigenvalues and eigenvectors, and that the number of linearly independent eigenvectors associated with an eigenvalue is at least one, and is at most the multiplicity of the eigenvector. Moreover, since the multiplicities of all of the (real) eigenvalues add up to at most n, if there are fewer than n real eigenvalues, or if one of the associated eigenspaces has dimension less than the multiplicity of the corresponding eigenvalue, then the total number of linearly independent eigenvectors is smaller than n, and they do not form a basis for Rn. On the other hand, if there are n real eigenvalues, and all of the dimensions of the corresponding eigenspaces equal the multiplicities of the corresponding eigenvalues, then we can find a total of n eigenvectors, in linearly independent subsets corresponding to the distinct eigenvalues. We will see below that the n vectors in the union of these linearly independent subsets form a linearly independent set, and hence form a basis of Rn . This means that we know exactly when a matrix is diagonalizable.

Test for Diagonalizability
An n xn matrix A is diagonalizable if and only if both of the following conditions hold:

(a) the characteristic polynomial of A has n real roots (not necessarily distinct);
(b) the dimension of the eigenspace corresponding to each eigenvalue is equal to the multiplicity of the eigenvalue.

Corollary
If the characteristic polynomial of an n x n matrix A has n distinct real roots, then A is diagonalizable.

(Indeed, if the real roots are distinct, then each has multiplicity one, and we know that the eigenspaces then each have dimension one.)

Examples

The only point in the proof of the diagonalizability testwhich we have not yet established is the statement that, if S 1, . . ., S k are lin early independent subsets of eigenspaces corresponding to distinct eigenvalues l1, . . ., lk of the n x n matrix A, then their union is linearly independent. To see this, suppose that the vectors were linearly dependent. Then we can find a "shortest" dependence relationship (counting the nonzero coeficients), and it must involve vectors from at least two of the different eigenspaces, say the first and last. Then with the obvious notation we have

c1v1 + · · · + ckv k = 0.

Hence A(c1v1 + · · · + ckv k ) = 0, and using the fact that these vectors are eigenvectors we have

l1c1v 1 + · · · + lkckvk = 0.

We multiply the earlier equation by lk and then subtract the result from the last equation, leaving

(lk - l1) c1 v1 + · · · + (lk - lk) ck vk = 0.

This last coefficient is zero, but the first coefficient is not, so we are left with a shorter dependence relationship than the one we had! Now since we started with the the shortest possible relationship, the original supposition of linear dependence must be false, and the result is proved.

Back 250 Lecture Index