We have seen that orthogonal matrices have a particularly convenient property: their inverses are their transposes (and hence are substantially easier to compute than inverses in general). We have also used inverses in an important computational context - to diagonalize a matrix, when that is possible. When can that diagonalization be achieved using an orthogonal matrix, and its inverse? Conveniently, there is a simple way of recognizing when this is the case, though the proof is not simple.
Recall that a square matrix A is symmetric if A = AT.
Theorem An nxn matrix A is symmetric if and only if there is an orthonormal basis of Rn consisting of eigenvectors of A. If this is the case, then there is an orthogonal matrix Q, and a diagonal matrix D, such that A = QDQT.
Indeed, if there is such an orthonormal basis of Rn, then we already know that A = QDQ-1 for Q the matrix whose columns are the given eigenvectors, and D the diagonal matrix of eigenvalues. Since Q is then orthogonal by definition, it follows that A = QDQT . And then
AT =(QDQT )T = (DQT )T QT = QDQT = A,
so A is automatically symmetric in this case.
On the other hand, suppose instead that we start with the assumption that
A is symmetric. According to the theorem, we should then be
able to find an orthonormal basis for
Rn,
consisting entirely of eigenvectors.
The main difficulty is showing
Suppose we assume this for the moment. Pick an eigenvalue l for A, and a corresponding eigenvector of length 1, u. Let Q be any orthogonal matrix whose first column is u, and consider the matrix B=QTAQ. Then B is symmetric, since BT = (QTAQ)T = QT AT Q = QTAQ = B. Furthermore, the first column of B is QTAu = QT lu = le1 (where e1 is the first standard basis vector). Since B is symmetric, the matrix B' derived by omitting the first row and column of B is also symmetric, so proceeding inductively we may suppose that we have an orthogonal matrix Q' which diagonalizes B'. Let R be the orthogonal matrix with first row and column the standard basis vector [1,0,...,0] (or its transpose), and with Q' providing the remaining entries. Then RTBR is a symmetric matrix whose first column is l e1 (and by symmetry, the first row is similar), and which on deletion of the first row and column leaves the diagonal matrix (Q'TBQ'). Thus [QR]TA[QR] = RTQTAQR = RTBR= D is diagonal, and our claim is proved.
As stated, the tricky part is showing that the eigenvalues are real. For the sake of completeness, we give that calculation, which forces us to consider complex eigenvalues and eigenvectors. In that context, we need to make use of complex conjugation.
Reality Theorem
The eigenvalues of a symmetric matrix are real numbers.
Definition.
(1) If a+bi is a complex number, then its
complex conjugate is the number a-bi, in which i
is replaced by -i. If
l
is a complex number, we denote its complex conjugate
l'
(2) If u is a complex vector, let u' denote
its complex conjugate (term by term).
Note
(a)
Notice that a complex number
l
satisfies the condition
l=
l' if and only if it is
real.
(b) Another important relation is:
(lm)'
=
l'm'
(where
l and
m are complex numbers).
Now we can make the following calculation. Let A be a
(real) symmetric matrix, let
l
be a complex eigenvalue for A, and let u be a
corresponding complex eigenvector.
Then
This proves the Reality Theorem, which completes the proof that the symmetric matrices are exactly those which are diagonalizable via orthogonal matrices.
Examples
An application: Quadratic forms in R2.
Although almost all high school algebra courses refer to the fact that quadratic forms in two variables all correspond to conicsections (circles, ellipses, parabolas, and hyperbolas) they usually deal only in passing with the general quadratic form
a x2 + 2bxy + cy2 + dx + ey + f = 0.
They instead consider the simpler case in which there is no xy term, since then a simple process of "completing the squares" allows us to put the equation into a standard form, and determine which conic section it yields and where it is located. In this simpler case, the major axis is either horizontal or vertical. But we know that the general case must correspond to the simpler case after a matrix transformation of R2 correspondingto an appropriate rotation, and in fact symmetric matrices allow us to consider how to find this transformation.
We consider the quadratic form associated with the above expression (that is, the purely quadratic terms in the variables x and y): ax 2 + 2bxy + cy2 . If we let
A = and v = ,
then the quadratic form is exactly A v·v = v·Av = vT Av. We know that we can diagonalize the symmetric matrix A as A = QDQT. The quadratic form is then
vT QDQT v =vT (DQT )T QT v =(DQT v)T QTv = D(QT v)·(QT v).
If x' and y' stand for the coordinates ofQT v, and l 1 and l2 are the diagonal elements, then this expression is exactly l1 x'2 + l2 y'2 . In other words, the new variables x' and y' (which are determined from x and y via the orthogonal transformation QT corresponding to the orthonormal eigenvectors of A) are ones inwhich the quadratic form is in the standard form, and the major axis isparallel to the one of the two new orthogonal axes. The type of conic section (circle, ellipse, parabola, or hyperbola) is determined by the numbers l1 and l2 , which are just the eigenvalues of the matrix A.
This result can be adapted to any number of dimensions. The quadratic form associated with a symmetric square matrix is Av·v = v·Av (this expression is quadratic in the components of v). The diagonalization via an orthogonal transformation means that, in appropriately transformed coordinates, the quadratic form is a sum of multiples of squares of the new coordinates. The transformation is determined by the orthonormal eigenvector s of thesymmetric matrix, and the multipliers in the sums of squares in the newcoordinates correspond to the eigenvalues. The classical geometric questions involving conic sections in three dimensions are handled in afashion similar to the two-dimensional case. The quadratic form associated with a symmetric square matrix in four dimensions has important applications to the theory of relativity.