Gaussian elimination is not the be-all and the end-all in practice, especially for solving linear systems with a fixed coefficient matrix and various "right hand sides". We will examine a different procedure, an alternative to repeated Gaussian elimination, and we will analyze the relative efficiency of the two approaches, as well as the "matrix inversion" approach.
In order to bring a matrix into reduced row echelon form, notice that all three types of elementary row operation may be necessary, but just to bring a matrix to row echelon form (not necessarily reduced), we do not need to use the operation "multiply a single row by a nonzero constant." We will now consider matrices for which we do not need to use row interchanges, either. In fact, if the matrix corresponds to a linear system, then the row interchanges correspond to reordering the equations, which could have been done before we started. As a result, though we will only consider matrices that do not require interchanges, our method can easily be modified to handle any linear system.
Thus we consider matrices A which can be brought to row echelon form using only row operations of the single type "replace row j with row j + c row i, for c a scalar and i< j." Denote the row echelon form of A by the matrix U: then every entry below and to the left of its "diagonal" entries (that is, each entry of the form uii) is a zero. (Such a matrix is called "upper triangular.") And we have elementary matrices E1, . . . , Ek such that
Ek · · · E1 U= A;
if we set L = E1-1 · · · Ek-1, then
A= LU.
If we examine any of the elementary matrices Ej, we see that it has ones on the diagonal, one nonzero entry below the diagonal, and all other entries zero. The same holds for its inverse (which has the negative of the original nonzero entry below the diagonal, in its place). It is easy to see that any product of such matrices, such as the one called L above, has all of its nonzero entries on or below the diagonal; such matrices are called lower triangular. In fact, the diagonal entries are clearly all equal to 1, and the nonzero entries below the diagonal can be obtained from the entries in the individual matrices Ej-1.
Examples
The L-U Decomposition of a matrix
A
Suppose that the m x n matrix A
can be brought into row echelon form U, an upper
triangular matrix, using only row operations of
the form "replace row j by (row j + c row i)"
(for c a scalar and i <j).
Then A = LU, for L the matrix formed from
Im
using the sequence of inverses
of the elementary operations reducing A to U.
L is a lower triangular matrix, with ones on
the diagonal.
In practice, we can keep track of these operations byoperating side by side on A and on Im, being sure to use the inverse operation (adding instead of subtracting the corresponding row multiple, and vice versa) on Im.
Examples
Now, what if the matrix A with this
L-Udecomposition came fr om the linear system
Ax = b? We have LUx =
b, and if we set y = Ux,
this is the same as Ly = b.
Solving y = Ux is what we have referred to as
"back substitution" before, and similarly solving
Ly = b
is called "forward substitution"(the variables are determined
from first to last, rather than from last to first). The
combination of these two procedures:
completes the solution of the original
problem.
Examples
We will analyze the complexity of three approaches to solving a system Ax= b with a fixed coefficient matrix A, and with many different values of b. Consider the case in which A is an invertible n x n system. Notice that in the decomposition A = LU, we then have both Land U invertible . We have three methods we can consider: repeatedly computing the reduced row echelon form of the augmented matrices [A b] for the different values of b; finding the inverse A-1 using Gaussian elimination on [A In] and then computing the solutions x = A-1 b for the different values of b; finding the LU decomposition of A, and solving y = L-1b and x = U-1y. We can compare the relative efficiencies of these methods by counting the number of arithmetical operations (additions, subtractions, multiplications, and divisions) used by each method. These are called "flops" in computing (floating point operations), and their total is called the flop count.
The approximate flop counts, in the n x n case considered here, are as follows:
If we are interested in solving the linear system for N choices of b, notice that we have the following approximate counts:
If n and N are large, clearly the second and third methods are much more efficient than the first, and the third method is much more efficient than the second if n is large.