A sort of diary for Math 421:02, spring 2004
Part 2: linear algebra, the first half

BACK TO
LAPLACE
TRANSFORMS
The first half of linear algebra AHEAD TO
MORE LINEAR
ALGEBRA
Lecture #8
2/12/2004
Lecture #9
2/17/2004
Lecture #10
2/19/2004
Lecture #11
2/26/2004
Lecture #12
3/2/2004
Lecture #13
3/5/2004

DateTopics discussed
Thursday,
March 5
Questions from long ago
We began the linear algebra part of this course on February 12. I asked some questions which were not answered by students in the class with a large amount of certainty. Today I returned to these questions after the linear algebra sophistication (!) we've gotten in the last three weeks. I would like to answer each question, and give a brief discussion supporting the section. The discussion will not be detailed but I would like it to be persuasive enough.

Old question #1
If (0,0,0,0) and (1,2,4,0) are solutions of 7 linear equations in 4 unknowns, must there be other solutions?
Discussion and answer
In matrix form, the equations are AX=Y where Y is a 7 by 1 column vector and X is a 4 by 1 column vector. A is therefore a coefficient matrix which is 7 by 4. Since (0,0,0,0) is a solution, we learn that Y must be the 7 by 1 vector of 0's, and that this system is AX=0, a homogeneous system. The collection of solutions, S, of a homogeneous system is a subspace, in this case a subspace of R4. The dimension of S gives a rough measure of how big S is. Here the dimension of S is one of the integers 0 or 1 or 2 or 3 or 4. We're also told that (1,2,4,0) is a solution, that is, it is in S. Therefore S has at least dimension 1. And since S is a subspace, any scalar constant multiplying (1,2,4,0) must also be a solution. Therefore if t is any real number, (t,2t,4t,0) is a solution. The answer to the question, "Must there be other solutions?" is therefore, yes, there are infinitely many other solutions. I note also that the coefficient matrix, A, of the homogeneous system AX=0 which we know is 7 by 4, must have rank somewhere between 0 and 4. There is no restriction on the rank (I messed this up in class!). Let's just look at the extreme values for simplicity. If rank A = 0 then A is the 0 matrix, and certainly (t,2t,4t,0) solves AX=0. If rank A =4, then the coefficient matrix could look like (in RREF)

(1 0 0 0 -1 0 0 0)
(0 1 0 0 -2 0 0 0)
(0 0 1 0 -4 0 0 0)
(0 0 0 1  0 0 0 0)
and this also has (t,2t,4t,0) as some solutions. So I don't think we can say much about the rank of A in this case.

Old question #2
Can there be exactly 5 distinct solutions to a collection of 340 linear equations in 335 unknowns?
Discussion and answer
I inserted the italicized word distinct above. I think I meant that "Could there be exactly 5 distinct solutions?", not just 5 solutions, which by linguistic indefiniteness (!) could turn out to be just one solution with 5 different names.
We have another system here, which again in matrix form could be written AX=Y where is Y a 340 by 1 column vector, and X is a 335 by 1 column vector of unknowns, and A must therefore be a coefficient matrix which is 340 by 335. A contains more than 100,000 (I think!) numbers. I claim that the answer to the question is "No." I will attempt to convince you that the answer to the question is "No" by assuming that AX=Y has five distinct (pairwise unequal!) solutions and showing that there can be many others.
So suppose X1 and X2 and X3 and X4 and X5 all satisfy AX=0. Then Mr. Obbayi who I believe was thinking linearly suggested that we look at the column vector W=X1-X2. What do we know about W? Well, AW=A(X1-X2)=AX1-AX2=Y-Y=0. Therefore this W is a solution to the associated homogeneous system. But each one of the Xi's is a particular solution to AX=Y. Therefore any of them plus a solution to the homogeneous system is a solution to AX=Y. Remember again that solutions to the homogeneous system form a subspace. Therefore X3+t(X1-X2) is a solution to AX=Y for any t. Hey: there are lots and lots of values of t, and since the solutions are distinct, X1-X2 is not 0. Therefore we have created infinitely many different solutions to the original system. That's more than 5!

Old question #3
Must a system of 30 homogeneous linear equations in 40 unknowns have some non-zero solution?
Discussion and answer
So we have again AX=Y. Now Y is 30 by 1, and, in addition, Y is 0 (all zero entries) since this is a homogeneous system. X is 40 by 1 and A is 30 by 40: it has 1200 entries. We need to describe some aspects of the solution space of AX=0. This collection of X's is a subspace of R40. Initially, it could have any dimension, from 0 to 40. The answer to the question will be "Yes" and to give some supporting evidence we must show that the dimension is >0. Since this is a homogeneous system, we can look hope that the RREF will contain some information. And since the system is homogeneous, we don't need to "augment" the coefficient matrix. So:

        (BLOCK OF 1's AND |   JJJJJJ   UU  UU  NN  N  K K)
        (0's SORT OF AN   |     JJ     UU  UU  N N N  KK )
A~~...~~(IDENTITY MATRIX  |   JJJJ     UUUUUU  N  NN  K K)
        (------------------------------------------------)
        (          MAYBE HERE SOME ROWS OF 0'S           )
The most important fact to remember her is that A is a 30 by 40 matrix: it is "wider" than it is "high". The rank of A is the number of 1's in the left-upper corner BLOCK. How big can the rank be? The largest it can possibly be is 30, the "height" of the matrix. Now consider the width. Since that is 40, the chunk I called JUNK (pretty poetry!) must be at least 40-30 columns wide. That means (if you think about the equations that the rows represent) at least 10 variables can be freely specified (uhh, x31 through x40, I think, as we usually number them) in the solution space of AX=0. So the solution space has dimension at least 10. Therefore there are "some" non-zero solutions, in fact, at least a 10-dimensional space of non-zero solutions.

I wanted to revisit those questions so I could attempt to convince you that we've done something worthwhile in the last few weeks. Certainly part of what we've done is learn vocabulary ("spanning" and "subspace" and "linear combination" and "forsooth") but I hope we've also gotten enough understanding that we could answer some questions which seem formidable.

Mr. Marchitello kindly wrote his solution to yesterday's QotD, which will be useful in today's work. The system was

2x1+3x2=y1
5x1-1x2=y2
and here is the row reduction:
(2  3|y1)~(1  (3/2)|  (1/2)y1   )~(1 0|(1/17)y1+(3/17)y2)
(5 -1|y2) (0 -(17/2)|-(5/2)y1+y2) (0 1|(5/17)y1-(2/17)y2)
What can we learn from this computation? As I remarked, there are no "compatibility conditions" to be satisfied, so there is a solution for any choices of y1 and y2. But how many solutions are there? Well, again think about "a particular solution plus any solution of the associated homogeneous system". Here the associated homogeneous system is row equivalent to the 2 by 2 identity matrix,
(1 0) 
(0 1)
where there's no "junk" in the sense of the earlier discussion. The only solution of the associated homogeneous system is the trivial solution, (0,0). So for every choice of the pair (y1,y2) there is exactly one solution (x1,x2). That solution is given by those two linear combinations, so x1=-(5/2)y1+y2 and x2=(5/17)y1-(2/17)y2.

This situation is very nice and very important. I need to discuss it in general.
DEFINITION The n by n identity matrix has 1's on the diagonal and 0's elsewhere. I will call it In. It is both a left and right multiplicative identity for matrices where the matrix product is defined. For example, where n=3

   (1 0 0)
I3=(0 1 0)
   (0 0 1)
and if
  (a b c)
  (d e f)     (p q)
A=(g h i)   B=(r s)
  (j k l)     (t u)
  (m n o)
then you should do enough of the matrix multiplication to convince yourself that AI3=A and I3B=B.

DEFINITION Suppose A is a square matrix, say n by n. Then B is an inverse of A if AB and BA are both In. B is frequently written as A-1. If I know that B is the inverse of A, and if I want to solve the matrix equation AX=Y, then X=BY, very simple.

It is possible find inverses "by hand" for small square matrices. Look:

(2  3|1 0)~(1   (3/2)| (1/2) 0)~(1 0|(1/17)  (3/17))
(5 -1|0 1) (0 -(17/2)|-(5/2) 1) (0 1|(5/17) -(2/17))
So
((1/17)  (3/17))
((5/17) -(2/17))
is a multiplicative inverse to
(2  3)
(5 -1)
You can check this claim very easily by multiplying the matrices. Why does this method "work"? Of course, I am just duplicating what Mr. Marchitello did above. The different columns are just place-holders for the variables y1 and y2. The inverse remarkably (?) appears because we are just writing out the coefficients for the linear combinations solving the original system of linear equations.

Algorithm for an inverse and some ideas
Suppose A is an n by n matrix. We can attempt to get an inverse by doing the following: write the augmented matrix (A|In) and then use row operations on A, pushed through the whole n by 2n augmented matrix, to try to get the n by n identity matrix appear on the left-hand side. If the result is (In|B) then B is the guaranteed inverse of A. From this we can deduce some facts and uses:

  • If A is an n by n matrix, then A will have an inverse exactly when A has rank=n.
  • If A is an n by n matrix, then A will have an inverse exactly when its rows are all linearly independent. Then the rows will be a basis of Rn.
  • If A has an inverse B, then the equation AX=Y will have exactly one solution, and that solution X is BY. That's because B(AX)=BY, but B(AX)=(BA)X=InX=X, so X must be BY.
I can also tell you that if A has rank less than n, then the rows of A are linearly dependent, and there must be Y's for which AX=Y has no solution (because in the row reduction algorithm, there will be a row of 0's on the left-hand side, which will lead to a compatibility condition on the right). Also the associated homogeneous system AX=0 will have some "junk" and that therefore there will be non-trivial solutions so even if AX=Y has a solution, it will always have infinitely many solutions. So when the rank is less than n, things happen. You might consider them lousy things or interesting things (or boring things!) but stuff happens. When the rank is as large as it can be (and, as I've written already a few times, matrices want to have large rank) nice things happen.

A silly (?) example
I wrote the following on the board. Suppose A is the matrix

( 7  1  3 -1)
( 6  3  0  2)
(-2  2  3  1)
( 6 -4 -1 -4)
and B is the matrix
( -13   9   17  12)
(  98 -67 -128 -90)
( -38  26   50  35)
(-108  74  141  99)
I assert that B is A-1. How would you verify this rather peculiar claim (yeah, I will tell you how I got this example, soon). What should we do and how difficult would the task be? We could multiply the matrices which is not much work on a computer, since I have explained that matrix multiplication and vector dot products are often acknowledged by both software and hardware these days, and are very fast. By hand, we could pick out a row, say, in A, and dot it with a column in B. I forgot which candidates I took in class, but let me try the third row in A and the third column in B.
      A             B, the candidate 
                         for  A-1 
( 7  1  3 -1)      ( -13   9   17  12)
( 6  3  0  2)      (  98 -67 -128 -90)
(-2  2  3  1)      ( -38  26   50  35)  
( 6 -4 -1 -4)      (-108  74  141  99)
We compute: (-2)(17)+(2)(-128)+(3)(50)+(1)(141)=-34-256+150+141=-280+281=1. Amazing and just as we hoped! The diagonal entries in the product of these matrices should all be 1. What about off-diagonal entries. The dot product of such a row vector and a column vector should be 0 (the vectors should be orthogonal or perpendicular). We could try again a random (?) entry:
  
( 7  1  3 -1)      ( -13   9   17  12)
( 6  3  0  2)      (  98 -67 -128 -90)
(-2  2  3  1)      ( -38  26   50  35)  
( 6 -4 -1 -4)      (-108  74  141  99)
Again we compute:(-2)(9)+(2)(-67)+(3)(26)+(1)(74)=-18-134+78+74=-152+152=0.More amazing! This says that the (32) entry in the product (third row and second column) is 0. These vectors are perpendicular. "Clearly" -- no, not at all! I think most of this stuff is almost miraculous.
Comments
No, I am not a magician. I used Maple but I must tell you that even with Maple's assistance producing this example was not exactly easy. First, the inverse of a matrix with integer entries usually involves rational numbers which are not integers. We have already seen that, with the wonderful (?) "17" example. Well, next week we shall see that there is an approach distinctively different from this algorithm which will produce the inverse to a matrix. There's an actual formula for the inverse, and the determinant of the matrix is a very important ingredient in the formula. By the way, deciding when to use the formula (which is explicit but can be very clumsy) and when to use the algorithm is difficult. So here I found a 4 by 4 matrix with integer entries whose determinant just "happens" to be 1. If you want a really wonderful word to thrill people at your next social encounter try unimodular: a square matrix is unimodular if it's determinant is 1. O.k., even with all this and with Maple, I had to work a bit to produce the example I showed you. It wasn't a lot of work, but it was some work.

Bad matrices: no inverses
I asked for the simplest example of a matrix which does not have an inverse. A good deal of discussion followed, but Mr. Dupersoy suggested the 1 by 1 example: (0). I agreed this example was certainly smallest. I asked for a larger example, and was given

(0 0) 
(0 0)
which I said wasn't too original. Then Mr. Cohen (I believe it was he) suggested
(0 0) 
(0 1)
which has rank 1 and since 1<2, this matrix has no inverse. Also
(0 1) 
(0 1)
has no inverse. Wow. Matrices which don't have inverses are sometimes called singular. Singular matrices have low rank. Good matrices have high rank, and those square matrices which are highest rank are invertible (have inverses).
Please note that according to the dictionary, the word "rank" used as an adjective can mean "foul-smelling, offensive, loathsome, indecent, corrupt." Therefore some sentences using the word "rank" can be understood in a humorous way.

I was asked if only square matrices have inverses. In the context of this course and this text, my answer is "Yes". In fact, there are one-sided inverses for non-square matrices which some students may meet during computations. We won't see these ideas here.

The ideas I suspect M&AE students will need concern matrices with symbolic coefficients. This is so they will be able to do computations concerned with changing coordinates (and recognizing symmetries) more easily. The next two (and last two!) examples I considered are of this type.

Strange example #1
Suppose that a and b are real numbers, and G is the matrix

(0 a a 0 0)
(1 0 0 b b)
(0 1 0 0 0)
(0 0 2 2 2)
(0 0 1 1 1)
Question For which values of a and b is this 5 by 5 matrix invertible?
Answer O.k., this was wonderful in the sense of how distracting the a and b stuff could be. People looked and looked. Finally, I think Mr. Hunt remarked that the last two rows were identical so the column rank was less than 5. Indeed, I agreed with him, and said that there were some important ideas in his statement. First, there is the fact, not at all obvious, that the row rank and the column rank are always the same. That is, the number of linearly independent rows equals the number of linearly independent columns. Second and easier, recognizing that column 4 = column 5, so that the column 5 is at most 4, can't be 5, so that this matrix can never have an inverse, for any values of a and b! I have not discussed column rank etc. because this is mostly a row-oriented text, and we just don't have time for a full treatment of all of linear algebra.
Is there a "row argument" supporting the statement that the rank is less than 5? Well, yes, far away from the distraction of the a's and b's. Row 4 and row 5 are linearly dependent (one is a scalar multiple of the other) so that in row reduction we will certainly get a row of 0's. The rank will definitely be less than 5, so there are no values of a and b which would allow this matrix to have an inverse.

Strange example #2
Suppose that a and b are real numbers, and H is the matrix

(0 a a 0 0)
(1 0 0 b b)
(0 1 0 0 1)
(0 0 2 1 1)
(1 0 1 0 0)
Question For which values of a and b is this 5 by 5 matrix invertible?
Answer Here it is clear that if a is 0 then the rank is at most 4. What happens if a is not 0? I will use some row operations, beginning with exchanging rows so that I'll avoid the symbolic entries for as long as possible! (Avoid thinking! Avoid effort!)
Warning The following discussion is complicated.
Here I will begin by rearranging the rows, and then I clear up the first column and then the second column.
(0 a a 0 0) (1 0 0 b b) (1 0 0  b  b) (1 0 0  b  b)
(1 0 0 b b) (0 1 0 0 1) (0 1 0  0  1) (0 1 0  0  1)
(0 1 0 0 1)~(0 0 2 1 1)~(0 0 2  1  1)~(0 0 2  1  1)
(0 0 2 1 1) (1 0 1 0 0) (0 0 1 -b -b) (0 0 1 -b -b)
(1 0 1 0 0) (0 a a 0 0) (0 a a  0  0) (0 0 a  0 -a)
Now I'll try to clear up the third column, carefully. We need to be very careful with the symbolic entries.
(1 0 0  b  b) (1 0 0   b   b) (1 0 0      b      b)
(0 1 0  0  1) (0 1 0   0   1) (0 1 0      0      1)
(0 0 2  1  1)~(0 0 1 1/2 1/2)~(0 0 1    1/2    1/2)
(0 0 1 -b -b) (0 0 1  -b  -b) (0 0 0 -b-1/2 -b-1/2)
(0 0 a  0 -a) (0 0 a   0  -a) (0 0 0   -a/2  -3a/2)
I'll begin by exchanging the fifth and fourth rows (the algebra is simpler!), and here I need to assume that a is not equal to 0.
(1 0 0      b      b) (1 0 0      b      b) (1 0 0 0  -4b)
(0 1 0      0      1) (0 1 0      0      1) (0 1 0 0    1)
(0 0 1    1/2    1/2)~(0 0 1    1/2    1/2)~(0 0 1 0   -1)
(0 0 0   -a/2  -3a/2) (0 0 0      1      3) (0 0 0 1    3)
(0 0 0 -b-1/2 -b-1/2) (0 0 0 -b-1/2 -b-1/2) (0 0 0 0 2b+1)
We are almost done. I will assume now that 2b+1 is not equal to 0.
(1 0 0 0  -4b) (1 0 0 0 -4b) (1 0 0 0 0)
(0 1 0 0    1) (0 1 0 0   1) (0 1 0 0 0)
(0 0 1 0   -1)~(0 0 1 0  -1)~(0 0 1 0 0)
(0 0 0 1    3) (0 0 0 1   3) (0 0 0 1 0)
(0 0 0 0 2b+1) (0 0 0 0   1) (0 0 0 0 1)
So the rank is 5 exactly when both a and 2b+1 are not zero.

By the way, if these numbers are not zero, then H has an inverse. What is the inverse? Well, I tried working it out using our row reduction algorithm. The reductions got complicated, and I lost my eagerness to do the work. So ...

with(linalg):
H:=matrix(5,5,[0,a,a,0,0,1,0,0,b,b,0,1,0,0,1,0,0,2,1,1,1,0,1,0,0]);
                          [0    a    a    0    0]
                          [                     ]
                          [1    0    0    b    b]
                          [                     ]
                     H := [0    1    0    0    1]
                          [                     ]
                          [0    0    2    1    1]
                          [                     ]
                          [1    0    1    0    0]
inverse(H);
         [             1                   b          2 b   ]
         [  0       -------      0    - -------     ------- ]
         [          1 + 2 b             1 + 2 b     1 + 2 b ]
         [                                                  ]
         [             1                   b            1   ]
         [ 1/a      -------      0    - -------    - -------]
         [          1 + 2 b             1 + 2 b      1 + 2 b]
         [                                                  ]
         [              1                 b            1    ]
         [  0      - -------     0     -------      ------- ]
         [           1 + 2 b           1 + 2 b      1 + 2 b ]
         [                                                  ]
         [             3                 b - 1          3   ] 
         [ 1/a      -------     -1    - -------    - -------]
         [          1 + 2 b             1 + 2 b      1 + 2 b]
         [                                                  ]
         [            -1                   b            1   ]
         [ -1/a     -------      1      -------      -------]
         [          1 + 2 b             1 + 2 b      1 + 2 b]
         [                                                  ]
Notice how the conditions we know appear implicitly in the matrix. That is, there's division by a, so a shouldn't be 0, and division by 1+2b, so that shouldn't be 0 also. Maple can find symbolic matrix inverses, and Maple rarely gets hysterical!

The QotD was the following (with instructions phrased so that Mr. Malek would not be able to just write the answer with a calculator):
Use row operations and exact rational arithmetic to find the inverse of the matrix

(3 3  3)
(3 1  0)
(3 0 -1)
I remarked that this matrix was symmetric (that is, when you "reflect" it across the diagonal of 11 22 33 terms you get the same matrix), and it turns out that the matrix inverse will also be symmetric. Also the final answer will only involve fractions whose bottoms are 3 (yes, I fouled up and wrote 2 at first!).

Please read the textbook and hand in 6.7:1, 5 and 6.9: 7, 13.

Tuesday,
March 2
We first reviewed the previous QotD with the help of Mr. Gradziadio who had given nice, direct answers. The QotD asked for an analysis of the solutions to the homogenous system whose coefficient matrix is
(-15 0 -42 -6 -21 -83)
( 30 0  84  0  24  -6)
(-15 0 -63  2  -8  27)
which is row equivalent to
(1 0 0 0 2/5  5/3) 
(0 0 1 0 1/7 -2/3)
(0 0 0 1  5   -2 )
There is a free variable because of the second column. The first "equation" implied by the first row is x1=-(2/5)x5-(5/3)x6. With this in mind Mr. Gradziadio observed that the coefficient matrix implied that there were 3 equations in 6 unknowns, and that the rank of the matrix was 3. The collection of vectors (x1,x2,x3,x4,x5,x6) which are solutions to the homogeneous system are a subspace. The dimension of this subspace is 3. A basis for this subspace is (0,1,0,0,0,0) (because of the column of 0's) and (-2/5.0,-1/7,-5,1,0) (put x5=1 and x6=0) and (-5/3,0,2/3,2,0,1) (put x5=0 and x6=1). A typical vector which solves the homogeneous system is x2 multiplied by the first vector, and x5 multiplied by the second vector, and x6 multiplied by the third vector. The easiest way to see this is by writing out the linear system abbreviated by AX=0. We can "see" that the three vectors named are linearly independent by looking at the second, fifth, and sixth components of the vectors (they form a matrix called I3, a 3 by 3 identity matrix) whose only solutions are 0 and 0 and 0.

Now, again, we begin: the study of solutions of inhomogeneous systems of linear equations. I first observed that non-linear equations are not too well-behaved. For example, even in one variable, the equation x2=a has 0 (a=-1) or 1 (a=0) or 2 (a=1) real solutions.l There are lots of ways of understanding that, but all of the ways take some effort. It is p0artially my job to convince you that the case of linear systems, even in high dimensions, is actually much easier.

I started with a very detailed analysis of an example. The example was extremely carefully and delicately (?) chosen. Here is the system of equations:

2x1+1x2+1x3+4x4+4x5=y1
2x1+2x2-4x3+4x4+6x5=y2
2x1+0x2+6x3+4x4+2x5=y3
The questions I wanted to discuss included these:
For which triples (y1,y2,y3) will there be solutions of this specific system?
If there are solutions, can we describe the solutions in some systematic fashion?

As in the homogeneous case, changing to RREF will help a great deal. In this case we need to carry along the right-hand sides of the equations, so that we can interpret our final results in terms of them. We take the coefficient matrix and write another column, and call the result the augmented matrix. The row operations will result in a system of equations with the same collection of solutions, and maybe the solutions will be easier to understand after we are done (that will be true).

( 2  1  1  4  4 | y1)
( 2  2 -4  4  6 | y2)
( 2  0  6  4  2 | y3)
It is conventional to put some sort of separator between the coefficients and the "right-hand sides". Most texts use |. I challenged the class with an announcement. The system can be solved for exactly one of these vectors: (3,2,1) and (2,3,1). Which vector, why not the other one, and what are the solutions "like"?

Row reduce the coefficient matrix:

( 2  1  1  4  4 | y1) (1 1/2 1/2 2  2 |(1/2)y1) ( 1 0  3 2 1 | (3/2)y1-(1/2)y2) 
( 2  2 -4  4  6 | y2)~(0  1  -5  0  2 |  y2-y1)~( 0 1 -5 0 2 |  y2-y1 )
( 2  0  6  4  2 | y3) (0  -1  5  0 -2|y3-y1)   ( 0 0  0 0 0 | y3+y2-2y1)
Well, the typesetting isn't perfect but I hope you get the idea. First, I would like to mention that maybe the coefficient matrix which is now in RREF might be a bit familiar. It is the first example I analyzed in detail last time.

The additional detail of the last column is what we will look at now. But what the heck does the last row mean? It abbreviates a linear equation, which I will write out in detail with the +'s and the variables:

0x1+0x2+0x3+0x4+0x5=y3+y2-2y1
When we dealt with homogeneous systems, such a row had little information. But now we seem it imposes a condition the the y-variables must satisfy if the system of equations could have a solution.

Compatibility conditions
I learned to call a condition like y3+y2-2y1=0 a compatibility condition (the dictionary says "compatibility" is "able to coexist; well-suited; mutually tolerant"). Your textbook says that a selection of the variables where the compatibility condition is satisfied makes the system consistent, and otherwise the system is inconsistent.
Recall the triples (3,2,1) and (2,3,1). The equation y3+y2-2y1=0 is not satisfied if y1=3 and y2=2 and y3=1. It becomes 1+2-2(3)=-3-0, which is false. Therefore (3,2,1) cannot be a solution of the original system. I can't easily "see" this fact just by looking at the original system. What about (2,3,1)? The compatibility condition becomes 1+3-2(2)=0, which is true. Be careful of a possible logical trap here: this alone by itself doesn't guarantee that the original system has a solution for (2,3,1). It just says that (2,3,1) passes this "test" or satisfies this criterion or something.
[The instructor discussed this point of logic in terms of driving strategy: "Slower cars should be on the right, so therefore (??!!) cars are the right should/must/might be slower ..." while this is certainly a false implication, it might be helpful under certain circumstances ...]

Finding a solution
So does (2,3,1) have some solution to the system? Consider the system in RREF. The third equation is satisfied. The other two are

x1      =(3/2)y1-(1/2)y2)-3x3-2x4-1x5 
      x2=   y2-y1   +5x3+0x4-2x5
I just want to get some solution of the equation for the vector (2,3,1). Here is one way: make x3=0 and x4=0 and x5=0. Plug in y1=2 and y2=3 and y3=1 in on the right hand side, where the green stuff is. Then x1=(3/2)(2)-(1/2)(3)=3/2 and x2=2-3=-1. Indeed (!) if you just "plug in" the values (3/2,-1,0,0,0) into the original system:
2x1+1x2+1x3+4x4+4x5=y1
2x1+2x2-4x3+4x4+6x5=y2
2x1+0x2+6x3+4x4+2x5=y3
you will get 2(3/2)-1+0's=2 (the desired value of y1) and you will get 2(3/2)+2(-1)=1 (the desired value of y2) and you will get 2(3/2)+0's=3 (the desired value of y3). This isn't magic but to me it seems quite close!

Other solutions
We experimented with other solutions, The students were revolted by my request that we think if there are solutions when x3=Pi and x4=sqrt(2) and x5=e. There are solutions, and they were easy to write. What is the "structure" of all solutions? Here matrix notation comes in handy. Suppose I know that AX=Y and AW=Y where Y is the vector (2,3,1) in column form. Then I know that AX-AW=Y-Y so (undistributing?) A(X-W)=0, and X-W is a solution of the associated homogeneous system which which had previously analyzed in detail. So if W is any particular solution of our system, such as the vector Z=(3/2,-1,0,0,0), we can get the general solution by adding on any solution of the homogeneous system, which was the span (all linear combinations) of three vectors which I called u and v and w. So all solutions are exactly Z+au+bv+cw for any choice of numbers a and b and c.

Digression back to 244
What are all solutions of the ODE y''+y=xex? (This is not an initial value problem -- I am just asking for some useful description of all solutions. Well, y''+y=0 has as solutions a sin(x)+ba cos(x). These are the collection of solutions of the associated homogeneous equation. Now, oh my goodness, we just guess that one solution of y''+y=xex is (1/2)x2-(1/2)ex. This is a particular solution. If you don't believe me, you can check by substituing in the equation. Then because this is a linear ODE, the general solution of y''+y=xex is (1/2)x2-(1/2)ex+a sin(x)+ba cos(x), one particular solution plus the most general solution of the associated homogeneous equation.

What might be happening
Linear systems actually keep track of the "information" using a dimension count. The linear system we just analyzed takes vectors from R5 to R3. It takes 5 "chunks" of information in the 5-dimensional vector, and then it throws out 3 of them (the three chunks corresponding to the solutions of the homogeneous system). Then the output is a vector in R3 subject to the restriction y3+y2-2y1=0. This is a plane in R3, and it is a 2-dimensional subspace with basis (1/2,1,0) (with y2=1 and y3=0) and (1/2,0,1) (with y2=0 and y2=1).
We could almost draw a picture: input 5 chunks, throw out 3, output 2 chunks. The complication is that the coordinate systems are really forced by the structure of the matrix A, and don't have to correspond to what we might initially "like". In this case, the collection of allowable outputs in R3 is rather thin, and must satisfy the compatibility condition. I also should remember to mention Mr. Elnaggar for some meritorious (?) suggestion he made in connection with this example.

Another example, with a square coefficient matrix
I rapidly analyzed another example. This was a 4 by 4 system: 4 equations in 4 unknowns. This was the original system:

3x1+0x2-1x3+1x4+=y1
0x1+1x2+0x3+1x4+=y2
2x1+0x2+1x3+1x4+=y3
0x1+2x2+0x3+1x4+=y4
I was actually going to compute the row reduction but the clamor of the class prevented it.
clamor: n. 
1. loud or vehement shouting or noise.
2. a protest or complaint; an appeal or demand.
Instead I presented this result:
(3 0 -1 1 | y1)    (1 0 0 0 | (1/5)y1-(4/5)y2+(1/5)y3+(2/5)y4)
(0 1  0 1 | y2)~~~~(0 1 0 0 |                -y2+y4       )
(2 0  1 1 | y3)    (0 0 1 0 |-(2/5)y1+(2/5)y2+(3/5)y3-(1/5)y4)
(0 2  0 1 | y4)    (0 0 0 1 |                2y2-y4        )
It is only four "pivots" and not that much work!
Here is a general fact, which is difficult to state precisely but sure is true (trust me!): matrices want> to be of maximal rank. So if we try to write a random matrix (as I did here, subject to my desire that the row reduction not horrible) then it is likely that the rank will be 4. So here the rank is 4, and we verified it (or I did, the class declining to cooperate!).

What do we know? There is no compatibility condition. Any 4-tuple of y's has a solution: just plug in the y-coordinates on the right of the RREF and you'll get the corresponding x-coordinates. How many solutions are there? Well, that's the same as asking what the solutions of the associated homogeneous equation are. In this case, if all of the y's are 0, then all of the x's are 0. The subspace of homogeneous solutions is just the zero vector, {0}, and its dimension is 0. The only solution of the associated homogeneous system is the trivial solution. So this is a very nice case: all outputs in R4 are possible, and each output corresponds to exactly one input. This special case is very important, and we will learn an algebraic tool (the determinant) which may help us recognize when this occurs.

The QotD asked students to do row reduction of a 2 by 2 system:

2x1+3x2=y1
5x1-1x2=y2
The answer is
x1  =(1/17)y1+(3/17)y2
  x2=(5/17)y1-(2/17)y2
There was enormous complaining because the number 17 was seen. I regret this. (It is intentionally not clear whether I regret giving an example with 17 or whether I regret the complaints!)

I checked my answer with Maple:

with(linalg):
a:=matrix(2,2,[2,3,5,-1]);
                                 [2     3]
                            a := [       ]
                                 [5    -1]
inverse(a);
                            [1/17    3/17]
                            [            ]
                            [5/17   -2/17]
Isn't this EASY?

Please continue reading chapter 6.

Thursday,
February 26
Here's an example of a homogeneous system of linear equations:
5a-6b+3c=0
This is a fairly puny system, especially compared to those you will encounter in both school and the real world. It has the essential aspects, though: some variables (a and b and c) entering only to the first power and being combined linearly with coefficients (5 and -6 and 3) and the linear combination is set equal to 0.

A more general system might involve N variables, x1, x2, ..., and xN. There might be a collection of M equations: SUMj=1Naijxj=0. Here the coefficients are the aij, which are N times M coefficients. The equations all describe linear combinations of the variables, and we are asked to see what we can say when all of the linear combinations are zero: that's what makes this a homogeneous linear system.

We can rewrite this more compactly in matrix notation: AX=0. Here A is the M by N matrix whose ijth entry is aij and X is the N by 1 column matrix or a column vector with entry in row i equal to xi. Finally, 0 is the M by 1 column matrix whose entries are all zeros.
Minor comment about names: a (something by 1) matrix is frequently called a column matrix or a column vector, while a (1 by something) matrix might be called, of course, a row matrix or a row vector. In our text, the "unknowns" are almost always column matrix notation. Warning: other sources (textbooks, program descriptions, etc.) might make the unknowns row vectors, because everybody always has a new and better and different way to write and understand things.

Suppose S is the collection of all vectors in RN which are solutions of AX=0. What can one say about S? What is the "structure" of S?

In all cases, S has at least the vector of all 0's in it. That's because this is a homogeneous system. This is usually called the trivial solution of a homogeneous system. The interesting question to ask is whether there are solutions which are not the trivial solution. In applications, systems might represent long-term or steady-state responses of some circuit or springs or something, so we might want to know whether the mathematical model allows the possibility of other than a "neutral" (trivial?) steady-state.

O.k., what can one say about the solution set? Let's briefly look again at our puny example (my online dictionary defines puny as "undersized, weak"):
5a-6b+3c=0
Suppose we have a solution, a and b and c, of this system. (No, I don't care what the numbers are here -- my goal is the general features or structure right now).

  1. What about ta and tb and tc? Well, look:
    t·0=t(5a-6b+3c)=5(ta)-6(tb)+3(tc)
    So ta and tb and tc are again solutions: a scalar multiple of a solution is a solution.
  2. Suppose a1 and b1 and c1 are solutions, and also a2 and b2 and c2 are solutions. Then:
    5a1-6b1+3c1=0
    5a2-6b2+3c2=0
    ADD EQUATIONS
    (5a1-6b1+3c1)+(5a2-6b2+3c2)=0+0
    5(a1+a2)-6(b1+b2)+3(c1+c2)=0
    So the vector sum of solutions is a solution.
This should all look a bit familiar. The sum of vectors in S is in S, and scalar multiples of vectors in S are in S. Therefore (drum roll!) the collection of solutions of a homogeneous system of linear equations is a subspace. Well, again this could just be mathematical jargon. But it helps a little bit. For the general system we wrote above in matrix notation as AX=0 where A is an M by N matrix, S is inside RN, and we could really compute with it if we knew: what's the dimension of S, and what's a basis of S? The dimension is a number which gives some idea of the size of S, and, with a basis, we would have some way of "addressing" each vector in S in a unique way.

A friendly dialog on the street
As we walk down the street, we meet a stranger who asks various questions.
Question #1 "Yo! [My effort to be contemporary.] I gotta homogeneous system of 5 linear equations in 6 unknowns. Does this system gotta solution?"
Our answer "Esteemed stranger, surely you must know that such a system is solved by (0,0,0,0,0,0)."
Question #2 "Hey, hey, yeah, yea, I know. But ... what if (1,2,3,0,0,-1) is a solution? Hey, do ya know any others, and not that cheap, crummy solution?"
Our answer "Forsooth, friend met by chance: I know that (-1,-2,-3,0,0,1) is a solution. And, even, let us add, there are many, many other solutions, such as (5,10,15,0,0,-15), oh, so many that each star in the sky could have a separate one!"
(Dictionary says forsooth is old-fashioned and means "truly; in truth; no doubt.")
Question #3 "O.k., o.k., no stars, just now for some down to earth bling-bling which I'll give ya if you can tell me: suppose I know both (1,2,3,0,0,-1) and (5,0,2,2,1,1) are solutions of this system. Can you tell me if there is a solution which doesn't happen to be a scalar multiple of any solution I've told you?"
Our answer "Noble new acquaintance, our lives are enriched by your inquiries. Know that (6,2,5,2,1,0) is also a solution, for it is a vector sum of two known solutions of this homogeneous system. And (6,2,5,2,1,0) is not a multiple of (1,2,3,0,0,-1) or else it would have a 0 in the fourth coordinate. And also it cannot be a multiple of (5,0,2,2,1,1) for then it would have a 0 in the second coordinate. And, certainly, it isn't a multiple of (0,0,0,0,0,0). So we have produced what you requested."
Question #4 "Well, you've done well so far, here in this dark, quiet street. I have just one more question, just one more. Here in this book I have a list of all of the 98,000 solutions to a homogeneous system of 35,000 equations in 2,340 unknowns. What do you say to that?"
Our answer "No, that can't be so. Once there are more solutions than the trivial solution, there must be infinitely many solutions, and your book must therefore be an infinite sink of knowledge ... and even more, by your gradually improving grammar and the increasing subtlety of your questions, we know that you are the evil magician, The Linear Eliminator, whom all must loathe and hold in contempt!"
With a gigantic flash, the stranger disappears!
[This drama can be licensed at your local high school. Inquire about fees and rights.]

Actually, elimination (which means changing a matrix by row operations to reduced row echelon form [RREF]) will be the nicest thing. And I should specify that what the last question/answer revealed is a general dichotomy (dictionary: a division into two, esp. a sharply defined one.) worth remarking on: a homogeneous linear system either has only the trivial solution, OR it has lots and lots (infinitely many!) solutions.

I tried to give a really explicit example. I started writing a 3 by 5 matrix, and then remarked that we would eventually want to change it to RREF, so why don't we just start with the RREF. Here's our A:

( 1 0  3 2 1)
( 0 1 -5 0 2)
( 0 0  0 0 0)
I checked to make sure this matrix was in RREF, and it was. Now we proceed to

Analysis of the homogeneous system determined by the coefficient matrix, A
1. How many unknowns in each equation? 5 unknowns
2. How many equations does this represent? 3 equations
Comment Yeah, the third equation is rather silly. It is 0=0, or, more formally, 0x1+0x2+0x3+0x4+0x5=0. I would like to be organized here, and not ignore anything.
3. What is the rank of this matrix? The rank is 2
4. Find a solution of this system? (0,0,0,0,0).
Comment This solution, suggested by Mr. Ivanov, is so distinctive that I propose to call it the Ivanov vector. In fact, the Name-a-vector-registry will maintain a list of the vectors I designate forever, or maybe for just a few minutes.
5. Find another solution (a non-trivial solution!) of the system.
(-3,5,1,0,0)
Comment This is the Seale vector following its suggestion by Mr. Seale. I asked why this vector satisfies all of these equations. One way is to just "plug in" the suggested values, and, sure enough, they will satisfy the equations. But is there some "structure" underlying his example?
6. What is the dimension of the subspace S of R5 of solutions of this system? What is a basis of this subspace?

The core of the example with coefficient matrix A
The collection of equations is

1x1+0x2+3x3+2x4+1x5=0.
0x1+1x2-5x3+0x4+2x5=0.
0x1+0x2+0x3+0x4+0x5=0.
Therefore
x1      =-3x3-2x4-1x5 
      x2= 5x3+0x4-2x5
So if
(x1)
(x2)
(x3)
(x4)
(x5)
is a vector in S, we know the following:
(x1)   (-3x3-2x4-1x5)      (-3)     (-2)     (-1)  
(x2)   ( 5x3+0x4-2x5)      ( 5)     ( 0)     (-2)       
(x3) = (     1x3    ) =  x3( 1)     ( 0)     ( 0)
(x4)   (     1x4    )      ( 0) + x4( 1)     ( 0)        
(x5)   (     1x5    )      ( 0)     ( 0) + x5( 0)
I will call the first vector which appears on the right-hand side, u, and the second vector, v, and the third, w. There are several important observations to make about these vectors.
SPAN Every vector in S can be written as a linear combination of these three vectors.
LINEAR INDEPENDENCE These three vectors are linearly independent. Look at the last three coordinates. If some linear combination is equal to 0, then because of the 0-1 structure of the vectors, we see that x3 and x4 and x5 would all have to be 0. The only way a linear combination can be 0 is for all of the coefficients to be 0. These vectors are linearly independent. Generally it might be a bit irritating to decide ("by hand") if three vectors in R5 were linearly in dependent.
VECTORS IN THE SUBSPACE HAVE COORDINATES Each vector in S has a unique description written as a sum of the three vectors. In effect, the numbers x3 and x4 and x5 serve as a unique address for any vector in S once we know u and v and w. These vectors form a basis for S and the dimension of S is 3.

Another example
Consider this coefficient matrix for a homogeneous system of linear equations:

(1 0 0 0 0 3 5  7 -1)
(0 1 0 0 0 2 1  0  0)
(0 0 0 1 0 2 5 -1  1)
(0 0 0 0 1 5 0  2  1)
There's an additional little "wrinkle" here: the third column of 0's. Here S will again mean the collection of all solutions of the system of equations. We want to understand S well, so if needed we can compute with it.
1. How many unknowns in each equation? 9 unknowns
2. How many equations does this represent? 4 equations
3. What's the rank of this system? The rank is 4.
4. What additional complication does that "all 0's" column represent? The vector (0,0,1,0,0,0,0,0,0) is in S.
Comment This vector was suggested by Ms. Kohut and will therefore be called the Kohut vector. Any multiple of the vector solves AX=0. It represents one dimension in S. What is or what should be the dimension of S? At first, since S is inside R9, we see that the dimension is some integer from 0 to 9. The Kohut vector (!) certainly shows that 0 isn't correct. And we can see that 9 isn't correct if we can find at least one vector which isn't in S, such as (1,0,0,0,0,0,0,0,0). But ...
5. What's a basis of S? Think about it ...
Comment This is supposed to be the payoff of the whole lecture. You are supposed to see that every vector in S can be written as
x3(0,0,1,0,0,0,0,0,0)+ x6(-3,-2,0,-2,5,1,0,0,0)+ x7(-2,-1,0,0,0,0,1,0,0)+ x8(-7,0,0,1,-2,0,0,1,0)+ x9(1,0,0,-1,-1,0,0,0,1)
This needs some thinking about and I strongly urge you to think about it. The collection of these 5 vectors in R9 does give a basis of S.
6. What's the dimension of S? Its dimension is 5.
You maybe should notice that the matrix is 4 by 9, its rank is 4, and the dimension of S is 5. In our previous example, the matrix was 3 by 5, its rank was 2, and the dimension of S was 3.
Advertisement: 4+5=9 and 3+2=5.

The QotD was to analyze similarly the collection S of solutions of the linear homogeneous system given by the matrix

(-15 0 -42 -6 -21 -83)
( 30 0  84  0  24  -6)
(-15 0 -63  2  -8  27)
There was some distress voiced almost immediately about this. In order to make people happier, I declared that I would respond to one computational request from the students. I heard from Mr. Seale, "What is that in RREF?" So I responded
(1 0 0 0 2/5  5/3) 
(0 0 1 0 1/7 -2/3)
(0 0 0 1  5   -2 )
(Secret: I started with this and did row operations to get the one I first wrote on the board!)

Please continue reading chapter 6. Please hand in 6.5: 1, 5, 13 on Tuesday.

The first exam results
I returned the graded exams and answers. Here is a version of the exam, and here are some answers. A discussion and some statistical analysis of the grading is here..

As I remarked, the grades were incredibly bimodal, with a standard deviation of 22.78, very large in my experience (wide dispersion from the mean in this scale). In general, math courses are intensely cumulative and students who do poorly on the first exam tend to do poorly in the whole course. This course has three components. There certainly are connections (for example, problem 6 on the first exam, and other connections of linear algebra with Fourier series which will become apparent) but the connections are not overwhelming. Therefore I believe students are more likely than in most math courses to be able to have much improved achievment on other exams in this course. Since I don't believe that students who have gone through technical majors and persisted into the junior and senior years are likely to be weak, I respectfully suggest that allocation of time and effort are likely to be related to very poor performance. So I recommend practice.

Tuesday,
February 24
We had a wonderful time with the exam. More information to follow.

I can't stay so late next time. (A few of us stayed until almost 10!). I keep five giant mutant rottweilers (200 pounds each) and they got so hungry that they ate the refrigerator.

On Wednesday, I went to a nice presentation about Schlumberger. They are looking for mechanical engineering undergrads who want to be field engineers. This is a real employment opportunity for such people.

     Exam in 1 week, on Tuesday, February 24

     Please consider the following material.
 

Office hours/review for the exam

I will be in my office (Hill 542, on the fifth floor of Hill Center) and available for 421 questions on Monday, February 23, from 5:30 to 7:30. If more than a few students show up, I'll find a classroom and put a note on my office door.

It's also likely that I will be in my office and available on Tuesday afternoon but please confirm this then with e-mail or a 'phone call.

Thursday,
February 19
The instructor urged students to prepare for the exam. (Actually the instructor nearly threw a tantrum, but this is undignified to report.) Also the instructor or I or me remarked that I am mostly interested in the ideas of linear algebra here, because almost always (I hope) we will have computers to do the arithmetic.

How can we solve systems of linear equations? This is a difficult question which is central to many applications. I'll break it up into three parts.
Numerical solution of linear equations
Google lists 407,000 (four hundred and seven thousand!) responses to the query "numerical linear algebra". There are steps in the solution of linear equations (division!) where numerical things (such as accuracy) can get really tricky. There are even examples of 2 by 2 and 3 by 3 matrices where computing the inverse is unstable numerically. I certainly won't worry about this here (my examples will mainly be to illustrate the logic and almost always involve "small" rational numbers' But you will almost certainly have to worry about numerical stuff. The simplest approach is to get a good collection of programs (for example, Matlab) and hope that your problems stay inside what the programs can handle.
Symbolic computation in linear algebra
Some students may have been witnesses in calc 3 to the computation of the Jacobean (volume distortion factor) for spherical coordinates. This is a symbolic 3 by 3 determinant. I'll discuss the difficulties of symbolic computation a bit more when we get to determinants and Cramer's rule, but such computations are almost notoriously difficult. There may be no good way to do them.
And for right now, here?
We will deal with small systems. We will use a variant of Gaussian elimination and won't worry too much about efficiency. The main point of using this will be to illustrate the concepts of linear algebra, and to try to give students some intuition about what to expect.

I began by writing three vectors in R4 and asking if they were linearly independent. Let's see: I took them at random, with small integer entries. Maybe v1=(1,2,-2,0) and v2=(2,3,0,3) and v3=(3,0,2,4). These vectors are linearly independent if the only solution to the vector equation av1+bv2+cv3=0 is a=0 and b=0 and c=0. Because that solution always works, it is sometimes called the trivial solution. We want to know if there is a non-trivial solution, where the equation is correct with a and b and c not all 0. Then the vectors will be linearly dependent. The vector equation translates into a system of 4 scalar equations:

 1a+2b+3c=0
 2a+3b+0c=0
-2a+0b+2c=0
 0a+3b+4c=0
It is tremendously appealing to just jump (?) into this collection of equations and "do things". Instead, I would like to promote the use of a methodical approach. This takes some of the "flavor" away, but it does work. I'd like to change those 4 equations into others, and be sure that the solution set of the equations does not change. Sometimes people call this changing the system of linear equations into an equivalent system. It turns out that there are some things we can do that are all reversible, and then (almost) clearly the solution set won't change. These are:
  1. Multiply an equation by a non-zero constant. (Reversible because we can multiply by 1/constant, of course.)
  2. Add/subtract one equation from another. (Reversible because we can subtract/add.)
  3. Interchange two equations.
So we will do these things, in an effort to help us understand when equations have solutions, and what the solutions look like. But we really don't need to carry along the variables (a,b,c above) or even the "=0"s. We just really need to manipulate the coefficient matrix. In this case that matrix is
( 1 2 3)
( 2 3 0)
(-2 0 2)
( 0 3 4)
The equation operations above have a different name in the matrix context.

DEFINITION The elementary row operations

  1. Multiply a row by a non-zero constant.
  2. Add/subtract one row from another.
  3. Exchange rows.
For example, if we had the 3 by 2 matrix
( 2  3)
(-1  0)
( 5  2)
Then a variety of elementary row operations will yield these matrices:
  (-1  0)   ( 4  6)   ( 1  3)   ( 2  3)     
A=( 2  3) B=(-1  0) C=(-1  0) D=(-1  0)     
  ( 5  2)   ( 5  2)   ( 5  2)   (35 14)
See if you can "guess" the row operations which created A, B, C, and D. Here are the answers. If M and N are matrices which can be linked by a series of row operations, then I will say that M and N are row-equivalent, and I will frequently write M~N (that's a "tilde", a horizontal wiggle, which is used in many places to denote row equivalence). If a matrix is non-zero, then infinitely many other matrices can be created which are row equivalent. But some of the row-equivalent matrices are easier to understand than others.

DEFINITION Reduced row echelon form (p.264 of the text).
A matrix is in reduced row echelon form if:

  1. The leading entry of any nonzero row is 1.
  2. If any row has its leading entry in column j, then all other entries of column j are zero.
  3. If row i is a nonzero row and row k is a zero row, then i<k.
  4. If the leading entry of row r1 is in column c1, and the leading entry of row r2 is in column c2, and if r1<r2, then c1<c2.

I remarked that this interesting definition certainly illustrates the following remark of Goethe:
Mathematicians are like a certain type of Frenchman: when you talk to them they translate it into their own language, and then it soon turns into something completely different.
Can I "translate" the definition so it will be more understandable?

One of the tricks I use when I see a complicated mathematical definition is try to break it. That is, I test (?) each of the conditions in order and see what it means by finding a sort of counterexample. And I try to find the very simplest and smallest counterexample. So:

  1. The leading entry of any nonzero row is 1.
    First, I guess that "leading entry" means the first (moving from left to right) non-zero entry of a row. So if (0 0 2 1 3) is a row, the leading entry would be 2. What is the smallest example of a matrix which has a non-zero row whose leading entry is not 1? The 1 by 1 matrix (2) was suggested.
  1. If any row has its leading entry in column j, then all other entries of column j are zero.
    Can I find a very small example of a matrix which does satisfy 1 but does not satisfy 2? The suggestion here was something like
    (1)
    (1)
  1. If row i is a nonzero row and row k is a zero row, then i<k.
    How about an example satisfying 1 and 2 but not this? Here we needed to ponder a bit longer. Zero rows are supposed to be at the bottom. Here's a small example:
    (0)
    (1)
  1. If the leading entry of row r1 is in column c1, and the leading entry of row r2 is in column c2, and if r1<r2, then c1<c2.
    Certainly this is the most mysterious specification, and seems most menacing with the letters and subscripts. This example will need more than one column, because c1 and c2 shouldn't be equal. Here is an example which satisfies 1 and 2 and 3, but not rule 4:
    (0 1)
    (1 0)
    What the heck does rule 4 say? The leading 1's have to move down and to the right.
Here is a matrix in row echelon form:
(0 0 1 0 0 2 -3 0)
(0 0 0 0 1 0  0 0)
(0 0 0 0 0 0  0 0)
For those of you who know about identity matrices) these are square matrices which have 1's on the diagonal and 0's off the diagonal. I will call the n by n identity matrix In. These matrices are nice because they are identities for matrix multiplication. If the dimensions are in agreement so that the multiplication is defined, then AIn=A and InB=B. So In's are nice. The row echelon form is sort of
(In     JUNK)
(ZEROS ZEROS)
But of course the example above shows that we don't exactly need an identity matrix -- the 1's can be further indented or pushed in.

BIG IMPORTANT FACT Every matrix is row equivalent to exactly one reduced row echelon form matrix.

And we can find this unique reduced row echelon form matrix with a simple algorithm. An algorithm is a computational rule which is definite (you don't need to guess anywhere!) and is finite: it will always terminate.

Here is an example. Let's look at the matrix

( 1 2 3)
( 2 3 0)
(-2 0 2)
( 0 3 4)
which we started with earlier. We look for the first non-zero column and the highest non-zero entry in that column. Multiply that row so the row begins with 1. Here we don't need to do anything, since the row already begins with 1. Then take multiples of the first row and subtract them in turn from the other rows so that the entries below become 0.
( 1 2 3)
( 2 3 0)
(-2 0 2)
( 0 3 4)
~
( 1  2  3) 
( 0 -1 -6)
(-2  0  2)
( 0  3  4)
~
( 1  2  3) 
( 0 -1 -6)
( 0  4  8)
( 0  3  4)
Nothing necessary; row already begins with 1.   Multiply first row by 2 and subtract from second row.   Multiply first row by 2 and add to third row.
Now we are done with the first column. Consider now the second column, and look for the first (top to bottom) non-zero entry after the first row. Make that 1, and use it to clean out all of the other entries in that column.
( 1  2  3) 
( 0 -1 -6)
( 0  4  8)
( 0  3  4)
~
( 1  2  3) 
( 0  1  6)
( 0  4  8)
( 0  3  4)
~
( 1  0 -9) 
( 0  1  6)
( 0  4  8)
( 0  3  4)
~
( 1  0  -9) 
( 0  1   6)
( 0  0 -16)
( 0  3   4)
~
( 1  0  -9) 
( 0  1   6)
( 0  0 -16)
( 0  0 -14)
Identify non-zero entry.   Divide second row by -1.   Multiply second row by 2 and subtract from first row.   Multiply second row by 4 and subtract from third row.   Multiply second row by 3 and subtract from fourth row.
Now we are done with the second column. Consider now the third column, and look for the first (top to bottom) non-zero entry after the second row. Make that 1, and use it to clean out all of the other entries in that column.
( 1  0  -9) 
( 0  1   6)
( 0  0 -16)
( 0  0 -14)
~
( 1  0  -9) 
( 0  1   6)
( 0  0   1)
( 0  0 -14)
~
( 1  0   0) 
( 0  1   6)
( 0  0   1)
( 0  0 -14)
~
( 1  0   0) 
( 0  1   0)
( 0  0   1)
( 0  0 -14)
~
( 1  0  0) 
( 0  1  0)
( 0  0  1)
( 0  0  0)
Identify non-zero entry.   Divide third row by 16.   Multiply third row by 9 and add to first row.   Multiply third row by 6 and subtract from second row.   Multiply third row by 14 and add to fourth row.
We now have our reduced row echelon form matrix. Now if want to return to the original problem (the vector equation whose analysis will tell us about linear independence) we see that the system of three linear equations is equivalent to the system of equations
a=0
b=0
c=0
0=0
and now it is very clear that the only solutions are 0 and 0 and 0, so the original collection of three vectors is linearly independent.

An interesting exercise is writing a correct implementation of this algorithm as a computer program. I am not the world's leading programmer but I think I could do this. But it is not an easy exercise. In fact, just describing the algorithm to get a reduced row echelon form matrix is not easy. I don't think I will attempt to do it. Move from left to right, and clean up the columns in order. I made one more

DEFINITION Rank The rank of a matrix is the number of 1's in the leading block of the reduced row echelon form (therefore it is the number of non-zero rows).

The matrix above has rank 3. A matrix has rank 0 if it is the zero matrix, a matrix whose entries are all 0. Generally, a p by q matrix could have rank any non-negative integer between 0 and the minimum of p and q. So a 7 by 4 matrix could have rank 0 or 1 or 2 or 3 or 4. Here are examples of each:
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
  
(0 1 2 3)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
  
(1 0 0 5)
(0 0 1 7)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
  
(1 0 0 0)
(0 0 1 0)
(0 0 0 1)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
  
(1 0 0 5)
(0 1 0 4)
(0 0 1 7)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
(0 0 0 0)
Example of
rank 0
   Example of
rank 1
   Example of
rank 2
   Example of
rank 3
   Example of
rank 4

A homogeneous system with fewer equations than unknowns
What can we say about this homogeneous system of equations:

2a+3b+1c+3d+4e+1f=0
3a+4b+2c+1d+2e+2f=0
4a+1b+1c+5d+6e+4f=0
So let's take the coefficient matrix and use row operations to change it to reduced row echelon form.
(2 3 1 3 4 1)
(3 4 2 1 2 2)
(4 1 1 5 6 4)
~
(1 3/2 1/2 3/2 2 1/2)
(3  4   2   1  2  2 )
(4  1   1   5  6  4 )
~
(1  3/2 1/2  3/2  2 1/2)
(0 -1/2 1/2 -7/2 -4 1/2)
(0  -5  -1   -1  -2  2 )
~
(1 3/2 1/2 3/2  2 1/2)
(0  1  -1  14   8  -1)
(0 -5  -1  -1  -2  2 )
~
(1 0  2 -39/2 10  2)
(0 1 -1   14   8 -1)
(0 0 -6   69  38 -3)
~
(1 0  2 -39/2   10   2 )
(0 1 -1   14     8  -1 )
(0 0  1 -23/2 -19/3 1/2)
~
(1 0 0   7/2  -8/3   1 )
(0 1 0   5/2   5/3 -1/2)
(0 0 1 -23/2 -19/3  1/2)
 
In each case, I identified a coefficient to pivot around (in red) and used it to clear out the corresponding column. I fouled up a similar example in class. This is very tedious when done by hand. And almost always the rational numbers get "larger" -- denominators grow and grow. I will do a minimal amount of such computation in public. Also, I will try to construct exams so that there won't be very much of such computation.

The rank of this matrix is 3. To me, especially because this is a random example (honestly!), this is no surprise. Random matrices naturally "want" to have maximum rank. This can be made precise, but I would need an explanation of the word "random" and I don't want to enter that swamp.

The original system of equations was

2a+3b+1c+3d+4e+1f=0
3a+4b+2c+1d+2e+2f=0
4a+1b+1c+5d+6e+4f=0
which we now know has the same solutions as
a       + [ (7/2)d  -(8/3)e    +1f]=0
  b     + [ (5/2)d  +(5/3)e-(1/2)f]=0
    c   + [(-23/2)d-(19/3)e (1/2)f]=0
 3 by 3
identity             JUNK
 matrix
This actually has some significance. Does this system of homogeneous equations have any non-trivial solutions? Yes, it does. We can specify any numbers for d and e and f (a three-parameter family of solutions) and then select a and b and d so that the equations are correct. So we have learned that this system of equations has actually a three-dimensional subspace of solutions. More about this next time, but, generally, you can see that if we have any homogeneous system of 3 equations in 6 unknowns, there will always be non-trivial solutions.

The QotD was first incorrectly stated, and then rewritten. Here's the first version: suppose I have a homogeneous system of 340 linear equations in 315 unknowns. What are the possible ranks of the coefficient matrix? Explain why there must be non-trivial solutions to this system.
O.k., the first part isn't hard. The coefficient matrix is 340 by 315, so the rank could be any integer from 0 to 315. The second statement asks students to justify a false statement, which is quite idiotic of the instructor. Why is it false? Here is an example of a system of equations. If j is an integer between 1 and 315 (including the end values) suppose that the jth equation is xj=0. I don't care what the other 25 equations are, these 315 equations already guarantee that there will be only the trivial solution.

Corrected QotD: suppose I have a homogeneous system of 315 linear equations in 340 unknowns. What are the possible ranks of the coefficient matrix? Explain why there must be non-trivial solutions to this system.
The rank again could be any integer from 0 to 315. But now consider the coefficient matrix in reduced row echelon form. It is a rectangle which has more columns than rows (25 more). That means no matter what the rank (0, 1, 2, ...., up to 315) there must be JUNK columns in the matrix, in fact, at least 25 junk columns. That in turn implies that we can specify any values to those 25 variables, and can then choose if necessary values for the "first" 315 variables making the equations true. There must be non-trivial solutions, and in fact there is a subspace of at least dimension 25 in R340 which is composed of solutions to this homogeneous system of equations.

I must and will go over this more in the next class.

Please hand in 6.3 #5 and 6.4: 5, 13 on Tuesday, February 24. Continue reading chapter 6, and, yes, please do study for the exam.

The row operations which were
used in the early example:

  1. Exchanged rows 1 and 2.
  2. Multiply row 1 by 2.
  3. Add row 2 to row 1.
  4. Multiply row 3 by 7.

      ( 2 3)   (-1 0)   ( 4 6)   ( 1 3)   ( 2  3)     
Start=(-1 0) A=( 2 3) B=(-1 0) C=(-1 0) D=(-1  0)     
      ( 5 2    ( 5 2)   ( 5 2)   ( 5 2)   (35 14)

Tuesday,
February 17
Why study this stuff?
I defined a subspace of Rn last time. Why should we be interested in this? Many math processes create subspaces. For example, when we analyze y''+y=0 we see the solutions are A sin(t)+B cos(t): that's the linear span of sine and cosine, the collection of all linear combinations of sine and cosine. It is a subspace of functions.

Here is another way subspaces of functions can arise. If we consider the more complicated expression y''-3y'+2y (another second order linear ODE) we could ask what are the possible outputs if we feed in linear combinations of functions like sin(t), cos(t), t sin(t), t cos(t), et: what outputs should we expect? Here the question is more subtle, since I threw in et which is actually a solution of the associated homogeneous equation. The result will be a subspace, but will be a bit more difficult to describe. The general problem to be discussed: How can we describe subspaces efficiently? I'll go back to Rn for this.

DEFINITION Linear combinations
Suppose v, v2, ... vk are vectors in Rn. Then a linear combination is a sum a1v1+a2v2+...+akvk where a1, a2, ..., and ak are scalars.

I'd like to describe subspaces as linear combinations of vectors.
DEFINITION Span; spanning set
If the collection of all linear combinations of v1, v2, ..., vk is a subspace S, then S is said to be the span of v1, v2, ...., vk. Or, the other way around: v1, v2, ..., vk is a spanning set for S.

I tried a modest example as a beginning: S was the collection of vectors in R4 which can be described as (x,y,0,0) with x and y any real numbers. If e1=(1,0,0,0) and e2=(0,1,0,0), then {e1,e2} is a spanning set for this S (xe1+ye2 is the linear combination). We tried various examples of spanning sets. One was {e1,e1+e2}. I ran through the logic which went something like this:


Why is the span of {e1,e2} the same as the span of {e1,e1+e2}?
Every vector in the first group is a sum (or, with -1) a linear combination of the vectors in the second group. Therefore any linear combination of the first group of vectors is a linear combination of the second group of vectors. The converse logic is analogous: each of the vectors {e1,e1+e2} can be written as a linear combination of the vectors {e1,e2}, so the linear combinations of those vectors are linear combinations of the other set.

I went through this logic in detail, although rapidly, because it is at the heart of numerous arguments in linear algebra. Another way to understand the logic is to mumble, "Linear combinations of linear combinations are linear combinations" very rapidly.

Goldilocks and the three bears (a wonderful link!)
Let's see: Just {e1} alone is not enough to be a spanning set for this S because we couldn't get, say, (0,17,0,0). The set {e1} is too small to be a spanning set.
THIS PORRIDGE IS TOO HOT!
The set {e1,e2,e1+e2} definitely is a spanning set for S, but, as Mr. Cohen remarked, the last vector in the list is redundant vector: it isn't necessary. Including it doesn't change the span. So this set is too large to describe S efficiently.
THIS PORRIDGE IS TOO COLD!
What's going on? There are many descriptions of S, but what descriptions are efficient, minimal, etc.? How can we tell? This is a core question of linear algebra. If we can write one vector as a sum of others, then ("linear combinations of linear combinations are linear combinations") we don't need that vector. The trouble is that we may not know which vectors are unneeded. An unbiased way of finding out that there are unneeded or redundant vectors uses the following definition.

DEFINITION Linear independence and linear dependence
Vectors v1, v2, ..., vk are said to be linearly independent if, whenever there is a linear combination a1v1+a2v2+...+akvk which =0, then all of the coefficients a1, a2, ..., ak must be equal to 0. (Intuitively, the vectors point in different directions, so if some sum is 0 we must have no non-zero multiple of any of the vectors in the sum.)
On the other hand (?), Vectors v1, v2, ..., vk are said to be linearly dependent if there is some linear combination a1v1+a2v2+...+akvk which =0 and not all of the scalar coefficients are 0. (This means that one of the vectors can be written as a linear combination of the others, so [at least!] one of the vectors is redundant in describing the span of v1, v2, ..., vk.)

Example in R4
This was a lot, so I thought it was time to do an example. Look at the vectors v1=(1,2,0,3) and v2=(0,3,1,2) and v3(1,2,3,1). I tried to choose an example with small integer entries, but one where I couldn't "look at" the vectors and guess the answers to questions I would ask. Here I first wanted to know if these vectors were linearly independent. So I need to know about the solutions to the vector equation
av1+bv2+cv3=0.
If the only solutions to this vector equation occur when a=0 and b=0 and c=0 then the vectors are linearly independent. This is a vector equation in R4 (the 4 is most important here!) so the one vector equation translates to a system of 4 scalar equations:
1st component equation 1a+0b+1c=0
2nd component equation 2a+3b+2c=0
3rd component equation 0a+1b+3c=0
4th component equation 3a+2b+1c=0
For the few of you who don't know, there are well-known algorithms for analyzing such systems. Right now (this lecture!) I'll just try to look at the equations "by hand" -- this is a bit inefficient and sometimes irritating, but, just for today. Please try to keep your attention focused on the logic as much as possible. The first equation tells me that a=-c and the third equation tells me that b=-3c. Using these in, say, the fourth equation gives me 3(-c)+2(-3c)+1c=0 which is -8c=0 so c must be equation to 0! The other equations which relate c to a and b now tell me that a and b must be 0 also. So these vectors in R4 must be linearly independent. Therefore if I want to define a subspace S and the span of these three vectors, all of the vectors are needed: none of them are redundant. In fact, each vector in S has a unique description as a sum of v1 and v2 and v3. Why is this?
Let me show you with an example. Suppose w is in S, and w=4v1-9v1+33v1 and also w=83v1+22v1+48v1: two different descriptions as linear combinations of the vectors. Then the two linear combinations must be equal since they are both w. Do some vector algebra and get (83-4)v1+(22+9)v1+(48-33)v1=0. Hey! This contradicts the linear independence of v1 and v2 and v3 which we just verified! So eachw in S can have only one description as a linear combination of those three vectors.

DEFINITION Basis and dimension
Suppose S is a subspace. Then v1, v2, ..., vk is a basis of S if any of the following three equivalent conditions is satisfied:

  • The vectors v1, v2, ..., vk span S and they form a minimal spanning set for S: if we delete any of them, the linear combinations no longer "fill up" S.
  • The vectors v1, v2, ..., vk are linearly independent and they are a maximal linearly independent set in S: if you include any other vector in S, there's a way of writing that vector as a linear combination of these.
  • Every vector in S can be described as a unique linear combination of v1, v2, ..., vk. So if w is in S, there is exactly one way to write w as a1v1+a2v2+...+akvk. The scalars a1, a2, ...., ak are usually called the coordinates of w with respect to this basis.
All three of these conditions are important, and often occur computationally, so it is really important to somehow internalize (make a part of your mental picture of the world!) that they are the same. By the way, the integer k which occurs in each of the three statements (the number of vectors in a basis) is called the dimension of S. This number is the same for every basis of the subspace.

Now back to the example
So I have this subspace S of R4 which has v1=(1,2,0,3) and v2=(0,3,1,2) and v3+(1,2,3,1) as a basis. Even with our low-brow "hand" technology, we can still answer questions like this: is w=(2,2,2,2) in S? I picked this vector at random, and felt very comfortable betting with myself that the answer was "No." Why is that? Let me try to share my mental picture of the situation. In fact, let me try, as I did in class, to draw a picture of the situation. This picture has got to be deceptive. It is a two-dimensional picture of a three-dimensional analogy of a four-dimensional situation. (Now that's a sentence!) So my impression is that S is a flat "thing" going through the origin, 0. It is flat because multiples of any vector in S is in S: so S is made up of straight lines which pass through the origin (+ and - multiples of vectors are allowed). Because we can also add vectors in S and get another vector in S, we know that the straight lines are "joined" by flatness. A random point in R4 is probably not in S because S is down a whole dimension from the "ambient" (surrounding) space. Well, let's see what happens. Can this w be in S? If w=av1+bv2+cv3, then this is one vector equation which results (look again at the coordinates) in four scalar equations which again I will analyze in a naive manner. Here are the four scalar equations:
1st component equation 1a+0b+1c=2
2nd component equation 2a+3b+2c=2
3rd component equation 0a+1b+3c=2
4th component equation 3a+2b+1c=2
Then (first equation) a=2-c and (third equation) b=2-3c, so inserting things in, say, the fourth equation, we get 3(2-c)+2(2-3c)+1c=2. This becomes after a bit of cleaning -8c=-8 so, if the equations can be satisfied, then c should be 1. We used only the first, third, and fourth equations. What about the second? If c=1, then a=2-1=1 and b=2-3(1)=-1. The equation 2a+3b+2c=2 becomes 2(1)+3(-1)+2(1)=2 or 1=2. So the equations have no common solution. Therefore w is not in S.

Another example
Of course I didn't want to repeat totally an example with similar qualitative aspects as the previous one. With that in mind, let us analyze the situation in R4 when v1=(-2,4,2,-4) and v2=(1,-5,-1,4) and v3(-2,1,2,-2). (I screwed up one of these vectors in class, and copied a vector with a sign change onto the board!) If S is the span of these vectors, what is and what could be the dimension of S? S sits inside R4. The dimension could be 4 or 3 or 2 or 1 or even 0. (What about 0? Well if S consisted just of the zero vector, this does satisfy the rules for a subspace, and the usual understanding is that the dimension of this rather small subspace is 0.) How about here? Since S is spanned by 3 vectors, I don't think S will be all of R4, which needs 4 vectors to span, so we can throw 4 out. Also, S won't have dimension 1, because "clearly" the three vectors spanning S will not consist of multiples of one vector. Since, for example, v1 is not a scalar multiple of v2, S needs at least two vectors to span it. Therefore S has either dimension 2 or 3. We need to decide if these 3 vectors are linearly independent. So we look at the vector equation av1+bv2+cv3=0.
which translates into the system of 4 scalar equations:
1st component equation -2a+1b-2c=0
2nd component equation 4a-5b+1c=0
3rd component equation 2a-1b+2c=0
4th component equation -4a+4b-2c=0
Again, only elementary techniques will be used. The majority of the class, who know about, say row reduction to row-echelon form, was getting fairly restless. This did not work in class because of the misquoted minus sign. It will work here! The first and third equations are the same (logically the same: one is the other multiplied by -1), so we can forget one of them. Double the first equation and add it to the second. The result is -3b-3c=0, so that c=-b. Add the second and fourth equations. The result is -b-c=0 so that (again!) c=-b. What is going on? Apparently if we choose b and c to satisfy c=-b, and then deduce a value of a, we will get solutions to all of the equations. Let me choose b=1. Then c=-1, and, say, the first equation becomes -2a+1(1)-2(-1)=0 so that a should be a=3/2. In fact the values a=3/2 and b=1 and c=-1 satisfy all of the equations. (You can check that the vector equation 3v1+2v2-2v3=0 is correct, which is the same thing.) Therefore v1, v2, and v3 are not linearly independent. In this case, we can get a basis for S by omitting any one of them, and the dimension of the subspace S is 2.

A system of abbreviation: matrices
We will be writing lots and lots and lots of linear equations. It may be handy to have abbreviations for them. For example, the two linear equations in four unknowns
3x-2y+6z+5t=9
5x-4y-7z+t=11
is conventionally abbreviated using certain rectangular arrays called matrices. The three matrices which would be used for this system are

   ( 3 -2  6  5)     (x)      ( 9)
A= (           )  B= (y)   C= (  )
   ( 5 -4 -7  1)     (z)      (11)
                     (t)   
and the matrix equation which abbreviates the linear system above is AB=C. Now we should talk a little bit about matrix algebra. Coverage of this is going at jet speed and I strongly suggest that students who are weak on this material should review it in the text.

Matrix algebra
A rectangular array with p rows and q columns is called a p by q matrix. (Yeah, there are higher dimensional matrices, and some of you may come into contact with them as representation of things like stress tensors, etc.) If A is a p by q matrix, then the entry in the ith row and jth column will frequently be called Ai,j.
DEFINITION Matrix addition If A and B are matrices which have the same size (both p by q) then the p by q matrix C=A+B has (i,j)th entry Ci,j=Ai,j+Bi,j.
DEFINITION Scalar multiplication of a matrix If A is a p by q matrix, and t is a scalar, then the p by q matrix C=tA has (i,j)th entry Ci,j=tAi,j.
DEFINITION Matrix multiplication The product AB of the matrices A and B is only defined under certain conditions. A must be p by q and B must be q by r. So for AB to be defined, the "inner dimensions" of A and B must be the same: the number of columns of A must equal the number of rows of B. If C=AB then C will be a p by r matrix, and the entries of C are gotten in a weird manner: Ci,k is the inner product: (ith row of A)·(kth column of B). This is very very weird if you have never seen it before. This computation organizes "linear combinations of linear combinations". Do examples, please!

An inadequate collection of examples of matrix algebra
Here are some small matrices we can practice on.

  A=       B=        C=    D= 
( 2 3 2)  (3 5 -1)  (-2)  (2 0 0)
(-1 5 3)  (4 4  2)  ( 2)  (0 2 0)
                    (-3)  (0 0 2)
Here A and B are the same size (both 2 by 3) so we can compute A+B. No other pair of matrices given here has the same size. We could compute 3C and 5A and -30D (scalar multiplication) since there aren't size restrictions for those. What about products? A and B are 2 by 3, C is 3 by 1, and D is 3 by 3. What products are defined? I think just these and no others: AC, AD, BC, BD, DA, and DB. Look at just AC. It is the matrix product of a 2 by 3 matrix and a 3 by 1 matrix so AC must be a 2 by 1 matrix. The top entry is the dot product of the first row of A with the single column of C: (2,3,2)·(-2,2,3), which is -4+6+6=8. The other entry is the value of (-1,5,3)·(-2,2,3), which is 2+10+9=21. So AC would be a 2 by 1 matrix whose entries are
( 8)
(21).
It would be useful if students computed A+B, 3C, 5A, -30D, AD, BC, BD, DA, and DB. Some of the products are sort of interesting.

The QotD was: suppose A is the 2 by 2 matrix

(2 1)
(3 4)
Compute A4. After a clever suggestion from the class (who did suggest that?) I agreed that one could do one third less work by first computing A2 and then squaring the result, rather than computing A2 and then A3 and then A4. Of course I am sitting here in luxury with a Maple window, so I type:
with(linalg):
A:=matrix(2,2,[2, 1, 3,4]);
                                 [2    1]
                            A := [      ]
                                 [3    4]
evalm(A^4);
                             [157    156]
                             [          ]
                             [468    469]
and that's the answer.

Please keep reading the book: now chapter 6.

     Exam in 1 week, on Tuesday, February 24

     Please consider the following material.
 

Office hours/review for the exam

I will be in my office (Hill 542, on the fifth floor of Hill Center) and available for 421 questions on Monday, February 23, from 5:30 to 7:30. If more than a few students show up, I'll find a classroom and put a note on my office door.

It's also likely that I will be in my office and available on Tuesday afternoon but please confirm this then with e-mail or a 'phone call.

Thursday,
February 12
Please see the newly expanded syllabus and list of homework problems. Today we begin a section of the course devoted to linear algebra. The Math Department gives several courses devoted to linear algebra: 250, an introduction to the definitions and simple algorithms; 350, a theoretical approach; 550, a course covering both theory and more advanced algorithms, given for several engineering graduate programs. In addition, parts of some numerical analysis courses cover questions of numerical linear algebra.

Here are my goals for this part of the course.

First, enough understanding of the "structure" of the collection of solutions to a set of linear equations so that someone who uses powerful tool such as Matlab will see what the answers should "look like". Matlab, by the way, was originally designed to do linear algebra computations for engineers and scientists.

Questions

  1. If (0,0,0,0) and (1,2,4,0) are solutions of 7 linear equations in 4 unknowns, must there be other solutions?
  2. Can there be exactly 5 solutions to a collection of 340 linear equations in 335 unknowns?
  3. Must a system of 30 homogeneous linear equations in 40 unknowns have some non-zero solution?
My request to the class for answers to these questions was not met by certainty. We decided that the answers should be "Yes" and "No" and "Yes". I will consider the questions again later.

Computational aspects

  1. Solution by "elimination" of a system of linear equations (conversion to row echelon form)
  2. Decision that a square matrix is invertible (algebraically) and the computation of an inverse; connection with Cramer's rule for solution of some systems.
  3. Know that, due to its "structure", there are ways to compute rapidly the 10th power of this matrix:
    0 3 2 1
    3 7 4 2
    2 4 0 1
    1 2 1 1

Vocabulary
When discussing linear algebra, special vocabulary is used. Students should have some idea what these terms mean, and be able to check if suggested examples are valid.
homogeneous, inhomogeneous, basis, linear independence, linear combination, spanning, subspace, dimension, rank, eigenvalue, eigenvector, matrix addition and matrix multiplication, symmetric, diagonalization.

The coefficients of the systems of linear equations
I remarked that our arithmetic will be limited to addition, subtraction, multiplication, and division. The examples I'll do in class will be rather small, because I am human. Usually the coefficients and the answers will involve rational numbers. But other collections are possible for the coefficients. Standard examples include the real numbers and the complex numbers. But other, less standard, examples can occur to engineers. The QotD given a week ago is converted by Laplace transform into a system of linear equations for X(s) and Y(s) whose coefficients are rational functions of s (that is, quotients of polynomials). Rational functions can be added, subtracted, multiplied, and divided. Suppose we wanted to analyze a really big system of "masses" and "springs" (I've shown part of one to the right) and wanted to learned how the system reacted if we "kicked" it in one place. The Laplace transform would give us a big system of linear equations with rational function coefficients. The same ideas and algorithms that we will discuss here can be used. (Application: material science, study of crystals, etc.) Another example, more relevant to electrical and computer engineering, occurs when the coefficients of the linear equations are integers mod a prime (simplest example: the prime is 2, and the coefficients are just 0 and 1). This sort of arithmetic arises, for example, in cryptographic applications (secure transmission and storage of information). Again, many of the same ideas and computations can be done with such coefficients.

The chief example for 421
This will be Rn which is the collection of n-tuples of real numbers. Each n-tuple will be called a vector. So we have v=(a1,a1,...,an). I'll try (I won't guarantee!) to write vectors in italics (just as I will try to write them on the blackboard with happy little half-arrows on the top. Where did that notation came from?) If w=(b1,b2,...,bn) then the vector sum v+w is the n-tuple (a1+b1,a2+b2,...,an+bn). I will now use the correct words, and I hope you will agree that they are precise: vector addition is commutative (order to be added doesn't matter) and associative (grouping of adding doesn't matter) and there's an additive identity (0=(0,0,...,0)) and additive inverses. Etc. We also have scalar multiplication: tv=(ta1,ta2,...,tan) which has various distributive rules, etc. Everything for this stuff is similar to what works in R2 and R3. Here I will be chiefly interested in what happens for other, bigger, n's.

Length/norm and inner product
I'll do it for Rn but I have in mind constantly a more complicated example we'll be doing later: Fourier series, where things will be the same but not the same (stay in the course, and you will see what I mean). So the length or norm of a vector v is sqrt([a1]2+[a2]2+...+[an]2). This is, of course, an "echo" of the Pythagorean theorem to measure distances between two points. I will remark right now that your intuition may be weak. For example, if you consider the n-dimensional unit cube (all coordinates between 0 and 1) then most of the corners are very, very far away from the origin. Why? Well, just look at the farthest corner, (1,1,...,1). In R560 its distance from the origin is sqrt(560), already fairly large! I then defined the inner product of two vectors, v=(a1,a1,...,an) and w=(b1,b1,...,bn). This is v·w=SUMj=1naj·bj. So the inner product (or dot product) gives a real number for two vectors. Basic inner product properties should all look familiar:
v·w=w·v (commutativity), (v1+v2)·w=v1·w+v2·w (linearity in the first variable, and because of commutativity, the same in the other variable), and (tv)·w=t(v·w) (scalar multiplication "comes out" of dot product).
As I mentioned above, sometimes "intuition" (principally the result of computing lots and lots of examples) can't be trusted, but something wonderful does happen.

Cauchy-Schwarz inequality
For v and w in Rn, |v·w|<=||v|| ||w||. I tried to explain why this is true (or see p.224 of the text). I looked at (v+tw)·(v+tw). Since this is the norm squared of the vector v+tw, the quantity must be non-negative. "Expand it out", using linearity in both factors and commutativity of the dot product. The result is v·v+(2v·w)t+w·wt2. If you look carefully at this, you can see several things. If the vectors are "fixed" and t varies, this is a quadratic function in t: At2Bt+C (maybe with peculiar A and B and C, though). I remarked that the graph was a parabola. Since A is non-negative (A is the square of a norm) the parabola opens up. Since the whole quadratic function is also a norm squared, it can either be in the upper half plane or, at worst, it could just touch the t-axis. Now the roots of a quadratic are [-B+/-sqrt(B2-4AC)]/(2A). The quadratic should not have two real roots (otherwise it will be negative for some values of t, which is impossible). So the discriminant should be non-negative: B2-4AC<=0. When I replaced A and B and C with the weird coefficients above, and cleared up the algebra, I got the Cauchy-Schwarz inequality. I went through this is detail because I will need a similar computation later when we do Fourier series.

Consequences of Cauchy-Schwarz
Well, we could prove the triangle inequality: ||v+w||<=||v||+||w|| but I don't feel like it. It is true, though (square both sides, "expand" (v+w)·(v+w), and do algebra). So the distance in R5 from (1,2,3,4,5) to (3,4,5,6,7) is less than or equal to the distance from (1,2,3,4,5) to (-2,2,-2,2,13) plus the distance from (-2,2,-2,2,13) to (3,4,5,6,7). (Do not check this, but accept my assurance!) What's more interesting is now we can define angles. Why? In earlier courses, you should have seen the law of cosines used to get an alternate description of v·w, as equal to ||v|| ||w||cos(theta), where theta was the angle between the vectors. Cauchy-Schwarz says that the quotient for the cosine part must be between -1 and 1. This is not obvious. Why could the dot product of two vectors in R33 be 444 while the lengths of the vectors are 3 and 12. Cauchy-Schwarz says this can't happen.
I think I finally did an example, something like (1,2,3,0,2) and (2,-3,0,7,-1) in R5. Here one length is sqrt(18) and the other is sqrt(53), while the dot product is -6. So the angle between these vectors is arccos(-6/(sqrt(18)sqrt(53))). The minus sign indicates that the angle is between Pi/2 and Pi. Two vectors are orthogonal or perpendicular if their dot product is 0.

Computational note
Most "higher level" computer languages recognize things like vector addition and dot product, and usually compile the instructions to take advantage of "vector" or parallel processing which is built into the computer chips. So these things can be done really fast. Therefore it is useful computationally to see the vector operations in your work.

DEFINITION Subspace
A collection of vectors S in Rn is a subspace if

  1. 0 is in S.
  2. The sum of vectors in S is in S.
  3. The scalar multiple of any vector in S is in S.
I tried to give some examples of S's and asked if these things were subspaces. Let's see if I can reconstruct some of them. They were in R5:
1. First I gave an S which was just two vectors: S={(1,2,3,0,0),(0,1,2,0,1)}. This is not a subspace because, for example, 0 is not in S.
2. I amended S a bit. I put (0,0,0,0,0) into the previous S. Now people said this S still wasn't a subspace because the sum of (1,2,3,0,0) and (0,1,2,0,1) wasn't in S, so that rule #2 isn't satisfied. And then we discussed more: we could sort of add vectors etc. Things still wouldn't work, because of rule #3.
3. Here's a different sort of example. I define S by a descriptive phrase which will be much more common in this course: S was the collection of vectors (a+b,2a,3a-b,3b,a) where a and b are any real numbers. In this case, I worked hard at proving that all of the rules (#1, #2, #3) were satisfied. If we look at the underlying "structure", this S consists of a(1,2,3,0,1)+b(1,0,-1,3,0). We will call this a linear combination of the two vectors. I sort of tried to draw a picture (a five-dimensional picture?) of this S. It seems to be a two-dimensional thing (a plane) which passes through 0. We will have to make everything more precise.
4. Here is another way where an example will be presented systematically. Suppose u=(1,-1,2,3,4) and w=(3,1,-2,0,5). My candidate for an S will be the collection of vectors v in R5 so that v·u=0 and v·w=0. That is, the collection of all v's which are perpendicular to both u and w. I tried, not too successfully, to draw a picture of the situation, and persuade the class that this was indeed a subspace. How about an algebraic view? Well, consider rule #2 of the subspace stuff. If v1·u=0 and v2·u=0 then v1·u+v2·u=0 and (linearity!) (v1+v2)·u=0. The same thing would happen with the "·w" requirement. And also with scalar product, etc. This is a subspace. It is not immediately clear to me what the "dimension" of S is. It will turn out to be that the dimension will be 3. But this will take a bit of effort.

The QotD was: consider the vectors in R4 with the following description: (a+b,a,3b-a,c) where a and b are any real numbers, and where c is any integer, positive or negative or 0. Is this a subspace? Well, for example, a=0 and b=0 and c=56 gives us the vector (0,0,0,0,56) in our candidate for a subspace. Well, if we try to multiply this be 1/(17) we will get something not in S, contradicting rule #3. So this S is not a subspace. Logically, we need to show a specific example to be sure that the rule is really broken, not that is might be broken. Oh well, this is a delicate point of logic.

Read chapter 5. Review the first few sections if you need to. We will be going over material at jet speed and you will be responsible for it. Please hand in these problems on Tuesday: 3.7, 8 (a remnant of Laplace!), 5.4: 9, 15, 5.5: 5, 17, 19. Read these sections and learn how to do the problems!


Maintained by greenfie@math.rutgers.edu and last modified 1/20/2004.