### Diary for Math 311, spring 2003

In reverse chronological order.

Old diary entries: 1/22/2003 to 3/3/2003
Old diary entries: 3/5/2003 to 4/14/2003

Date What happened
5/5/2003
THE LAST DAY!!!

Let's analyze and prove one version of the Fundamental Theorem of Calculus (FTC). This is discussed in section 7.3 of the text.

This version of FTC has to do with how the integral behaves as a function of its upper parameter. Perhaps an example will make the difficulties clearer. Let's look at a function defined in [0,3] by the "piecewise" formula f(0)=2, f(x)=1 for 0<x<=1, f(x)=0 for 1<x<=3. We first observed that f was indeed Riemann integrable on [0,3]. We could use partitions such as {0,B,H,C,3} where B is slightly bigger than 0, H is slightly less than 1, and C is slightly larger than 1. The difference between the resulting upper and lower sums is "very small". And consideration of such sums allows one to actually compute the Riemann integral. Of course f is also Riemann integrable on subintervals of [0,3], so we could define F by F(x)=int0xf for x in [0,3]. We can actually "compute" F. First, F(0)=int00f must be 0, because the width of the "subintervals" in any Riemann sum must be 0. Now if x is between 0 and 1 (or actually equal to 1) we can use the partition {0,B,1} with B close to 0 to see that int0xf is x (really, the width of 1-B is close to x and the height, sup of f on [B,1], is 1). Now if we have x>1 and x<=3, we can use the partition {0,B,H,C,3} as before to see that for such x's, F(x) is 1. What about the derivative of F? Everyone who has gone through a calc 1 class can tell that F is differentiable for x in [0,1) and its derivative is 1, and it is also differentiable in (1,3] with derivative 0. To the left below is a graph of F, and to the right below is a graph of F'. I would like you to compare F' and f. The Riemann integral, by the way, doesn't even "notice" the discontinuity of f at 0. The Riemann integral locally averages the behavior of f and reports that local average to F.
 <-- graph of F              graph of the derivative of F-->

In what follows, we will need to "recall" some facts about the Riemann integral.

1. If f is Riemann integrable in an interval, then it is Riemann integrable in any subinterval.
Proof: Discussed in the writeup for the lecture of 4/28/2003.
2. If f is Riemann integrable in an interval, then it is bounded, say by M (so |f(x)|<=M for all x in the interval). Also the absolute value of the integral of f on the interval is less than or equal to M multiplied by the length of the interval.
Proof: Discussed in the writeup for the lecture of 4/24/2003.
Theorem (continuity of the integrated function) Suppose f is Riemann integrable on [a,b]. Then f is Riemann integrable on [a,c] for all c in [a,b], and if F(x)=intaxf for x in [a,b], F is continuous on [a,b]. Indeed, if f is bounded by M, F satisfies a Lipschitz condition with Lipschitz constant M: for all x and y in [a,b], |F(x)-f(y)|<=M|x-y|.
Proof: If x<y, then F(y)=intayf= intaxf+intxyf by additivity on intervals (4/28 lecture). Since F(x)=intaxf we see that F(y)-F(x)=intxyf. If we know that -M<=f(x)<=M, then -M(y-x)<=F(y)-F(x)<=M(y-x) so that |F(x)-f(y)|<=M|x-y|.

Comment: the function g(x) which is 0 if x<0 and is sqrt(x) if x>=0 does not satisfy a Lipschitz condition in any interval which includes the "right side" of 0 because sqrt(x) doesn't satisfy the Lipschitz property in such an interval (see the lecture on 4/14, please). So this g can't be a F corresponding to any Riemann integrable f. The same is true for other functions which don't satisfy Lipschitz conditions. Generally, people expect that "integrating" makes functions "smoother" and better behaved. Thus, we go from Riemann integrable to Lipschitz. Along this line is the next result, which says we go from continuity to differentiability.

Theorem (a version of FTC) Suppose f is Riemann integrable on [a,b]. Then f is Riemann integrable on [a,c] for all c in [a,b], and if F(x)=intaxf for x in [a,b], and is f is continuous at c, F is differentiable at c and F'(c)=f(c).
Proof: Recall that F'(c) is the limit as h-->0 of (F(c+h)-F(c))/h. This proof naturally separates into two cases, h>0 and h<0. We'll do just the case h>0 (as in class).
Now F(c+h)=intac+hf=intacf+intcc+hf=F(c)+intcc+hf so that F(c+h)-F(c)=intcc+hf. Since f is continuous at c, given epsilon>0 we can find delta>0 so that if |h|<delta, then when |x-c|<delta, |f(x)-f(c)| must be less than epsilon. So if |f(x)-f(c)|<epsilon we can "unroll" the inequality to get -epsilon<f(x)-f(c)<epsilon. Therefore f(c)-epsilon<f(x)<f(c)+epsilon. Both "ends" of this inequality are constants, so if we integrate on the interval [c,c+h] the result for the ends is just multiplication by the length of the interval, which is h. The result of integrating the central term is F(c+h)-F(c). The result is h(f(c)-epsilon)<F(c+h)-f(c)<h(f(c)+epsilon). Since h is positive here, we know that f(c)-epsilon<(F(c+h)-F(c))/h<f(c)+epsilon (division by a positive number does not change the direction of the inequalities). Now subtract f(c) to get -epsilon<[(F(c+h)-F(c))/h]-f(c)<epsilon, which means |[(F(c+h)-F(c))/h]-f(c)|<epsilon. This will hold certainly if 0<h<delta.
A similar inequality can be proved for h<0. But this is precisely what is meant by declaring that the limit as h-->0 of (F(c+h)-F(c))/h exists and equals f(c). So we are done.

Although this result is almost always used where f is continuous in the whole interval, so F is differentiable in the whole interval with F'=f, we actually don't "need" continuity of f. Here's a fairly simple example of discontinuities in f not "noticed" by F. Suppose f(x) is 1 if x=1/n for n in N and f(x)=0 otherwise. Then f is Riemann integrable in any interval (!) and intabf is 0 for any a and b. Therefore a candidate F will always be 0 and will be differentiable everywhere, and F'=f for x not equal to 1/n. F doesn't notice F's values on a thin set like {1/n}.

Of course if we make f non-zero on a "thicker" set then we may run into trouble. We have already seen examples where f is not Riemann integrable as a result (f(x)=1 if x is rational, or even f(x)=x if x is rational).

If we had also verified the Mean Value Theorem, then we would know that two differentiable functions defined on the same interval with the same derivative would differ by a constant. That would be enough when combined with our version of FTC to prove that if G'=f on [a,b] and f is Riemann integrable, then intabf=G(b)-G(a). This is the version of FTC which is used everywhere in calculus and associated subjects.

Please look at the review material for the final.

5/1/2003 The final is scheduled for

### Tuesday, May 13, 12:00-3:00 PM in SEC 205

and by popular demand (!?) I will have a
Review session on Saturday, May 10, at 1 PM in Hill 525
I will also have office hours in Hill 542 on Monday, May 12, from 1 PM to 5 PM. Almost surely I'll be in my office most days next week, and I will also respond to e-mail. I will try to produce some review material to hand out on Monday. We will cover a version of the Fundamental Theorem of Calculus in the last class.

Professor Cohen has created even more notes on Riemann sums, a total of 21 pages now. Please take a look.

Material related to what I discuss today is covered in the textbook in section 5.6 (see 5.6.1 through 5.6.4) and in section 7.2 (see 7.2.7).

In probability one builds models of "chance". The cdf (cumulative distribution function) of a random variable X, which is defined by f(x)=probability{X<=x} contains most of the useful probability information. Quantities such as the mean (expectation) and the variance can be computed from it (usually involving various integrals). The function I'd like to study today has the essential properties of a cdf which are listed below. I won't discuss a probability "model" that this function might come from.

1. f's values are in [0,1].
2. The limit of f(x) as x-->infinity is 1.
3. The limit of f(x) as x-->-infinity is 0.
4. If x<y, then f(x)<=f(y).
5. If a is in R, then the limit of f(x) as x-->a- exists. In the language of 311, this is the limit of f(x) as x-->a when the domain of f is (-infinity,a).
6. If a is in R, then the limit of f(x) as x-->a+ exists and equals f(a). In the language of 311, this is the limit of f(x) as x-->a when the domain of f is (a,infinity).
Since this is Math 311, I should prove something. So I will show that if property 3 holds (so if x<y, then f(x)<=f(y)), then property 5 is true (If a is in R, then the limit of f(x) as x-->a- exists.).
Theorem Suppose for all x, y in R, if x<y, then f(x)<=f(y). Then the limit of f(x) as x-->a- exists and is equal to sup{f(x):x<a}.
Proof: Let's call the set {f(x) : x< a}, W, and call its sup, S. Why should S exist? We will use the completeness axiom. W is not empty (since, for example, f(a-33) is in W). W is bounded above, and one upper bound is f(a) (this uses the increasing assumption, of course). Therefore S exists using the completeness axiom on the set W, which is non-empty and bounded above.

I claim that the stated limit exists. This is a one-sided limit, so the following implication must be verified: if epsilon>0 is given, then there is a delta>0 so that if a-delta<x<a, then S-epsilon<f(x)<=S. Since S is sup{f(x) : x< a}, given epsilon>0, there will be w in W so that S-epsilon<w<=S. But w is a value of f, so that there is v<a with f(v)=w. Take delta to be a-v. Then if a-delta<x<a, we know that a-(a-v)<x<a, so v<x<a. Since f is increasing, we may "apply" f to this inequality and get f(v)<=f(x)<=f(a). This means that w<=f(x)<=f(a). But w>S-epsilon, and since x<a, f(x)<=S because S is the sup of W. Now we have S-epsilon<f(x)<=S, which is what we wanted and the proof is done.

Of course a similar statement is true about limits "on the right" with sup replaced by inf.

Now I'll begin creating a weird example. We know that Q is a countably infinite set (go back and look at the first day or two in this course). In fact, Q intersect any interval of positive length is countably infinite. Countably infinite means that there is a bijection (a pairing, a function which is 1-to-1 and onto) between N, the positive integers, and the set. So there is a bijection B:N-->{elements of Q, the rational numbers, in the open interval (0,1)}. Now remember that the sumn=1infinity1/2n is 1. So what's f(x), finally?

f(x)=the sum of 1/2n for those n's which have B(n)<=x.

This is a weird definition. Since the range of B, the bijection, is only the rationals in (0,1), if x is less than or equal to 0, there are no B(n)'s less than or equal to x. Therefore the sum is "empty", and the legal (?!) interpretation of an empty sum is 0. Thus f(x)=0 for x<=0. Now if x>=1, all of the rationals in the open interval (0,1) are below x, so f(x) must be 1. Notice also that f's values are some sort of "subsum" of the complete sum of 1/2n for n in N, and therefore f(x) must be in [0,1] for all x. So we have verified requirements 1 and 2 and 3 for cdf's above.

Things will get even more interesting when we look at #4. If x<y and x and y are in the unit interval, then the interval (x,y) has infinitely many rational numbers between 0 and 1 in it. Therefore f(x) must be strictly less than f(y). Thus, on [0,1], f is a strictly increasing function: f(x)<f(y) if x<y. This is more than #4 requires.

We saw that #4 implies that the left and right hand limits exist. So all we need to do is investigate where f is continuous. Well, since limx-->a-f=sup{f(x):x<a}=LEFT and limx-->a+f=inf{f(x):x>a}=RIGHT we just need to think about where f(a) fits. Certainly since f is increasing, we know that LEFT<=RIGHT. We'll say that f has a jump at a if LEFT<RIGHT, and the amount of the jump is RIGHT-LEFT. Can f have a jump of, say, 33? That is, can RIGHT-LEFT be 33? Since f's values are in [0,1], I don't think this is possible. Can f have a jump of 1/33? It could have such a jump, but it actually couldn't have too many of such jumps: it certainly couldn't have more than 33 of those, because notice that f can't jump down, only up, since f is increasing. Actually the total length of the jumps of f should be 1, since f(large negative) is 0 and f(large positive) is 1. Now where do the jumps take place?

Let's imagine an example, where, say B(17)=3/7. If x<3/7, the sum for f(x) would not have the 1/217 term in it. As x increases "towards" 3/7, f(x) would increase towards sumB(n)<3/71/2n, which would be sup{f(x):x<3/7}. If x>=3/7, the sum would have the term 1/217. And inf{f(x):x>3/7} would be exactly 1/217 larger than sup{f(x):x<3/7}, and would be equal to f(3/7).

In fact, f has a jump of 1/2n at B(n): f has jumps at every rational number. The total sum of the jumps at the rationals is 1/2+1/4+1/8+...=1. There are no other possible jumps, since any additional jump would mean that the function increases more than 1, and we already know this is not possible.

In [0,1], this f is continuous at every irrational number and is continuous at 0 and at 1, and is not continuous at every rational number between 0 and 1. Is f a cdf of a continuous distribution? Is it a cdf of a discrete distribution? Well, maybe f shares aspects of both kinds of distributions.

Is f Riemann integrable on [0,1]? Of course this is the same as asking if, given epsilon>0, we can find a partition P of [0,1] so that US(f,P)-LS(f,P)<epsilon. For example, suppose we take epsilon=1/10. Is there some partition which clearly satisfies the requirement? Remember that f is increasing on [0,1], and f(0)=0 and f(1)=1. If we partition [0,1] into n equal subintervals of width 1/n, then the difference US(f,P)-LS(f,P) must actually equal the total increase (1) multiplied by the width of the subintervals (1/n), so the difference is 1/n. And if n>10, we have a satisfactory partition. And by the Archimedean property, we can always find n so 1/n<epsilon. In fact, any increasing function must be Riemann integrable. Any decreasing function must be Riemann integrable. There are problems with integrability when there is much combined "wiggling" up and down.

What is the Riemann integral of f on [0,1]? Some thought should convince you that we know approximately what the integral is: it must be between 0 and 1. But I don't know more than that. In fact, until after this lecture I really didn't think much about what f looks like. So here is what I did: I asked Maple to "draw" an approximation of the graph of an f. I listed 30 rational numbers between 0 and 1, and had Maple draw the approximation to the graph of f by just using these thirty rational numbers, in order, as B(1), B(2), ..., and B(30). The total "weight" that's left over sums up to at most 1/230 which is less than 10-8, a very small number. So the graph drawn is quite close (probably beyond screen resolution) to a "true" graph of f. Here are the Maple procedures I used.

`gen:=rand(1..99999);`
This asks Maple to create a "random" integer between 0 and 99,999.
`A:=[seq((gen()/100000),j=1..30)];n:=30;`
This asks Maple to create a sequence of "random" rationals (the integers divided by 100,000) in the open unit interval. It assigns this sequence to the name A. The next statement creates the variable n with value 30.
```bin := proc(x)
local y, inc, j;
global A, n;
y := 0;
inc := .5;
for j to n do
if A[j] < x then y := y + inc fi; inc := .5*inc
od;
RETURN(y)
end;```
The procedure bin uses the global variables A and n to get values of the function f(x) depending on the specifications of A and n.
```area := proc()
local y, j, inc;
global A, n;
y := 0;
inc := .5;
for j to n do y := y + (1. - A[j])*inc; inc := .5*inc od;
RETURN(y)
end;
```
This Maple procedure gets the area of the approximation to the function. I then plotted the approximation with the command
`plot(bin,0..1,thickness=3,color=black);`
One value of A is this:
```
8193   79919   3341  63119   38091  1436  18757   8339   69373
[-----, ------, ----, ------, -----, ----, ------, -----, ------,
50000  100000  6250  100000  50000  3125  100000  20000  100000

4197   12881  29181   23023   41   89573   79983   287  4481
-----, -----, ------, -----, ----, ------, ------, ---, -----,
10000  50000  100000  50000  1000  100000  100000  800  12500

12149    2917   51331   13887    327   23963  9167    1109
------, ------, ------, ------, -----, -----, -----, ------,
100000  100000  100000  100000  20000  25000  12500  100000

29977   94989   13559  9011
------, ------, -----, -----]
100000  100000  50000  20000
```
The associated graph is shown and it has approximate area .57585733. One surprising aspect of the graph to me was the enormous flatness of most of it: but of course the graph is not "flat" anywhere (any interval has infinitely many rationals, so the graph must increase). The amount of increase is mostly very very very small. Maple also displays the vertical jumps with vertical line segments.

I don't know or understand very much about the possible values of int01f. It is between 0 and 1, but it can be very very large: if I select the first 30 "random" rationals close to 0, then I get the first graph shown below and the area is .97827779, and if I select the first 30 "random" rationals close to 1, then I get the second graph shown below and the area is .05210699.

Problem Is there a bijection B which has int01f=1/2? I don't know. In fact I don't know any specific value (or non-value!) of int01f. Certainly fairly easy reasoning (moving around the big blocks) shows that the values of int01f are dense in (0,1), but I really don't know the answer to the question just asked. I suspect it is "yes".

4/30/2003 Tomorrow I will give out student evaluation forms. Also tomorrow I will request information on when I can usefully be available before the final exam. The final is scheduled for

### Tuesday, May 13, 12:00-3:00 PM in SEC 205

 I began by discussing the material written in this color in yesterday's diary entry. We went through the proof in detail, and I think at least some students understood and verified it.

 Students did not want to discuss the material presented in this color in yesterday's diary entry (Additivity of the integral over intervals). This material will be used in the proof of a version of the Fundamental Theorem of Calculus, which I hope to give at the last meeting of the class Monday.

Historically probability has been the inspiration of many of the more intricate results about integration. I now take a small detour to present a complicated example of a function. First I tried to give some background on probability.

Probability originated in the 1600's in an effort to predict gambling odds. Here's the basic idea, as it is now understood. One plays a game "many" times and observes the outcomes. A quotient called the "relative frequency" is computed: this is the (number of outcomes of a desired type) divided by (total number of times the game has been played). Of course relative frequency is a number between 0 and 1. Now the idea or hope or desire is that as the(total number of times the game has been played) gets large (approaches infinity?) this relative frequency should somehow "stabilize" or approach a limit. This limit is called the probability of the outcome of the desired type. Since the limit of a sequence of numbers in [0,1] is also a number in [0,1], the probability of a collection of outcomes is always in [0,1]. Of course this is all a model of reality, and building these models can be difficult. And certainly the relevance to "reality" of what's deducted using these models can also be debated. But that's the basic idea. Now I'll introduce some vocabulary and illustrate the vocabulary with a few examples.

 Outcome Sample space Event Probability Random variable Vocabulary (approximate)Definition One of a list of possible results of "the game" The collection of all possible outcomes A collection of certain specified outcomes: an event is a subset of the sample space An assignment of a number in [0,1] to an event: a measurement of how "likely" the event is This is a real-valued function defined on the sample space. Maybe think of such a function as the amount of "winnings" (in \$?) that each outcome generates. A fair die The outcomes are classified by the number of dots on the face showing. The sample space could be labeled {1,2,3,4,5,6} One event could be the collection of "odd" outcomes: {1,3,5}. There are 26=64 different events. The probability of an event is measured by the number of distinct outcomes in it divided by 6. So pr({1,3,5})=3/6. One random variable could be the number of dots showing: an integer from 1 to 6. A fair coin, flipped until heads shows The game ends when the first head shows. One outcome could be H. Another could be TTTH which can be abbreviated as T3H. Another could be a sequence of all T's, abbreviated as Tinfinity. For each n in N, we have Tn-1H. We also have Tinfinity. One event could be all outcomes with at most 50 flips: {Tn-1H with 1<=n<=49}. A fair coin flipped independently would likely get the following: pr(Tn-1H)=1/2n, and pr(Tinfinity)=0. See Note 1 below. A random variable could be the number of flips needed. This random variable is defined and is a real number on all of the outcomes Tn-1H. It isn't defined on Tinfinity but that doesn't matter since the probability of that happening is 0. Pick a number at random from [0,1], all numbers being equally likely. 1/3 is an outcome, and so is Pi/7 and sqrt(2)-1. The sample space is [0,1]. Well, one event could be A={x with 0=

Note 1 Here is one unpleasant consequence of this model. Since the probabilities of all the outcomes in the sample space should add up to 1, there is no positive number "left over" to assign as the probability of Tinfinity (since the sum of 1/2n as n goes from 1 to infinity is 1), so its probability must be 0! So here is a conceivable event which happens hardly ever, according to this model.

Note 2 This "game" is called "choosing a number from the unit interval uniformly at random". Clearly (?) the correct model would assume that the probability of an "interval event" is the length of the interval. Being subintervals of [0,1] means that the lengths are correctly weighted so the probability of the whole sample space is 1, as it should be. But then the probability of, say, the event which is the interval (1/3-1/n,1/3+1/n) is 2/n for all positive integers n. Then since pr({1/3})<=pr((1/3-1/n,1/3+1/n)) should be true (smaller events should have smaller probabilities!) we see that pr({1/3})<=2/n for all n in N. Thus (Archimedean property!) pr({1/3})=0 in this model. What is more unsettling to realize is that the probability of any one number event is 0! So the chance of picking any one number, according to this model, is 0, but we've got to pick some number! Both note 1 and note 2 deal with the paradoxes of trying to model infinite "games" with a series of rules that lead to weirdness. These weirdnesses seem to be necessary.

The most famous results of probability deal with repeated experiements and the tendency of random variables to have nice "asymptotic" properties. One such result is the Central Limit Theorem, which essentially states that the normal curve rules every repeated experiment. Here are two applets simulating the CLT, one with dice and one with a sort of pachinko-like "game". Such results are usually understood and investigated using the cumulative distribution function, cdf, of a random variable X. So cdf's are extremely important in probability.

If X is a random variable, the cumulative distribution function, f of X is defined by this:
f(x)=the probability that X is less than or equal to x. That is, f(x)=pr(X<=x).

Some effort is needed to be acquainted with this definition. Let's look at our three random variable examples, and graph their cdf's.

Tossing a fair die Here there are jumps of 1/6 at 1 and 2 and 3 and 4 and 5 and 6. A graph of the cdf follows. There is a solid dot where the value of the function is, and an empty circle where it "isn't".

The values of the cdf are always in [0,1], since it is a probability. The cdf is always increasing but may not not strictly increasing. That is, if x<y, then f(x)<=f(y).

Flipping until a head occurs Here there are jumps of 1/2 and 1/4 and 1/8 and ... at 1 and 2 and 3 and ... This graph takes some thinking about.

Notice that this cdf never "reaches" 1, but its limit as x-->infinity is 1. We could easily get cdf's which are never 0, but whose limit as x-->-infinity is 0.

Squaring a uniformly distributed number from [0,1] What is the probability that such a number is less than 1/2? This is the same as asking for the length of the interval of x's in [0,1] for which x2<=1/2. That interval is [0,1/sqrt(2)], so its length is 1/sqrt(2). Therefore the graph is sqrt(x) for x in [0,1], and 0 for x<0 and 1 for x>1.

The first two random variables are examples of discrete random variables, and the third is a continuous random variables. In many manipulations concerning random variables, we build the model (as described above) but then once the cdf of a random variable is known, almost all other information is discarded, and work is done with the cdf alone. We can generalize some facts about cdf's from these examples:
Properties of cdf's
Suppose f is a cdf. Then:

1. f's values are in [0,1].
2. The limit of f(x) as x-->infinity is 1.
3. The limit of f(x) as x-->-infinity is 0.
4. If x<y, then f(x)<=f(y).
5. If a is in R, then the limit of f(x) as x-->a- exists. In the language of 311, this is the limit of f(x) as x-->a when the domain of f is (-infinity,a).
6. If a is in R, then the limit of f(x) as x-->a+ exists and equals f(a). In the language of 311, this is the limit of f(x) as x-->a when the domain of f is (a,infinity).

Sometimes people say that properties 5 and 6 mean that the cdf is a cadlag function (!). This is an acronym for the French phrase "Continue a droite, limite a gauche": the function is continuous from the right, and has left-hand limits. We will discuss and further verify these properties tomorrow, and also try to count the number of jumps that any cdf can have. And we will look at a remarkable cdf.

In the case of continuous random variables, another function is sometimes studied, the density function. This turns out to be the derivative of the cdf, and its utility for discrete random variables is not immediately clear. (What should the derivative of a mostly horizontal function be?) So I will just look at cdf's here, today and tomorrow.

4/28/2003 Again we are going through the technicalities on integral and order, integral and linearity, and additivity of the integral over intervals. This takes effort and discipline, but Math 311 is the course whose total object is constucting calculus (also called "analysis") with all the interconnections showing. So let's move on and finish up these technicalities.

Proposition (negating integrands) Suppose f is Riemann integrable on [a,b]. Then the function h defined by h(x)=-f(x) is also Riemann integrable on [a,b], and intabh=-intabf.
Proof: If A is a bounded subset of the reals, define B to be the subset which consists of -a where a is in A. Then the -sup{x in A}=inf{x in B}. This is an exercise in sup's and inf's, suitable for earlier in the course. If A=(-13,5], then B would be [-5,13), and sup A would be 5, and inf B would be -5. We will apply this repeatedly.

Initially I want to show that h is Riemann integrable. I will use the necessary and sufficient condition with epsilon. What do I mean? I must show that given epsilon>0, there is a partition P of [a,b] so that US(f,P)-LS(f,P)<epsilon. Since the upper sums and lower sums of f involve infs and sups, we can apply the previous remark. We get US(h,P)=-LS(f,P) and LS(h,P)=-US(f,P). Therefore US(h,P)-LS(h,P)=-LS(f,P)-(-US(f,P))<epsilon. We now know that h is Riemann integrable.

We need to show that intabh=-intabf. Let's look at UI(h,[a,b]). This is inf of the upper sums of h. But each upper sum of h is minus a lower sum of f. Therefore the inf of h's upper sums is (again by the remark above!) equal to minus the sup of the lower sums of f. But this is -LI(f,[a,b]). Therefore we have shown that UI(h,[a,b])=-LI(f,[a,b]). Since f and h are Riemann integrable, the upper and lower integrals of each are equal to the "integral" of each. That is, we have verified intabh=-intabf as desired.

Note This differs slightly from the presentation made in class. I believe it is more systematic, and perhaps better. I am not sure, though.

 Proposition (addition of functions) Suppose f and g are Riemann integrable functions on [a,b]. If the function h is defined by h(x)=f(x)+g(x), then h is Riemann integrable on [a,b], and intabh=intabf+intabg. Proof: Again we must show that h is Riemann integrable first. So given epsilon>0 we must find a partition P of [a,b] so that US(h,P)-LS(h,P)=LS(f,P)+LS(g,P). Now we subtract, reversing the inequalities in the proper fashion: US(f+g,P)-LS(f+g,P) inf N + inf P, then we can find n in N and p in P with inf M>n+p>=inf N + inf P (using our old characterization of inf's). But there is m in M with m<=n+p, which implies m=LI(f,[a,b])+LI(g,[a,b]). But f and g and f+g are all Riemann integrable, so UI=LI=intab for f and g and f+g. Thus the inequalities stated above prove that intabh=intabf+intabg. Note I think that I made a mistake in this proof in class. What I've written above is, I think, correct. But it is different from what I stated in class. Please look at your notes and decide. Let me know, please.

The last three propositions can be abbreviated by writing that:
The collection of Riemann integrable functions on [a,b] is a real vector space, and the mapping from these functions to the real numbers defined by integration is a linear transformation.

 Finally here's the last technical result we need. Again, all of these should be familiar from calculus days! Theorem (Additivity over intervals) Suppose a0, I will try to produce a partition P of [a,b] for which US(f,P)-LS(f,P)=US(f,P1)+US(f,P2). I claim that this then implies UI(f,[a,c])>=UI(f,[a,b])+UI(f,[b,c}). If UI(f,[a,c])=UI(f,[a,b])+UI(f,[b,c}). We can similarly prove LI(f,[a,c])<=LI(f,[a,b])+LI(f,[b,c}). Since again we already know these functions are Riemann integrable, UI=LI=inta or bb or c, we have verified that intacf=intabf + intbcf. Note There is much repetition of ideas in all of these proofs! Proof of II): Since f is Riemann integrable on [a,b] and f is Riemann integrable on [b,c], we know, given epsilon>0, there is a partition Q1 of [a,b] and a partition Q2 of [b,c] so that US(f,Q1)-LS(f,Q1)

Possible question of the day Suppose g is Riemann integrable on [0,5], and you are told that int05g=13, and that |g(x)|<=3 for all x in [0,2]. What over and underestimates can you make about int25g and why?

First was MEGO. The web site www.acronymfinder.com reports that this means "My Eyes Glaze Over (during a boring speech or briefing)".

I asked for the source of the quotation "sup of the evening, beautiful sup" -- this was a misspelling of the word "soup" and the phrase comes from chapter 10 of Lewis Carroll's "Alice in Wonderland" where it is the first line of a song that the Mock Turtle sings. John Tenniel's historic illustration is shown.

The Mock Turtle also discusses its education, and remarks that it studied `the different branches of Arithmetic-- Ambition, Distraction, Uglification, and Derision.'

Lewis Carroll was actually an academic mathematician at Oxford University named Charles Lutwidge Dodgson. Biographical information is abundant.

4/24/2003 The Question of the Day
Suppose f:R-->R is defined by f(x)=5 when x=3 and f(x)=-9 when x=6, while f(x)=0 for all other x's. Is f Riemann integrable on [2,7], and, if it is, what is the Riemann integral of f on that interval?

I began by observing that the special arguments last time actually proved more than I stated.
Theorem (Integrability of Lipschitz functions) Suppose f satisfies a Lipschitz condition on [a,b]: that is, there is a constant K>0 so that for all x,y in [a,b], |f(x)-f(y)<=K|x-y|. Then f is Riemann integrable on [a,b].

Comment The method of proof actually also provides the beginning of an algorithm to approximate definite integrals, so the work is not totally wasted, even though the conclusions of the theorem to be stated below apply to many more functions than this one.

Corollary Suppose f is differentiable on [a,b], and there is K>0 so that |f'(x)|<=K for all x in [a,b]. Then f is Lipschitz and therefore Riemann integrable.

We reconsidered an example discussed on 4/14/2003 (see the material on boxes and butterflies): the function sqrt(x) on the interval [0,1]. Since f'(x)=x-1/2 for x>0, the derivative is not bounded on (0,1]. And we actually saw that this f does not satisfy a Lipschitz condition on [0,1]. But everyone who has been through a calculus course knows that sqrt(x) for x between 0 and 1 does have an area, and this area is even easily computable with the Fundamental Theorem of Calculus. So how can we verify this function is Riemann integrable on [0,1]? The following result is a major success of the course.

Theorem (continuous functions are Riemann integrable) Suppose f is continuous on [a,b]. Then f is Riemann integrable on [a,b].
Proof: The key to this proof is to use uniform continuity to "control" the amount of variation of the function on an interval. So we know: given eta>0, there is alpha>0 so that if x and y are in [a,b] and if |x-y|<alpha, then |f(x)-f(y)|<eta. I'll try to relate alpha and eta to what we need to verify Riemann integrability.

We will use a simple partition again. So P will break up [a,b] into n equal intervals, each of length (b-a)/n. The number of boxes is n. In each subinterval, the inf and sup of f on [xj-1,xj] is actually assumed: there are numbers mj and Mj so that sup of f on [xj-1,xj] is Mj=f(pj) and inf of f on [xj-1,xj] is mj=f(qj). This is a consequence of continuity. The difference between the upper and lower sums will be bounded by:
(the number of boxes)·(the width of the boxes)·(the max height of the boxes). If we can show that this will be less than some given epsilon>0, we will be done. But the number of boxes is n, and the width is always (b-a)/n. Therefore if the height of the boxes will always be less than epsilon/(b-a), we will be done. But the height is f(pj)-f(qj) with |pj-qj|<(b-a)/n. So now we use uniform continuity. We want eta to be epsilon/(b-a). So uniform continuity tells us there is some alpha>0 which guarantees that eta. Now take n large enough so that (b-a)/n<alpha, always possible by the Archimedean property. And we are done.

An example or two
I will assume the standard properties of sine for these examples (we could instead get functions whose graphs would be polygons with similar properties, but would it be worth the trouble to define them?).

Here is a picture of the function f(x)=sin(1/x) for x not 0, and f(0)=0. This function is continuous in [0,1]. If (xj) is any sequence in (0,1] with 0 as a limit, then the sequence (f(xj) is squeezed by the "x" and the limit will be 0. I had Maple draw in the "squeezing lines", +/-x, as well as sin(1/x) in this picture. This function has an infinite number of wiggles up and down in [0,1], but it is still Riemann integrable.

Now consider the function sin(1/x) on (0,1]. A first observation is that there's no way to define this function at 0 so it will be continuous there. That's because it is possible to find sequences (xj) in (0,1] whose limits are 0 for which (f(xj) could be either a sequence without a limit, or sequences with different limits (it isn't hard to get explicit sequences converging to 0 or 1 or -1, for example). So there's no "natural" f(0). For simplicity, let's define f(0)=0. I claim that this f is indeed Riemann integrable on [0,1]. Here's is a verification of this claim.

Given epsilon>0, we need a partition P of [0,1] so that US(f,P)-LS(f,P) is less than epsilon. The maximum height of any box in that difference is 2 since the range of sine is [-1,1]. So let's "waste" epsilon/2 on a first box: take 0 and x1 so that x1<epsilon/4. Then the upper-lower on that subinterval must be less than (epsilon/4)·(maximum height)=epsilon/2. Now consider sin(1/x) on the interval [x1,1]: there the function is continuous, hence Riemann integrable, and there is a partition Q of [x1,1] so that US(f,Q)-LS(f,Q)<epsilon/2. Now take P to be the points in Q together with 0, and the discrepancy will be at most the sum of the two, so therefore we have the desired partition whose difference between upper and lower sums is less than epsilon. A picture may help understanding. The large vertical box all the way on the left of the picture contains infinitely much wiggling of sin(1/x). The finite amount of wiggling not contained in that box is "captured" inside a finite sequence of other boxes.

This web page has links to notes of Professor Cohen which we are (approximately) following. Please look down the page and find "Riemann Integral, Section 1".

 Not mentioned in class I really should have mentioned that the combination of the Lipschitz idea and the analysis above allows the creation of an algorithm to approximate the definite integral of a continuously differentiable function as accurately as desired. We know that given epsilon>0, we can take delta=K/(b-a). If we want to approximate within 1/n we could take any Riemann sum with more than K/[(b-a)n] subintervals. Any Riemann sum will be between the upper and lower sums, and those will both be within 1/n of the true value of the integral. Naturally, "real" numerical analysis considers algorithms which converge more rapidly, but what is described here is the beginning.

Plans ...
We need to prove "technical" results about order, linearity of the integral, and additivity over intervals to help present a version of the Fundamental Theorem of Calculus. I'll also go through a short excursion on probability in order to show the class some interesting and almost unbelievable examples.

Theorem (order and integral) Suppose f and g are Riemann integrable on [a,b], and that for all x in [a,b], f(x)<=g(x). Then intabf<=intabg(x).
Proof: This result should be relatively easy to verify. Let's see: since f(x)<=g(x), the sup of g on an interval will be at least as large as the sup of f on the same interval. Therefore US(f,P)<=US(g,P) for all partitions, P: each of f's upper sums are less than or equal to the corresponding upper sums of g. Now what can we say about the inf of f's upper sums (what we called the upper integral of f, UI(f,[a,b])? Since UI(f,[a,b]) is a lower bound for all of f's upper sums, it is also a lower bound for all of g's upper sums. Therefore UI(f,[a,b])<=the greatest lower bound for all of g's upper sums, so UI(f,[a,b])<= UI(g,[a,b]). Since both f and g are Riemann integrable, the UI's are equal to the Riemann integrals, so that intabf<=intabg.

Corollary Suppose that there are real numbers m and M so that for all x in [a,b], m<=f(x)<=M and f is Riemann integrable in [a,b]. Then m(b-a)<=intabf<=M(b-a).
Of course this can also be proved if we just compute US(f,P) and LS(f,P) when P is the "trivial" partition, p={a,b}.

Linearity of the integral here will mean that intabqf+g=q intabf+intabg when f and g are Riemann integrable on [a,b] and q is a constant. I will divide this into three parts, allowing me to concentrate on smaller steps:
i) (Positive homogeneity) intabq·f=q·intabf
ii) (-1) intab-f=-intabf

Proposition (i) Positive homogeneity) Suppose q is a positive constant and f is Riemann integrable on [a,b]. If g is a function defined by g(x)=q·f(x), then g is Riemann integrable on [a,b] and intabg=q·intabf.
Proof: If S is any subset of R and we define qS to be the set of numbers y=qx where x is in S, then we know the following results from long ago:
If S is bounded above (or below), then qS is bounded above (or below). The converse is also true.
Why? If t is an upper bound of S, then t>=x for all x in S, so that qt>qx always. Lower bounds work the same way. The converse is verified by "multiplying" qS by the positive number 1/q.   If S is bounded above, then sup(qS)=q sup(S).
Why? If the sup of S is v, then qv is an upper bound of qS. Also, if w is any upper bound of qS, (1/q)w is an upper bound of S, so that v<=(1/q)w, so that qv<=w, and we've verified that qv=sup(qS) as desired.

Now apply these observations to the collection of upper sums of g and the upper sums of f. Note that US(g,P)=qUS(f,P) because of the ideas above. So we have verified that UI(g,[a,b])=qUI(f,[a,b]). We can similarly verify that LI(g,[a,b])=qLI(f,[a,b]). Since f is Riemann integrable, UI(f,[a,b])=LI(f,[a,b]), implying that UI(g,[a,b])=LI(g,[a,b]), and therefore, g is also Riemann integrable. Whew! And we also know that the upper and lower integrals both multiply by q, so that intabg=q·intabg.

4/23/2003 We used the lemma proved last time to verify the following
Theorem (upper sums dominate lower sums) If S and T are any partitions of [a,b], then US(f,S)>=LS(f,T).
Proof: Here's the proof, which is very witty. Last time we proved that more points in the partition may make the upper sum decrease, but can't make it increase. A similar result (reversing directions, though!) is true for lower sums. Therefore, if P is the union of the partitions S and T, we have the following sequence of inequalities:
LS(f,T)<=LS(f,P)<=US(f,P)<=US(f,S).
The central inequality (between LS and US for P) is true because sups are bigger than infs, always.

Comment I can't imagine a totally convincing picture of the situation addressed in this result -- it seems really complicated. Temporarily, I defined: A=the set of all lower sums of f. That is, x is in A if x=LS(F,P) for some partition P of [a,b].
B=the set of all upper sums of f. That is, x is in B if x=US(F,P) for some partition P of [a,b].

The theorem just stated presents us with a situation which should be familiar from earlier work in the course (a month or more ago). The sets A and B have the following properties: if a is in A, then a is a lower bound of B, and if b is in B, then b is an upper bound of A. It is natural to look at the sup of A and the inf of B. Here we will use special phrases:
The sup of A is called the lower Riemann integral of f on [a,b], and is denoted LI(f,[a,b]).
The inf of B is called the upper Riemann integral of f on [a,b], and is denoted UI(f,[a,b]). It is always true that LI(f,[a,b])<=UI(f,[a,b]). If these numbers are equal, then we say that f is Riemann integrable on [a,b], and the common value is called the Riemann integral of f on [a,b], intab f, or, more commonly, intab f(x) dx. Note that the "x" in the integration is a "local" variable, and therefore the value of intab f(x) dx is the same as intab f(w) dw which is the same as intab f(t) dt, etc.

If f is example 3 of the last lecture (0 on the irrationals, and 1 on the rationals, and the interval is [0,1]) then A={0} and B={1}, not very big sets, and not very complicated!

How can we tell if the Riemann integral exists, and how can we get interesting examples? We will begin with this theorem:
Theorem f on [a,b] is Riemann integrable if and only if for every epsilon>0, there is a partition P of [a,b] so that US(f,P)-LS(f,P)<epsilon.
Proof: Let's first assume that f is Riemann integrable. Then LI(f,[a,b])=UI(f,[a,b]). Now since LI(f,[a,b]) is a sup (the sup of the set A), given epsilon/2, we can find a partition S so that LI(f,[a,b])-epsilon/2<LS(f,S)<=LI(f,[a,b]). Since UI(f,[a,b]) is an inf (the inf of the set B), given epsilon/2, we can find a partition T so that UI(f,[a,b])<=US(f,T)<UI(f,[a,b])+epsilon/2. Now let P be the partition which is the union of S and T (what is called classically the common refinement of the two partitions). We also know that LI(f,[a,b])=UI(f,[a,b])=intab f. Now we can package all this into a wonderful chain of inequalities:
intab f-epsilon/2<LS(f,S)<=LS(f,P)<=US(f,P)<=US(f,T)<intab+epsilon/2
So the numbers =LS(f,P) and US(f,P) are both "sandwiched" into an interval centered around intab, and this interval has length at most epsilon. So we are done with this part of the proof.

Now we need to verify the "epsilon condition" implies Riemann integrability. Remember what the sets A and B are. Since every element of B is an upper bound of A, and every element of B is a lower bound for A, we already know that sup A<=inf B. Why is this true? Well, if sup A>inf B, then take epsilon= sup A-inf B>0. We can find (sup chracterization) a in A so that sup A>=a>sup A-epsilon. a is then greater than inf B=sup A-epsilon. But we could then (inf characterization) find b in B with a>b>=inf B, which contradicts the known fact that a<=b for all choices of a in A and b in B. How can we prove Riemann integrability? This condition is exactly the same as proving sup A=inf B. Since we already know sup A<=inf B, let us see what happens when sup A<inf B. Then take epsilon=inf B-sup A>0. The assumption in the statement of the theorem says we can find a in A and b in B with b-a<"this" epsilon. So b-a<inf B-sup A. But certainly a<=sup A and inf B<=b, yielding (since -sup A<=-a) inf B-sup A<=b-a. This is a contradiction! Whew. (The logic in all this is a bit intricate, but is very similar to lots of proofs we did a month or two ago.)

Even better is the following result:
Theorem f is Riemann integrable on [a,b] if and only if there is a sequence of partitions (Pn) so that US(f,Pn)-LS(f,Pn)<(1/n). If such a sequence of partitions exists, then the two sequences of numbers, (US(f,Pn)) and (LS(f,Pn)), both converge to a common limit, and that limit is intab f.

With this result (whose proof I postponed until next time) we will actually be able to effectively recognize some Riemann integrable functions and maybe compute some integrals.

Example 1 A step function: Select numbers a<c<d<b. Define the function f(x) by f(x)=1 if c<x<d, and f(x)=0 otherwise. This is the simplest example of what the text (and other sources) call a step function. With some effort, we decided to look at the following partition: P={a,c,c+(1/3n),d-(1/3n),d,b}.
The lower sum is exactly 1·((d-(1/3n))-(c+(1/3n)))=d-c-(2/3n). Be careful with endpoints -- only one rectangle is non-zero! The upper sum is 1·((d-(1/3n))-(c+(1/3n)))+1·(d-(d-(1/3n)))+1·((c+(1/3n))-c)=d-c (no n's intervence at all!). The difference is (2/3n) which is certainly less than 1/n, so the hypotheses of the preceding theorem are satisfied. The limits of the sequences (d-c-(2/3n)) and (d-c) are both d-c, which therefore must be the integral of f over [a,b].

Example 2 f(x)=(1/5)x2+arctan(cos(x2).
This is revision of history. In class I actually analyzed f(x)=5x7+arctan(cos(x17)) on the interval [3,11]. This was, of course, partly a joke, but also partly a serious effort to show that we could "handle" something this complicated. I am switching to the function written above because I had Maple graph it, and this one wiggled a lot, whereas the function I did in class has very small scale wiggles (the seventh power really dominates the arctan term!) so it doesn't look as weird.
So here f(x)=(1/5)x2+arctan(cos(x2). We will analyze this function with the help of some tools from calc 1. Therefore f'(x)=(2/5)x+(1/(1+(cos(x2))2(-sin(x2))(2x). And even more is true if we try to get a very rough upper bound on |f'(x)|. The triangle inequality tells us that this is <=|(2/5)x|+|(1/(1+(cos(x2))2(-sin(x2))(2x)|. The first term is (2/5)x which on [3,11] is less than 22/5<5. Much of the second term can be overestimated by 1 (the fraction, the sine) so we just have left |2x| which is less than 22, so that always |f'(x)|<27.
The Mean Value Theorem of calculus says that for x and y in [3,11], |f(x)-f(y)|<=27|x-y| (because the quotient (f(x)-f(y))/(x-y) will always be f'(c) and this in absolute value is less than 27). So this f is a Lipschitz function with Lipschitz constant less than 27. (We encountered such functions before [butterfly functions!] in the lecture of 4/14). Now let us think about a partition of [3,11] into J (an integer) equal pieces. The mesh of the partition will be (11-3)/J=8/J, and that will be the length of each subinterval. If x and y are in the same subinterval, therefore, |x-y|<8/J. And from the Lipschitz inequality, |f(x)-f(y)|<(27)(8/J). Notice that f is continuous, so that the sup and the inf over each subinterval are values of f on the subinterval (f(x) and f(y) in the above). Thus the difference between the upper sum and the lower sum will be at most: (# of subintervals)·(maximum variation in the subinterval)·(length of the subinterval). This is J·(27)(8/J)·(8/J). A simplification gives (27)(8)(8)/J. And when J gets large, this gets small, so we have indeed verified the criterion for this function to be Riemann integrable on [3,11]. Wow! More to come next time.

4/21/2003 The problem of area
What is area?

This is a serious geometric question and difficult to answer. Usually we would like "area" to be something satisfying the following rules:

1. The area of a piece of R2 should be a non-negative number.
2. The area of bigger pieces should be a bigger number.
3. The area of congruent pieces should be the same.
4. If two pieces don't overlap or overlap only at the boundary, then the area of the union of the two pieces should be the sum of the areas of the pieces. More generally, if you split up pieces of the plane into subpieces, the areas of the subpieces should add up to the area of the original piece.
5. (Normalization) The area of the unit square should be 1.
These are the rules that area should seem to follow. Of course, some of the words need explaining, and some of them need lots of explanation. One book that I've read recently which discusses the classical Euclidean approach to this problem and others is Hartshorne's Geometry: Euclid and Beyond which I would recommend for those who want to study the axiomatics of classical geometry, now thousands of years old. The book isn't easy, but it has a great deal of content. The idea of a "piece" of the plane is certainly imprecise. "Bigger" in the case of numbers is <=. In the case of pieces of the plane, it probably should mean "is a subset of", so larger sets have larger areas. The word "congruent" probably means, as was said in class, the same shape and size: this means that if we translate or rotate or flip sets, the results should have areas equal to the sets we started with. In the case of sets being "split up" they should not have overlapping interiors: only the boundaries are allowed to overlap. Whew!

Just an initial statement of these properties is awesome. What is more distressing is the following statement, whose verification needs more time than this course has to run: there is no way to assign area to every subset of the plane in a way that obeys all of the rules above. This is irritating. In R3 the situation is even worse. It turns out that some "obvious" facts about decomposition of polyhedra into pieces with equal area are also not true! There is more information about this in Hartshorne's book. Here we do something much more pedestrian. We will try to assign "area" (actually, the definite integral) to regions in the plane bounded by the x-axis, x=a, x=b (here a<b) and y=f(x). Even for this seemingly more modest goal will turn out to be more difficult than one thinks, and the examples we will consider will be intricate and irritating.

We will follow the lead of Cauchy and Riemann in this. Bressoud's book (referenced in the general background to the course) discusses some of Cauchy's ideas and shows that some of what Cauchy wrote was just wrong! This stuff can be difficult. The basic idea is exactly described by the picture to the right, which is almost surely rather familiar to every student who has been through a basic calculus course. We need to label and define and investigate every aspect of this picture. I note that we are investigating what is called the Riemann integral (Google has over 72,000 responses to "Riemann integral"). Another candidate for integration is called the "Lebesgue integral" (Google has over 29,700 responses to "Lebesgue integral").

We will start with a function f defined on [a,b]. We need to split up the interval. The word "partition" is used both as a verb and as a noun in this subject. As a verb, partition means to break up the interval into subintervals. As a noun, currently more important, "partition" P will mean a finite subset of [a,b] which contains at least both a and b. So a partition could be as small as just {a,b} (I'll assume here that a<b, so a and b are distinct). Or a partition could have 10100 points. The points in P will usually be written here as {a=x0<x1<x2<...<xn-1<xn=b}. In this case the partition has n+1 elements, and it has divided the interval into n subintervals (although frequently the subintervals have equal length, this is not required). The mesh of the partition P will be written ||P|| and it means the maximum of xj-xj-1 for j running from 1 to n. Since P is a finite set, the word "maximum" can be used, and will equal one of the elements of the set. Additionally, we will need to specify "tags" (word used in the text) or "sample points" (phrase I am more used to). So this is a selection of tj in each subinterval [xj-1,xj] as j runs from 1 to n. The tj will be used to create the height of each subrectangle. Then the Riemann sum of f on [a,b] with partition P and sample points T is sumj=1nf(tj)(xj-xj-1). The idea is that as ||P||-->0, this sum should tend to some sort of limit, and this will be the area or the definite integral. We'll call this RS(f,P,T): it is a complicated creature. I will return to this general sum later, but right now I will try something which may be a bit easier to handle: upper and lower sums.

The Upper Sum of f on [a,b] with partition P, US(f,P), is sumj=1n(sup of f on [xj-1,xj])(xj-xj-1).
Several observations should be made about this, and about the corresponding definition, to be made below, of "lower sum". First, sup and inf need to be used, and these sums don't need to be Riemann sums. The reason for this is that functions don't need to attain their sups and infs (example:f(x)=x, x in [0,1) and f(1)=0). If, however, the function f is continuous then it will attain its sup and inf in closed bounded intervals, so that the upper and lower sums will be Riemann sums. Second, for us to know that sup and inf exist and are real numbers, the function f should be bounded in [a,b]: there is a positive real number M so that |f(x)|<=M for all x in [a,b]. The function f(x)=1/x for x>0 and f(0)=0 is not bounded in [0,1], and therefore our theory will not apply to it (such phenomena need to be analyzed as improper integrals -- we are considering only proper definite integrals here. Whew! Now for the lower sum:
The Lower Sum of f on [a,b] with partition P, LS(f,P), is sumj=1n(inf of f on [xj-1,xj])(xj-xj-1).
Notice that Riemann sums (with tags or sample points) will always be caught between the upper and lower sums associated with their partitions.

Example 1 f(x)=x, [0,1]. Here I looked at the partition P which was {0,1/n,2/n,3/n,...,(n-1)/n,1}, an evenly spaced partition dividing [0,1] into n equal subintervals. The difference between the upper and lower sums can be exactly computed in this case (most unusual, because almost always we will need to estimate such things). Just "shove over" the boxes so they line up and have height 1 and width 1/n, so that the difference between the upper and lower sums is 1/n. As n gets large, this discrepancy -->0.

Example 2 f(x)=1 if x=1/2 and f(x)=0 otherwise. On the interval [0,1] use the partition {0,0,1/2-1/n,1/2+1/n,1}. This partition has four points and three subintervals. The lower sum is 0 because the infs on any of the three subintervals is 0. The upper sum has three parts. The left and right parts are 0 because the sup is 0 there, but the inside part, with width 2/n, has sup=1. Hence the upper sum is 2/n. The difference between the upper and lower sums here also-->0 as n gets large.

Example 3 Consider the function f which is 0 on all of the irrationals and 1 on all the rationals. Then since there are rationals and irrationals in all intervals of positive length (the "density" of the rationals and irrationals) all of the upper sums on the interval [0,1] are 1 and all of the lower sums on the interval [0,1] are 0. There is always a discrepancy of 1 between the upper and lower sums.

Definition The Upper Riemann integral is the inf of all of the upper sums of f. The Lower Riemann integral is the sup of all of the lower sums of f. We will say that f is Riemann integrable if the upper and lower Riemann integrals are equal. The common value will be called the Riemann integral of f.

So the functions of examples 1 and 2 are Riemann integrable, with integrals of value 1/2 and 0 respectively. The function in example 3 is not Riemann integrable.

The following result plays an important part in any development of the Riemann integral.
Lemma Suppose P is a partition of [a,b] and q is a point which is not in P. Let Q be the partition obtained by taking P union {q}. Then:
US(f,P)>=US(f,Q) and LS(f,P)<LS(f,Q).
The idea is that the approximations should get closer when we throw in more points. We will use this lemma many times.
Proof: I will only look at the upper sums. Let me suppose that q is between xj-1 and xj. Then all of the terms in the upper sums for P and Q are exactly the same except for one term in the P sum: (sup of f on [xj-1,xj])(xj-xj-1), which is replaced by these terms in the Q sum: (sup of f on [xj-1,q])(q-xj-1)+(sup of f on [q,xj])(xj-q).
There are now several observations:
1. (xj-xj-1)=(xj-q)+(q-xj-1).
2. (sup of f on [xj-1,xj])>=(sup of f on [xj,q])
3. (sup of f on [xj-1,xj])>=(sup of f on [q,xj-1])
2 and 3 are true because sup's on bigger sets may be larger but cannot be smaller. These observations combine to prove the result stated: multiply inequality 2 by xj-q; multiply inequality 3 by q-xj-1; add the results and use the equation in 1. This shows that the Q sum is overestimated by the P sum.
The result for lower sums is similar.

On Wednesday I hope to give out some notes written by Professor Cohen which will outline this material.

4/16/2003 The instructor addressed the question, "Where do we go from here?" and then answered even more inquiries in preparation for the exam.

I brought in the text (a standard calculus book) used for Math 151 and compared what's in Chapter 6 of the Math 311 textbook with that text. The approach and even many of the pictures, are the same. What happens?

1. The definition of derivative; a differentiable function is continuous.
2. If a function has a local extremum and is differentiable there, then the derivative is 0 at that point.
3. Rolle's Theorem (a special case of the MVT): a function which is differentiable in [a,b] with f(a)=f(b)=0 must have at least one c in (a,b) with f'(c)=0.
4. Mean Value Theorem (tilted form of Rolle's Theorem): if f is differentiable in [a,b], there is at least one c in (a,b) with f'(c)=(f(b)-f(a))/(b-a).
5. f' positive in an interval implies f is increasing there; f' negative in an interval implies f is decreasing there.
6. More topics: l'Hopital's rule; Taylor's Theorem.
Of course there are examples in the 311 text which have more interest due to our developing knowledge than what's in the calculus text. But I would like to temporarily skip this material and discuss the integral, which is almost always given less attention than the derivative in calculus courses. The examples illustrating features of the integral are to me more interesting than corresponding examples for the derivative. The examples tie into a number of applications in other areas, such as probability. Therefore on Monday, April 21, we will begin material in chapter 7. The treatment in class will not be the same as in the textbook.

In the balance of time remaining for the class I tried to answer more questions in preparation for the exam. Maybe the only interesting comment was my remembering a "workshop" problem from calculus. The problem was something like this:

Aliens change and permute 10,000 values of the function F(x)=x2. That is, they might change (2,4) and (5,25) to (2,25) and (5,4), except that this is done to 10,000 points. What is the value of limx-->aF(x) and why?

The answer is that limx-->aF(x) exists for all a's, and its value is alway a2. This is rather surprising, but once one is "close enough" to a (and not equal to a)there is no difference between the original function and the altered function.

I worked on some other textbook and workshop problems, just as I had in the review session the night before.