Tuesday, October 2 | (Lecture #9) |
---|
Exam!!!
The first big, real exam will be given in the normal class time and
place on Friday, October 12. It will cover material up to and
including today's lecture (section 3.7, the Chain Rule). The lecturer
will have a review session on Thursday evening, October 11. Please do
the suggested homework and the workshop problems.
The topic for today is the Chain Rule, which is probably the most important of the differentiation "rules". I want to introduce the subject matter in a way that I think is indirect and natural. So I presented the following economic information.
Chipco data
I then put the chart below on display and we
discussed it for a while. Here's a
pdf copy which you can print out separately if you like.
CHIPCO INVESTMENT DOLLARS & PRODUCTION Capital Invested Chips produced Marginal chips produced $ in millions 1,000's of units 1,000's of units per millions of $'s 200 3,000 .23 300 3,040 .28 400 3,070 .42 500 3,100 .78 600 3,190 .31 CHIPCO SALES & PROFITS Chips marketed Profit gained Marginal profit 1,000's of units $'s in millions Millions of $'s per 1,000's of units 3,000 1.2 .03 3,050 2.8 .02 3,100 3.6 .05 3,150 4.9 -.01 3,200 5.1 .02
The first table
I don't think many people were familiar with the economic terms used
above. In particular, the word "marginal" is used in a fairly
technical sense. In the first table above, it refers to the
approximate amount would increase per each million dollars of increase
in capital investment. Therefore, for example, if $302 million were
invested, then (according to this model) chip production would be
3,040,000 (chip production at the $300 million level) plus
.28(2)(1,000) chips. The 2 comes from the additional millions of
dollars of capital. The 1,000 comes from the units I use for chip
production. The .28 is this "marginal" quantity. In the first table,
the marginal quantity is therefore the approximate amount
C/M, relating the change in chip
production to the change in money invested. It is sort of a
slope, or, more likely, sort of a derivative: indeed, the use of
"marginal" in economics usually means a derivative. In this model, if
$297 million were invested, the approximate expected chip production
would be 3,040,000 (again, chip production at the $300 million level)
plus .28(-3)(1,000). The novelty here is the use of the
minus sign, since here we are decreasing the capital investment rather
than increasing it.
Comment
The ratio written as C/M is very "classical" and sometimes
can be misunderstood. I don't mean here that has an independent meaning as a
variable, so you can't just divide by 's top and bottom. The abbreviation
is an abbreviation of complicated ideas, a change in chip production
compared to a change in money invested.
The second table
The second table describes a similar phenomenon, here connecting the
chip amount, C, with the profit derived from these marketing and sale
of these chips. For example, the profit derived from the sale of
3,000,000 chips (the first line of the second table) is $1.2
million. If we now look at the third column, the model predicts a
marginal profit of .03 (in the given units). Using this, if 3,010,000
chips are marketed (that's 10 more 1,000 units of chips) the
additional profit would be .03(10) million dollars, or $300,000. And
if only 2,970,000 chips were marketed, then the profit would be 1.2
million+(.03)(-30)(1,000)million. (I think I got all the units
correct.) The third column gives P/C for various amounts
of chip marketing: the change in profits compared to the change in
chips marketed. Of course the validity of such models can certainly be
criticized, but I really wanted to show this to explain what's in the
next paragraph.
Linking the tables
The two tables linked together describe a complicated
phenomenon. First we "input" capital, M (M is for money), which
produces C, a certain number of chips. Then the chips are marketed
(and sold, hopefully!) to obtain a certain amount of profit, P. Here
we have a composition of functions. For example, suppose we were asked
how much profit there is if we put in M=$500 million. From the first
table we read off C=3,100,000 chips, and from the second table we can
then see that P will be 3.6 million dollars.
I hoped that this was all fairly clear. Now I asked what I thought was
a difficult question. Suppose we increase M from 500 million dollars
to, say, 503 million dollars. What does the model predict the profit
will be? We can trace this if we are sufficiently alert. The first
marginal quantity we need to consider is C/M. For M=$500 million,
this is .78. So the new chip production is old chip production +
increase in chip production, and this will be 3,100,000+(.78)(3)(1,000
chips). Now let us consider the chip/profit table. With C=3,100,000,
we see that profit is supposed to be 3.6 million. But we are changing
C by adding on the (relatively small) amount of .78(3)(in units of
1,000's of chips). The relevant marginal quantity here is P/C on the row where C is 3,100(,000). The marginal
amount here is .05, so that the new profit will be the old profit (3.6
million) plus (.05)(.78)(3) million dollars. The 3 comes
from perturbing the capital investment. The really interesting stuff is (.05)(.78): indeed,
this represents the marginal profit as capital invested changes, when
the capital investment is 500 million dollars. Symbolically, it might
make sense written this way:
P/M=P/C·C/M.
So the C's just seem to cancel out. Of
course, this is more complicated than just multiplying fractions,
since the fractions (the marginal stuff, the derivatives) need to be
"evaluated" on the appropriate rows of the tables. So I hope that my
discussion has supported:
The Chain Rule
Suppose that f and g are differentiable functions. The
F(x)=fog(x)=f(g(x)) is differentiable, and
F´(x)=f´(g(x))·g´(x).
Here o is supposed to be a little circle, and the little circle indicates composition. I will more frequently write F(x)=f(g(x)). The tables above sort of indicated that chip production was a function of capital investment, and then that profit was a function of the chips marketed, so that profit as a function of capital investment was a composition of the two functions.
Function diagrams
I tried to make another supporting argument
for the Chain Rule, using function diagrams. If x is the input to a
differentiable function, g, then the output is g(x). If we perturb or
change the input to g a bit, the output, g(x+h), can be thought of in
several parts: g(x)+g´(x)h+(Errg)h. Here g(x) is the
old output, g´(x)h is a change in output which is directly
proportional to h, and there's other, higher order stuff, which-->0
faster than h, so Errg-->0 as h-->0.
Now we could take another differentiable function, f, with input w and input perturbation k. The output will be f(w+k), and we can think of it as f(w)+f´(w)k+(Errf)k. If we wire up the output of g to the input of f, then we can try to think this way:
x+h--> g(x)+g´(x)h+(Errg)h w \___________/ THIS IS k WHEN CONSIDERING f's INPUT w + k --> f(w) + f´(w)k + (Errf)k f(g(x)) + f´(g(x))g´(x)h + (STUFF)hNow the STUFF turns out to be f´(w)(Errg)+(Errf)g´(x)+(Errg)(Errf), a whole bunch of things you don't need to remember but all of which-->0. This means that the only first order term that comes out of g followed by f is the change, h, multiplied by f´(g(x))g´(x). I hope you recognize that this is the same sort of result we got from the microchip numbers. The amplification factors multiply, but you needed to evaluate them at the correct values of their "arguments".
Why do this? The more ways I can try to explain may allow you more ways to understand. At the basement level, what you must have is the ability to compute with the Chain Rule correctly in straightforward, formulaic circumstances. Examples of this are shown below. But I do think you will be more appropriately equipped to face the demands of a technical career if you understand how this formula works, and what the numbers mean. I hope that the varied approaches will let more of you into the first floor, the second floor, etc., even into the penthouse (sigh, bad metaphor) of understanding the Chain Rule. |
Basement example #1
If F(x)=(x2+7)300, what will F´(x) be? I
don't need the Chain Rule, not really (?), to compute
F´(x) because, after all, F(x) is "just" a polynomial (although
here F(x) is a polynomial of degree 600, and this polynomial is not
presented in standard fashion). Success (rapid, accurate computation)
here probably will result from recognizing that the chain rule
applies.
If F(x)=f(g(x)), then g(x) is x2+7 so
g´(x)=2x, and f(x) is x300 so
f´(x)=300x299. Thus
F´(x)=f´(g(x))g´(x)=f´(x2+7)(2x)=300(x2+7)299(2x).
Whew!
But now comes the realistic comment. Hardly ever does anyone bother writing down all of these intermediate steps. That is, in practice very few f's and g's are actually identified. What happens is that people see and differentiate the outside most function (f above), put in the inner function (g) in that derivative, and then multiply by g´. For example, consider sin(ex+x2). What is its derivative? The outside function is sine, whose derivative is cosine. So I begin by writing cos(what's inside)·the derivative of what's inside. The result is cos(ex+x2)·(ex+2x). This expression is a formula for the derivative of sin(ex+x2). Again, I urge you to consider the significance and necessity (!) of appropriate parentheses in these expressions. The "argument" of cosine is ex+x2 and the cosine expression is then multiplied by the expression (ex+2x).
#2
I think I did another example, something like computing the derivative
of F(x)=e5x2+7sqrt(sin(x)). The outside most
function is e-to-the- (?) whose derivative is e-to-the-. Therefore the
derivative, F´(x), will begin with
e5x2+7sqrt(sin(x)) and continue by multiplying
by the derivative of 5x2+7sqrt(sin(x)).
That's a sum, so its derivative is 10x+7·(the derivative of
sqrt(sin(x)). What's the derivative of sqrt(sin(x))? That itself is a
composition, with the "outside" function sqrt or
thing-raised-to-the-half-power. The outside derivative is
(1/2)(thing-raised-to-the-minus-half-power). So the derivative of
sqrt(sin(x)) is (1/2)(sin(x))-1/2(cos(x)). Now put it all
together:
if F(x)=e5x2+7sqrt(sin(x)) then
F´(x)=(e5x2+7sqrt(sin(x)))(10x+7((1/2)(sin(x))-1/2)).
I strongly recommend that you use lots of parentheses when applying the Chain Rule.
Finding a low point -- left over from earlier
In lecture #3, we considered the graph of
f(x)=5e-2x+3e4x. We decided that it probably was
shaped like a big U, with one lowest point. We can now accurately and
exactly find this lowest point. Look:
f´(x)=5e-2x(-2)+4e4x(4). Here the Chain Rule
is being used twice, with the exponential function as the outside
function each time, and with multiplication by -2 (respectively by 4)
as the inside function.
The lowest point is where f´(x)=0. Here's the algebra:
5e-2x(-2)+3e4x(4)=0
3e4x4=5e-2x2
12e4x=10e-2x
e6x=10/12
6x=ln(10/12)
x=(1/6)ln(10/12)
By the way, this is about -0.03038692614, which is sort of close to
what we guessed way back near the beginning of the course.
QotD
I gave the following information: F(x)=f(3x-x2), and
that
f(1)=4; f´(1)=8; f´´(1)=16.
f(2)=3; f´(2)=5; f´´(2)=7.
I asked students to compute F(1) and F´(1) and F´´(1).
F(1)
Since F(x)=f(3x-x2), I know that
F(1)=f(3·1-12). Here the parentheses mean function
evaluation, and F(1)=f(3-1)=f(2)=3.
F´(1)
Since F(x)=f(3x-x2), this is a composition of two
functions. The inside function is 3x-x2 and the outside
function is f. Therefore the Chain Rules tells us that the derivative
is F´(x)=f´(3x-x2)(3-2x). There is no simpler
expression which is valid for all x. We have no information which
allows us to write anything else for f´. To get F´(1) we
plug x=1 into the equation F´(x)=f´(3x-x2)(3-2x)
and we get
F´(1)=f´(3·1-12)(3-2·)=f´(2)(3-2)=5·1=5.
We need to use the formula for F(x) to get a formula for F´(x).
F´´(1)
We need to begin with information about F´(x) and differentiate
that. Knowing F´(1) is not enough: we need to know how
F´(x) changes.
We do know F´(x)=f´(3x-x2)(3-2x). Let me rewrite
this with some emphasis.
F´(x)=(f´(3x-x2))(3-2x).
If you look carefully at the logic of the right-hand side of this
equation, I hope that you will see it is a product of
f´(3x-x2) and (3-2x). I will use the Product Rule. The
first factor, f´(3x-x2), is a composition. The outside
function is f´ and the inside function is 3x-x2. I
will use the Chain Rule on this.
F´´(x)= =(f´´(3x-x2)(3-2x))(3-2x)+(f´(3x-x2))(-2).
Now insert x=1. Here is the result:
F´´(1)=(f´´(3·1-12)(3-2·1))(3-2·1)+(f´(3·1-12))(-2)=(f´´(2)(1))(1)+(f´(2))(-2)=7·1+5·(-2)=7-10=-3.
Friday, September 28 | (Lecture #8) |
---|
Certainly, if x is very very large positive, the fraction
=(1x+2)/(3x2+4) will become small. (The limit as
x-->+ of f(x) is 0, we will later say.) A similar thing is
also true as x-->-. Both of these limits or asymptotic
behaviors or whatever you want to say occur because the bottom is a
degree 2 polynomial, and the top is a degree 1 polynomial, and the
high degree polynomial "dominates". I'll be more precise about this in
the future.
But what happens in between? What sorts of numbers can we get out of f? Well, f(1)=3/7, so 3/7 is in the range. And f(2)=4/16=1/4, so 1/4 (sigh!) is in the range. And ... and ... I think just listing numbers is not even simple fun, and merely is pointless. One nice suggestion for an output, though, was 0. The only x which gives 0 is -2. This is good because now I know that the only sign change is at -2, and (Intermediate Value Theorem) this is where f's outputs change from positive to negative or negative to positive.
How can we systematically learn about f? Well, I do know (Quotient
Rule) that
f´(x)=[(1)(3x2+4)-(1x+2)(6x)]/(3x2+4)2.
When can this be 0? Only when the top is 0, so let me "simplify" the
top:
(1)(3x2+4)-(1x+2)(6x)=3x2+4-6x2-12x=-3x2-12x+4.
So the top is 0 when -3x2-12x+4. We need to use the
quadratic formula (hey, very few "random" degree 2 polynomials with
integer coefficients actually can be written as a product of two
degree 1 polynomials with integer coefficients!). The roots are
-2-(4/3)sqrt(3) and -2+(4/3)sqrt(3). (I wrote something much more
horrible in class, and this is what results after some arithmetic.)
What do I know about the SIGN of f´? This derivative is a
quotient, and the bottom is (3x2+4)2. The bottom
is always positive. So the SIGN of f´ in this case is determined
by the sign of the top. But the top is -3x2-12x+4. This is
a parabola, and because the coefficient of x2 is -3<0,
the parabola opens down. From this and from knowing the two
roots of the top, we see:
So what will the range of f be, exactly? It seems "clear" looking even at the approximate graph shown to the right that the range will be all numbers between f's values at rL and rR. That is, the range is the interval [f(rL),f(rR)]. I can be more precise, of course. The range is [f(-2-(4/3)sqrt(3)),f(-2+(4/3)sqrt(3))], and this is (after some work which I would not inflict on people in class!) [(1/4)-(1/6)sqrt(3),(1/4)+(1/6)sqrt(3)].
What this is, and what this is not
I do not claim that the computations just done are wonderful. Maybe
they are not even interesting. But if you absolutely need to know the
answer to questions of that type, and you need to know with great
accuracy, and great certainty, then the analysis we've done is
probably the best way. It really allows us to get the correct numbers,
and the procedure is understandable. It is also inherently more
precise than using a graphing device, although I probably would try to
get an approximate answer using a graph first.
By the way, to the right is, of course, a graph (in a weird window, check the axes!) of y=f(x). I hope you can "see" the range. Incidentally, [(1/4)-(1/6)sqrt(3),(1/4)+(1/6)sqrt(3)] is approximately [-0.039,0.539].
Building a BIG cylinder inside a sphere
One of the recent workshop problems asked students to analyze the
problem of a cylinder "inscribed" inside a sphere. So the cylinder
touches the sphere as mucl as possible. The task is to understand how
to write a formula for the volume of the cylinder as a function of the
cylinder height, and to also tell what the domain of that formula is,
considering the origin of the problem. Here we will look at the
formula obtained when the radius of the sphere is 3. The height of the
cylinder is related to the cylinder's radius using Pythagoras, if we
look at a cross-section of the cylinder. If x is the height of the
cylinder, then V(x)=Pi(9-x2/4)x. The domain of V(x), when
considering this problem, is 0<x<6. I don't think a cylinder can
have a negative height, and I don't think a cylinder insider a sphere
of radius 3 can have a height bigger than the diameter of the sphere,
which is 6.
We can
graph V as a function of the height, x. When x is close to 0, the
cylinder is short and wide, but the shortness (a factor of x) makes
V(x) quite small. When x is near 6, the cylinder is tall, but the
radius is very small (36/4 is 9!) so the V(x) is also quite small.
How can we find the cylinder of largest volume inside this sphere?
That will be at the "top" of the graph, and I will locate the top by
compute V´(x). So this is (Product Rule)
Pi(-2x/4)x+Pi(9-x2/4). This will be 0 when (divide by Pi,
collect x2's): 9-(3/4)x2=0, so x=2sqrt(3) (the
negative root is not in the domain for this problem!). Indeed, if you
look at the graph, the top of the graph seems to be at about 3.4. This
specifies the cylinder. The radius and the volume can then be
computed.
What this is, and what this is not
Again: this is not a profound problem! But it does show a
systematic way to solving such problems. There may be computational
details which can be irritating, but at least we have some method to
work with. And I assure you that we will go into great detail about
the general method later in this course.
Problem 26 of section 3.4
Section 3.4 has many interesting problems, and I strongly urge
students to read it. Here is a lovely problem. When I first looked at
it, I thought that it was very complicated, and that we didn't have
the necessary technical base to do it. So here's the problem: look at
a clock face at 3 o'clock. The minute hand points up, at 12, and the
minute hand points to the right, at 3. Fine: a right angle. Here's the
question: how fast is the angle changing at that time? Or, phrased
more technically, what is the angular velocity (?!) of the angle at 3?
The picture to the right may be lovely, but it is "frozen" and the
right angle only occurs at 3 o'clock. We can't study the rate of
change of the angle by considering that picture any more than we could
deduce the speed of a car by looking at a picture of it racing.
Well, the way I finally thought of doing this is to "decouple" (?) or separate the actions of the two "hands" of the clock.
Let's call 1
the angle between the segment connecting the center of the clock with
"12" and the minute hand. I know that 1 varies from 0 to 2Pi in one hour, and that the
minute hand moves steadily. Hey: I bet that d1/dt is constant, and
that it is 2Pi (in radians per hour). |
On the other hand (little joke, sorry) we can analyze the motion of
the hour hand. Call 2 the angle between the segment connecting the
center of the clock with "12" and the hour hand. The hour hand travels
completely around the clock in 12 hours, and it also moves
steadily. Therefore its angular velocity, in radians per hour, is
2Pi/12, and that's d2/dt.
|
Let me call the angle between the minute hand and the hour hand
. Then surely (at least, as
time progresses after 3 o'clock) I know that =2-1. Now we can figure out how changes:
d/dt=d2/dt-d1/dt=2Pi/12-2Pi=-11Pi/6.
To me, as I confessed in class, this is all not "intuitively" clear,
and I do need to think about it. The angle decreases (hey: the minute
hand does move faster than the hour hand!) and the rate of decrease is
steady. Interesting: not profound, but we needed to think a bit, and I
believe derivatives help.
Leibniz notation The derivative of f at x is f´(x). But historically there has been another system used. The difference quotient defining the derivaitve is the slope of a secant line for the curve y=f(x). It is a difference in y's (second coordinates of the points) divided by a different in x's, first coordinates of the points. Traditional notation for the quotient looks like this: (y)/(x). And we want limx-->0(y)/(x). People like remembering the definition, because, as you will see, it helps intuition and applications. The notation that is used is dy/dx for "the derivative of y with respect to x". Use notation that helps you and that doesn't mislead you. Mostly I think Leibniz notation is helpful. |
The derivative of sine: a guess
I began by (trying to) draw an accurate picture of sine and then
discussing the slopes of the tangent lines to this curve. The derivative of sine, is, of course, exactly those slopes.
Where would the derivative of sine be 0? Well, where the tangent lines are horizontal. That should be at the tops and bottoms of the sine curve. These occur at Pi/2, 3Pi/2, -Pi/2, etc.: lots of places because sine repeats every 2Pi.
Let's look at sine between, say, x=-Pi/2 and x=Pi/2. The derivative, the slope of the tangent line, would start out at -Pi/2 at 0, then it would increase (as the tangent line began to tilt up). Then it would tilt up more (the slope would be more positive) until it would begin to tilt "down": here the language gets complicated. I am not asserting that the derivative is negative, but I am merely asserting that the slope, which stays positive, begins to decrease. Eventually the slope becomes 0 again when x=Pi/2.
What should happen between x=Pi/2 and x=3Pi/2? In that interval the sine curve is sort of a downwards reflection of the behavior in the interval [-Pi/2,Pi/2]. The derivative starts at 0, then becomes negative. It gets more negative, then gets less negative, and ends up at 0. The shape of the derivative exactly reflect the shape in the earlier interval, since the sine curve's shape is a flip of the earlier behavior.
The derivative repeats every 2Pi, since y=sin(x) repeats every 2Pi. I drew sort of what is shown below. The derivative of sine qualitatively looks like cosine, except maybe we don't know the scaling factor: how high the curve is. Things are wonderful: the scaling factor is 1.
The derivative of sine: via limits, algebra, etc.
In a calculus textbook, something like the following is done when
f(x)=sin(x):
sin(x+h)-sin(x) sin(x)cos(h)+sin(h)cos(x)-sin(x) ----------------- = ---------------------------------- = PIECE #1 + PIECE #2 h hwhere
sin(h) PIECE #1 = cos(x) -------- hand as h-->0 this --> cos(x)·1 because we arranged it this way when we decided to use radian measure! Also,
cos(h)-1 PIECE #2 = sin(x) ---------- hIf we multiply this top and bottom by cos(h)+1, the result on top is [cos(h)]2-1 which is [sin(h)]2. Then
[sin(h)]2 sin(h) 1 PIECE #2 = sin(x) ------------- = sin(x) -------- sin(h) ------------ h [cos(h)+1] h [cos(h)+1]Now as h-->0, I claim:
sin(h) 1 sin(x)-->sin(x); ------ -->1; sin(h)-->0; ----------- --> 1/2. Nothing happens! h [cos(h)+1]So the result is sin(x)·1·0·1/2=0.
The derivative of cosine
Look again at the shape of the graphs drawn just above. If I move the
coordinate axes Pi/2 to the right, the graph of sine becomes the graph
of cosine. The candidate graph for the derivative needs to be
recognized. It is actually minus the graph of sine. This minus
sign is slightly annoying, and sometimes I screw up and forget it or I
put it where it shouldn't be. The derivative of cos(x) is
&ndashsin(x). (There is a shift of Pi/2 in both graphs.)
The derivative of tangent
A specific example of the quotient rule is to compute the derivative
of f(x)=sin(x)/cos(x). Here top=sin(x) and bottom=cos(x), and the
derivative will be
(top'·bottom-bottom'·top)/(bottom)2.
Notice that because of the minus sign in the quotient rule and the
minus sign in the derivative of cosine, there is an asymmetry in the
result, and this can lead to errors.
f´(x)=[cos(x)·cos(x)-(-sin(x))·sin(x)]/[cos(x)]2.
Here is an answer which demands simplification. The top is
[cos(x)]2+[sin(x)]2 which is 1. The whole result
is therefore 1/[cos(x)]2 or [sec(x)]2. Of
course, this f(x) is tan(x), so we now know that the derivative of
tan(x) is [sec(x)]2. Please note the parentheses. I almost
always use lots of parentheses because I make mistakes.
There are 6 trig functions. I know the derivatives of sine and cosine and tangent, as I just showed you. Also sometimes useful is the derivative of secant. Notice: since sec=1/cos, sec´=-(-sin)/(cos)2=(1/cos)(sin/cos)=(sec)(tan). I don't offhand know the derivatives of csc and cot. Sigh. Now, the updated table:
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
f(x)+g(x) | f´(x)+g´(x) |
f(x)·g(x) | f´(x)·g(x)+f(x)·g´(x) |
CONSTANT(f(x)) | CONSTANT(f´(x)) |
1/f(x) | -f´(x)/[f(x)]2 |
f(x)/g(x) | [f´(x)g(x)-g´(x)f(x)]/[g(x)]2 |
sin(x) | cos(x) |
cos(x) | -sin(x) |
tan(x) | [sec(x)]2=1/[cos(x)]2 |
sec(x) | sec(x)tan(x) |
In mathematics you don't understand things. You just get used to them. |
QotD
Here is the question, and here is its solution, as done by my silicon pal, Si:
> time(); 0.004 > f:=(3*exp(x)-4*cos(x))*(x^4-x^2+1)/((5/x^3)+7*sqrt(x)); 4 2 (3 exp(x) - 4 cos(x)) (x - x + 1) f := ----------------------------------- 5 1/2 ---- + 7 x 3 x > time(); 0.004 > diff(f,x); 4 2 3 (3 exp(x) + 4 sin(x)) (x - x + 1) (3 exp(x) - 4 cos(x)) (4 x - 2 x) ----------------------------------- + ---------------------------------- 5 1/2 5 1/2 ---- + 7 x ---- + 7 x 3 3 x x 4 2 / 15 7 \ (3 exp(x) - 4 cos(x)) (x - x + 1) |- ---- + ------| | 4 1/2| \ x 2 x / - ----------------------------------------------------- / 5 1/2\2 |---- + 7 x | | 3 | \ x / > time(); 0.004The time required took less than .001 seconds. Sigh. Different "invocations" of the program do take different amounts of time, however, even for identical computations. The program is very large, and when it is started, it may be stored in different chunks of memory in various ways, and this can increase the running time for some computations.
Exam warning!
The first exam of the course will be given two weeks from this
lecture, on Friday, October 12. More information about the exam,
including review material, sections to be covered, etc., will be
available soon.
Tuesday, September 25 | (Lecture #7) |
---|
Function | Derivative |
---|---|
xn | nxn-1 |
I actually proved this in the last lecture when n is a positive integer. It is, in fact, true for any constant n. So examples would be:
Constants
I want f´(x)=limh-->0[f(x+h)-f(x)]/h. What if f is a
CONSTANT function, so its values are all the same?
Well then the top of the difference quotient, f(x+h)-f(x), will be
CONSTANT-CONSTANT, and it will be 0.
So the derivative will be 0.
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex
Let's consider the derivative of an exponential function, say
ax, where a is a constant. Then the difference quotient,
[f(x+h)-f(x)]/h becomes [ax+h-ax]/h. As I
mentioned in class, just plugging in h=0 yields 0/0, and this doesn't
help. We can try the algebra that's available:
[ax+h-ax]/h=[axah-ax]/h=ax((ah-1)/h).
So we need to consider (ah-1)/h as h-->0.
We actually analyzed this limit graphically and numerically in Lecture #3 for several values of a. When a=2, it seems that the limit exists and equals .693, while if a=3, 1.109 was the approximate value of the limit. Since f´(x) will be equal to ax multiplied by whatever the value of limh-->0(ah-1)/h is, I want to choose a value of a so that the formulas are as simple as possible. So maybe we can get a value of a so the limit of (ah-1)/h as h-->0 is 1. This can be done. In fact, here is a fairly irritating way of thinking about this special number, which people call e. If h is small, we would like (ah-1)/h to be approximately 1 when h is small, well, look at the following sequence of ideas:
n | (1+{1/n})n |
---|---|
1 | 2.000000000 |
2 | 2.250000000 |
3 | 2.370370369 |
4 | 2.441406250 |
5 | 2.488320000 |
10 | 2.593742460 |
100 | 2.704813829 |
1,000 | 2.716923932 |
10,000 | 2.718145926 |
100,000 | 2.718268237 |
1,000,000 | 2.718280469 |
Well, the numerical value of (1+{1/100})100 about 2.704813829. To the right is a table of values of (1+{1/n})n. This method of "computing" or approximating e is actually very very slow. As previously written, here are several million digits of e, if you need them. As I mentioned in class, it is certainly possible to prove that the formula converges as n gets large. This can be done with "only" high school algebra. It is quite tedious, and, to me at least, has little redeeming social value. (I may have the wrong attitude here, so please forgive me.)
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
What does that limit statement mean?
The definition f´(x)=limh-->0[f(x+h)-f(x)]/h is
sometimes a bit difficult to understand. What if I just throw out
"limh-->0"? Well, certainly f´(x) is NOT the same as [f(x+h)-f(x)]/h (hey: one of
them has an h and the other doesn't even mention h!). So it really
means f´(x)=[f(x+h)-f(x)]/h+ERR, where ERR (stands for "ERROR",
of course) is some "mess", and all I know about it (and care about it,
at this time!) is that it goes to 0 as h goes to 0. I don't like
division, so let me multiply by h. Here's the result:
f´(x)h=f(x+h)-f(x)+ERR·h.
I don't like subtraction, so let me add f(x).
f(x)+f´(x)h=f(x+h)+ERR·h.
People usually put the "ERR" term on the other side, so let me do
that. It is irritating, but I won't change the sign on the ERR term,
because right now I am interested in the qualitative aspect. So what I
have is:
f(x+h)=f(x)+f´(x)h+Err·h |
---|
So what the heck do we have? If we think of the function as taking an input variable, x, and smooshing (?!) it around to get an output value, f(x), then f(x+h) is what results if we "kick" the input value by a little bit, h. If f is differentiable, then the output seems to split up into several pieces.
I have a silicon friend, and my silicon friend knows a function
(really!) named LambertW. (I am sorry, that is actually the
name of the function -- I will tell you later in the course where
it comes from). This function can be computed, and to the right are
some values of LambertW.
LambertW(3+h)=LambertW(3.)+LambertW´(3.)(h)+smaller stuff If I use the numbers in the second row of the table, this becomes 1.051613975=1.049908895+LambertW´(3.)(.01)+smaller stuff Then I can approximate LambertW´(3.) by just dropping (what I hope is) the "smaller stuff" term and "solving" for LambertW´(3.). The result is the approximate value 0.1705080000. The second row gives me 0.1709417000. The third row gives me 0.1707030000 and the fourth row gives me 0.1707460000. I bet that LambertW is differentiable, and its derivative at x=3 is about .1707. Indeed (!) when I ask my silicon pal for the value of LambertW's derivative at 3, the answer I get is 0.1707244807. My silicon pal knows lots of derivatives, many more derivatives than I do.
Comment I really didn't select some special silly function for
this demonstration. I wanted to do an "experiment" so I took a
function that I don't know very well and thought you would be unlikely
to know. I wanted you to see how the numbers worked. Here's
more than you are likely to want to know about this function.
|
Sums
Suppose f and g are differentiable functions. Then I know:
f(x+h)=f(x)+f´(x)h+Errf·h
g(x+h)=g(x)+g´(x)h+Errg·h
I used different notation for the error terms for f and g to keep track of stuff
better than I did in class.
I can add these equations. Here's the result:
f(x+h)+g(x+h)=f(x)+f´(x)h+Errf·h+g(x)+g´(x)h+Errg·h
but this is an awfully silly way to write the result. I should write
it in such a way that the "structure" of the equation is shown. Here:
(f(x+h)+g(x+h))=(f(x)+g(x))+(f´(x)+g´(x))h+([Errf+Errg]h)
The left-hand side is the function f+g at the input value x+h.
The first piece, (f(x)+g(x)), is the unperturbed value of the function f+g
at x. The second piece,
(f´(x)+g´(x))h, is a multiplier not involving h,
multiplied by h. The third piece is h multiplied by some stuff:
[Errf+Errg], and all this stuff-->0 as
h-->0. Hey, this is the higher order vanishing. So I know the next
line in the table.
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
f(x)+g(x) | f´(x)+g´(x) |
Products
This is harder.
Suppose I know
f(x+h)=f(x)+f´(x)h+Errf·h
g(x+h)=g(x)+g´(x)h+Errg·h
and now I multiply the equations. Well, the left-hand side isn't
bad, but there are three terms on the right-hand side of each
equation, so there will be NINE terms if I distribute out the
product/sums. I am really considering the product function of f and
g. Here is how to organize the result. In class we thought a bit about
how to organize this. In "print" such discussion is more difficult to
write.
Please: I don't want all this stuff to be memorized. I haven't
memorized it. But I do know the general idea, and that's what I'd like
you to get used to. Here is the left-hand side:
(f(x+h)·g(x+h)). This
is the function f·g's value at x+h.
Here are the pieces of the product of the two
right-hand sides:
(f(x)·g(x)). This
is the function f·g's value at x: the "old", unperturbed value
of f·g.
(f´(x)·g(x)+f(x)·g´(x))h. This is the first-order term, stuff (no h's!)
multiplied by one h.
Here is all the rest of the stuff. There are (good
grief!) six different terms. But I can "pull out" one h from all of
these terms, and what is left in all six of these terms are things
that -->0 as h-->0. Wow!
(f(x)·Errg+f´(x)·g´(x)h+f´(x)·Errgh+f(x)·Errf+Errf·g´(x)+Errf·Errgh)h.
If you see this, sort of, then you can see the next line of the
table.
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
f(x)+g(x) | f´(x)+g´(x) |
f(x)·g(x) | f´(x)·g(x)+f(x)·g´(x) |
A quote from von Neumann
John
von Neumann (1903-1957) was a mathematician who was raised in
Hungary and spent most of his career in the United States. He worked
in many areas of pure and applied mathematics. His ideas were
influential in quantum mechanics, the development of nuclear weapons,
game theory, and the theory and construction of digital computers.
Another way of looking at products
Well, first let's reconsider addition. We could imagine f and g being
functions that somehow model intervals which are growing in length. At
a certain time, the intervals have some length (the lengths labeled
just f and g to the right). If we increase time ("+h") then each
interval grows. I used the Greek capital Delta () to indicate how much
each would grow. If we combine the functions with addition, then the
growth just adds. To me this is sort of straightforward. I hope it is
to you. Since the growths add, the average growths add, and the
"instantaneous growths" (the derivatives!) also just add.
A simple physical (?) model of multiplication in this setting would be to just make the intervals into sides of a rectangle. Then the area of the rectangle will be the product function, f·g. When we allow the intervals to grow, the area grows in a more complicated manner. Look at the picture: the increase of the area has three "chunks". One of them is an increase in f multiplied by g. Another is f, multiplied by the increase in g. Then there's the corner piece, which is increase in f multiplied by increase in g. When the increments get small, the corner piece's decrease is much faster than the two other pieces' decreases. (Sorry: my use of language is not very good here.) So I think that the instantaneous increase in f·g will be given by f´(x)g(x)+f(x)g´(x): these are the first order terms of the increase. Of course, this is the product rule.
Examples
The derivative of 37x88: well, this is a product. It is
37·x88. Here f(x)=37 and g(x)=x88. So
f´(x)g(x)+f(x)g´(x) becomes
0·x88+37·88x87. Most people think
that multiplication by a constant deserves an entry of its own. So
back to the table:
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
f(x)+g(x) | f´(x)+g´(x) |
f(x)·g(x) | f´(x)·g(x)+f(x)·g´(x) |
CONSTANT(f(x)) | CONSTANT(f´(x)) |
Reciprocal
Now I want to consider the reciprocal ("one-over"). This in not
the inverse function, which is much more complicated. Here is just
want the multiplicative inverse of the result of the function. For
example, if f(x)=3x2+5, then f(2)=3·4+5=17. If
g(x)=1/f(x), g(2) would be 1/17.
We could for a second pretend that the graph of f looks roughly ("qualitatively") like what is shown to the right. I haven't included much information, but as we travel from left to right on the curve y=f(x), the graph goes up. The slope of the tangent line is a geometric representation of the derivative. I bet that the slopes of the tangent lines to y=f(x) are positive. Now look at y=1/f(x). Since the values of f got bigger (yeah, I made it simple, everything is positive) the values of y=1/f(x) get smaller. That graph is decreasing and the tangent lines have negative slope. So I bet there should be some negative sign appearing somewhere in some formula. Let's get the formula.
Let me assume that g(x)=1/f(x), and that g(x) is differentiable. Then
f(x)·g(x)=1. We could differentiate this equation. As I did in
class, I will be very adventurous (?) and differentiate the right-hand
side first. That gives me 0. I will use the product rule on the
left-hand side and then this equation results:
f´(x)·g(x)+f(x)·g´(x)=1.
But I want a "formula" for the derivative of g(x), so I will "solve"
for the derivative of g. Here is the result:
g´(x)=–f´(x)·g(x)/f(x)=–f´(x)/[f(x)]2 (because g(x) is 1/f(x) and I substituted).
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
f(x)+g(x) | f´(x)+g´(x) |
f(x)·g(x) | f´(x)·g(x)+f(x)·g´(x) |
CONSTANT(f(x)) | CONSTANT(f´(x)) |
1/f(x) | –f´(x)/[f(x)]2 |
just one crummy Example
We did get our minus sign. The derivative of 1/(3x2+5) must
be -(6x)/(3x2+5)2.
Quotient rule
Finally, I will state the Quotient Rule. This is a formula which
allows you to differentiate f(x)/g(x). If you think of this as
f(x)·[1/g(x)], you could differentiate by using the Product
Rule and the Reciprocal Rule (no one calls it that!) in
succession. But people rarely do that. They just remember another
formula. Here it is, the last line in the table for today.
Function | Derivative |
---|---|
xn | nxn-1 |
CONSTANT | 0 |
ex | ex |
f(x)+g(x) | f´(x)+g´(x) |
f(x)·g(x) | f´(x)·g(x)+f(x)·g´(x) |
CONSTANT(f(x)) | CONSTANT(f´(x)) |
1/f(x) | -f´(x)/[f(x)]2 |
f(x)/g(x) | [f´(x)g(x)-g´(x)f(x)]/[g(x)]2 |
In mathematics you don't understand things. You just get used to them. |
Actually, von Neumann was the sort of person who liked to say shocking things. I believe you and I should try to understand things (he certainly did, in spite of the quote), in order to increase the amount of structure in what we know. Such structure makes the incredible accumulation of facts more tolerable. |
The derivative of (7ex+9x-11)/(x5-sqrt(x)) is
[(7ex+9)(x5-sqrt(x))-(7ex+9x-11)(5x4-1/{2sqrt(x)})/(x5-sqrt(x))2.
What a mess. Remember that sqrt(x)=x1/2, whose derivative
is (1/2)x-1/2 or, if you wish, 1/{2sqrt(x)}. Also we now
can officially differentiate rational functions, which are quotients
of polynomials.
What grows faster?
I wanted to compare x10 and ex. When x=2, the
first function is 210=1,024, and the second function is
e2 which is certainly less than 9 (e is less than 3). In
fact, e2 is about 7.389. Which one grows faster? I asked
people to graph this, say from 0 to 5. A graph is shown to the
right. My "tool" has lots more computing power and pixels than the
typical hand-held device, but please notice the differing scales on
the vertical and horizontal axes. The red curve is y=x10
since it sort of exits the window at the top. The lazy (?),
apparently more slowly increasing green curve is y=ex.
Comparing the asymptotics as x gets large of these functions turns out
to be extremely important in practice. For example, similar
discrepancies or differences occur in descriptions of different types
of chemical reactions, so knowing what's fast and what's slow is
valuable. Consider the function F(x)=x10/ex. The
Quotient Rule tells me that its derivative is
F´(x)=(10x9ex-x10ex)/[ex]2. Some algebra (cancel ex top and bottom, factor the top) gives me
F´(x)=x9·(10-x)/ex.
Let me concentrate on x>0. The first factor, x9, is
positive. The second factor, 10-x, is positive for x between 0 and
10. It is 0 at 10, and afterwards, for all x's bigger than 10, it is
negative. So:
x10/ex begins with the top increasing faster
than the bottom. Things sort of are even at 10, but forever
afterwards, the bottom is increasing faster than the top.
The graph shown is very deceptive. To the right is a graph of y=x10/ex more truthfully displayed. The window should be examined carefully. The x's go from 0 to 30. The y's go from 0 to (don't miss the little number in the top left!) 5·105: that's 500,000, five hundred thousand. So it looks like x10 is winning, winning, winnning ...until after x=10, it loses, loses, loses. To give you some idea of how badly it loses, 8310 is about 1.55·1019, big, but e83 is about 1.11·1036, much, much, much bigger. Some chemical reactions occur in nanoseconds. Others just loaf along. |
QotD
Suppose
f(x)=(1x2+2)/(3x2+4).
Find an equation for the line tangent to y=f(x) when x=1.
A solution For this we need a POINT and a
SLOPE.
POINT Since f(1)=(1+2)/(3+4)=3/7. the tangent line
passes through (1,3/7).
SLOPE Since (Quotient Rule) f´(x)=
[(2x)(3x2+4)-(6x)(x2+2)]/(3x2+4)2,
f´(1)=[2·(7)-6·3]/(3+4)2=-4/49.
An equation for the line is therefore (y-3/7)=-(4/49)(x-1). I am lazy,
and unless I am told that the form I just wrote is unacceptable or
unless I needed another form to work more with the equation, I would
leave the equation exactly that way. (Please!)
Another (?) answer: y=-(4/49)x+(25/49).
Friday, September 21 | (Lecture #6) |
---|
We discussed various curves which could represent the position of Francine on the parkway in terms of miles from the start of the parkway at time t, in terms of hours elapsed from 7 AM. I tried to show that our everyday intuition lead to the graph being increasing (as you travel from left to right, the points on the graph go up). The graph can have level spots, where Francine pulls over for a rest stop. Legally Francine isn't supposed to drive backwards, though.
If we believe that motion is continuous (so Francine does not have a Star Trek transporter or other device) then the graph of Francine's position goes from (7 AM, 0 miles) to (10 AM, 172 miles) and therefore the graph must have on it at least one point with coordinate description (*,135). All of this, by the way, rests on some complicated assumptions, some of them philosophical (why should motion be continuous?). Today, though, I believe that motion is continuous, and therefore at sometime Francine must be at Mile 135. This is all the idea behind the following important result.
Intermediate Value Theorem
Suppose that the function f is
defined and continuous on the interval [a,b]. Then the equation f(x)=y
has at least one solution for every y which is between f(a) and f(b).
In mathematics, the word theorem is applied to results that are deduced from basic principles, and usually the term is used for more important conclusions in the subject. In this case, the Intermediate Value Theorem follows from basic principles governing the real numbers. A particular basic principle which is used in the proof of the theorem is the "least upper bound" property of the real numbers. This essentially declares that there are "no holes" in the real numbers. A precise statement is fairly delicate, and this property essentially shows that the reals and the rationals are distinct. Several upper-level math courses spend quite a bit of time exploring the statement. You can read about it in Wikipedia but I do add that detailed knowledge of such foundational material is not needed for success in Math 151 (or, for that matter, for successful careers in almost all of science and engineering!). |
Problem #5 in section 2.7
Here's the problem:
Show that cos(x)=x has a solution in the
interval [0,1].
Since I am a picture person, I frequently try to draw a graph or two
to understand what the problem is about. In this case, certainly the graph of y=x is easy to imagine on [0,1]. What
about y=cos(x)? Well, it does help, if you don't have access to a
graphing device, to know that cos(0)=1. I know that cosine drops down
as x increases from 0. What do we know about cos(1)? As one student
declared, cos(1) is both less than 1 and greater than 0. That's
because Pi/2 is about 1.57 and cosine decreases between 0 and Pi/2,
and does not reach 0 until Pi/2. So the graph of
y=cos(x) decreases from (0,1) to (1,cos(1)), and the end point
is somewhere above the x-axis. The mental picture I have built is
shown to the right. I deliberately did not have a graphing
device create an "accurate" display of the situation -- I wanted to
show what we should be able to do inside our own heads.
The picture now encourages me to believe the assertion of the
problem. The text supplies this hint:
Show that f(x)=x-cos(x) has a zero in
[0,1].
The phrase "has a zero in [0,1]" means that there is some root of
f(x)=0 in the interval [0,1]. Well, f(0)=0-cos(0)=-1<0 and
f(1)=1-cos(1). Since cos(1) is between 0 and 1, 1-cos(1) will be
between 1 and 0. So we know that f(1)>0. Since the values at the
endpoints are both positive and negative, I know that 0 is between the
endpoint values. The Intermediate Value Theorem then applies to show
that f(x)=0 has a solution between the endpoints, which are 0 and 1.
Can we locate this root more precisely?
So far what we know is that there is a root of cos(x)=x in [0,1]. Such
equations occur quite frequently in applications. It is rare that
solutions to such equations can be written exactly in terms of simple
operations (roots, logs, etc.) and classical constants, such as e and
Pi. But it may be (usually is!) important to know them accurately. Can
we get better information?
If f(x)=x-cos(x), we know f(0)<0 and f(1)>0. I've indicated this with the + and - labels on the ends of the unit interval to the right. | |
f(.5)=–.3775..., so we now know there is a root in the interval [.5,1]. | |
f(.75)=+.0183..., so we now know there is a root in the interval [.5,.75]. | |
f(.625)=–.1859..., so we now know there is a root in the interval [.625,.75} | |
Etc. By this I mean we can continue chopping the interval, looking for the sign of f's value at the center, making the length of the interval where a root is located as small as we like. This is the key idea of the Bisection Algorithm. The weird entry condition below, f(a)·f(b)<0, means that f has different signs at the two ends of the interval. |
The Bisection Algorithm
Entry conditions
A continuous function f(x) defined on an interval [a,b], with
f(a)·f(b)<0;
a positive tolerance E for the error.
Output
An interval [c,d] so that d-c<E and f(c)·f(d)<0.
This identifies an interval of length less than Ein
which must contain a root of f(x)=0.
"Loop" structure
Given [p,q] with f(p)·f(q)<0: let m=(p+q)/2. Compute f(m).
If f(m)·f(p)<=0, then replace q by m else
replace p by m.
Exit check If q-p<E then return p and q as c
and d in the output else go to loop.
The very alert student might
notice a slight difference from what I described in class. Here I
wrote f(m)·f(p)<=0 inside the loop. In class I
omitted the = possibility thinking that the program would just work
anyway. Mr. Sloane asked me about this
and I dismissed his worry. I was wrong. If you "test" the program on a
fuctnion like f(x)=2x-1 on the interval [0,1], whose only root is at
the midpoint, 1/2, then without the =, the program goes from [0,1] to
[.5,1] to [.75,1], and loses the root entirely! With this change, the
algorithm properly goes from [0,1] to [0,.5] to [.25,.5] to [.375,.5],
etc. This is probably the way it should perform. My programming and logical skills have suffered another crushing insult.
A simple program implementing the bisection method bisection := proc (f, a, b, E) local p, q, m; p := a; q := b; while E <= q-p do print(p, q); m := (1/2)*p+(1/2)*q; if f(p)*f(m) <= 0 then q := m else p := m end if end do end proc;The function f has been defined by the formula f(x)=x-cos(x) in another statement: f:=x->x-cos(x);. This bisection program prints out the intermediate stages, so you can see the program focusing on the interval in which the root sits. > bisection(f, 0., 1., 0.001); 0., 1. 0.5000000000, 1. 0.5000000000, 0.7500000000 0.6250000000, 0.7500000000 0.6875000000, 0.7500000000 0.7187500000, 0.7500000000 0.7343750000, 0.7500000000 0.7343750000, 0.7421875000 0.7382812500, 0.7421875000 0.7382812500, 0.7402343750 0.7392578125When the program ends, it reports the last variable's value, the middle of the subinterval. We will get more sophisticated root-finding methods, but this one is simple to understand and to use.
Algorithm? The modern meaning for algorithm is quite similar to that of recipe, process, method, technique, procedure, routine, except that the word "algorithm" connotes something just a little different. Besides merely being a finite set of rules which gives a sequence of operations for solving a specific type of problem, an algorithm has five important features:Knuth continues on the same page to contrast his definition of algorithm with what could be found in a cookbook: Let us try to compare the concept of an algorithm with that of a cookbook recipe: A recipe presumably has the qualities of finiteness (although it is said that a watched pot never boils), input (eggs, flour, etc.) and output (TV dinner, etc.) but notoriously lacks definiteness. There are frequently cases in which the definiteness is missing, e.g., "Add a dash of salt." A "dash" is defined as "less than 1/8 teaspoon"; salt is perhaps well enough defined; but where should the salt be added (on top, side, etc.)?
|
Average and Instantaneous Rates of Change
A few lectures ago I tried to analyze a number of real phenomena. I
hope that background should help you accept the following definitions:
The average rate of change of f in the interval
[x0,x1] is
(f(x1)-f(x0)/(x1-x0).
Geometrically, this is the slope of a secant line connecting the two
points (x0,f(x0)) and
(x1,f(x1)) on the graph of y=f(x). The
instantaneous rate of change of f at x2 is a
stranger thing, that probably can't be physically measured in most
cases. It is the slope of the tangent line at
(x2,f(x2)). The instantaneous rate of change of
f at x2 is better and better approximated by the average
rate of change if the numbers x0 and x1 are
close to f.
The definition
Consider limh-->0(f(x+h)-f(x))/h. If this limit exists,
then the value of the limit is f´(x), the derivative of f at
x, and we say that f is differentiable at x.
Most of this course will be devoted to studying the derivative of a function and its uses. Actually, the next few lectures will show that, for familair functions defined by formulas, the derivative can be computed fairly easily. So computation of derivatives, while both necessary and useful, is not the ultimate aim of the course (hey, such computation can be described carefully enough so that [capable] programmers can create differentiation programs!). We will spend most of the time investigating how to use derivatives.
Example
A traditional first example is f(x)=x2. Then we need to
consider (f(x+h)-f(x))/h. Let's look at the top of the fraction.
f(x+h)-f(x)=(x+h)2-x2=x2+2xh+h2-x2=2xh+h2=h(2x+h).
In the last step, I factored out an h because I was thinking ahead:
(f(x+h)-f(x))/h=(h(2x+h))/h=2x+h.
Now
limh-->02x+h=2x. We're done.
Conclusion The function f(x)=x2 is differentiable at
all x's, and its derivative is given by f´(x)=2x.
A tangent line
What is an equation of a line tangent to y=x2 at x=3? Here
we need a POINT and a SLOPE.
(3,9). If x=3, y=33=9.
f´(3)=2·3=6. The
derivative's value at x=3 is the slope of the tangent line at x=3.
An equation for the tangent line at x=3 is therefore y-9=6(x-3). I'd
probably leave the equation this way unless there was a reason to
change it (if I were requested to provide the answer in a different
form, or if I needed to compute with it more).
Is it correct?
Here I wanted to consider the formula for the derivative,
f´(x)=2x and consider if the answer were reasonable. The
graph of y=x2 to the left of the y-axis is decreasing, and
the tangent lines should be tilted "down". Their slopes should be
negative. And the algebraic candidate we have for the slopes of
tangent lines, 2x, is negative when x<0. On the other side of the
y-axis, the tangent lines tilt "up", and their slopes seem to be
positive. Of course, for x>0, 2x>0.
In this case, considering the answer and seeing that it is reasonable
and consistant with other information is easy. Certainly in more
complicated situations, such checks are more difficult. But if at all
possible, within the limits of time and effort, please try to make
such a check. Everyone makes mistakes: humans, computers, humans using
computers, etc. A few seconds "thought" can catch errors that can be
very irritating later.
And now for xn
Everyone knows the answer. O.k., but why? (Why is "the answer"
actually the answer, not why does everyone know it.) If
f(x)=xn where n is a positive integer, then:
f(x+h)-f(x)=(x+h)n-xn.
We need to consider (x+h)n. It is possible, using the Binomial
Theorem, to write an explicit exact expanded form of this
object. I don't need that. I need much less information. So:
(x+h)n=(x+h)(x+h)···( times)···(x+h).
There are lots of ways to multiply things out here: you need to choose
in each factor either the left (x) term or the right (h) term. But how
may ways are there which would get only x's? For this, we would need
to make only the x choice each time. There is exactly one way to get
all x's, so xn would only come out one time. How many ways
would result in one h and all the rest (n-1 of them) x's? Well, we
could choose the h from the first term and choose all x's from the
other terms. Or we could choose an h from the second term and all the
others (the first term and the terms after the second term)
x's. Etc. Here by "Etc." I mean that we could take an h from exactly
one of n factors, with all the other choices being x's. So since there
are n factors, there are n ways to get a product with one h and all
the rest x's. So in the result, there's nhxn-1. What about
the rest? We took care of the "no h" term and the "one h" terms. So
all the rest has at least two factors of h. So, actually, we now know:
(x+h)n=xn+nhxn-1+h2JUNK.
In this expression "JUNK" is not something bad. It represents terms I
don't need to care about at this time. In fact, later in the course,
we will try to understand some aspects of JUNK and how they can be
useful. Anyway, let's continue:
(x+h)n-xn=xn+nhxn-1+h2JUNK-xn=nhxn-1+h2JUNK=h(nxn-1+hJUNK)
where again I factored the h out because I was thinking ahead. Now the limit:
limh-->0(f(x+h)-f(x))/h=limh-->(h(nxn-1+hJUNK))/h=limh-->0nxn-1+hJUNK=nxn-1.
Therefore, f(x)=xn is differentiable, and its derivative,
f´(x), is nxn-1.
QotD
If f(x)=1/x, use the definition of derivative to find f´(x).
So f(x+h)-f(x)=1/(x+h)-1/x=(x-(x+h))/((x+h)x)=-h/((x+h)x) and
limh-->0(f(x+h)-f(x))/h=limh-->0[-h/((x+h)x)]/h=limh-->0-1/((x+h)x)=-1/(x·x)=-1/x2.
Interesting aspects of this computation: if f(x)=1/x, then
f(x+h)=1/(x+h). You must understand the grammar (?) of functions to do
this. And combining the fractions and converting to a simple fraction:
you must know how to do this sort of algebra.
Limit test!!!
Please see here and be prepared
for a short exam about limits next Wednesday.
Tuesday, September 18 | (Lecture #5) |
---|
1 1 ---- - ------ 3x+4 x2+6 lim ------------------- x-->2 x-2As I remarked in class, I can't imagine a situation where this specific limit would occur. But let me try to evaluate it anyway.
"Plugging in" x=2 gets 0 on the bottom, and on top, if we substitute correctly, 1/(3x+4) becomes 1/10 and 1/(x2+6) becomes 1/10. So the top is (1/10)-(1/10) which is 0. Surprise (not!): this is a 0/0 situation. We use some algebra. My feeling is great dislike for compound fractions, that is, fractions within fractions. I find them difficult to understand and difficult to manipulate. My advice is to change them into "simple" fractions, where only one division sign will appear. But we need to do this carefully.
The top is 1/(3x+4)- 1/(x2+6) which is
(x2+6)-(3x+4)
-------------------
(x2+6)(3x+4)
Notice that this fraction is sitting "on top" of (x-2). We have
something like this:
A A 1 A ----- ----- · --- ------- B B C B·C A ------- = -------------- = ---------- = ----- C 1 B·C C · --- 1 CThank you, Ms. Fung, for asking about this. So the result in our case is that the compound fraction becomes the following simple fraction:
Limits and algebra
If limx-->af(x)=L1 and
If limx-->ag(x)=L2 then
•
limx-->af(x)+g(x) exists and equals
L1+L2;
• limx-->af(x)·g(x) exists and equals
L1·L2;
• (when L2 is not 0)
limx-->af(x)/g(x) exists and equals
L1/L2.
QotD (definitely less absurd)
(Please accept my guarantee that this is not absurb, at least
for a little while [2 lectures].)What is
1 1 --- - --- x2 w2 lim --------- ? x-->w x-wThis is a compound fraction. I will try to convert it to a simple fraction. Let's look at the top:
w2-x2 (w-x)(w+x) w+x --------- = ---------- = - ------ x2w2(x-w) x2w2(x-w) x2w2An easy mistake to make here is somehow losing (?) the minus sign. Please don't. Then:
A weird but not absurd limit
f(x)=|x+5|/(2x+10). So what happens as x-->7, for example? The
x+5-->7+5=12. The absolute value for x's near 7 give x+5 near 12. The
results are all positive, so |x+5|-->12. The bottom, 2x+10, approaches
2·7+10, which is 24. The resulting limit value is 12/24.
There's trouble near -5, though. So what about limx-->-5|x+5|/(2x+10)?
Here there's a real difficulty with absolute value. Recall that |w| is
w if w>0 and |w| is -w if w<0. So |x+5| is x+5 if x+5>0. and
|x+5| is -(x+5) if x+5<0. But x+5>0 is the same as x>-5, and
x+5<0 is the same as x<-5. So put all this together:
suppose f(x)=|x+5|/(2x+10).
If x>-5, f(x)=(x+5)/(2x+10)=1/2;
if x<-5, f(x)=-(x+5)/(2x+10)=-1/2.
To the right is a graph of f. I don't "know" what the value of f(-5) should be (computer programs handle this differently, but many report that trying to evaluate the function at -5 is an error).
Such functions occur when trying to model real situations (hey: "hit" a plate -- that will be a shock! Or drop a chunk of salt into a container of water -- that's a shock to the salt concentration!). So getting some language to describe the behavior is useful.
As x gets close to -5 from the left, the values of f(x) are all -1/2. So people say that limx-->-5-f(x), the left-hand limit of f at -5, exists, and they say that the value of this limit is -1/2.
As x gets close to -5 from the left, the values of f(x) are all 1/2. So people say that limx-->-5+f(x), the right-hand limit of f at -5, exists, and they say that the value of this limit is 1/2.
You may be confused at first with the superscript/exponent + and -. The "+" up there always means "from the right" and the "-" up there will always mean from the left. In the expression x-->-5+ the minus before the 5 means look 5 units to the left of 0. The plus sign in the exponent means see what happens to the right of that number (that is, -5), but look close to it.
A piecewise linear graph and some limits
I drew a graph similar to what's shown to the right. The idea was to use the geometric information on the graph to find out what we know about function values and limits. I also wanted to urge students to understnad the possibilities of our algebraic "language". So to the right is a graph of y=f(x). Some questions: |
Lots of oscillations, I
I asked people to consider the function f(x)=sin(1/x). The range of
this function is certainly easy enough to understand. The outputs are
outputs from sine, and therefore they are numbers from -1 to 1. The
domain is all x's except 0 (division by 0 is bad).
What makes this function interesting is its wild behavior near 0. When x is very small, 1/x is very large. So a short interval of x's near 0 gets changed to a long interval of numbers away from 0, and then these numbers are "fed into" sine. For example, look at the interval of x's between 1/5Pi and 1/4Pi. These numbers work out to 0.064 and 0.079. If x is between 1/5Pi and 1/4Pi, then 1/x is betweeb 4Pi and 5Pi. The sine curve oscillates once, from 0 to 1 and back to 0, between 4Pi and 5Pi. Therefore the function f(x)=sin(1/x) goes from 0 to 1 to 0 in the interval [0.064,0.079]. Hey! Look at the x's between, say, 1/(100Pi) and 1/(101Pi). Now 1/x goes from 100Pi to 101Pi. And, again sine's values go from 0 to 1 to 0. So there is another "blip" in the graph. Of course I've been looking at stuff from even multiples of Pi to odd multiples of Pi. If I had looked from odd multiples to evens, I would have seen blips down, from 0 to -1 to 0. Hey, the sin(1/x) curves oscillates infinitely often from 0 to 1 to 0 to -1 and back to 0 in every small interval centered around 0. Certainly limx->0sin(1/x) does not exist, and neither do the left- and right-handed limits.
Computing machines (calculators and other graphical devices) have a finite number of pixels and a finite ability to consider numbers. They usually can't show too much detail of the curve y=sin(1/x), and sometimes very strange results occur. You should check what your calculator does. Here is a graph drawn by Maple on [-1,1]. Maybe it is slightly better than a calculator-drawn graph, but it still can't cope with all the oscillations.
Lots of oscillations, II Here I looked at f(x)=x·sin(1/x). This function takes the previous result and multiplies it by x. So the heights will vary. The sine outputs are between -1 and +1, but then we multiply by x. If x is very small, then the function result is very small. Actually, since -1<=sin(1/x)<=+1, so -x<=x·sin(1/x)<=+x. This graph will always be between the lines y=x and y=-x. It will touch the lines each time sin(1/x) is +1 or -1, and this happens (look at the preceding analysis) infinitely often. So what happens here to the limit as x-->0? The heights are being squeezed down to 0. I claim that the outputs all get close to 0, and that limx-->0f(x) exists, and the value of the limit is 0. |
Now another function ...
Here is one of very important limits of calculus. It will be built
into many of the methods we will use. We should understand it. So:
What can we say about limx-->0[sin(x)]/x? Please be very
careful. Here the division is outside of the function evaluation.
If we try to "plug in" we again get the equivocal 0/0, and that is not
helpful. I suggested that we plot f(x)=[sin(x)]/x.
Here is a plot of y=[sin(x)]/x. It doesn't look too strange. Certainly, this seems to suggest that limx-->0[sin(x)]/x=1. In fact, this is true, when x is measured in radians. |
If you insist on using degrees, a calculator, with
window properly adjusted, will show you something like the graph to
the right for a plot of y=[sin(x)]/x near 0. This graph seems almost
flat, and it seems to show that IF YOU MEASURE ANGLES WITH
DEGREES then limx-->0[sin(x)]/x=.017, which is also
true.
Please note that the vertical and horizontal scales are not the same in this graph. |
Degrees or radians?
Choose between the numbers .017 and 1: which is better to compute
with? Everyone I know likes 1 better. Therefore everyone analyzing
periodicity with calculus will use the trig functions with radian
measure.
What is the limit?
The limit of [sin(x)]/x as x-->0 is 1. Better, because it is so important:
sin(x)
The text discusses this result on pages 100 and 101. Or you can look
here for the result as I
discussed it last year in Math 151.
lim -------- = 1
x-->0 x
Hey, engineers, it is sometimes
handy to know that if |x| is very small, then sin(x) and x are about
the same. For example:
The differences get very, very small. In fact, we can predict how
small the differences get using calculus --later in the course.
|
The Squeeze Theorem
The Squeeze Theorem is a result which is occasionally useful. Here are
the hypotheses:
IF f(x)<=g(x)<=h(x) andThis is also called the Sandwich Theorem in some texts. The picture is supposed to give some credibility to the result. The green, lowest curve, represents y=f(x). The blue, highest curve, represents y=h(x). In between, frantically trying to "escape", is y=g(x), but it is trapped, more and more narrowly as x-->a, between the other curves.
IF limx-->af(x) and limx-->ah(x) both exist and
IF the two limits are equal,
THEN limx-->ag(x) exists, and equals the common value of the other two limits.
Now for this course ...
I mentioned the following ideas.
COMPETITION The course is not a struggle
between students. The grades are given essentially on an absolute
scale, so one student getting a high grade does not imply another
student will get a lower grade. Please help each other. This is a good
thing.
COMBAT This is also not a struggle between
students and the instructor. It is my job and strong desire to help
you do well. This means learning calculus and learning how to write
about calculus. My comments and grades are part of my effort to do
this. Pleae take advantage of the interaction given by the course. It
isn't as much as we would ideally like, due to economic factors, but
try to take advantage of what there is.
I returned workshops and QotD's to students who attended.
Friday, September 14 | (Lecture #4) |
---|
The idea of limit
Today we begin an official discussion of limit. This is a very useful
idea. There are lots of computations and manipulations which are
related to limits. But I must be very careful about how class time is
used, and I must rely on you to read the textbook and do
homework problems. I will choose to emphasize ideas in the lectures. I
will try to do a sufficient number of examples to illustrate these
ideas, but your own efforts will be key to getting enough experience
and sufficient familiarity.
Tree growth
Catalogs sell trees with short descriptions which frequently include
such information as "The eventual height of this tree is 40 ft and its
growth rate is 2 ft/yr." What do these numbers mean? We discussed this
for a while.
First approach
If one took the 2 ft/yr literally, then in a century the tree would be
200 ft high, and a bit later (well, quite a bit later!) the tree would
knock down the moon. This is probably not realistic.
Another attempt to understand ...
We made a second attempt to understand. The tree would grow at 2
ft/yr, and then, after 20 years (40/2=20) the tree would stop
growing. Well, if we plotted the growth curve, it might look like what
is shown to the right. To me, a simple piecewise linear curve is
really neat and simple. But I don't think this represents behavior of
complex organic objects. For example, I don't think a really tiny tree
would actually grow two feet taller in its first year. I think it
would begin by growing rather slowly. I also don't think that trees
grow at a steady rate, and then tree growth would suddenly and totally
stop at some specific height. That's probably not what happens.
Maybe more real ...
Reality is sloppier. I bet that the tree begins its growth fairly
slowly. Then the growth gets more rapid and steadier, and persists for
quite a while until the tree gets near its "ultimate height". The tree
growth slows then, but I would expect more growth, at a slower rate,
for a while. I think that one more reasonable possibility for a growth
curve is shown to the right.
In fact, I think what's shown to the right is, itself, only an imitation of real tree growth. Growth will depend on climate and weather (moisture, temperature), and nourishment (how much of what kind of minerals, etc.), and competition (in the shadow of something else?), and such factors as where is the lumber harvester. Each tree will likely be somewhat different, and what's presented and described is an average of many observations.
How could tree growth around, say, year 13, be calculated? Well, suppose H(t) be the height of the tree in feet, at time t years after the tree seed germinates. Then we could measure the height of the tree once a year. So we'd have values for H(12) and H(13) and H(14), etc. The growth rate in feet per year for the year [13,14] would be (H(14)-H(13))/(14-13). This number would represent the average growth rate of the tree during that year. Of course, we could want more precise information. I could imagine someone measuring the tree height every 3 months, a quarter of a year. The interval [13,13.25] would be closer to 13. And I would think that the quantity (H(13.25)-H(13))/(13.25-13) would be a better idea of what might be the tree growth rate "at" 13. Of course, this fraction is what's called the average growth rate in the interval [13,13.25]. We could imagine an enthusiast desiring better information. This person could want 10,000 measurements of the tree height during the year, etc. This information could be used to give more precise information about tree growth rate. Now think about real life. I don't believe that 10,000 tree measurements are likely. I think that approximate average rates of growth are all we can expect, and the idea of some totally precise "growth rate" is fictional. But, even though it is fictional, such an ideal growth rate may be useful.
Diary entry in progress!
Rock dropping
I used feet and pounds, an antique system of measurements in this
discussion. The units confused people. I am sorry.
Suppose we drop a rock. Then the rock will fall approximately
s(t)=16t2 ft after t seconds. If we wanted to know the
average velocity that the rock fell in the one second after t=3, we
could compute:
s(4)-s(3) 16(42)-16(32) -------- = ------------= 16(16-9) = 112 4-3 1The units in this answer are ft/sec. So, on average, the rock fell 112 ft/sec during the time interval [3,4]. Suppose I wanted to get a better estimate of the average velocity near time 3. I could measure the distance at, say, t=3.007. Then since (3.007)2=9.042049, the average velocity of the rock during the time interval [3,3.007] would be
s(3.07)-s(3) 16(9.042049)-16(9) 16(.042049) ----------- = ------------------ = ----------- = 96.112 3.007-3 .007 .007So this average velocity, which, with some effort, I could actually imagine observing and measuring and computing, is 96.112: that is, on average during the time interval [3,3.007], the rock fell 96.112 ft/sec.
Let's now use algebra. Algebra allows almost everyone to be clever. So we can ask: what is the average velocity of the rock during the time interval [3,3+h] if h is a small positive number? The computation would be something like this:
s(3+h)-s(3) 16(3+h)2-16(32) 16(9+6h+h2)-16·9 16(6h+h2) ---------- = --------------- = ---------------- = ---------- =16(6+h) h h h hSo the average velocity of the rock during the interval [9,9+h] seconds is 16(6+h) ft/sec.
This is actually a rather remarkable computation, and just because you may have seen such things before is no reason to ignore the wonder: you must be like a child .... Let's check the formula. If h=.007, then 16(6+h) is 16(6.007)=96.112 (that's what we got before). To me, the most remarkable part of the computation is that we have a quotient, a tiny distance change on top, and a tiny time change on the bottom. Amazing, somehow the "tiny"'s sort of cancel, and we are left with an average velocity, 16(6+h), which has a sort of stability property as h gets smaller. In fact, if you don't think too hard, as h gets smaller ("h-->0") then 16(6+h) gets close to 16(6)=96 ("16(6+h)-->96").
So the average velocity over the interval [3,3+h] approaches 96 as the length of the time interval goes to 0. People abbreviate this by declaring that 96 is the instantaneous velocity of the rock at time 3. As far as I know, human beings don't actually measure or observe instantaneous velocities. These instantaneous things are lies, o.k., we shouldn't call them lies, they are useful fictions or mental constructions. Most of this course will be about various useful fictions.
Local linearity
Last time we considered the curve y=3x
and zoomed in on it around the point (0,1). We observed that the curve
seemed more and more to appear like a straight line. Hey: I even gave
this phenomenon a name: local linearity. The line goes through the
point (0,1), and therefore complete information about it only depends
on knowing its slope. Well, if we think that the line is about the
same as the curve, on a very very small scale, then maybe another
point on the line would approximately be (h,3h) if h is a
small positive number. The slope is then the difference in the second
coordinates divided by the difference in the first coordinates:
(3h-1)/h (on the bottom, h-0 is just h). In this case,
unlike the low-degree polynomial investigated with the rock, the h's
don't obviously somehow fade away. We guessed using numerical evidence
that as h-->0, this quotient also "stabilizes" near the value
1.1. This behavior is not obvious to me algebraically, and, indeed, if
I wanted to verify it for you right now, showing the limiting behavior
would take a great deal of effort.
A function which isn't locally linear
Look at the graph to the right, and specifically look "near"
(0,0). Here I am deliberately omitting the scale marks on the axes,
because they don't matter for the purposes of this discussion. If we
zoom in around (0,0), we get something which looks much the same. This
is a self-similar object and scale doesn't matter. (Self-simlarity
is a key part of what's called fractals, one natural
(?) appearance of which is here).
There's no magnification which will make the graph to the right look like a straight line through the point (0,0). So this is a graph of a function which is not locally linear. In this course, most of the functions we will consider will be locally linear, so you should see that some rather simple graphs will just be thrown out of consideration.
Please realize that the graph to the right is associated with |x|, and is not an abstract, weird invention.
Observations and ideals
We, humans, can observe average rates of change. We can measure how
balls roll down inclined planes. We can, with considerable more
ingenuity and equipment (go over to the Chem Department!) make
observations of chemical reactions on a near-nanosecond scale. As far
as I know, we look at average rates of change. A nice invention
is the instantaneous rate of change, which will be a principal
object of study in this course. As far as I know, we do not observe
instantaneous rates of change, we can only approximate them by real
measurements. This course studies the useful fictions related to
instantaneous rates of change.
What is a limit?
The idea of a limit took several hundred years to evolve and
understand. We try to recaptulate ("summarize briefly") this
development today in about 70 minutes!
What is a limit?
"Real" functions are not very exact. A chemical engineer might have
(I'm simplifying hugely) some process which produces the correct kind
of plastic if, say, a certain amount of benzene is used in the
mix. So a certain percent of benzene, say 3%, may produce a
plastic which has the desired amount of, say, translucence. But in
reality, even very precise measurements may not create an input with
exactly 3% benzene. Maybe some days we get 3.5, or 4.2. And maybe the
desired output measurement is not necessary -- we can deal with some
error in the output. Real measurements imply that we need to
contemplate error, not as a moral defect (at least in this case), but
as part of our mathematical model which we must deal with.
Output tolerance as controlled by input tolerance
So maybe we need to understand the following idea: we want a
certain output tolerance: that is what we can "live with" in the
material created. Is there an input tolerance describing inputs in an
interval around the ideal input which makes the corresponding outputs
close enough to the desired output? The existence of such a
relationship is what's meant by limit.
limx-->af(x)=c means that if some output tolerance is
specified, so we want |f(x)-c|<some acceptable output error, then
there is some input error or wiggle that will force that acceptable
output error.
Very precisely, here is what is meant by limx-->af(x)=c: Given some positive OTE (Output Tolerance Error), there is a positive number AIE (Acceptable Input Error) so that: If 0<|x-a|<AIE then |f(x)-c|<OTE. Maybe this is too darn abstract. Let me consider a numerical example, with a very simple function.
x2 near x=3
Change the output tolerance to .2
|
I do think that a few of the students in Math 151 may need to get this more intimate (?) view of limits, but not more than a few of the students. Most students will need to understand how to compute limits for familiar functions and familiar combinations of functions. There are some standard techniques. The first one is almost ludicrous, but I recommend it strongly.
Plug in
Suppose you want to understand limx-->af(x). If the
function is defined by some formula you understand, plug
in. That is, evaluate f(a).
Example What's limx-->3x2? I think it is
32.
This "trick" is important enough so that there is a label which goes
with it. A function f is continuous at x=a if the domain of f
includes an interval surrounding a, and if limx-->af(x)
exists, and if the value of that limit is f(a).
In other words, the function is continuous at x=a if the limit of the
function as x "approaches" a is gotten just by plugging in a to f:
computing f(a). Due to many facts about limits that we will mention
next time, most familiar functions are continuous in their domains.
Many of the most interesting limits we need to deal with (those involving rates of change, for example) can't be evaluated by just plugging in. Here are two examples: limx-->0(3x-1)/x and limx-->0(sin(x))/x. In both cases, plugging in gives 0/0. In fact, both limits do exist. The value of the first one is ln(3) and the value of the second one is 1. Neither fact is obvious, and some effort is needed. But sometimes limits can be seen by (relatively!) simple algebraic manipulation. We will try to change them into forms which can be evaluated by "plugging in".
Algebraic transformation, with the goal being ...
I think we looked at something like
limx-->2(x2-3x+2)/(x-2). First plug in. The
top becomes 22-3·2+2=4-6+2=0. The bottom is also
0. I don't know what 0/0 means. But we can realize that the top can be
analyzed:
x2-3x+2=(x-2)(x-1). Therefore, if x is not equal to 0,
(x2-3x+2)/(x-2)=[(x-2)(x-1)]/(x-2)=x-1.
We are dividing the top and bottom of the fraction by x-2. This would
not be valid if x were equal to 0.
Two comments: first, I'll rarely be so careful as to mention
restrictions like "x not equal to 0" again. I'm just not that pedantic
("marked by a narrow focus on or display of learning especially its
trivial aspects"). Second, the "x not equal to 2" is actually part of
the precise definition of limit, if you look either in the text or
what's written above. We don't need to know what happens exactly at
x=2. We want to know what happens when x is close to 2. Back to the limit:
Therefore limx-->2(x2-3x+2)/(x-2)=limx-->2x-1=1 (just by plugging in).
Another example
Graphical example
Suppose we have a piecewise-defined function:
1-x2 for x<2 f(x)=I don't care (!) for x=2 2x+1 x>2Part of the graph of this function is shown to the right. The formula y=1-x2 defines a parabola whose top is at (0,1), and which opens down. We take the piece of this curve which is to the left of x=2. The formula y=2x+1 is a straight line of slope 2 which passes through (3,7) for example. We take the piece of this curve which is to the right of x=2.
What can we say about limx-->2f(x)? If we first restrict
attention to x<2, we can use one formula:
For x<2,
limx-->2f(x)=limx-->21-x2=1-22=-3. This
limit is just "plugging in".
Now if we consider x>2, use the other formula:
For x>2,
limx-->2f(x)=limx-->22x+1=2·2+1=5. Again,
the answer uses "plugging in".
I hope that if you look at the picture you can see these two results. But what about limx-->2f(x)? There is no unique one number which occurs as output when x gets close to (but not equal to) 2. The outputs get close to -3 (if inputs are from the left) and they get close to 5 (if the inputs are from the right). So we say that "this limit does not exist". If a limit exists (without any further description, so without the qualifications of left or right) then there should only be one value.
One further algebraic example
I think we looked at the more complicated
limx-->w(x-w)/(sqrt(x)-sqrt(w)).
Here plugging x=w in just gets you 0/0, which I don't understand. I
can see two ways of analyzing the fraction inside the limit.
Method 1 Multiply by the conjugate. Here this means:
(x-w) (x-w)((sqrt(x)+sqrt(w)) (x-w)((sqrt(x)+sqrt(w)) ----------------- = ----------------------------------- = ------------------------ = sqrt(x)+sqrt(w) (sqrt(x)-sqrt(w)) (sqrt(x)-sqrt(w))((sqrt(x)+sqrt(w)) x-wTherefore limx-->w(x-w)/(sqrt(x)-sqrt(w))=limx-->wsqrt(x)+sqrt(w) which we can handle by plugging in x=w, so the answer is sqrt(w)+sqrt(w)=2sqrt(w). Method 2 Another way of handle the quotient which some people prefer is this:
(x-w) (sqrt(x)-sqrt(w))(sqrt(x)+sqrt(w)) ----------------- = ---------------------------------- =(sqrt(x)+sqrt(w)) (sqrt(x)-sqrt(w)) (sqrt(x)-sqrt(w))Then again limx-->w(x-w)/(sqrt(x)-sqrt(w))=limx-->wsqrt(x)+sqrt(w)=2sqrt(w).
Use any valid method that works for you, please. Here's another view of conjugation
QotD
What is limx-->3(x2-4x+3)(x2-x-6)?
Plugging in gets
(32-4·3+3)(32-3-6)=(9-12+3)/(9-3-6)=0/0
so that doesn't work. But:
(x2-4x+3) (x-3)(x-1) x-1 --------- = ---------- = --- (x2-x-6) (x-3)(x+2) x+2and limx-->3(x-1)/(x+2)=(3-1)/(3+2)=2/5 by "plugging in".
Tuesday, September 11 | (Lecture #3) |
---|
36 means
3·3·3·3·3·3. That is, 3's repeated
and multiplied 6 times. Therefore
RULE
3positive integer means 3's multiplied together that
"positive integer" number of times.
Then, surely, people noticed things like 35+2 is
37, seven 3's multiplied, and this is the same
(associativity) as 35·32: just line
things up and multiply them.
RULE If n and m are
positive integers, then
3n+m=3n·3m.
Well, then, this "rule" is neat and easy to remember. But consider the
following "equation" (quotes because until you decide what some of the
symbols mean, the equation is silly):
36=36+0=36·30. If
this equation is valid, then 30 should mean 1. Making this
equation correct would mean that the simple template just above would be
true for n and m non-negative integers, and then I (and other
people) would have less to remember.
RULE 30=1.
If we want to extend the n+m equation more, consider the following
"equation" (quotes again because we are exploring how the symbols
work):
30=37-7=37+(-7)=37·3-7.
Then since we want 30 to be 1,
1=37·3-7 so that
3-7=1/37. Well, then, here's another
definition:
RULE If n is a positive
integer, then 3-n=1/3n.
So, for example, 3-2 must be 1/9. Finally, what happens if
we repeat exponentiation? Let's consider (34)2. What
is this? Well, 34 is 3·3·3·3 (four 3's
multiplied). Then squaring this gets us
(3·3·3·3)·(3·3·3·3). If
you count this is eight 3's multiplied: 38. So therefore
(34)2=38. Therefore we get
another:
RULE If n and m are positive integers, then (3n)m=3n·m.
But what about, say, 37/2? If we believe in the previous
result, or, better, we want to use it as a simple template for
computation, then consider (31/5)5. This "should" be
3(1/5)·5=31=3. So therefore
31/5 should mean the positive number which is the fifth
root of 3, the unique positive number which solves x5=3.
RULE If n is a positive
integer, then 31/n is the nth root of 3, that
is, the positive solution of xn=3.
People have adapted these rules so that exponentiation is easy to handle.
If b is a positive number, then:
|
The definition (?) of 3sqrt(2)
I mentioned this in class, although I suspect it seemed weird and
totally useless to most students. Very briefly, the problem is how to
define and compute such things as 3sqrt(2). The previous
discussion showed how to compute, say, 377/17 First find
the 17th root of 3 and then take the 77th power
of that root. But sqrt(2) is not rational. It can't be written as a
quotient of two integers. Numbers like 3sqrt(2) will occur
in calculus, and we may want to get some approximation of them. So
what's done? First, get an approximation of sqrt(2): 1.414213562, for
example (!). This is rational (you may not like it, but it is
1414213562/100000000. Then compute 31.414213562 which is
about 4.728804386. And this is what is done.
The curve y=3x investigated
I asked people to graph y=3x on their graphing
calculators. Below are some results.
To the right is a graph of y=3x in the window -1<=x<=1, 0<=y<=2. I think the curve looks sort of the way everyone thinks it should. This window has the point (0,30), which is (0,1), in the middle of it. | |
Please keep track of the dimensions of the window. Here I've "zoomed in". The window is -.25<=x<=.25, .75<=y<=1.25. This is again a square window. The zoom is a factor of 4. The bend in the curve is still visible to me, but the amount of curviness is less. | |
The zoom is higher. Now the square window has these
dimensions: -.01<=x<=.01, .99<=y<=1.01. The curve is, to
my weak human eyes, just about flat. The phrase "locally linear" is
used for functions like 3x. This phrase means that if the
graph of the function is zoomed at enough, the result looks very much
like a straight line. In fact, the functions of interest in this
course will be exactly the locally linear functions. The straight line, if you look at it carefully (some students did) has slope approximately equal to 1.1. (Hey, (3.05-30)/(.05-0) is about 1.12). | |
We then went through the same sequence of graphs for y=2x. I'll skip to the third graph. So what's shown to the left is the third result, which is y=2x in the window -.01<=x<=.01, .99<=y<=1.01. Again, this curve is locally linear, and the window is sufficiently small so that the piece of graph visible looks to me like a segment of a straight line. The slope of this line is about .69. (A computed value of (2.05-30)/(.05-0) is about .705.) |
Exponential functions are very commonly used to model such phenomena
as radioactive decay and bacterial growth and lots of other
things. The local linear approximations are things we will work
with. I don't find 1.1 ("slope" of 3x near (0,1)) and .69
("slope" of 2x near (0,1)) particularly nice numbers to
work with. I would rather work with 1. We
can adjust the local slope of bx by varying b. There is a
unique number b (between 2 and 3) so that the local slope of
bx at (0,1) is 1.
This number is called e. To the right is a graph of 2x in blue and 3x in green and ex in red. The graph does show
ex "sandwiched" between the other two exponentials. It
doesn't clearly show that ex has the desired local slope of
1, but this is actually true. We will later see how to compute e with
as much accuracy as desired. ex is widely known as
the exponential function, and frequently referred to as exp(x).
e is approximately 2.71828, and I sort of doubt that more digits will be needed by almost anyone in the class for practical computation. e is not rational, and its decimal expansion does not repeat. If you wish, here are about the first 2 million decimal places of e. People have computed billions of decimal places of both e and Pi. What else is there to do?
Graphing 5e-2x+3e4x
I asked people what the graph of f(x)=5e-2x+3e4x
would look like. I admire "technology" (that is in this case, graphing
devices) but to some extent one should learn what to expect. So here
is sort of a thinking process, a way we go through the graphing
step-by-step. The pictures in the process won't be totally accurate,
but I will try to draw them as I see them. I hope this helps.
Start with y=ex. We know what this looks like. The
domain is all reals and the range is all positives. The
point (0,1) is on the curve. The curve goes down to 0 as x-->-
and goes up to "+" as x-->+.
| |
Now modify to y=e4x. Multiplying the x by 4 inside
the exponential function makes the function increase faster as x gets
large positive, and also makes it increase faster as x gets large
negative.
Another way to think of this is by the formula e4x=(ex)4. The fourth power means that numbers bigger than 1 get bigger (the outputs of exp to the right of 0) and numbers between 0 and 1 get smaller (the outputs of exp to the left of 0). So a point whose second coordinate is 2, say, gets changed to one whose second coordinate is 24=16, and one with second coordinate is 7 gets changed to a point with second coordinate 74=2401. The point (0,1) has height 1, and since 14=1, (0,1) is still on this curve.
Or if you wish, think this way: e4x=(e4)x.
Since e4 is about 54.598, this curve is like
y=54.598x, and the exp curve is about y=2.718x.
Notice the 54.5980=1, so (0,1) is on the curve.
| |
And change it to y=3e4x. This stretches stuff by a factor
of 3 in the output. That means that vertical distances get mutliplied
by 3. A point with second coordinate 2 gets changed to one with second
coordinate 6, and a point with second coordinate 7 gets changed to one
with second coordinate 21. Please note that this is very
different from the previous change, which was not a uniform change of
length. The point (0,1) becomes (0,3).
| |
Now let's try the first piece of the function. What is e-x?
Here the minus sign inside reverses x, so the graph gets flipped over
the y axis. The domain is still all reals, the range is still all
positive numbers, but the function is decreasing. (0,1) is on the curve.
| |
Here we have e-2x. As you travel left on this curve, the
curve moves up more. As you travel right, the curve gets closer to the
x-axes.
If you wish, think of e-2x=(e-x)2. Squaring makes numbers bigger than 1 bigger, and makes numbers between 0 and 1 smaller. (0,1) is on the curve since 12=1.
Or, you could think this way: e-2x=(e-2)2, and this is about (.135)x, so
certainly things shrink as x increases. And since .135 is smaller than
2.718 (e's approximate value) you get the behavior shown. (0,1) is on
the curve since .1350=1.
| |
We finish preparing this piece by looking at the graph of
y=5e-2x. This is a constant factor stretch of 5 in vertical
direction of the plane, and the result is shown. The point (0,1) has
become (0,5).
| |
Now we take both of the previous red curves and add them up. This picture is intentionally drawn a bit screwy. The constants involved (5, -2, 3, and 4) were also intentionally chosen to have no simple relationship or obvious symmetry. The result, the adding of the two curves, shouldn't be symmetric. I just wanted to get a rough idea of what was going on. This will allow me to check what a machine does. (Machines can make mistakes, and so can human beings, as the human beings use machines.) |
Here is a Maple graph of f(x)=5e-2x+3e4x. Please consider the ridiculous difference in the vertical and horizontal scales. The total width of the bottom is 6 units, and the height is a bit more than 2,000 units. The exponential function gets really big really quick. | |
I also
asked people where the coordinates of the "bottom" of the curve
were. Several people responded, who were quick with their
calculators. If we plot the curve with a different range of x's, we
see what is to the right. Be careful again with the scale of the
vertical and horizontal axes. A bit more work should shows that the
bottom is fairly near (-.03,7.97). We will learn how to compute the
coordinates of such points.
Note The first graph I put here was wrong. I screwed up. Sorry. Human beings can make mistakes, and here I am an average representative of the species. Mr. Sloane spotted the error, and I thank him. |
Inverting exp to get ln
The exponential function has domain all real numbers and range all
positive numbers. It is increasing and 1-to-1. It has an inverse
called ln for "natural log". I may sometimes slip and write log
instead of ln. Almost the only logarithm function I use is ln. Since
the point (10,e10) on on the graph of exp, we know that
(e10,10) is on the graph of ln. Therefore
ln(e10)=10. So, in words, ln(a number) is the number
of powers of e "in" the number.
The picture to the right has a part of exp in red and a part of
ln in blue. The dashed line in green is the "main diagonal" y=x. The
graphs of the curves are obtained by flipping each over the main
diagonal.
The exp graph increases very rapidly and the ln graph increases very slowly. Again, we will discuss this later in detail.
Logarithm "laws"
Some other logs are used in practice (base 10 for pH and decibels
[sound intensity] and Richter scale [earthquakes] and base 2 for some
computer science/engineering applications). But in this course almost
surely we'll only refer to ln. The algebraic equations which we
constructed (?) for exponentiation have counterparts for ln:
How many digits does 240 have? As one amusing exercise, I mentioned that ln(2) is about .7 and ln(10) is about 2.3. I asked how to use these values to see how many decimal digits the number 240 has.
Here is how this problem can be done. If 10x=240, we can "ln" both sides. Now ln(10x)=x ln(10) and ln(240)=40 ln(2). Therefore x is 40 times ln(2)/ln(10). This is about 40 times .7/(2.3), so about 40/3 which is near the integer 13. Indeed, 240 is exactly 1099511627776. Better: it is 109 95116 27776 (I put spaces between groups of 5 digits). So there are 13 decimal digits in 240.
I don't think this is a vital computation, but I do think that applied scientists and engineers should be able to get approximate sizes of reasonable numbers without exact computation.
QotD: solve exactly -10e-2x+12e4x=0
Here I expected something like this:
12e4x=10e-2x
e4xe2x=(10/12)e-2xe2x
because I want to use 1/e-2x=e2x
e6x=10/12 since eaeb=ea+b and e0=1.
e6x=10/12
6x=ln(10/12) since ln is the inverse of exp.
x=(1/6)ln(10/12).
All of these answers are also correct:
ln({10/12}1/6)
-(1/6)ln(12/10)
(ln(10)-ln(12))/6
and others (!) because they are all different names for the same
number (properties of logs!).
As I mentioned to students who worked on this, I would like you to make mistakes while doing the QotD, and get rid of them now, rather than on exams, when such errors will be more painful and costly. By the way, the approximate numerical value of the answer is -.03 which is not a coincidence. (1/6)ln(10/12) turns out to be the first coordinate of the "bottom" of the curve sketched earlier in the lecture. Stick with the course and I will explain how this all works.
Friday, September 7 | (Lecture #2) |
---|
I gave the information in the table to the right. The table declares values
for certain functions (A, B, and C) at certain inputs (0, 1, and
2). For example, C(2)=4. I had two volunteers, Mr. Ferdinand and Ms. Kravitz who computed the following: C(B(1))=C(0)=-2 and B(C(1))=B(2)=5. These results are not equal. C(1)B(1)=2·0=0 and B(1)C(1)=0·2=0. These results are equal. If f(x)=B(x2+x), then f(0)=B(02+0)=B(0)=-1 and f(1)=B(12+1)=B(2)=5. Here I wanted to stress that tabular information (the collection of "data points") was enough information to do these problems. Notice, please, that while multiplication is commutative, composition of functions is not. Composition is also written with a little circle. So B(C(1)) can be written BoC(1). |
|
The second pre-class problem
Again, some happy volunteers, Mr. N. Patel and Ms. Rana, worked on this. The function f was
defined by the formula f(x)=sqrt(x)+sqrt(3-2x). I first asked what the
domain of this function was. Since sqrt(x) occurs, I know that
x>=0. But sqrt(3-2x) is also present, so 3-2x>=0, which is
equivalent to 3>=2x which is equivalent to 3/2>=x. Both of the
restrictions must be satisfied, so that the domain is the closed
interval [0,3/2] (both endpoints are included).
I also asked that a graph of the function be given. Here I
expected and wanted people to use a graphing calculator!
I don't know the exact range of this function. We'll come back to this
question later.
Why?
I discussed these pre-class problems because the answers to the first
QotD were not correct. People gave graphs which were not graphs of a
function, and did not seem familiar with the idea of composing
functions. I hope these examples help.
The hierarchy of trig
There are several ways of looking at the trig functions. I tried to
give a rapid review of the different conceptual levels and uses.
Level 0: measurement of sides of triangles
Several thousand years ago some smart people noticed that trigonometry
can be used to make indirect measurements. For example, if you had a
very tall tree whose height might be difficult to get directly, then
you could more easily measure the shadow and the estimate the angle
from the tip of the shadow to the tree top.
The idea is to notice that two right triangles which have one acute angle equal then have their other acute angle equal, and so they are similar. But ratios of corresponding sides of similar triangles are equal. So in the second diagram, we know that ?/!=??/!!. It may be that ?, for example, is difficult to measure directly. If we know the ratio ??/!! for the angle shown, and if we can measure ! directly, then the similarity equation will allow us to get ?. This is actually a very clever idea. People who used this idea actually collected data about the ratio of sizes of right triangles (they began to build a "knowledge base").
Ratios and special numbers
There are 3 sides to a right triangle. If you choose an acute angle,
the standard English names for the sides are as Adjacent (ADJ) and
Opposite (OPP) and Hypotenuse (HYP). These are shown to the
right. Then there are 6 possible ratios to study. I'll refer almost
always to three of these.
sine of the angle=OPP/HYP;
cosine of the angle=ADJ/HYP;
tangent of the angle=OPP/HYP.
I'll rarely mention or use the other three ratios (secant, cosecant, and
cotangent).
Fairly early in history some special right triangles were "discovered" and used frequently.
The triangle with sides labeled | Sine of the acute angle | Cosine of the acute angle | Tangent of the acute angle |
---|---|---|---|
1/2 | sqrt(3)/2 | 1/sqrt(3) | |
1/sqrt(2)2 | 1/sqrt(2) | 1 | |
sqrt(3)/2 | 1/2 | sqrt(3) |
Level 1: triangles in a circle
Consider the unit circle. In Math 151, "the unit circle" will
mean a circle of radius 1 centered at (0,0). Take a point (x,y) on the
circle. In the drawing to the right, the point is somewhere in the
first quadrant. Draw a right triangle with one corner at the origin
and one corner at (x,y), and the right angle sitting on the
x-axis. Since the hypotenuse has length 1, the ratios defining the
sine and cosine of the acute angle with vertex at (0,0) get simpler:
the bottoms are both 1. The legs of the triangle are x and y long, so
that the x and y coordinates are themselves equal to cosine and sine
of that angle. The text and most other sources usually will call the
acute angle at the origin , and I will try to do that also. Then
the coordinates of the point on the unit circle are cos() and
sin(). In fact, we could define those coordinates to be
cos() and sin(), and then we'd be able to get more angles
and more values of sine and cosine. For example, if =0, then the
corresponding point on the unit circle is at (1,0), so that cos(0)
should be 1 and sin(0) should be 0. And if =Pi/2 (a right angle),
the point would be "at" (0,1), and then cos(Pi/2)=0 and
sin(Pi/2)=1. Certainly tan(0)=0 and tan(Pi/2) is not defined.
Level 2: a moving point
Very few folks in this course likely will often use trig to find
heights of trees. But the functions of trigonometry will be used to
model periodic phenomena (body temperatures varying during the day,
seasonal changes of weather, electrical and magnetic fields,
etc.). These applications need a slightly different view. Consider a
particle moving around the unit circle. The slightly obnoxious picture
to the right is supposed to show such motion. A particle "starts"
moving from (1,0) at time 0, and moves with unit speed around
the circle in the positive (always, counterclockwise) direction. The
coordinates of the particle dynamically change. You can almost read
them off from the particle motion. The first, x-coordinate, is the
left/right position of the point. It goes from 1 to 0 to -1 to 0 to
... This is cosine of the time it takes the particle to get to that
position. The other coordinate, in one circular movement, starts at 0,
goes up to 1, down to 0, then down to -1, and back up to 0. In fact,
if we really think of the particle as moving for all time, the
positions constantly cycle back and forth. The simplest way to
measure angles, if we are going to consistently use this
dynamic view, is with the time lable. Of course, this measurement is
what is called radian measure. The preceding table then can be
augmented as follows:
Angle measurement | The triangle with sides labeled | Sine of the angle | Cosine of the angle | Tangent of the angle |
---|---|---|---|---|
0, and 0o | 1 | 0 | 1 | |
Pi/6, and 30o | 1/2 | sqrt(3)/2 | 1/sqrt(3) | |
Pi/4, and 45o | 1/sqrt(2)2 | 1/sqrt(2) | 1 | |
Pi/3, and 60o | sqrt(3)/2 | 1/2 | sqrt(3) | |
Pi/2, and 90o | 0 | 1 | Not defined |
It is not true that Babylonians had 360 fingers. Not everything you hear in class is totally correct.Dynamical view.
Graphs of trig functions
Now we can look at the graphs of the trig functions. Sine and
cosine are geometrically the same, only starting at different "times"
(after all, the circle is round and looks the same left/right as it
does up/down).
The graph of sine | |
---|---|
To the right is part of the graph of y=sin(x). The curve is what
happens to the second coordinate of the imaginary particle.
The sine function has domain all real numbers. The range is just the closed unit interval, [-1,1]. Since the imaginary particle described above travels around the circle forever, the sine function is periodic, with period equal to 2Pi. That is, for all numbers x, sin(x+2Pi)=sin(x). | |
The graph of cosine | |
To the right is part of the graph of y=cos(x). The curve is what
happens to the first coordinate of the imaginary particle.
The sine and cosine curves are congruent -- you can move one exactly onto the other. Sine begins at height 0 and goes up to 1. Cosine begins at height 1 and goes down to 0. The cosine function has domain all real numbers. The range is just the closed unit interval, [-1,1]. Since the imaginary particle described above travels around the circle forever, the cosine function is periodic, with period equal to 2Pi. That is, for all numbers x, cos(x+2Pi)=cos(x). | |
The graph of tangent | |
To the right is part of the graph of y=tan(x). It is very different
from the graphs of sine and cosine. In one way, it is simpler. It
consists of repeated "chunks" of horizontal width Pi, and each chunk
when considered by iteself just goes up: it is increasing. But the
graph also has breaks (where the cosine function is 0).
The function has domain all real numbers except odd multiples of Pi/2 (such at -3Pi/2, -Pi/2, Pi/2, 3Pi/2, etc.). The range is all real numbers. Amusingly, tangent is periodic but its minimal (smallest) period is Pi, so that tan(x+Pi)=tan(x) (if x is in the domain of tangent). |
It is neat to notice that most people (include me here!) usually draw sine and cosine incorrectly. One total wiggle of each of them occurs in an interval of length 2Pi, about 6.28 or six and a quarter. But the up/down is -1 to +1. So the darn curves aren't really as bumpy as many people think.
Addition formulas
The only one I could remember is
sin(1+2)=sin(1)cos(2)+cos(1)sin(2)
and this was embarrassing. You should look at the cosine formula,
which is about the same order of complexity, and try not to look at
the tangent formula, because it is slightly ferocious.
Amplitude and frequency Let me show some more graphs. I should have drawn more in class, because this comes up in many applications.
Comment
This, along with other reasons, is why I am not an engineer. Radios I'd design would pop and hiss, power lines would shake themselves to fragments, etc. |
Functions and their inverses
Consider the function f(x)=x3+7. A graph of this function
is shown to the right. I hope you recognize this as x3
moved up 7 units. Certainly f(2)=23+7=15: the output
corresponding to the input 2 is 15. Since f is a function, 15 is the
unique, the only output corresponding to the input 2. But the function
f has an interestping property. Suppose f(x)=15. This means
x3=7=15 so x3=15-7=8. And x=81/3=2. The
only input corresponding to the output 15 is the input 2. The
geometric interpretation of this is gotten by considering the
horizontal line y=15. This line intersects the curve y=x3+7
at exactly one point, (2,15). In fact something more general is
happening.
The specific function f considered here is one-to-one (1-1). This means there is exactly one distinct input for each output. How can we verify this? If you believe the graph, the fact means that any horizontal line will intersect the curve at most one time. Algebraically, we can do this: y=x3+7 implies that x=(y-7)1/3. If y is the output, we have just "constructed" the input, x, corresponding to the output, y.
One-to-one If we have a 1-1 function, we can think about reversing the process, and the reversed process is the inverse function to the original function. Right now the idea is what I'd like to make sure of. There are certainly implementational issues to worry about, and these aren't easy. But I think the idea should precede computations in this case. The ludicrous diagram to the right is supposed to make clear (?) what's going on. If there is exactly one output value corresponding to each input value, then we could think about reversing this association, and creating an inverse function.
In the case of f(x)=x3+7 I know that f is 1-1 because I can solve f(x)=y and get exactly one value of x for each y. The function g(y)=(y-7_1/3 is a function defined by the algebraic formula I got while solving. It is a formula for the function inverse to f.
If the point (x,y) which is (f's input, f's output) is on the graph of f, then (y,x) will be on the graph of the inverse to f. The switch (x,y) to (y,x) is geometrically done with a flip across the "main diagonal", y=x.
Part of the graph of the function inverse to this f(x)=x3+7 is shown to the right. It is the cubic curve flipped over the main diagonal.
Another example, not so easy
Piecewise linear functions are used in many computational
applications. They look rather simple, but because they can wiggle a
lot and they can be computed easily, they are very useful. I first
"defined" the function with its graph. The function had domain
(-,3]. Let's start at the top, at 3. The graph was a straight
line segment connecting (3,2) to (1,-2). Then there was a line segment
connecting (1,-2) to (0,1). Finally, the graph went left as a
half-line, starting at (0,1), passing through (-1,0) and just
traveling on left.
Some people feel uneasy about pictures, so I request that the class provide the relevant formulas. Only a few moments later (with the instructor roaming the class and urging on student efforts, like a wild slug), we got:
x+1 for x<=0 f(x)=-3x+1 for 0<x<1 2x-4 for x>=1 and x<=2I hope that's correct. I actually don't remember the function we used in class but I think it was something like this. As several people observed, there were alternate specifications for this function. We could have put the points x-0 and x=1 on either formula. It wouldn't matter.
This function is not 1-1. For example, f(-1)=0 and f(1/3)=0 (I solved -3x+1=0) and f(2)=0 (I solved 2x-4=0). Geometrically the horizontal line y=0 (the x-axis) hits the "curve" three distinct times.
Therefore if we flipped the graph over the main diagonal, we would not get the graph of a function. Well, what can we do? If we insist on trying to undo (?) this function f, we could somehow break the function into pieces (as indicated by the colors on the flipped graph).
The left-most piece of f Consider f with domain restricted to x<=0. There f(x)=x+1, and, if y=x+1, then x=y-1. The function g1(y)=y-1 is inverse to f.
Please notice how the domain and range are interchanged for the function f (suitably restricted) and its inverse, g1. That is the way it should be, since the inputs and outputs are being interchanged.
The middle piece of f
Again notice how the domain and range are interchanged for the
function f (suitably restricted) and its inverse, g2. That
is the way it should be, since the inputs and outputs are being
interchanged.
The right-most piece of f
And in this final piece also, the domain and range are interchanged
for the function f (suitably restricted) and its inverse,
g3: inputs and outputs are being interchanged.
|
Squaring
I believe you could, with some seriousness, declare the the previous
rather elaborate example was silly. It was technical, and overly
caring about "logic". I would like to argue against this. My argument
would consist first of discussion of a rather obvious function.
Consider squaring: f(x)=x2. We know about this function. It is perhaps the most obvious (once the negative numbers are learned) example of a function which is generally (except for 0) not 1-1. Look: f(2)=22=4 and f(-2)=(-2)2=4 also. Take the graph, a parabola whose vertex is at (0,0), and flip it over the main diagonal. The result is a curve which is not the graph of a function. "Everyone" knows this. Yet at the same time everyone would like to indicate square roots, and would like to tell that the square root of 4 is ... well, it could be -2 or +2. If a unique output is desired, the unique output which people have chosen is the non-negative answer (a classical word for this was "branch", really, apparantly because the flipped over curve might look like it has several slanting tree things). So sqrt(x) (really, better written with that strange almost-division sign) means the non-negative number whose square is x. So there has been a choice, in terms of the graph shown to the right. "Everyone" has decided to take the top, green, piece of the graph, and if ever the bottom part has to be used, it will be designated -sqrt(x).
Choosing an inverse
In this case the choice of a part of the function's domain and the
inverse has been done in our culture years and years ago, and I think
it is likely we have all grown up with it so nicely that we haven't
thought about it!
The trig functions and their inverses
The six trig functions are all periodic. Therefore the difficulties of
squaring actually become much more obvious. If sin()=1/2, well, then, actually, could be Pi/6 or 5Pi/6 or Pi/6-2pi
or 5Pi/6-2Pi or Pi/6+2PI or 5Pi/6+2Pi or ... actually infinitely many
other possibilities. Yet if you are interested in numbers,
solving equations like sin() =some number for
might be important (hey, I mentioned in class and meant the following:
construct a robot arm, and "watch" its motion -- you'd better be able
to understand how the values of the input and output of sine and
cosine work!).
arcsine People have agreed to "restrict" the domain of sine to [-Pi/2,Pi/2]. Then the function is 1-1, changing the inputs [-Pi/2,Pi/2} to putputs [-1,1} (sine is increasing in that domain, and we will develop techniques to verifty such claims easily). So there is an inverse function. The text calls it sin-1 and I will try not to use that notation since a -1 in the exponent makes me think of reciprocal. The notation I will use is arcsin. Also I should mention that most of the computer languages I know use the designation arcsin to mean what I am describing here (since superscripts are difficult to type!). The graph to the right is arcsin. The endpoints are (-1,-Pi/2) and (1,Pi/2). I labeled these wrong in class and I am sorry. Everyone agrees that the domain of arcsin is [-1,1] and the range is [-Pi/2.Pi/2]. |
arctan
Tangent, from the inverse function point of view, is in some silly
sense twice as bad as sine. ! This is
because sine is 2Pi periodic and tangent is Pi periodic. The tangent
domain restriction that people use almost always (99.999% of the
time!) is (-Pi/2,Pi/2). The open interval results from the fact than
tangent is not defined at odd multiples of Pi/2 (hey, cosine is 0
there). So flip the graph of tangent. The result is arctan (or, in the
text, tan-1).
The domain of arctan is all real numbers. The graph is increasing, and when x gets very large positive, then arctan(x) gets close to Pi/2. On the other side, when x gets very large negative, then arctan(x) gets close to -Pi/2. There are two (horizontal) asymptotes, y=Pi/2 and y=-Pi/2. As I mentioned, arctan is used to take big positive and negative numbers and output numbers which are more controlled, between -Pi/2 and Pi/2. arctan has been more used by me than tan. It is a neat function.
QotD
To the right is a graph of a function, f. It is piecewise linear, and
has period 5. Values of f can be read from the graph. For example,
f(0)=1 because (0,1) is on the graph, and f(3)=0 because (3,0) is on
the graph.
So:
Suppose g(x)=3f(2x)+4.
Compute g(0), g(1), and g(2). Graph g(x). What is the range of g?
g(0)=3f(2·0)+4=3f(0)+4=3·1+4=7.
g(1)=3f(2·1)+4=3f(2)+4=3·1+4=7.
g(2)=3f(2·2)+4=3f(4)+4=3·0+4=4.
We can do the graph of g in "stages".
This graph: period 2.5 and range [0,1]. | |
This graph: period 2.5 and range [0,3]. | |
This graph: period 2.5 and range [4,7]. Please note that the three values found for g are verified on this graph. |
Tuesday, September 4 | (Lecture #1) |
---|
Introduction
What's calculus about? The simple answer, which we will use as initial
motivation, is to learn how to find and compute lines tangent to
curves and areas of regions with curved boundaries. These problems are
easy to state and most people like pictures. But truthfully very few
of you will have any reason to compute tangent lines and areas after
calculus courses. So here's the goal of the course:
We want to teach you how to model, analyze, and, to the extent possible, solve problems involving all sorts of rates of change and accumulation.It turns out that modeling such problems is an extremely important and useful skill. I write the word "model" to mean an activity which could be rather sophisticated.
• For example, stretch a rubber band. Hooke's Law, which
describes some of what happens, declares that the amount of stretching
is directly proportional to the force. Well, this means if one ounce
stretches the rubber band, say, 1/3rds of an inch, then two
ounces should stretch the band 2/3rds of an inch. That's
easy. But I don't think that putting 10 tons on the rubber band would
make it many miles long. The simple model has restricted validity.
• Later in the course I will show you a picture of the
solubility of sodium sulfate in water as the temperature varies. The
result is quite (unexpectedly to me!) complicated. What model will
describe this?
• Disease spread (for example, influenza) has been
extensively studied and is quite important (how much vaccine should be
made, and what types, etc.). The models are fairly accurate, but
... sometimes are complicated.
• Enzyme-catalyzed chemical reactions can be fairly
accurately modeled using techniques of this course. We may see some
simple examples later.
A simple toy model
Equal-sized squares are cut from the corners of an 8 inch by 11 inch rectangular piece of paper. The flaps in the resulting piece of paper are folded up. What is the resulting volume which the paper object encloses?I hope that the accompanying illustrations are helpful. I'll call the edge length x. Then the "solid" brick will have volume, V, which will be the product of the area of the base multiplied by the height. The height is x, and the base has sides with lengths 8-2x and 11-2x. So V=x(11-2x)(8-2x). This formula is "just" a polynomial of degree 3.
Althought this is a rather simple problem, already some interesting features are available. Notice that x=-50 or x=200 makes no sense in this problem. Also notice that V (in units of inches3) could not be -11 or 206,358 (I think!).
New
Math 151 has a new textbook this semester. The text will be used
through the three semester sequence (151-152-251). It is cheaper and
slightly shorter that the previous text, and perhaps it offers better
exposition. Look out for errors, and please report them to
me. Although the book was extravagantly proofread, almost all first
editions have errors.
Is the course important?
This course teaches the basic "language" used in virtually all fields
of engineering, technology, and science. It is likely that the grade
in the first college calculus course is a very good indicator of
eventual success in such majors.
Who teaches what? How is student work assessed and how are grades determined? Please see here.
A rough comparison
In New Jersey, a high school calculus course has 50 minute periods,
nominally 180 student days: let's say 160. So the time in-class is
about 8,000 minutes.
Math 151 has two lectures each week. There are 14 weeks, and the
lectures are 80 minutes long. Two lectures are lost to exams. So 13
times 80 is about 1,040, then doubled is 2,080. I will add about half
the recitation time (more about this below): that's another
14·40=560 minutes. So 151 has 2,640 educational minutes. That's
about one-third of the high school regimen!
Therefore YOU will be your most
important teacher! Working in groups is good. Asking questions is
good. Taking advantage of the LRC and the MSLC is good.
Modes of instruction
Two meetings each week are likely to be classical lectures. There's
usually not much interaction in the lectures, and (1600<8,000!)
there is not
much time for interaction. The additional "recitation" meeting will
have more varied activities. Part of the time will be question/answer
about course material and textbook homework problems. Maybe there will
be a quiz. Much of the time will be spent doing workshops. Students will discuss somewhat
non-routine problems in groups. Students will be required to write up a detailed solution of a selected
problem and hand this in the next week. Why?
• Essentially all engineering students will participate in
a group project in one or more upper-level courses, and extended
written solutions will be required. Practice now, when the stakes are
lower.
• Some of the workshop problems will be intentionally (!)
ill-defined, and the methods needed may not be obvious. Real-world
problems frequently are not given with the methods of solution, and
sometimes what a solution is may not be clear!
• Many job environments have people working in teams. So
get used to it now, when the stakes are lower. And maybe use these
workshops to begin study sessions outside of class.
Let's really begin ...
Solutions to two textbook homework problems should be handed in
tomorrow (1.1: 18 and 68). Consider numbers:
I'll use this background color if I write things
which I don't have time to cover in class. Or I just forgot!
Some real numbers have two valid decimal expansions. This doesn't bother me, since, say, a house on a corner of a grid system of streets could also possibly have two valid addresses (if it is on the corner of Cedar Avenue and Third Street, the house could have the address 12 Cedar Avenue and also 42 Third Street). So, for example, the real number with decimal address 23.45999... (with the 9's repeating) also have the decimal address 23.46, a terminating expansion (there are a string of 0's which are usually not written. |
Geometry
The real numbers are usually thought of as corresponding to a specific
geometric object, the real line. I usually think of this line
as horizontal with 0 sitting "in the middle". 1 is to the right of
0. And this geometric picture brings up the idea of order. In addition
to the algebraic structure, there is order: a<b. To me this means
(in my picture of the line) that a is to the left of b. Negative
numbers are to the left of 0 and positive numbers, including all the
positive integers, are to the right of 0. Here are some interesting
aspects to note.
Distance including a discussion of | |
The distance between two points is a non-negative real number whose
size expresses how far apart the numbers are. This will be important
when we study approximation schemes. We'd like to know that the
approximation gets "close" to the correct answer, and the closeness
will be measure by the distance. Algebraically, if the points
correspond to the real numbers a and b, the distance between them is
|a-b| and this is the same as |b-a|, so that distance has some
symmetry. But I just used absolute value, and here is the
piecewise definition of absolute value:
|x|=x if x≥0 and |x|=-x if x<0
Therefore absolute value is always a non-negative number. The absolute value of a number is 0 only when the number itself is 0. And absolute value of a product is the product of the absolute values (this actually is not totally obvious, and needs a bit of thought, I believe). |
Intervals
Suppose we want to discover what numbers x are closer to 9 than a
distance of 4. Algebraically this requirement translates to
|x-9|<4. We can sort of "unroll" the inequality. The absolute value
will be less than 4 if the number itself is both less than 4 and
greater than -4. The two inequalities can be compactly written as
follows:
-4<x-9<4 which implies 5<x<13.
This is an interval and an interval which does not contain
either endpoint is called an open interval. The notation for
this interval is (5,13). Intervals which
contain both endpoints are called closed. An examples of such
an interval is [-4,6], which means the numbers x satisfying
-4≤x≤6. There are also
half-open intervals, unbounded intervals (with notation using + or -
), etc. Please see the textbook.
Warning! If you wanted to "solve" (better: understand!) the inequality |x-9|>4 you can't just "unroll" it to -4>x-9>4. This inequality has no solutions. There is no number which is simultaneously less than -4 and greater than 4. You can't write this so compactly and using such implications represents an invalid (wrong!) method of solution.
A valid method of solution would involve separately solving
the inequalities: |
The plane
The conventional way to describe the plane algebraically is to drop
down two lines perpendicular to each other: coordinate axes. A point
in the plane will then be described by an ordered pair of real
numbers. The first coordinate will usually be called the x-coordinate
and the second, the y-coordinate. This pair describes ordered
distances from the horizontal (the x-axis) and vertical (the y-axis)
lines. Please see the text for more about this.
The embarrassment of all this, especially with "new" students, is that (3,8) could describe both a point in R2, the plane, and could also describe an open interval (with [missing!] endpoints 3 and 8). The context is supposed to help, but still the notational confusion is possible, and this is lousy.
A (non-vertical) line and its algebraic description
Suppose we wanted an algebraic description of points with coordinates
(x,y) which lie on the straight line which goes through (4,3) and
(8,13). If (x,y) is such a point, then look at the picture: two right
triangles indicated are similar, so the corresponding sides have the
same ratios:
13-y 13-3 ---- = ---- 8-x 8-4and the point (x,y) is on the line exactly when y-13=([13-3]/[8-4])(x-8). The quantity ([13-3]/[8-4]) is called the slope, and multiplies changes in x to give changes in y. It is frequently designated with the letter m.
Distance in the plane, R2 Look, please, at the diagram to the right. In the plane, points correspond to ordered pairs of numbers. So a point p might correspond to an ordered pair, (x1,y1), and q might correspond to (x2,y2). Then the point (x1,y2) is the vertex of a right triangle whose hypotenuse is a line segment connecting p and q. One leg of the right triangle is on a line where all the first coordinates are x1, and the length of that leg is given by the one dimensional formula, |y1-y2|. The other leg is on the line where all the second coordinates are y2, and the length of that leg is |x1-x2|. Then by Pythagoras, the hypotenuse has length sqrt(|x1-x2|2+|y1-y2|2). And usually the absolute values are discarded since we are squaring the quantities. Therefore we officially define: dist(p,q)=sqrt((x1-x2)2+(y1-y2)2) if p has coordinates (x1,y1) and q has coordinates (x2,y2). |
Example
The distance between between (3,-2) and (6,4) is
sqrt([3-6]2+[4-(-2)]2)=sqrt(32+22)=sqrt(13).
A circle and its algebraic description Suppose we wanted to describe algebraically the collection of all points which have distance sqrt(13) from (6,4). This is, of course, a circle of radius sqrt(13) and center (6,4). Well, if (x,y) is such a point, then sqrt([x-6]2+[y-4]2)=sqrt(13). This does describe the circle. Of course, if you must, all sorts of algebraic things could be done to the equation. But I am the laziest person in the room, and therefore ... |
Functions
The word function is used in a technical sense in calculus, and
is one of the most important vocabulary words. It is the logical
setting for how things are transformed to other things. In the case of
Math 151, the "things" are numbers. So functions will change numbers
to numbers. A function is a rule changing numbers to numbers, with the
important restriction that each "input" number is assigned a unique
output number. You could think of a function as a machine with an
input and a unique output associated to each input. The collection of
all valid inputs to the machine, those inputs which don't cause the
machine to break, is called the domain. The collection of all
of the outputs for these valid inputs is called the range.
One example, from the toy model, already tricky!
In the toy model we got the equation V=x(11-2x)(8-2x). Well, the
formula on the right side of the equation defines a function. Usually
we write V(x) to show the dependence of V on the input value, x. So
V(x)=x(11-2x)(8-2x). What is the domain of this function, V? If we
just consider the formula, we'll get one answer: the polynomial has a
value for all x's. The "natural domain" of a function defined by the
formula x(11-2x)(8-2x) is all x. But if the formula is being used to
model the physical situation, then there are definite restrictions on
x. Certainly x should not be negative (cut out a square with
negative sides?). And x can't be bigger than half of the
smaller side length (remember that the paper's original size was 8 by
11 inches). So the domain is (0,4). Some people (many people,
actually) argue that it makes more sense for the domain in this model
to be [0,4]. I won't continue this discussion now, but we will come
back to this. But certainly the physically reasonable domain for V in
this problem is not the same as the "natural domain".
The range of V is interesting. If we allow x=0, the range
"begins" at 0. How big does it get? I don't know now. So the
range of V is [0, I don't know now].
Student-specified domain -- even weirder!
Before class Mr. Green "gave" me the
interval (0,5) and Mr. Sloane "gave"
me the interval [7,8]. Below is a picture of the union of these
intervals, with the open interval indicted with a small circle at the
end, and the closed interval shown with a large dot at each end.
I would like to "construct" a formula whose natural domain of
definition is these two intervals. Familiar restrictions include
division (we're not supposed to divide by 0) and square roots (in this
course we deal with real numbers, so we should only take square
roots of non-negative numbers.
If you don't believe me, the graph to the right of the function f is produced by Maple, a program available to all Rutgers students (on Eden, other systems, and in computer labs). The instruction I used is below. The discont=true simply advised the program to expected "discontinuities", and not to try to connect the dots (it is a program, and will try to connect the dots!). I hope that you could get something similar on a graphing calculator. Notice that there is a graph exactly over the intervals that were named, and the boundary "behavior" over (0,5) sort of indicates that the endpoints are not in the domain.
plot(sqrt(8-x)+1/sqrt(x)+sqrt((x-7)/(x-5)),x=-1..9,y=0..10,thickness=2,color=black,discont=true)
The graph of a function
The collection of all points in the plane which correspond to the
ordered pairs (x,f(x)), when x is in the domain of the function, is
called the graph of the function. I like pictures, so I like
graphs. Substantial evidence exists (allocation of neural resources --
brain power!) suggesting that humans can process lots of visual clues
efficiently, much more, say, than many numbers or algebraic
formulas. It is possible to specify a function with its graph. So
that's what I did.
An example
I drew a curve much like what is displayed to the right. What was
drawn? It was a nice ("continuous", technical word to be defined
later) curve which "interpolated" (was drawn to connect) the five
"data points" (-2,4), (-1,0), (0,-2), (1,-3), and (2,-1).
This is the graph of a function, since it passes the vertical line test: every vertical line intersects the graph at most one time. This is logically the same as requiring that every input yields exactly one output.
What is the domain of this function? That's the set of x's for which there is a point (x,something) on the graph. Look at the collection of vertical lines which touch this graph. The x's which touch or cross this graph are all x's in the closed interval [-2,2].
What is the range of this function? That's the set of y's for which the point (something,y) is on the graph. You can find these by considering all of the horizontal lines which touch or cross the graph. Notice, please, that there are horizontal lines which cross the graph more than once. This is definitely permitted by the definition of function. There can be distinct inputs which have the same outputs. The range is [-3,4].
Creating some new functions
| |||||||||||||||||
| |||||||||||||||||
| |||||||||||||||||
| |||||||||||||||||
We know that D(x)=f(|x|). Look below, please.
These data points are shown on the graph to the right. | |||||||||||||||||
You've got to get D's values very carefully.
The absolute value is on the "inside" of the function. The values of
the function are values of f, and these may be (actually, they
are!) negative! With these data points, I hoped that students
could indeed sketch a graph of D. Such a graph is to the right.
The domain of D is [-2,2]. The range of D is [-3,-1]. The graph of D has several interesting qualitative aspects. It has a corner at (0,-2). And it is symmetric with respect to the y-axis, and this is because |x| is inside D. |
About the QotD QotD=the Question of the Day. I will try (unless I get tired or busy or bored ...) to ask a question at every lecture. Every student who hands in an answer gets full credit, whatever the answer is! Why do I do this? Here are some possible reasons:
Further questions
Some answers |
Maintained by greenfie@math.rutgers.edu and last modified 9/3/2007.