An Intuitive Introduction To Limits
An Intuitive Introduction To Limits
An Intuitive Introduction To Limits
Ack! We missed what happened at 4:00. Even so, whats your prediction for
the balls position?
Easy. Just grab the neighboring instants (3:59 and 4:01) and predict the ball to
be somewhere in-between.
And it works! Real-world objects dont teleport; they move through
intermediate positions along their path from A to B. Our prediction is At 4:00,
the ball was between its position at 3:59 and 4:01. Not bad.
With a slow-motion camera, we might even say At 4:00, the ball was between
its positions at 3:59.999 and 4:00.001.
Our prediction is feeling solid. Can we articulate why?
The predictions agree at increasing zoom levels. Imagine the 3:594:01 range was 9.9-10.1 meters, but after zooming into 3:59.9994:00.001, the range widened to 9-12 meters. Uh oh! Zooming
should narrow our estimate, not make it worse! Not every zoom level
needs to be accurate (imagine seeing the game every 5 minutes), but to
feel confident, there must be some threshold where subsequent zooms
only strengthen our range estimate.
The before-and-after agree. Imagine at 3:59 the ball was at 10
meters, rolling right, and at 4:01 it was at 50 meters, rolling left. What
happened? We had a sudden jump (a camera change?) and now we cant
pin down the balls position. Which one had the ball at 4:00? This
ambiguity shatters our ability to make a confident prediction.
With these requirements in place, we might say At 4:00, the ball was at 10
meters. This estimate is confirmed by our initial zoom (3:59-4:01, which
estimates 9.9 to 10.1 meters) and the following one (3:59.999-4:00.001, which
estimates 9.999 to 10.001 meters).
Limits are a strategy for making confident predictions.
Limits help answer this conundrum: predict your speed when traveling to a
neighboring instant. Then ask the impossible question: whats your predicted
speed when the gap to the neighboring instant is zero?
Note: The limit isnt a magic cure-all. We cant assume one exists, and there
may not be an answer to every question. For example: Is the number of
integers even or odd? The quantity is infinite, and neither the even nor odd
prediction stays accurate as we count higher. No well-supported prediction
exists.
For pi, e, and the foundations of calculus, smart minds did the proofs to
determine that Yes, our predicted values get more accurate the closer we
look. Now I see why limits are so important: theyre a stamp of approval on
our predictions.
Human English
When we strongly predict that f(c) = L, we mean
means
for all real > 0
for any error margin we want (+/- .1 meters)
there exists a real > 0
there is a zoom level (+/- .1 seconds)
such that for all x with 0 < |x c| < , we have |f(x) where the prediction stays accurate to within the
L| <
error margin
Yes, we can get cute and ask for the left hand limit (prediction from before
the event) and the right hand limit (prediction from after the event), but we
only have a real limit when they agree.
A function is continuous when it always matches the predicted value (and
discontinuous if not):
The first check: do we even need a limit? Unfortunately, we do: just plugging in
x=2 means we have a division by zero. Drats.
But intuitively, we see the same zero (x 2) could be cancelled from the top
and bottom. Heres how to dance this dangerous tango:
Assume x is anywhere except 2 (It must be! Were making a prediction
from the outside.)
We can then cancel (x 2) from the top and bottom, since it isnt zero.
Were left with f(x) = 2x + 1. This function can be used outside the black
hole.
What does this simpler function predict? That f(2) = 2*2 + 1 = 5.
So f(2) = 5 is our prediction. But did you see the sneakiness? We pretended x
wasnt 2 [to divide out (x-2)], then plugged in 2 after that troublesome item
was gone! Think of it this way: we used the simple behavior from outside the
event to predict the gnarly behavior at the event.
We can prove these shenanigans give a solid prediction, and that f(2) = 5 is
infinitely accurate.
For any accuracy threshold (), we need to find the zoom range () where we
stay within the given accuracy. For example, can we keep the estimate
between +/- 1.0?
Sure. We need to find out where
so
In other words, x must stay within 0.5 of 2 to maintain the initial accuracy
requirement of 1.0. Indeed, when x is between 1.5 and 2.5, f(x) goes from
f(1.5) = 4 to and f(2.5) = 6, staying +/- 1.0 from our predicted value of 5.
We can generalize to any error tolerance () by plugging it in for 1.0 above. We
get:
If our zoom level is = 0.5 * , well stay within the original error. If our error
is 1.0 we need to zoom to .5; if its 0.1, we need to zoom to 0.05.
This simple function was a convenient example. The idea is to start with the
initial constraint (|f(x) L| < ), plug in f(x) and L, and solve for the distance
away from the black-hole point (|x c| < ?). Its often an exercise in algebra.
Sometimes youre asked to simply find the limit (plug in 2 and get f(2) = 5),
other times youre asked to prove a limit exists, i.e. crank through the epsilondelta algebra.
as
You can get sneaky and define y = 1/x, replace items in your formula, and then
use
so it looks like a normal problem again! (Note from Tim in the comments: the
limit is coming from the right, since x was going to positive infinity). I prefer
this arrangement, because I can see the location were narrowing in on (were
always running out of paper when charting the infinite version).
So many math courses jump into limits, infinitesimals and Very Small Numbers (TM)
without any context. But why do we care?
Math helps us model the world. We can break a complex idea (a wiggly curve) into
simpler parts (rectangles):
But, we want an accurate model. The thinner the rectangles, the more accurate the
model. The simpler model, built from rectangles, is easier to analyze than dealing
with the complex, amorphous blob directly.
The tricky part is making a decent model. Limits and infinitesimals help us create
models that are simple to use, yet share the same properties as the original item
(length, area, etc.).
The Paradox of Zero
Breaking a curve into rectangles has a problem: How do we get slices so thin we
dont notice them, but large enough to exist?
If the slices are too small to notice (zero width), then the model appears identical to
the original shape (we dont see any rectangles!). Now theres no benefit the
simple model is just as complex as the original! Additionally, adding up zero-width
slices wont get us anywhere.
If the slices are tiny but measurable, the illusion vanishes. We see that our model is a
jagged approximation, and wont be accurate. Whats a mathematician to do?
We want the best of both: slices so thin we cant see them (for an accurate model)
and slices thick enough to create a simpler, easier-to-analyze model. A dilemma is at
hand!
The Solution: Zero is Relative
These approaches bridge the gap between zero to us and nonzero at a greater
level of accuracy.
Overview of Limits & Infinitesimals
Lets see how each approach would break a curve into rectangles:
Limits: Give me your error margin (I know you have one, you limited,
imperfect human!), and Ill draw you a curve. Whats the smallest unit on your
ruler? Inches? Fine, Ill draw you a staircasey curve at the millimeter level and
youll never know. Oh, you have a millimeter ruler, do you? Ill draw the curve
in nanometers. Whatever your accuracy, Im better. Youll never see the
staircase.
Limits stay in our dimension, but with just enough accuracy to maintain the illusion
of a perfect model. Infinitesimals build the model in another dimension, and it looks
perfectly accurate in ours.
The trick to both approaches is that the simpler model was built beyond our level of
accuracy. We might know the model is jagged, but we cant tell the difference any
test we do shows the model and the real item as the same.
That trick doesnt work, does it?
Oh, but it does. Were tricked by imperfect but useful models all the time:
Audio files dont contain all the information of the original signal. But can you
tell the difference between a high-quality mp3 and a person talking in the other
room?
Computer printouts are made from individual dots too small to see. Can you tell
a handwritten note from a high-quality printout of the same?
Video shows still images at 24 times per second. This imperfect model is fast
enough to trick our brain into seeing fluid motion.
On and on it goes. We resist because of our artificial need for precision. But audio and
video engineers know they dont need a perfect reproduction, just quality good
enough to trick us into thinking its the original.
Calculus lets us make these technically imperfect but accurate enough models in
math.
Working In Another Dimension
We need to be careful when reasoning with the simplified model. We need to do our
work at the level of higher accuracy, and bring the final result back to our world.
Well lose information if we dont.
Suppose an imaginary number (i) visits the real number line. Everyone thinks hes
zero: after all, Re(i) = 0. But i does a trick! Square me! he says, and they do: i * i =
-1 and the other numbers are astonished.
To the real numbers, it appeared that 0 * 0 = -1, a giant paradox.
But their confusion arose from their perspective they only thought it was 0 * 0 =
-1. Yes, Re(i) * Re(i) = 0, but that wasnt the operation! We want Re(i * i), which is
different entirely! We square i in its own dimension, and bring that result back to
ours. We need to square i, the imaginary number, and not 0, our idea of what i was.
Beware similar mistakes in calculus: we deal with tiny numbers that look like zero to
us, but we cant do math assuming they are (just like treating i like 0). No, we need to
do the math in the other dimension and convert the results back.
Limits and infinitesimals have different perspectives on how this conversion is done:
Nobody ever told me: Calculus lets you work at a better level of accuracy, with a
simpler model, and bring the results back to our world.
A Real Example: sin(x) / x
Lets try a conceptual example. Suppose we want to know what happens to sin(x) / x
at zero. Now, if we just plug in x = 0 we get a nonsensical result: sin(0) = 0, so we
get 0 / 0 which could be anything.
Lets step back: what does x = 0 mean in our world? Well, if were allowing the
existence of a greater level of accuracy, we know this:
Things that appear to be zero may be nonzero in a different dimension (just like i might
appear to be 0 to us, but isnt)
Were going to say that x can be really, really close to zero at this greater level of
accuracy, but not true zero. Intuitively, you can think of x as 0.000000001, where
the is enough zeros for you to no longer detect the number.
(In limit terms, we say x = 0 + d (delta, a small change that keeps us within our error
margin) and in infinitesimal terms, we say x = 0 + h, where h is a tiny hyperreal
number, known as an infinitesimal)
Ok, we have x at zero to us, but not really. Now we need a simpler model of sin(x).
Why? Well, sine is a crazy repeating curve, and its hard to know whats happening.
But it turns out that a straight line is a darn good model of a curve over short
distances:
Just like we can break a filled shape into tiny rectangles to make it simpler, we can
dissect a curve into a series of line segments. Around 0, sin(x) looks like the line x.
So, we switch sin(x) with the line x. Whats the new ratio?
Well, x/x is 1. Remember, we arent really dividing by zero because in this superaccurate world: x is tiny but non-zero (0 + d, or 0 + h). When we take the limit or
take the standard part it means we do the math (x / x = 1) and then find the
closest number in our world (1 goes to 1).
So, 1 is what we get when sin(x) / x approaches zero that is, we make x as small as
possible so it becomes 0 to us. If x became pure, true zero, then the ratio would be
undefined (and it is at the infinitesimal level!). But were never sure if were at
perfect zero something like 0.00000001 looks like zero to us.
So, sin(x)/x looks like x/x = 1 as far as we can tell. Intuitively, the result makes
sense once we read about radians).
Visualizing The Process
Todays goal isnt to solve limit problems, its to understand the process of solving
them. To solve this example:
Realize x=0 is not reachable from our accuracy; a small but nonzero x is always
available at a greater level of accuracy
Replace sin(x) by a straight line as a simpler model
Do the math with the simpler model (x / x = 1)
Bring the result (1) back into our accuracy (stays 1)
In later articles, well learn the details of setting up and solving the models.
Caveats: The Trick Doesnt Always Work
Some functions are really jumpy and they might differ on an infinitesimal-byinfinitesimal level. That means we cant reliably bring them back to our world. It looks
like the function is unstable at microscopic level and doesnt behave smoothly.
The rigorous part of limits is figuring out which functions behave well enough that
simple yet accurate models can be made. Fortunately, most of the natural functions
in the world (x, x2, sin, ex) behave nicely and can be modeled with calculus.
Limits Or Infinitesimals?
Logically, both approaches solve the problem of zero and nonzero. I like
infinitesimals because they allow another dimension which seems a cleaner
separation than always just outside your reach. Infinitesimals were the foundation
of the intuition of calculus, and appear inside physics and other subjects that use it.
This isnt an analysis class, but the math robots can be assured that infinitesimals
have a rigorous foundation. I use them because they click for me.
Summary
Phew! Some of these ideas are tricky, and I feel like Im talking from both sides of my
mouth: we want to be simpler, yet still perfectly accurate?
This famous dilemma about being zero sometimes, and non-zero others is a famous
critique of calculus. It was mostly ignored since the results worked out, but in the
1800s limits were introduced to really resolve the dilemma. We learn limits today, but
without understanding the nature of the problem they were trying to solve!
Here are the key concepts:
Zero is relative: something can be zero to us, and non-zero somewhere else
Infinitesimals (another dimension) and limits (beyond our accuracy) resolve the
dilemma of zero and nonzero
We create simpler models in the more accurate dimension, do the math, and bring the
result to our world
The final result is perfectly accurate for us
My goal isnt to do math, its to understand it. And a huge part of grokking calculus is
realizing that simple models created beyond our accuracy can look just fine in our
dimension. Later on well learn the rules to build and use these models. Happy math.
Other Posts In This Series
1. A Gentle Introduction To Learning Calculus
2. How To Understand Derivatives: The Product, Power & Chain Rules
3. How To Understand Derivatives: The Quotient Rule, Exponents, and Logarithms
4. An Intuitive Introduction To Limits
5. Why Do We Need Limits and Infinitesimals?
6. Learning Calculus: Overcoming Our Artificial Need for Precision
7. Prehistoric Calculus: Discovering Pi
8. A Calculus Analogy: Integrals as Multiplication
9. Calculus: Building Intuition for the Derivative
10.Understanding Calculus With A Bank Account Metaphor
11.A Friendly Chat About Whether 0.999... = 1
Posted in Calculus, Math
questions and insights for the article. Thanks!
36 comments
I think of x as the x-axis in the plane that was demonstrated. It could also
be just the variable in the equation. But neither makes sense. I know x/x
is 1, but how come sin(x) is x?
thanks! and more power!
10.
mcmlxxxvi says:
Hello, Kalid,
Very well-written and descriptive. Thank you for giving me a good and
pleasant read on things past and nearly forgotten!
I could only wish that more people like you were teaching in high schools
and universities. Around here, the tutors are often skilled in their field,
but regularly and gravely fail to convey the meaning behind the
definitions, theorems and proofs they teach only the items themselves;
and the educational process plummets.
Arbie:
I believe this means the line y = x. Thus y_1 = sin(x), y_2 = x and y1
~= y2 for x -> 0.
11.
Kalid says:
Kalid says:
@mcmlxxxvi: Glad you enjoyed it, and thanks for the comment. I too
wish there was more emphasis on true understanding vs. the lets learn
enough to pass the next test mentality. Learning the intuition may take
a bit longer than memorizing in the short term, but in the long run it
gives you a more flexible set of knowledge, and not to mention its way
more fun. I sometimes see grades as a curse because rather than being
an indication of knowledge, they become an end in itself vs. the learning
it should represent. Its very hard to test intuition its a gutcheck you
need to ask yourself. But with no grades theres no incentive (carrot or
stick) I dont know the answer, but I too wish there was another way.
13.
Anonymous says:
Kalid says:
@Anonymous: Thank you Ive seen the essay and really like it :).
15.
asdf says:
In intuitionistic math, the law of excluded middle is rejected (i.e. not not
A doesnt imply A) so you must provide an algorithm for constructing all
your objects.
There is no general procedure for detecting whether or not 2 objects are
equal. You must explicitly provide an algorithm for showing 2 objects are
equal.
The trichotomy law (a b, a = b) doesnt hold in general.
All functions are continuous. Piecewise functions are nonsensical.
In other words, the continuum is unbreakable into points. Functions
transform the continuum onto the continuum.
With this as our basis, Smooth Infinitesimal Analysis introduces an object
called epsilon.
There is no algorithm to tell whether or not epsilon != 0 or epsilon = 0.
This avoids the first problem entirely.
epsilon^2 = 0 though which gives us a way to get rid of them from our
formulas.
So I view infinitesimals as the glue that makes the continuum
unbreakable and there is no algorithm to decide if the expression
epsilon = 0 or epsilon != 0 is true (see why we have to reject the law of
excluded middle to make this work?).
16.
Kalid says:
Dave says:
Kalid says:
werterber says:
Hello, i have silly question. How intuitively explain that cos x/x is
undefind?
There is graf> http://www.wolframalpha.com/input/?i=Plot%5B{cos%5Bx
%5D%2C+x}%2C+{x%2C+-1.0%2C+1.0}%5D
thx
20.
Kalid says:
@werterber: Not a silly question at all! In my head, its saying whats the
ratio of width [cos(x)] to distance traveled (x).
As our distance traveled goes to 0 (we arent moving from the starting
point), cos(x) tends towards 1 were pretty much at the same width. So
it becomes 1 / 0 in my head.
21.
Kostya says:
Anonymous says:
Hey, Kalid You hold a marvelous scape valve from the montains of
unintuitive theorems and corolaries contained in every text-book.Outside,
our memory rests in peace, and the big picture awakes our deep passions
about math.Oh, precious and full of insight scape valve.
23.
kalid says:
Thanks Kalid,
Your articles did help me a lot.
By the way, what software do you use to illustrate examples in your
articles (like this one)? Thanks
25.
kalid says:
skrsccrfrk says:
skrsccrfrk says:
andy521 says:
@kalid
@ kalid
Hi kalid plz i hav a doubt ?? My question is can we think about wat
number would be infinitesimal ?? According to me we humans hav a
limited horizon of thinking and so we can just think of finit numbers.
So even if we assign any value to infinitesimal it would be some fiite
value and a value smaller than it will still exist.. So is the limit which
we are talking about is the limit of our brains to comprehend such small
amounts ??? Plz help ??
29.
very nice, i loved the way, you taught us. Very interesting!
30.
Eric V says:
@Dave
post 17
Regarding your question, If you learn calculus via the use of
infinitesimals, is it possible to then make the leap over to using limits?, I
suppose it is possible for I have (in a way) done it, though I never knew I
was learning infinitesimals.
I must admit that prior to reading this post I have never even heard
about the defined mathematic concept of infinitesimal. I also never took
a formal Calculus course. I originally learned Calc in my AP physics class
in high school. Our teacher (one of the few who truly loved the craft of
teaching and had a passion for what she did) had both the constraint of
putting her Physics class on hold to teach Calc to those who have never
seen it, and also the freedom that brevity provided; she was free to teach
the idea of calculus without the strict procedural rigor that a formal class
drags its pupil through. We learned the basic idea of the integral before
the derivative, heresy in Calc101. Here it is 21 years later and I can still
hear her voice saying Taking the integral just means add up a whole
bunch of things, and taking a differential element of just means cut the
thing into really teenie weenie chunks. We learned the idea of a
derivative as slope of a function without being given 2 points, just one
point and an interval to the next. After seeing what happened as the
interval got smaller we finally visualized slope at a point. Only afterward
were we shown the official formula with a limit in it. I saw it as a
perfectly nice piece of legal-eez that made the rest of the world happy for
me to have learned the right way, and I was enormously grateful our
teacher taught us the intuitive way.
31.
Eric V says:
was. But then what about this new element, how can we know it is the
smallest thing?
A revelation came when I realized that in order to be an element we
dont really need it to be true that you cant break it apart, it just means
that if you do break it down further then it is no longer the same stuff.
Thus the element is really just the smallest possible piece of a thing
WHICH can still be the same thing. E.g. an element of water (H2O) can
be broken down, but it is no longer water, just hydrogen and oxygen
atoms. An atom can be broken down into protons, neutrons, and
electrons, but it is no longer the same stuff as the original atom. A little
chunk of matter (a superstring exhibiting one class of vibration in 10
dimensions) can indeed be broken down, it is just no longer matter. It is
also not exactly energy, but when the stuff comes back together in a
different pattern (the superstring having the same vibration just in a
different dimension) it appears to us as a little chunk of energy.
It seems natural to me to take a cue from the physical world to
comprehend numbers. When we look at an element and it appears weve
hit the limit in terms of breaking it up, but we can go further it just
means we have to view it in a different dimension. Why then could we
not do the same with numbers? Heres a rational number you can only
break it apart but to a certain extent and no smaller. I know you may
object and say take that number and divide by 2, it is smaller and still
rational. But take notice of the irrationals, like sqrt(2). It does exist,
sitting there staring us in the face. It is in between rationals. So how does
there exist any space between rationals? How can the rationals be
broken down finer than it is possible to break them down? Imagine
thinking you understand that atoms are elementary particles, then this
clown Rutherford comes along and experimentally identifies this object
(nucleus) in the middle of an atom.
I say the best way forward is to take as true those things that must be
true and re-evaluate our preconceived notions that have pigeon-holed us
into an apparent paradox. It is difficult and un-nerving. You can be
guaranteed youll get it wrong a few times before you make some
progress, but some progress is far better than the certainty of smaller
minds.
32.
kalid says:
so sparse theyll never complete it! There must be another way to get to
those in-between numbers, and it isnt by dividing the ones we have into
smaller bits.
33.
Kalid
Leibnitz and Newton originated calculus in the 17th century, long before
imaginary numbers were around. Cant we just say that limits are
paradoxical but they work and leave it at that?
Integrals are often described as finding the area under a curve. This
description is too narrow: it's like saying multiplication exists to find the area of
rectangles. Finding area is a useful application, but not the purpose. Integrals
help us combine numbers when multiplication can't.
I wish I had a minute with myself in high school calculus:
"Psst! Integrals let us 'multiply' changing numbers. We're used to "3 x 4 = 12",
but what if one quantity is changing? We can't multiply changing numbers, so
we integrate.
You'll hear a lot of talk about area -- area is just one way to visualize
multiplication. The key isn't the area, it's the idea of combining quantities into
a new result. We can integrate ("multiply") length and width to get plain old
area, sure. But we can integrate speed and time to get distance, or length,
width and height to get volume.
When we want to use regular multiplication, but can't, we bring out the big
guns and integrate. Area is just a visualization technique, don't get too caught
up in it. Now go learn calculus!"
That's my aha moment: integration is a "better multiplication" that works on
things that change. Let's learn to see integrals in this light.
Understanding Multiplication
Area is a nuanced topic. For today, let's see area as a visual representation of
of multiplication:
With each count on a different axis, we can "apply them" (3 applied to 4) and
get a result (12 square units). The properties of each input (length and length)
were transferred to the result (square units).
Simple, right? Well, it gets tricky. Multiplication can result in "negative area" (3
x (-4) = -12), which doesn't exist.
We understand the graph is a representation of multiplication, and use the
analogy as it serves us. If everyone were blind and we had no diagrams, we
could still multiply just fine. Area is just an interpretation.
Multiplication Piece By Piece
What's happening? Well, 4.5 isn't a count, but we can use a "piece by piece"
operation. If 3x4 = 3 + 3 + 3 + 3, then
3 x 4.5 = 3 + 3 + 3 + 3 + 3x0.5 = 3 + 3 + 3 + 3 + 1.5 = 13.5
We're taking 3 (the value) 4.5 times. That is, we combined 3 with 4 whole
segments (3 x 4 = 12) and one partial segment (3 x 0.5 = 1.5).
We're so used to multiplication that we forget how well it works. We can break
a number into units (whole and partial), multiply each piece, and add up the
results. Notice how we dealt with a fractional part? This is the beginning of
integration.
The Problem With Numbers
Numbers don't always stay still for us to tally up. Scenarios like "You drive
30mph for 3 hours" are for convenience, not realism.
Formulas like "distance = speed * time" just mask the problem; we still need to
plug in static numbers and multiply. So how do we find the distance we went
when our speed is changing over time?
Describing Change
Our first challenge is describing a changing number. We can't just say "My
speed changed from 0 to 30mph". It's not specific enough: how fast is it
changing? Is it smooth?
Now let's get specific: every second, I'm going twice that in mph. At 1 second,
I'm going 2mph. At 2 seconds, 4mph. 3 seconds is 6mph, and so on:
where speed(t) is the speed at any instant. In our case, speed(t) = 2t, so we
write:
But this equation still looks weird! "t" still looks like a single instant we need to
pick (such as t=3 seconds), which means speed(t) will take on a single value
(6mph). That's no good.
With regular multiplication, we can take one speed and assume it holds for the
entire rectangle. But a changing speed requires us to combine speed and time
piece-by-piece (second-by-second). After all, each instant could be different.
This is a big perspective shift:
Regular multiplication (rectangular): Take the amount of distance moved
in one second, assume it's the same for all seconds, and "scale it up".
Integration (piece-by-piece): See time as a series of instants, each with
its own speed. Add up the distance moved on a second-by-second basis.
We see that regular multiplication is a special case of integration, when the
quantities aren't changing.
How large is a "piece"?
We have a decent idea of "piecewise multiplication" but can't really express it.
"Distance = speed(t) * t" still looks like a regular equation, where t and
speed(t) take on a single value.
In calculus, we write the relationship like this:
The integral sign (s-shaped curve) means we're multiplying things pieceby-piece and adding them together.
dt represents the particular "piece" of time we're considering. This is
called "delta t", and is not "d times t".
t represents the position of dt (if dt is the span from 3.0-4.0, t is 3.0).
speed(t) represents the value we're multiplying by (speed(3.0) = 6.0))
I have a few gripes with this notation:
The way the letters are used is confusing. "dt" looks like "d times t" in
contrast with every equation you've seen previously.
We write speed(t) * dt, instead of speed(t_dt) * dt. The latter makes it
clear we are examining "t" at our particular piece "dt", and not some
global "t"
You'll often see
, with an implicit dt. This makes it easy to forget
we're doing a piece-by-piece multiplication of two elements.
It's too late to change how integrals are written. Just remember the higherlevel concept of 'multiplying' something that changes.
Reading In Your Head
When I see
I think "Distance equals speed times time" (reading the left-hand side first) or
"combine speed and time to get distance" (reading the right-hand side first).
I mentally translate "speed(t)" into speed and "dt" into time and it becomes a
multiplication, remembering that speed is allowed to change. Abstracting
integration like this helps me focus on what's happening ("We're combining
speed and time to get distance!") instead of the details of the operation.
Bonus: Follow-up Ideas
Integrals are a deep idea, just like multiplication. You might have some followup questions based on this analogy:
If integrals multiply changing quantities, is there something to divide
them? (Yes -- derivatives)
And do integrals (multiplication) and derivatives (division) cancel? (Yes,
with some caveats).
Can we re-arrange equations from "distance = speed * time" to "speed =
distance/time"? (Yes.)
Can we combine several things that change? (Yes -- it's called multiple
integration)
Does the order we combine several things matter? (Usually not)
Once you see integrals as "better multiplication", you're on the lookout for
concepts like "better division", "repeated integration" and so on. Sticking with
"area under the curve" makes these topics seem disconnected. (To the math
nerds, seeing "area under the curve" and "slope" as inverses asks a lot of a
student).
Reading integrals
Integrals have many uses. One is to explain that two things are "multiplied"
together to produce a result.
Here's how to express the area of a circle:
We'd love to take the area of a circle with multiplication. But we can't -- the
height changes as we go along. If we "unroll" the circle, we can see the area
contributed by each portion of radius is "radius * circumference". We can write
this relationship using the integral above. (See the introduction to calculus for
more details).
And here's the integral expressing the idea "mass = density * volume":
What's it saying? Rho: is the density function -- telling us how dense a material
is at a certain position, r. dv is the bit of volume we're looking at. So we
multiply a little piece of volume (dv) by the density at that position
and add
them all up to get mass.
We'd love to multiply density and volume, but if density changes, we need to
integrate. The subscript V means is a shortcut for "volume integral", which is
really a triple integral for length, width, and height! The integral involves four
"multiplications": 3 to find volume, and another to multiply by density.
We might not solve these equations, but we can understand what they're
expressing.
Onward an upward
Today's goal isn't to rigorously understand calculus. It's to expand our mental
model, and realize there's another way to combine things: we can add,
subtract, multiply, divide... and integrate.
See integrals as a better way to multiply: calculus will become easier, and
you'll anticipate concepts like multiple integrals and the derivative. Happy
math.
1. Matt says:
@Frank:
It might make more sense if you imagine dividing up the area between
the x-axis and the function y=x into many vertical rectangles, and adding
up their areas. The more rectangles you use, the better the
approximation of the area. The idea behind integration is that if I divide
up the area into infinitely many rectangles with infinitely small width, no
matter how far you zoom in, youll never see the difference between
the real shape (which is triangular) and my approximated shape
(which is composed of many rectangles). So its reasonable to say that
the area is in fact the same. Now how exactly do we add up infinitely
many infinitely small things to get a real number? UhKalid?!?
2. Kalid says:
@Peter: Cool, Ill check it out!
@Matt: Thanks for the comment! One of the hardest parts is getting my
head around the idea of accurate enough.
Heres how I think about it. In real life, we hit this all the time: A screen
image is a grid of pixels, yet we can see perfectly smooth shapes like
curves, circles, faces, etc. Similarly, inkjet printers spray a matrix of dots
on a paper, but to us it looks like a smooth unbroken image or line.
The key is realizing that the approximation is only an approximation at
that higher level of accuracy at the level that we work at, it appears
indistinguishable from the real thing. Calculus helps formalize some of
these ideas with limits (informally, two numbers that have a difference
less than our error margin appear the same to us).
Unfortunately, we dont really talk about this much, and we sometimes
say numbers are equal, and sometimes say they arent. Theres a notion
of infinitely small numbers which makes this clearer, and is used in
physics. That is, you can talk about how infinitely small numbers interact
with each other, and with infinity, to give numbers we can detect. A poor
analogy but it may work: A caveman could probably not conceive of an
individual atom, or the gargantuan Avogadros number (6 x 10^23), but
when this tiny particle and huge number combine we can get something
we can detect.
The key is writing this idea down in the language of math: numbers that
are too small and too large for us to detect can interact to give us
numbers we can work with.
3. ram says:
Hi Kalid,
your explanations of the underlying concepts of mathematics do bring
the subject at a democratic lavel, a level on which people communicate,
collaborate and work towards making the subject useful for greater
number of people.
Now coming to the subject, cud i say dat, differentiation is inverse of
integration?
and going by dat if i have to apply differentiation, lets say on the
example of circle, all i have to do is to run a playback, i.e. to peel of all
those tiny rings (or, in other words) thus divide the circle into the tiniest
possible rings.
Once i m done peeling I would measure this ring, to see the result of
differentiation application, which should be 2*pi*r.
BTW, would not I b applying multiplication again, to measure that tiniest
ring, i.e. finding the area of that ring, pi*r^2?
4. Arbie Samong says:
One way I understood the basic integral notation is with my crude
understanding of sets and functional programming. Using the given
example above (speed and time):
Theres an implied set of values of time, and we take a piece of it or a
member of that set. That becomes the slice of time. We then apply it to a
function of time that is speed. This results in another set whose members
are results from each function result using the said function given a slice
(or element of implied set of time) as input. Finally, we apply the
integrate operator, or probably a map to the integrate function; or, to
put it simply, use the integrate function on all members of the resulting
set to return the integrated value.
or something like:
map(integrate, (getSpeed(t) | t <= time_slices))
5. bill says:
note to extend idea by Kalid above:
circumference of a cirle (a 1d distance) = 2 pi r
area of a circle (a 2d area) = pi r ^ 2
(note the integral /derivative of each other.)
surface area of a sphere = 4 pi r ^ 2
volume of a sphere = 4/3 pi r ^3
try this with squares and cubes hint, base it on the shortest distance
from the centre to a side.
how cool is that!
How do you wish the derivative was explained to you? Here's my take.
Psst! The derivative is the heart of calculus, buried inside this definition:
It's strange, but you can see 10/5 as "I need to travel 10 'infinities' in 5
segments of time. To do this, I travel 2 'infinities' for each unit of time".
Analogy: See division as a rate of motion through a continuum of
points
What's after zero?
We're nearing the chewy, slightly tangy center of the derivative. We need
before-and-after measurements to detect change, but our measurements
could be flawed.
Imagine a shirtless Santa on a treadmill (go on, I'll wait). We're going to
measure his heart rate in a stress test: we attach dozens of heavy, cold
electrodes and get him jogging.
Santa huffs, he puffs, and his heart rate shoots to 190 beats per minute. That
must be his "under stress" heart rate, correct?
Nope. See, the very presence of stern scientists and cold electrodes increased
his heart rate! We measured 190bpm, but who knows what we'd see if the
electrodes weren't there! Of course, if the electrodes weren't there, we
wouldn't have a measurement.
What to do? Well, look at the system:
measurement = actual amount + measurement effect
Ah. After lots of studies, we may find "Oh, each electrode adds 10bpm to the
heartrate". We make the measurement (imperfect guess of 190) and remove
the effect of electrodes ("perfect estimate").
Analogy: Remove the "electrode effect" after making your
measurement
By the way, the "electrode effect" shows up everywhere. Research studies
have theHawthorne Effect where people change their behavior because they
are being studied. Gee, it seems everyone we scrutinize sticks to their diet!
Understanding the derivative
Armed with these insights, we can see how the derivative models change:
3. We don't know exactly how small "dx" is, and we don't care: get the rate
of motionthrough the continuum: [f(x + dx) - f(x)] / dx
4. This rate, however small, has some error (our cameras are too slow!).
Predict what happens if the measurement were perfect, if dx wasn't
there.
The magic's in the final step: how do we remove the electrodes? We have two
approaches:
Limits: what happens when dx shrinks to nothingness, beyond any error
margin?
Infinitesimals: What if dx is a tiny number, undetectable in our number
system?
Both are ways to formalize the notion of "How do we throw away dx when it's
not needed?".
My pet peeve: Limits are a modern formalism, they didn't exist in Newton's
time. They help make dx disappear "cleanly". But teaching them before the
derivative is like showing a steering wheel without a car! It's a tool to help the
derivative work, not something to be studied in a vacuum.
An Example: f(x) = x^2
Let's shake loose the cobwebs with an example. How does the function f(x) =
x^2 change as we move through the continuum?
I see the integral as better multiplication, where you can apply a changing
quantity to another.
The derivative is "better division", where you get the speed through the
continuum at every instant. Something like 10/5 = 2 says "you have a constant
speed of 2 through the continuum".
When your speed changes as you go, you need to describe your speed at each
instant. That's the derivative.
If you apply this changing speed to each instant (take the integral of the
derivative), you recreate the original behavior, just like applying the daily stock
market changes to recreate the full price history. But this is a big topic for
another day.
Gotcha: The Many meanings of "Derivative"
"The derivative of x^2 is 2x" means "At every point, we are changing by
a speed of 2x (twice the current x-position)". (General formula for
change)
"The derivative is 44" means "At our current location, our rate of change
is 44." When f(x) = x^2, at x=22 we're changing at 44 (Specific rate of
change).
"The derivative is dx" may refer to the tiny, hypothetical jump to the next
position. Technically, dx is the "differential" but the terms get mixed up.
Sometimes people will say "derivative of x" and mean dx.
Gotcha: Our models may not be perfect
Math is a language, and I want to "read" calculus (not "recite" calculus, i.e. like
we can recite medieval German hymns). I need the message behind the
definitions.
My biggest aha! was realizing the transient role of dx: it makes a
measurement, and is removed to make a perfect model. Limits/infinitesimals
are a formalism, we can't get caught up in them. Newton seemed to do ok
without them.
Armed with these analogies, other math questions become interesting:
How do we measure different sizes of infinity? (In some sense they're all
"infinite", in other senses the range (0,1) is smaller than (0,2))
What are the real rules about making "dx go away"? (How do
infinitesimals and limits really work?)
How do we describe numbers without writing them down? "The next
number after 0" is the beginnings of analysis (which I want to learn).
AK says:
I just wanted to let you know that I really appreciate the effort you put
into this. I only discovered this website a few days ago, and Ive been
having a blast reading all those intuitive approaches!!
You should consider writing an elementary and highschool book of
mathematics, as well as teaching on khansacademy
===
Josh: I think a better analogy is this:
Integration is piling all the shards on a scale and reading the total.
Antidifferentiation is putting the shards carefully back together in
exactly the right order and recognizing the plate.
Ogbuka chukwuma says:
I dont understand cumulative frenquency so well.pls help
BASSMAN says:
i need more details on how to solve the partial fractions and integrations.
Asmaul Hoque says:
It is realy interesting. I have enjoy it ..nd lear a lot. Today I unmderstood
What is Derivative ? Actually I am searching this but give us. Than you so
much. Please give the this opportunity to learn math.
John Jordan says:
Khalid,
Long-time lurker, first-time poster. Firstly, just wanted to say congrats on
all your work here, really impressive. This is my favourite maths site on
the web; I see the seeds of an educational revolution here. Reminds me
of the time I got a weighty book Applying Maths in the Chemical and
Biological Sciences..I was hoping for an interesting novel, what I got was
almost pure grammar, i.e. I was looking for semantics but all I got was
syntax. Your articles explain the meaning, i.e. utility, of these abstract
notions. Your complex numbers article helped solved the riddle of how
imaginary numbers could be use in the real world, so thanks!
Like the (modified!) analogy for the distinction of integral and antiderivative, which was yet another one of those esoteric relationships that
was never explored in high school; are you going to amend the original
article?
Regards,
John
kalid says:
@Bassman, Ogbuka: Ill take those as suggestions for future topics,
thanks.
@Asmaul: Glad it was helpeful!
@John: Thanks for the note, really appreciate it! I hear you, so many
math explanations just focus on the grammar, like the lifeless language
classes that nobody ever seems to learn from (contrasted with learning a
language by actually being immersed in it and speaking it, vs. trying to
crunch through the rules like a computer).
Im going to update the article right now with the new integral/antiderivative analogy. Thanks again for posting!
Anonymous says:
This came at a pretty good time for me since its publication coincided
with my own autodidactic journey through math! I was fresh into
calc/derrivatives when this came and I skimmed through, initially getting
about half of it. Then while walking my dogs today I got deep into
thinking about really understanding derrivatives after a few plug and
chug sessions, and I begun recalling what you had written (especially
regarding the actual rate+error part) and the superman analogy.
In retrospect it was a good thing I was walking in the barren woods
because the unconcious OOOOOOOOOOHHH! of my aha moment was
so loud. My dogs didnt seem to care though, they were busy pooping
and such.
Thank you, thank you, thank you!
Kalid says:
@Anonymous: Awesome, Im glad the aha! came :). Im planning on
making some changes to the site to help share and discuss the individual
aha! moments, really appreciate the note!
Sebastian Marquez says:
Khalid,
This is great! Derivatives were always out of focus to me but this is
helping clear things up.
Sebastian
kalid says:
@Sebastian: Thanks, glad it helped :).
just a kid says:
Hey Kalid, another great article!
But I noticed something; couldnt you just, instead of even doing all the
other math, just take the exponent of the original number, multiply the
number in front of it and then minus one from the exponent? if you didnt
get that, heres what I mean: the derivative of x^2=2*1(x)^(2-1), which
equates to 2x. It also works in the reverse of finding the original number
-Wm
kalid says:
@wm: Thanks for the pointer! Ill check it out.
STILL LEARNING says:
Really great stuff. Mathematics is the foundation of all science and
science is the compass to help us navigate the universe. Keep up the
good work. Very much appreciated.
kalid says:
@Still learning: Thanks really appreciate the encouragement!
Sudar says:
Hi Khalid,
Great article. I have always been fascinated by calculus and always
wanted to decipher the true meaning of derivative. Your article gives me
a great insight. However I would beg you to clarify the following
confusion that has arisen.
We all know that derivative of Y = X^2 is 2x. when you calculate values
of y for x=2 and 3, you get y = 4 and 9 respectively. The change in y
here is 9-4 = 5. However if I substitute x= 2 in the derivative function
dy/dx it gives me 2x = 4. you showed us why this difference exists. It is
because of the dx factor (Shoddy instrument). But the reality is that y
changed by 5 units when x changed from 2 to 3. Are you saying that
dy/dx or derivative is not here to calculate rate of change for such large
changes and if you use it for large changes results are inaccurate. Does
that mean that dy/dx can only be used to calculate very small changes.
Earlier I thought if you want to find how a function f(x) is changing w.r.t x
between 2 values without substituting the values, just calculate the
derivative and substitute x but it seems I was wrong?
Also I didnt understand when you say
The derivative is 44 means At our current location, our rate of change is
44.
Change is a relative term. How can there be a change at a current
location. It has always got to be between two locations.
here says:
Heya, I just hopped over to your web-site through StumbleUpon. Not
somthing I would typically browse, but I enjoyed your thoughts none the
less. Thank you for making some thing worth reading through.
wm tanksley says:
Sudar, I understand your confusion.
introduce limits for their own sake. But this has nothing to do with the
topic of understanding the derivative.
-Wm
Nikhil Panikkar says:
You said that I mentioned infinity and the continuum. I didnt mention
either; the only place I can find those concepts is in the original post.
Yes I was referring to the original post. I just stumbled upon this article
while doing a google search. and assumed that you were its author. Now I
have explored the site, and discovered it was Kalid.
I was thinking about your comment The derivative is sufficiently
understood as the slope of the line tangent to a curve at a point.. I have
some doubts regarding this definition. But I will first wait for your
comment on the application of the derivative to arbitrary curves and how
the limit restricts this applicability.
wm tanksley says:
(Note: I hope the LaTeX below works. I wish there were a preview
mode)
I claimed that using limits and infinitesimals to define the derivative led
to restricting ourselves to functions, while using the geometric definition
of the derivative allowed arbitrary curves rather than only functions.
(There are other advantages; for example, using the geometric definition
allows you to reason about derivatives of curves over arbitrary fields
rather than only the continuum.)
Recall that the geometric definition of the derivative is the slope of the
line tangent to the curve at any point on the curve. First let me
distinguish a function from a curve. Every function is a curve, but a
function has at most one value per input, while a curve can have any
number of values. We can consider the subset of general curves called
the algebraic curves, consisting of the Cartesian graphs of the
polynomials of the appropriate number of variables for the dimension
were examining; analytic curves are also amenable to this analysis, or
curves on other coordinate systems.
And a simple example of that is the classical unit circle. In order to find
the derivative of the unit circle using limits, one has to split the circle into
upper and lower halves. If one uses the geometric definition, however,
there is only one curve, and computing a formula for its tangent line is
simple algebra. The result is a formula for the tangent line to the circle at
every point on the plane (sometimes called the first order
semiderivative), and its easy to see how to extract the slope of that
line.
The algebra one performs in order to extract this is to evaluate the curve
at
, where r and s are variables representing arbitrary
. Evaluating at
, we get the translated
, which expands
. The Taylor expansion is
.
there are proofs that the geometric definition yields both exact solutions
and a simple method for deriving approximations.
I also explained one obvious way in which the geometric definition is
superior, in that it allows derivatives of curves that arent simple
functions. But I didnt explain in what aspects the infinitesimal definition
of the derivative is inadequate. Notice that Im not trying to say that its
bad or wrong, or that its ALWAYS inadequate; rather, Im pointing out
some specific problems that hinder certain uses. Also notice that Im not
complaining about limits; Im talking specifically about the use of
infinitesimals in the definition of the derivative. Limits may still be useful
(for example, I mentioned piecewise smooth functions, whose derivatives
require limits).
The most interesting problem is that infinitesimals require the use of the
continuum, and not all numbers are embedded in a continuum. The
rationals are very useful for most purposes; and floating point
computation is a use of a special type of rational number. There are other
infinite fields as well, and obviously the finite fields cannot be
approached with limits at all (but are quite easily approached with
geometry). And yes, the definition of algebraic curve applies over any
field, finite or infinite, so this method will find its derivative. Complex
numbers are reachable as well in fact, you can probably see that the
equation I derived for the tangent line has values over the entire plane,
not just on the unit circle, and in fact those values are geometrically
meaningful.
There are more interesting results as well. The tangent line is interesting
and useful, but there are also tangent conics, cubics, and so on.
-Wm
Nikhil Panikkar says:
Thank you Tanksley, for your explanation. But I have to admit, there is a
lot in the above explanation that I am not familiar with( like the first order
semiderivative ) , so Ill have to go through it step by step. I hope youll
stay on the site to clarify my doubts!
In the mean time, can we discuss your earlier comment The derivative is
sufficiently understood as the slope of the line tangent to a curve at a
point. ?
Lets say we want draw a tangent to a curve. This raises the question
what is a tangent.
1. Lets say the tangent a point is a line that best approximates the curve
at the point. This raises the question what is meant by best
approximation ?
2. A simplified answer to this question would be that it should have the
same value at the point as the curve.
This is what one would expect given the traditional definition of the
tangent.- ie the tangent line to a plane curve at a given point is the
straight line that just touches the curve at that point.
But if you look at the equation to the tangent line derived in my last post,
y1 = 3a^2x + a^3,
at x = a, y = 4 a ^ 3. The point (a, 4 a ^ 3), does not lie on the curve y =
x^3.
So is there something wrong with the definition, or is there something
Ive missed ?
wm tanksley says:
Im really sorry, but Im just not able to get the time to reply this
weekend. Youre on the right track in general (in fact, Im quite
impressed, given the tiny bit of explanation Ive been able to give); but
theres more to do.
If you dont mind, Im going to point you to a YouTube video where a fairly
complex curve is analyzed according to these rules.
http://www.youtube.com/watch?v=i9o0OfvQYmA
Unfortunately, he uses some unusual terms while doing this for
example, he denotes the curve using a polynumber, which he writes as
an array of integers. You may be able to figure how a polynumber is like a
polynomial without explicitly written variables; if you need a better
explanation the previous videos in his series will explain completely. See
the entire playlist at:
http://www.youtube.com/playlist?
list=PL5A714C94D40392AB&feature=plcp
-Wm
Nikhil Panikkar says:
Ok Tanksley, Ill check out the videos, and then Ill post what Ive
understood. But since I am unfamiliar with a lot of what is being
discussed here, Ill need your confirmation to be sure what I understood
is correct. Ill wait for your comment.
And thank you for pointing me to these videos. Its a new approach for
me integrating algebra, geometry and calculus. The only hindrance is
my own less rigorous math background. So Ill have to go through it step
by step. I hope you will stay on the site to comment on my progress.
wm tanksley says:
No problem, Ill be here.
feeds/email. I might find a way to use MathJax on the website and fall
back to the images on the other places.
@Gaurav: Awesome, glad youre enjoying the site :).
ansh choubey says:
Aha moment came near reading your fab articles must write a high
school book soon I wanna read more and more. ,. Plzzz do write on
physics too.
kalid says:
Hi Ansh, glad you enjoyed it!
ansh choubey says:
Aha moment came near reading your fab articles must write a high
school book soon I wanna read more and more. ,. Plzzz do write on
physics too.
Boom
Ademilson says:
Congratulations!!! its really a very intuitive explanation!
Analogy is the key! As above so below
Namaste!
Math enthusiast: Northwestern Student says:
This was absolutely great.well done. what an excellent explanation of
calculus also I would like to add this.
I was doing a lot of research and thinking and came to the conclusion
that like you mentioned the integral in some respects is not directly
related to differentiation. More precisely, the definite integral is unrelated
to differentiation, and anti-diffrentiation is the imperfect
reversal(opposite operation) of differentiation (very intuitive). The reason
why is simple: the definite integral computes the signed area under a
curve and the change in position of the original function (i.e (dx/dt) times
(dt) equals dx) which is completely useless if you are trying to find the
original function the antiderivative, however, is useful for that as long
as the constant is defined. the indefinite integral is virtually the same
as an anti-derivative except its syntax actually means nothing in
nature have you ever wondered why there is a dx (or the appropriate
differential) at the end of the integrand even though there are no bounds
of integration??? dx would represent an infinitesimally small width but
since there are no bounds of integration the dx means nothing its a
dummy variable as some would say
great stuff
Adrienne says:
I LOVE THIS. Helps me appreciate math so much more. Youre awesome
Kalid.
elisen says:
thanks
elisen says:
Dear Khalid,
I just wanted to say thank you again and the fact that you are very clever
and how your dream lies in helping other people understand is great.
I havent been listening to my maths classes recently and therefore I
need to do a lot of work.
Your website had made me more confident.
indeed the idea of a rate of change at a point is very confusing.
I want to ask whether your passion for maths or any learning stems from
your curiousity, whether you have read history on your maths.
and i hope you check this out.
this guy is quite smart and uses analogies too.
i want to be able to do amke analogies myself so i can understand and
relate ideas so i can apply them to life and make use of them. because all
learning is precious.
because i am a person who needs to understand,
therefore you ahas help (my mother on the other hand are the ones who
remember and dont question haha, but indeed there are different people,
and their way of learning and how their brain functions, their behaviour,
their attitude to learning approaches is different)
http://www.scotthyoung.com/blog/2007/03/25/how-to-ace-your-finalswithout-studying/
i want to thank you again and how you have left a Question part shows
your dedication.
Yours SIncerely
Elisen
mahendra says:
Voila outstanding ,kudos what a lucid explanation ,keep the great work
flowing
cheers
kalid says:
@mahendra: Thanks!
@Elisen: Really appreciate the note, thank you. (Scott is a friend and I
really like how he breaks down his methods!).
My passion for math (or learning in general) came when I realized how
much simpler an idea could be if we looked at it the right way. Something
which was once confusion becomes simple with the right approach (think
about how difficult multiplication is with Roman numerals, but how easy
it is with decimal numbers). I had this belief that any idea could be made
simple, and its what keeps me going. If something seems difficult, its ok
it just means I havent found the simple version of it yet.
Really glad the site has been helping :).
The jumble of rules for taking derivatives never truly clicked for me. The
addition rule, product rule, quotient rule how do they fit together? What are
we even trying to do?
Heres my take on derivatives:
The default calculus explanation writes f(x) = x^2 and shoves a graph in
your face. Does this really help our intuition?
Not for me. Graphs squash input and output into a single curve, and hide the
machinery that turns one into the other. But the derivative rules are about the
machinery, so lets see it!
I visualize a function as the process input(x) => f => output(y).
Its not just me. Check out this incredible, mechanical targetting computer
(beginning of youtube series).
The machine computes functions like addition and multiplication with gears
you can see the mechanics unfolding!
The
df vs df/dx
Sometimes we use df, other times df/dx what gives? (This confused me for a
while)
df is a general notion of however much f changed
df/dx is a specific notion of however much f changed, in terms of how
much x changed
The generic df helps us see the overall behavior.
An analogy: Imagine youre driving cross-country and want to measure the fuel
efficiency of your car. Youd measure the distance traveled, check your tank to
see how much gas you used, and finally do the division to compute miles per
gallon. You measured distance and gasoline separately you didnt jump into
the gas tank to get the rate on the go!
In calculus, sometimes we want to think about the actual change, not the ratio.
Working at the df level gives us room to think about how the function
wiggles overall. We caneventually scale it down in terms of a specific input.
And well do that now. The addition rule above can be written, on a per dx
basis, as:
Next puzzle: suppose our system multiplies parts f and g. How does it
behave?
Hrm, tricky the parts are interacting more closely. But the strategy is the
same: see how each part contributes from its own point of view, and combine
them:
total change in h = fs contribution (from fs point of view) + gs
contribution (from gs point of view)
Check out this diagram:
Now, like our miles per gallon example, we divide by dx to write this in terms
of how much x changed:
The chain rule lets us zoom into a function and see how an initial change (x)
can effect the final result down the line (g).
Interpretation 1: Convert the rates
A common interpretation is to multiply the rates:
If your miles per second rate changes, multiply by the conversion factor to
get the new miles per hour. The second doesnt know about the hour directly
it goes through the second => minute conversion.
Similarly, g doesnt know about x directly, only f. Function g knows it should
scale its input by dg/df to get the output. The initial rate (df/dx) gets modified
as it moves up the chain.
Interpretation 2: Convert the wiggle
I prefer to see the chain rule on the per-wiggle basis:
x wiggles by dx, so
f wiggles by df, so
g wiggles by dg
Cool. But how are they actually related? Oh yeah, the derivative! (Its the
output wiggle per input wiggle):
Remember, the derivative of f (df/dx) is how much to scale the initial wiggle.
And the same happens to g:
It will scale whatever wiggle comes along its input lever (f) by dg/df. If we write
the df wiggle in terms of dx:
We have another version of the chain rule: dx starts the chain, which results in
some final result dg. If we want the final wiggle in terms of dx, divide both
sides by dx:
The chain rule isnt just factor-label unit cancellation its the propagation of
a wiggle, which gets adjusted at each step.
The chain rule works for several variables (a depends on b depends on c), just
propagate the wiggle as you go.
Try to imagine zooming into different variables point of view. Starting from
dx and looking up, you see the entire chain of transformations needed before
the impulse reaches g.
Chain Rule: Example Time
Whats the derivative of x^4? 4x^3? Great. You brought down the exponent
and subtracted one. Now explain why!
Hrm. Theres a few approaches, but heres my new favorite: x^4 is really x * x
* x * x. Its the multiplication of 4 independent variables. Each x doesnt
know about the others, it might as well be x * u * v * w.
Now think about the first xs point of view:
It changes from x to x + dx
The change in the overall function is [(x + dx) x][u * v * w] = dx[u * v *
w]
The change on a per dx basis is [u * v * w]
Similarly,
From us point of view, it changes by du. It contributes (du/dx)*[x * v * w]
on a per dx basis
v contributes (dv/dx) * [x * u * w]
w contributes (dw/dx) * [x * u * v]
The curtain is unveiled: x, u, v, and w are the same! The point of view
conversion factor is 1 (du/dx = dv/dx = dw/dx = dx/dx = 1), and the total
change is
In a sentence: the derivative of x^4 is 4x^3 because x^4 has four identical
points of view which are being combined. Booyeah!
Take A Breather
I hope youre seeing the derivative in a new light: we have a system of parts,
we wiggle our input and see how the whole thing moves. Its about combining
perspectives: what does each part add to the whole?
In the follow-up article, well look at even more powerful rules (exponents,
quotients, and friends). Happy math.
1. Gourav says:
Awesome article. The description of the product rule really changed how I
think about them.
Out of curiosity, how do you think your idea of the power rule extends to
negative, fractional and irrational powers? Its a bit harder to think about
since you cant just split them into linear parts.
2. kalid says:
Hi Gourav, thanks for the note. Great question about the negative,
fractional and irrational powers. To follow the analogy, we could use the
chain rule; suppose we have f(x) = x^-3. See x^-3 as shorthand for
1/x^3. We can do:
d/dx x^-3 = d/dx 1/x^3 = d/dx 1/u = -1/u^2 * du/dx
du/dx can be understood intuitively (3x^2), and we divide it by (x^3)^2.
We can see the x powers fight it out as (x-1) 2x = -x 1 [The (x-1)
power is from du/dx, and -2x is from 1/u^2. With x=3, get -3 1 = -4 as
the power]. Notice how we still brought down the 3 (which was in
du/dx). Hope this part made sense.
).
before!
find out the change as 2:4 or 100:200 finally. What is the rule of change
in calculus ? If not, will calculus be able to find an accurate answer every
time ?
16. kalid says:
@Matthias: Glad it helped.
@JJ: Awesome, glad youre getting a head start! You got it, the math isnt
much more than algebra, its just seeing how to put the variables
together.
@Vishwas: Calculus is made for instantaneous rates of change, i.e. the
rate of change at a certain moment in time. As you move away from that
moment, the rate of change varies and is no longer accurate [and you
use integration to add-up these constantly-shifting moments].
17. Finne Gillan says:
Still does not make any sense unfortunately including the mechanical
computer video and wiggles. Perhaps its hopeless and I will never
understand calculus despite wanting to. I was lost earlier. If f(x) = x^2
and input is 10. Wiggle of 0.1 gives wiggle of 2.01 and wiggle of 0.01
gives 0.2001. How do these relate to the derivative?
18.Silrak says:
Hi, am working through the tutorials from naught, to gather an
understanding of calculus and within 4 weeks when my course will start.
Is there anyone who might help me excel more than is possible on my
own via skype or phone? Based in Melb, 24.01.14.
19. kalid says:
Hi Silrak, theres a full series on calculus
here: http://betterexplained.com/calculus/which might help.
20.
raju says:
If I havent confused you yet maybe this will throw you off guard (I jest, I
really do want you to understand)!
Heres another reason that little square ( df * dg ) vanishes but the slivers
remain. Lets use an analogy that all of us understand so well on an
intuitive level: Boolean Algebra! (Bang head against wall now)
+ means OR
* means AND
P(a) means probability event a happens
P(a) + P(b) means probability of a or b happening
P(a) * P(b) means probability of a happening AND THEN b happening
That little square is df * dg, its kind of like take a little chunk of f, then a
little chunk of g. Its like licking the pepperoni, AND THEN a little flea
jumps on your tongue and licks a little grease from your tongue. You
dont notice it compared the large little bit of grease you got, which
itself is small compared to the pie. Just df on its own is you licking the
pepperoni. Just dg on its own is the flea licking your tongue. The sum df
+ dg is either you licking the pepperoni or the flea licking your tongue.
But df *dg is licking the pepperoni AND THEN getting licked by the flea, it
means comparing the fleas very little bit of grease to the whole pie.
Hope this helps, or at least that you got the chance to laugh at Calculus
for a little while.
Excelsior,
Eric V
22.
Eric V says:
Sorry if the formatting is hard to follow. Cant use tab so I used a bunch of
spaces instead to keep my = lined up under each other. The Posting
Gnome ate my spaces. He also messed up my comment markers ** at
the end of lines.
23.Melissa says:
AAAHHH LIGHTBULB!!!
24.
This video helped me get the Chain Rule by working out d/dx(sin(x^2))
compared to d/dx(sin(x)) in a really visual and intuitive
way. http://youtu.be/bcGOZLL1v4Y
25.Bonnie says:
This is definitely written for people who have already taken calculus and
not understood it, versus someone with almost no exposure who is trying
to learn. I dont even think reading over this would be helpful. You
assume I already know all this stuff! Do you know of any good place to
START learning calculus? Maybe I can come back to this site after I
memorize all those rules, if Im still confused.
26.
kalid says:
Hi Bonnie, yep, this lesson is definitely geared for someone in the tail end
of a calculus class. If youre just starting out, check out:
http://betterexplained.com/calculus/lesson-1
Hope that helps!
Ah, the quotient rule the one nobody remembers. Oh, maybe you
memorized it with a song like Low dee high, high dee low, but thats not
understanding!
Its time to visualize the division rule (who says quotient in real life?). The
key is to see division as a type of multiplication:
We have a rectangle, we have area, but the sides are f and 1/g. Input x
changes off on the side (by dx), so f and g change (by df and dg) but how
does 1/g behave?
Chain rule to the rescue! We can wrap up 1/g into a nice, clean variable and
then zoom in to see that yes, it has a division inside.
So lets pretend 1/g is a separate function, m. Inside function m is a division,
but ignore that for a minute. We just want to combine two perspectives:
f changes by df, contributing area df * m = df * (1 / g)
m changes by dm, contributing area dm * f = ?
We turned m into 1/g easily. Fine. But what is dm (how much 1/g changed) in
terms of dg (how much g changed)?
We want the difference between neighboring values of 1/g: 1/g and 1(g + dg).
For example:
Whats the difference between 1/4 and 1/3? 1/12
How about 1/5 and 1/4? 1/20
How about 1/6 and 1/5? 1/30
How does this work? We get the common denominator: for 1/3 and 1/4, its
1/12. And the difference between neighbors (like 1/3 and 1/4) will be 1 /
common denominator, aka 1 / (x * (x + 1)). See if you can work out why!
(This is useful as a general fact: The change from 1/100 to 1/101 = one ten
thousandth)
The difference is negative, because the new value (1/4) is smaller than the
original (1/3). So whats the actual change?
g changes by dg, so 1/g becomes 1/(g + dg)
The instant rate of change is -1/g^2 [as we saw earlier]
The total change = dg * rate, or dg * (-1/g^2)
A few gut checks:
Why is the derivative negative? As dg increases, the denominator gets
larger, the total value gets smaller, so were actually shrinking (1/3 to 1/4
is a shrink of 1/12).
And get:
and it burns! It burns! This simplification hides how the division rule is just a
variation of the product rule. Remember, theres still two slivers of area to
combine:
The f (numerator) sliver grows as expected
Its the chain rule again we want to zoom into u, get to x, and see how a
wiggle of dx changes the whole system:
x changes by dx
u changes by du/dx, or d(x^2)/dx = 2x
How does e^u change?
Now remember, e^u doesnt know we want changes from xs point of view. e
only knows its derivative is 100% of the current amount, which is the exponent
u:
Any exponent (a^b) is really just e in different clothing: [e^ln(a)]^b. Were just
asking for the derivative of e^foo, where foo = ln(x) * x.
But wait! Since we want the derivative in terms of x, not foo, we need to
jump into xs point of view and multiply by d(foo)/dx:
We wrote e^[ln(x)*x] in its original notation, x^x. Yay! The intuition was
rewrite in terms of e and follow the chain rule.
And from vs point of view, u is just some static base (if u=5, we have 5^v).
We rewrite into base e, and we get
And the reveal: u = v = x! Theres no conversion factor for this new viewpoint
(du/dx = dv/dx = dx/dx = 1), and we have:
Its the same as before! I was pretty excited to approach x^x from a few
different angles.
By the way, use Wolfram Alpha (like so) to check your work on derivatives
(click show steps).
Question: If u were more complex, where would we use du/dx?
Imagine u was a more complex function like u=x^2 + 3: where would we
multiply by du/dx?
Lets think about it: du/dx only comes into play from us point of view (when v
is changing, u is a static value, and it doesnt matter that u can be further
broken down in terms of x). us contribution is
Were multiplying by the du/dx conversion factor to get things from xs point
of view. Similarly, if v were more complex, wed have a dv/dx term when
computing vs point of view.
Look what happened we figured out the genric d/du and converted it into a
more specific d/dx when needed.
Its Easier With Infinitesimals
Separating dy from dx in dy/dx is against the rules of limits, but works great
with infinitesimals. You can figure out the derivative rules really quickly:
Product rule:
We set df * dg to zero when jumping out of the infinitesimal world and back
to our regular number system.
Think in terms of How much did g change? How much did f change? and
derivatives snap into place much easier. Divide through by dx at the end.
Summary: See the Machine
The derivative of f can be seen from xs point of view (how does f change with
x?) or ys point of view (how does f change with y?). Its the same idea: we
have two independent perspectives that we combine for the overall behavior
(its like combining the point of view of two Solipsists, who think theyre the
only real people in the universe).
If x and y depend on the same variable (like t, time), we can write the
following:
Its a bit of the chain rule were combining two perspectives, and for each
perspective, we dive into its root cause (time).
If x and y are otherwise independent, we represent the derivative along each
axis in a vector:
This is the gradient, a way to represent From this point, if you travel in the x
or y direction, heres how youll change. We combined our 1-dimensional
points of view to get an understanding of the entire 2d system. Whoa.
1. Joe says:
When will Math curriculums begin combining concepts in meaningful
ways like this? Calculus classes like to split Power Rule, Quotient Rule,
and Chain Rule into discrete sections, when really theyre consequences
of the same basic idea. Perhaps its less labor-intensive teaching distinct
formulas to be memorized, but its just another reason people hear
Calculus and immediately glaze over.
And while Im lamentingyour mention of infinitesimals brings up another
sore spot of mine. A Calc TA told me how separating dy/dx is against
the rules, as you say, and I took it to heart. Imagine poor, confused me a
couple semesters later in DiffEq: I thought this was against the rules!
The limit-based approach to teaching Calculus needs some serious
revision, particularly for non-mathematicians moving into practical fields.
2. kalid says:
@Joe: I hear you we slice and dice concepts and miss the cohesive
whole. All the calculus rules are just examples of how different subparts
can contribute to the whole, but Im only seeing that now, 10+ years
after high school. Ugh.
And yeah theres so much dont do this, I dont know why, but dont!
in math. Why is it against the rules? What are the rules? Limits are a
seatbelt introduced to address theoretical concerns many, many years
after Calculus was put into use. Learning about seatbelts is fine, but dont
dive into them before you explain what a car [i.e., calculus] is!
1. Jackson says:
Thank you for the time youve put into these articles theyve helped me a
lot and Im glad to know there are people who care about intuition and
share it, but Im confused about your intuition of the natural log. Why is
the derivative always predicting the next increment by one? Why not .5?
Shouldnt it be infitestimally small because it is using the input of the a
naturally growing function?
2. kalid says:
Thanks Jackson, great question. Intuitively, think about taking a single
step forward, which is 1*dx. Another way of seeing it: when taking the
derivative, we split our continuous function into discrete steps (a single
dx wide at each step) and see our rate of change when we increment by
the next dx.
An analogy: we represent a photo with individual pixels (dx) and step
through one pixel at a time. The pixels are chosen at an infinitely small
retina resolution where we dont notice them at the macro scale.
(Theres more on limits later in this series.)
3. Jackson says:
Sorry for my inconvienience but Im confused how you got 1/x. Wouldnt
the derivative be dx/x because dx would be the change and x would be
the current value as dx approach 0.Im just confused why 1=dx instead
of approaching 0.
4. kalid says:
No worries, great question, I realize it can be unclear. I start with
scenarios where dx = 1 (which is a GIANT step) to estimate results in
my head. Then, I can set dx = 0 (taking the limit) to get an exact
prediction.
Lets say I want the derivative of x^2. I imagine going from 10^2 to
11^2 (we jumped from x=10 to x=11, so dx=1). The difference is 21, or
2x + dx (20 + 1). I can then set dx = 0 and get the exact answer of 2x.
(If there was no gap between x and the next value, the derivative would
be 2x.)
The natural log is harder to compute: its the time e^x needs to grow
from 1 to x. How does it change?
Imagine going from 10 to 11 (again, dx=1). Here, were at 10 and we
grow exponentially up to 11. Since e^x assumes were growing at 100%
of our current value, it takes 1/10 of a unit time to get to 11. (10 +
(1/10)*10 = 11).
Now, this isnt *quite* accurate because as were going to 11, were
getting faster. I.e., when were at 10.5 were growing at 10.5 units per
unit time, not the 10 we expected. Removing the imaginary dx fixes this
(we assume there is no midpoint between x=10 and the next value, so it
really is a perfect 1/x amount of time we wait).
Calculus examples are boring. "Hey kids! Ever wonder about the distance,
velocity, and acceleration of a moving particle? No? Well you're locked in here
for 50 minutes!"
I love physics, but it's not the best lead-in. It makes us wait till science class
(9th grade?) and worse, it implies calculus is "math for science class". Couldn't
we introduce the themes to 5th graders, and relate it to everyday life?
I think so. So here's the goal:
Use money, not physics, to introduce calculus concepts
Explore how patterns relate (bank account to salary; salary to raises)
Use our intuition to explore potential issues (can we keep drilling into
patterns?)
Strap on your math helmet, time to dive in.
Money money money
Ack. Clearly, not much happened -- Joe isn't earning anything. And what if you
see this?
Easy enough: Joe's making some money. And how much? With a quick
subtraction, we can figure out his weekly paycheck. Turns out Joe is making a
steady $100/week.
Key idea: If I know your bank account, I know your salary
The bank account is dependent on the salary -- it changes because of the
weekly salary.
Raise the roof
Let's go deeper: knowing the salary, what else can we figure out? Well, the
salary is another pattern to analyze -- we can see if it changes! That is, we can
tell if Joe's salary is changing week by week (is he getting a raise?).
The process:
Look at Joe's weekly bank account
Take the difference in bank account to get the weekly salary
Take the difference in salary to get the weekly raise (if any)
In the first example ($100/week), it's clear there's no raise (sorry, Joe). The
main idea is to "take the difference" to analyze the first pattern (bank account
to salary) and "take the difference again" to find yet another pattern (salary to
raise).
Working backwards
We just went "down", from bank account to salary. Does it work the other way:
knowing the salary, can I predict the bank account?
You're hesitating, I can tell. Yes, knowing Joe gets $100/week is nice. But...
don't we need to know the starting account balance?
Yes! The changes to his account (salary) is not enough -- where did it start? For
simplicity (i.e., what you see in homework problems) we often assume Joe
starts with $0. But, if you are actually making a prediction, you want to know
the initial conditions (the "+ C").
A More Complex Pattern
Let's say Joe's account grows like this: 100, 300, 600, 1000, 1500...
Interesting -- Joe's income is changing each week. We do another week-byweek difference and get this:
And yep, Joe's getting a steady raise of $100/week. Let's get wild and chart
them on the same graph:
One way to think about it: Joe gets a raise each week, which changes his
salary, which changes his bank account. As the raises continue to appear, his
salary continues to increase and his bank account rises. You can almost think
of the raise "pushing up" the salary, which "pushes up" the bank account.
So... Where's the Calculus?
What's the formula for Joe's bank account for any week? Well, it's the sum of
his salaries up to that point:
100 + 200 + 300 + 400... = 100 * n * (n + 1)/2
The formula for adding up a series of numbers (1 + 2 + 3 + 4...) is very close
to n^2/2, and gets closer as the number of steps increases.
This is our first "calculus" relationship:
A constant raise ($100/week) leads to a...
Linear increase in salary (100, 200, 300, 400) which leads to a...
Quadratic (something * n^2) increase in bank account (100, 300, 600,
1000... you see it curve!)
Now, why is it roughly 1/2 * n2 and not n2? One intuition: The linear increase in
salary (100, 200, 300) gives us a triangle. The area of the triangle represents
all the payments so far, and the area is 1/2 * base * height. The base is n (the
number of weeks) and the height (income) is 100 * n.
Geometric arguments get more difficult in higher dimensions -- just because
we can work out 2*100 with addition doesn't mean it's the easiest way.
Calculus gives us the rules to jump between patterns (taking derivatives and
integrals).
Points to Explore
The calculus formulas you typically see (integral of x = 1/2 * x^2) are different
from the "discrete" formulas (sum of 1 to n = 1/2 * n * (n + 1)) because the
discrete case is using "chunky" intervals.
Key Takeaways
Why do I care about the analogy used? The traditional "distance, velocity,
acceleration" doesn't lead to the right questions. What's the next derivative of
acceleration? (It's called "jerk", and it's rarely used). Such a literal example is
like having kids think multiplication is only for finding area, and only works on
two numbers at a time.
Here's the key points:
Calculus helps us find related patterns (bank account, to salary, to raises)
The "derivative" is going "down" (finding week-by-week changes to get
your salary)
The "integral" is going "up" (adding up your salary to get your bank
account)
We can figure out a formula for a pattern (given my bank account, predict
my salary) or get a specific value (what's my salary at week 3?)
Calculus is useful outside the hard sciences. If you have a pattern or
formula (production rate, size of a population, GDP of a country) and
want to examine its behavior, calculus is the tool for you.
Textbook calculus involves memorizing the rules to derive and integrate
formulas. Learn the basics (x^n, e, ln, sin, cos) and leave the rest to
machines. Our brainpower is better spent learning how to translate our
thoughts into the language of math.
In my fantasy world, derivatives and integrals are just two everyday concepts.
They're "what you can do" to formulas, just like addition and subtraction are
"what you can do" to numbers.
"Hey kids, we find the total mass using addition (Mass1 + Mass2 = Mass3).
And to find out how our position changes, we use the derivative".
"Duh -- addition is how you combine stuff. And yeah, you take the derivative to
see how your position is changing. What else would you do?"
One can always dream. Happy math.
1. Johann says:
Hi Kalid,
The thing about physics is that its more appropriate for
describing continuousvariations. Money is a discrete process and thus
a case of non-continuous variations
But I agree it is a nice view to explain it, and maybe link it to the boring
use made in physics (probably because its always presented in the same
way).
Again, great reading ! Keep up the good work
Bests,
Johann
2. Prudhvi says:
ingenious!
its true that a lot of people get bored by the direct physics application of
calculus if its introduced too soon
money seems like a natural way to understand it
of course it might not be able to so easily give an intuitive
understanding of the more grainy aspects of calculus, but whatever,
thanks!
3. Kalid says:
@Johann: Thanks for dropping by, and great point about continuous vs.
discrete. The funny thing is that many physicists treat the formulas as
discrete (i.e. using infinitesimal dx, dy, dz quantities to make a 3d
cube, for example) and then let it disappear to make it continuous
again. The neat thing is that using discrete quantities really shows how
the error margin is there (the difference between the actual sum of
squares and 1/2 * x^2) and how limits / Riemann sum help us shrink this.
I agree though, that physics would be cool if it were shown to be an
example of these general principles (and not the definition as is often
seen).
@Prudhvi: Yep, theres always details that you cant get to when you
make analogies. But you have to start somewhere :).
4. MJ says:
If someone had outright told me at any point in Calc I or Calc II that the
+ C can be thought of as an initial condition, I might have actually
remembered to tack it to the end of integrals, instead of considering it an
arbitrary annoyance that has little context.
That makes so much sense (and yet is so, so, so, painfully obvious), that
its not even funny.
5. Kalid says:
@MJ: Thanks for the comment yeah, its *way* too easy to think of the
+C as some mathematical details to keep track of, instead of something
_needed_ to figure out how to make your model work.
They are. But most of us learn these formulas independently. Calculus lets us
start with circumference = 2 * pi * r and figure out the others the Greeks
would have appreciated this.
Unfortunately, calculus can epitomize whats wrong with math
education. Most lessons feature contrived examples, arcane proofs, and
memorization that body slam our intuition & enthusiasm.
It really shouldnt be this way.
Math, art, and ideas
Ive learned something from school: Math isnt the hard part of math;
motivation is.Specifically, staying encouraged despite
Teachers focused more on publishing/perishing than teaching
Self-fulfilling prophecies that math is difficult, boring, unpopular or not
your subject
Textbooks and curriculums more concerned with profits and test results
than insight
A Mathematicians Lament [pdf] is an excellent essay on this issue
that resonated withmany people:
Feisty, are we? Well, heres what I wont do: recreate the existing textbooks. If
you need answers right away for that big test, theres plenty of websites, class
videos and 20-minute sprints to help you out.
Instead, lets share the core insights of calculus. Equations arent
enough I want the aha! moments that make everything click.
Formal mathematical language is one just one way to communicate. Diagrams,
animations, and just plain talkin can often provide more insight than a page
full of proofs.
But calculus is hard!
I think anyone can appreciate the core ideas of calculus. We dont need to be
writers to enjoy Shakespeare.
Its within your reach if you know algebra and have a general interest in math.
Not long ago, reading and writing were the work of trained scribes. Yet today
that can be handled by a 10-year old. Why?
Some define calculus as the branch of mathematics that deals with limits and
the differentiation and integration of functions of one or more variables. Its
correct, but not helpful for beginners.
Heres my take: Calculus does to algebra what algebra did to arithmetic.
Arithmetic is about manipulating numbers (addition, multiplication,
etc.).
Algebra finds patterns between numbers: a^2 + b^2 = c^2 is a
famous relationship, describing the sides of a right triangle. Algebra finds
entire sets of numbers if you know a and b, you can find c.
Calculus finds patterns between equations: you can see how one
equation (circumference = 2 * pi * r) relates to a similar one (area = pi *
r^2).
Using calculus, we can ask all sorts of questions:
How does an equation grow and shrink? Accumulate over time?
When does it reach its highest/lowest point?
How do we use variables that are constantly changing? (Heat, motion,
populations, ).
And much, much more!
Algebra & calculus are a problem-solving duo: calculus finds new equations,
and algebra solves them. Like evolution, calculus expands your
understanding of how Nature works.
An Example, Please
Lets walk the walk. Suppose we know the equation for circumference (2 * pi *
r) and want to find area. What to do?
Realize that a filled-in disc is like a set of Russian dolls.
Now heres where things get funky. Lets unroll those rings and line them
up. What happens?
Many calculus examples are based on physics. Thats great, but it can be hard
to relate: honestly, how often do you know the equation for velocity for an
object? Less than once a week, if that.
I prefer starting with physical, visual examples because its how our minds
work. That ring/circle thing we made? You could build it out of several pipe
cleaners, separate them, and straighten them into a crude triangle to see if
the math really works. Thats just not happening with your velocity equation.
A note on rigor (for the math geeks)
I can feel the math pedants firing up their keyboards. Just a few words on
rigor.
Did you know we dont learn calculus the way Newton and Leibniz discovered
it? They used intuitive ideas of fluxions and infinitesimals which were
replaced with limits becauseSure, it works in practice. But does it work
in theory?.
Great article! I love the insights. Im currently taking up calculus class and I
find it hard to learn its essence just by taking it up in school. I mean,
depending upon the style of ones professor, I think math is a subject one can
get by without much thinking by just knowing its procedure (except integral
calculus, I think). But I find myself being reluctant to score that way (and also
find integral calculus challenging), so I surfed the Internet to seek for a website
that would make me understand what calculus really means, and your website
turns out to be exactly what Im looking for!
I can really relate to you KalidI also feel that our math education system
today is being head over the clouds and must be more down-to-earth to
beginners. Not understanding the essence of mathematics makes the majority
of people not appreciate it. To give an analogy, its like theyre seeing music in
written form and calling it music without even listening to it. In order to
understand what an abstract word really means, one must get a hold first of its
manifestations in the concrete world, and then how the abstract thereafter
relates to the concrete. I think whenever people say they hate the beautiful
subject math, they just dont really understand what it means.
Ive read some of the others comments regarding evolution. I feel moved to
share some facts, inferences and insights regarding its validity.
Our scientific formulae are so predictive only because each scientific formula
represents a scientific generalisation that has been based on factual
observations. Its because we have observed a set of phenomena to be
consistent that we classify them together and make a scientific generalisation
out of them, taking advantage of their consistency to make predictions for
future purposes. We keep on observing sets of phenomena in this way.
However, that does not explain how they can be consistent. Therefore one is
left with two general categories to explain the consistency of each of them: (1)
occurrences ensue; (2) otherwise, theyre being controlled. What do we call
these certainties in the universe? Physical laws, which are certain, cant be just
some chance events, which are random and uncertain. Our lack of knowledge
permits us to believe that some things just happen by chance when we dont
know what caused it, but thats not the attitude of a scientist; science
attempts to explain causes or it wont have a cause. If the universe wouldnt
follow physical laws, we wouldnt be able to classify anything (e.g. atoms), let
alone observe any consistency. What intuition do you think drove us to call
physical laws laws? Laws are commands. Nothing comes from nothing. The law
of conservation of energy signifies this. If one wants to believe that something
can arise by itself, it shouldnt be the universe, because the universe is under
the law of conservation of energy. This therefore makes us conclude that the
universe has always existed from eternity past. However, the universe began.
Our universe is characterised by cosmic expansion. The second law of
thermodynamics indicates that the longer time has elapsed, the greater the
overall entropy of the universe shall be. Given that the universe is currently
not at a state of maximum entropy, the first and second laws of
thermodynamics indicate that the universe must not have always existed from
eternity past. Matter, energy, space and time, which constitute the universe,
have not always existed. Therefore, because the universe began to exist,
either some Being or something must have caused it. This cause of the
universe must be immaterial, because the cause of the universe cannot be the
universe itself, which is the totality of all material things, as nothing can cause
itself that has not arisen from nothing. In other words, something causing itself
is like saying that it appeared out of nowhere. Something arising out of nothing
can only be true if that thing is not under the law of conservation of energy, or,
if some Being xor some other thing caused it that, being able to create energy,
is above the law of conservation of energy. Because of laws such as the laws of
thermodynamics, only the Creator can and will create the universe from
nothing. Being transcendent, the Creator of the universe must possess a
unique nature distinct from the universe or from anything in it as much as the
Creator of the universe hasnt caused the universe or anything in it to bear
resemblance to the Creators nature. This nature then doesnt necessarily have
to be tangible nor visible to our eyes.
The theory of evolution holds that millions and millions of years ago, fish
began evolving by means of little cumulative changes over long periods of
time. Over approximately 170000000 years, fish managed to evolve to
amphibians. Over approximately 60000000 years, amphibians evolved to
reptiles. Some of these reptiles evolved to nonmonkey mammals, still over a
long period of time, which then evolved to monkeyssimply put, our
ancestors. Of course, fish came all the way from a common ancestor. This is
what Darwin has proposed. After the discovery of DNA, however, the theory of
evolution itself evolved to include nonliving chemicals that happened to live by
time and incredible luck.
There is no substantial evidence, however, to support this. It doesnt follow
that similarities in DNA should indicate a common descent. The assertion that
genus evolves to another genus over a very long period of time is contrary to
science (genome is the total of all the genetic possibilities for a given species,
and should not be confused for genotype). I understand that, in order to
appear as though it was falsifiable, and thus be convincing, this assertion
depends on natural selection. But its not the other way around; it is not
requisite for this unobservable assertion to be true in order for natural
selection to be true, or for natural things to serve some purpose. One purpose
We've been able to describe our step-by-step process with analogies (X-Rays, Timelapses, and rings) and diagrams:
However, this is a very elaborate way to communicate. Here's the Official Math
terms:
Intuitive
Concept
X-Ray (split apart)
Time-lapse (glue
together)
Arrow direction
Arrow start/stop
Slice
Formal Name
Symbol
ddr
dr implies moving
along r
endstart
Equation, such
as 2r
The Derivative
The derivative is splitting a shape into sections as we move along a path (i.e., XRaying it). Now here's the trick: although the derivative generates the entire sequence
of sections (the black line), we can also extract a single one.
Think about a function like f(x)=x2. It's a curve that describes a giant list of
possibilities (1, 4, 9, 16, 25, etc.). We can graph the entire curve, sure, or examine the
value of f(x) at a specific value, like x=3.
The derivative is similar. Officially, it's the entire pattern of sections, but we can zero
into a specific one by asking for the derivative at a certain value. (The derivative is a
function, just like f(x)=x2; if not otherwise specified, we're describing the entire
function.)
What do we need to find the derivative? The shape to split apart, and the path to follow
as we cut it up (the orange arrow). For example:
The derivative of a circle with respect to the radius creates rings
The derivative of a circle with respect to the perimeter creates slices
The derivative of a circle with respect to the x-axis creates boards
I agree that "with respect to" sounds formal: Honorable Grand Poombah radius, it is
with respect to you that we derive. Math is a gentleman's game, I suppose.
Taking the derivative is also called "differentiating", because we are finding the
difference between successive positions as a shape grows. (As we grow the radius of a
circle, the difference between the current disc and the next size up is that outer ring.)
The Integral, Arrows, and Slices
The integral is glueing together (time-lapsing) a group of sections and measuring the
final result. For example, we glued together the rings (into a "ring triangle") and saw it
accumulated to r2, aka the area of a circle.
Here's what we need to find the integral:
Which direction are we gluing the steps together? Along the orange line
(the radius, in this case)
When do we start and stop? At the start and end of the arrow (we start at 0,
no radius, and move to r, the full radius)
How big is each step? Well each item is a "ring". Isn't that enough?
Nope! We need to be specific. We've been saying we cut a circle into "rings" or "pizza
slices" or "boards". But that's not specific enough; it's like a BBQ recipe that says "Cook
meat. Flavor to taste."
Maybe an expert knows what to do, but we need more specifics. How large, exactly, is
each step (technically called the "integrand")?
variables involved, like radius and perimeter, you need to clarify which step we're
using: dr or dp?)
Last, remember that r (the radius) changes as we time-lapse, starting at 0 and
eventually reaching its final value. When we see r in the context of a step, it means "the
size of the radius at the current step" and not the final value it may ultimately have.
These issues are extremely confusing. I'd prefer we use r dr to indicate an intermediate
"r at the current step" instead of a general-purpose "r" that's easily confused with the
max value of the radius. I can't change the symbols at this point, unfortunately.
Practicing The Lingo
Let's learn to talk like calculus natives. Here's how we can describe our X-Ray
strategies:
Intuitive
Formal description
Visualization
Symbol
ddrAre
a
ddpAr
ea
ddxAre
a
Remember, the derivative just splits the shape into (hopefully) easy-to-measure steps,
such as rings of size 2r dr. We broke apart our lego set and have pieces scattered on
the floor. We still need an integral to glue the parts together and measure the new size.
The two commands are a tag team:
The derivative says: "Ok, I split the shape apart for you. It looks like a bunch of
pieces 2r tall and dr wide."
The integral says: "Oh, those pieces resemble a triangle -- I can measure that!
The total area of that triangle is 12baseheight, which works out to r2 in this
case.".
Here's how we'd write the integrals to measure the steps we've made:
Formal description
integrate 2 * pi * r * dr
from r=0 to r=r
A few notes:
Symbol
r02r dr
p=maxp=min(pizza slic
e) dp
x=maxx=minboard dx
Measures Total
Size Of
1) Can you think of another activity which is made simpler by shortcuts and notation,
vs. written English?
2) Interested in performance? Let's drive the calculus car, even if you can't build it yet.
Question 1: How would you write the integrals that cover half of a circle?
(Answer for the first half and the second half. This links to Wolfram Alpha, an online
calculator, and we'll learn to use it later on.)
Question 2: Can you find the complete way to describe our "pizza-slice" approach?
Remember that each slice is basically a triangle (so what's the area?). The slices move
around the perimeter (where does it start and stop?). Have a guess for the command?
Here it is, the slice-by-slice description.
Question 3: Can you figure out how to move from volume to surface area?
* pi * r^3.
to separate that volume into a sequence of shells. Which variable are we moving
through?