Topology of Numbers
Topology of Numbers
Topology of Numbers
Chapter 0: A Preview
Pythagorean Triples
As an introduction to the sorts of questions that we will be studying, let us con-
sider right triangles whose sides all have integer lengths. The most familiar example
is the (3, 4, 5) right triangle, but there are many others as well, such as the (5, 12, 13)
right triangle. Thus we are looking for triples (a, b, c) of positive integers such that
a2 + b2 = c 2 . Such triples are called Pythagorean triples because of the connection
with the Pythagorean Theorem. Our goal will be a formula that gives them all. The
ancient Greeks knew this formula, and even before the Greeks the ancient Babylonians
must have known a lot about Pythagorean triples because one of their clay tablets from
nearly 4000 years ago has been found which gives a list of 15 different Pythagorean
triples, the largest of which is (12709, 13500, 18541) . (Actually the tablet only gives
the numbers a and c from each triple (a, b, c) for some unknown reason, but it is
easy to compute b from a and c .)
There is an easy way to create infinitely many Pythagorean triples from a given
one just by multiplying each of its three numbers by an arbitrary number n . For
example, from (3, 4, 5) we get (6, 8, 10) , (9, 12, 15) , (12, 16, 20) , and so on. This
process produces right triangles that are all similar to each other, so in a sense they
are not essentially different triples. In our search for Pythagorean triples there is
thus no harm in restricting our attention to triples (a, b, c) whose three numbers
have no common factor. Such triples are called primitive. The large Babylonian triple
mentioned above is primitive, since the prime factorization of 13500 is 22 33 53 but
the other two numbers in the triple are not divisible by 2 , 3 , or 5 .
A fact worth noting in passing is that if two of the three numbers in a Pythagorean
triple (a, b, c) have a common factor n , then n is also a factor of the third number.
This follows easily from the equation a2 + b2 = c 2 , since for example if n divides a
and b then n2 divides a2 and b2 , so n2 divides their sum c 2 , hence n divides c .
Chapter 0 Preview 2
Pythagorean triple (a, b, c) . We can assume this triple is primitive by canceling any
common factor of a , b , and c , and this doesnt change the point ac , bc . The two
a b
fractions c and c must then be in lowest terms since we observed earlier that if two
of a , b , c have a common factor, then all three have a common factor.
From the preceding observations we can conclude that the problem of finding
all Pythagorean triples is equivalent to finding all rational points on the unit circle
x 2 + y 2 = 1 . More specifically, there is an exact one-to-one correspondence between
primitive Pythagorean triples and rational points on the unit circle that lie in the
interior of the first quadrant (since we want all of a, b, c, x, y to be positive).
In order to find all the rational points on the circle x 2 + y 2 = 1 we will use
a construction that starts with one rational point and creates many more rational
points from this one starting point. There are four obvious rational points on the
circle we could use to start, the intersections of the circle with the coordinate axes,
the points (1, 0) and (0, 1) . It doesnt
really matter which one we choose, so lets ( 0, 1 )
P
choose (0, 1) . Now consider a line which
intersects the circle in this point (0, 1) and
some other point P , as in the figure at the
(r,0)
right. If the line has slope m , its equa-
tion will be y = mx + 1 . If we denote the
point where the line intersects the x -axis
x
by (r , 0) , then m = 1/r so the equation for the line can be rewritten as y = 1 r .
Chapter 0 Preview 3
x
To find the coordinates of the point P in terms of r we substitute y = 1 r into the
2 2
equation x + y = 1 and solve for x :
x 2
x2 + 1 =1
r
2 2x x2
x +1 + 2 =1
r r
1 2x
1 + 2 x2 =0
r r
2
r +1 2x
2
x =
r2 r
2r
x= 2 or x = 0
r +1
2r
Plugging x = into the formula y = 1 xr , we get
r2+1
x 1 2r 2 r2 1
y = 1 = +1= 2 +1= 2
r r r2 + 1 r +1 r +1
Summarizing, we have found that the point P has coordinates
2r r2 1
(x, y) = ,
r2 + 1 r2 + 1
Note that when x = 0 there are two points (0, 1) on the circle. The point (0, 1)
comes from the value r = 0 , while if we let r approach then the point P ap-
proaches (0, 1) , as we can see either from the picture or from the formula for (x, y) .
If r is a rational number, then the formula for (x, y) shows that both x and y
are rational, so we have a rational point on the circle. Conversely, if both coordinates
x and y of the point P on the circle are rational, then the slope m of the line must
be rational, hence r must also be rational since r = 1/m . We could also solve the
x x
equation y = 1 r for r to get r = 1y , showing again that r will be rational
if x and y are rational. The conclusion of all this is that, starting from the initial
rational point (0, 1) we have found formulas that give all the other rational points on
the circle.
Since there are infinitely many choices for the rational number r , there are in-
finitely many rational points on the circle. But we can say something much stronger
than this: Every arc of the circle, no matter how small, contains infinitely many rational
points. This is because every arc on the circle corresponds to an interval of r -values
on the x -axis, and every interval in the x -axis contains infinitely many rational num-
bers. Since every arc on the circle contains infinitely many rational points, we can say
Chapter 0 Preview 4
that the rational points are dense in the circle, meaning that for every point on the
circle there is an infinite sequence of rational points approaching the given point.
Now we can go back and find formulas for Pythagorean triples. If we set the
rational number r equal to p/q with p and q integers having no common factor,
then the formulas for x and y become
p
2 q 2pq
x= p2
=
+1 p2+ q2
q2
p2
q2 1 p 2 q2
y= p2
= 2
+1 p + q2
q2
(a, b, c) = (2pq, p 2 q2 , p 2 + q2 )
The starred entries are the ones with nonprimitive Pythagorean triples. Notice that
this occurs only when p and q are both odd, so that not only is 2pq even, but also
both p 2 q2 and p 2 + q2 are even, so all three of a , b , and c are divisible by 2 . The
primitive versions of the nonprimitive entries in the table occur higher in the table,
but with a and b switched. This is a general phenomenon, as we will see in the course
of proving the following basic result:
Chapter 0 Preview 5
Case 1: Suppose p and q have opposite parity. If all three of 2pq , p 2 q2 , and
p 2 + q2 have a common divisor d > 1 then d would have to be odd since p 2 q2 and
p 2 + q2 are odd when p and q have opposite parity. Furthermore, since d is a divisor
of p 2 q2 and p 2 + q2 it must divide their sum (p 2 + q2 ) + (p 2 q2 ) = 2p 2 and
also their difference (p 2 + q2 ) (p 2 q2 ) = 2q2 . However, since d is odd it would
then have to divide p 2 and q2 , forcing p and q to have a common factor (since any
prime factor of d would have to divide p and q ). This contradicts the assumption
that p and q had no common factors, so we conclude that (2pq, p 2 q2 , p 2 + q2 ) is
primitive if p and q have opposite parity.
Case 2: Suppose p and q have the same parity, hence they are both odd since if
they were both even they would have the common factor of 2 . Because p and q are
both odd, their sum and difference are both even and we can write p + q = 2P and
p q = 2Q for some integers P and Q . Any common factor of P and Q would have
p+q pq p+q pq
to divide P + Q = 2 + 2 = p and P Q = 2 2 = q , so P and Q have no
common factors. In terms of P and Q our Pythagorean triple becomes
(a, b, c) = (2pq, p 2 q2 , p 2 + q2 )
= (2(P + Q)(P Q), (P + Q)2 (P Q)2 , (P + Q)2 + (P Q)2 )
= (2(P 2 Q2 ), 4P Q, 2(P 2 + Q2 ))
= 2(P 2 Q2 , 2P Q, P 2 + Q2 )
After canceling the factor of 2 we get a new Pythagorean triple, with the first two
coordinates switched, and this one is primitive by Case 1 since P and Q cant both be
odd, because if they were, then p = P + Q and q = P Q would both be even, which
is impossible since they have no common factor.
From Cases 1 and 2 we can conclude that if we allow ourselves to switch the first
two coordinates, then we get all primitive Pythagorean triples from the formula by
restricting p and q to be of opposite parity and to have no common factors.
Chapter 0 Preview 6
The same technique we used to find the rational points on the circle x 2 + y 2 = 1
can also be used to find all the rational points on other quadratic curves Ax 2 + Bxy +
Cy 2 + Dx + Ey = F with integer or rational coefficients A , B , C , D , E , F , provided
that we can find a single rational point (x0 , y0 ) on the curve to start the process. For
example, the circle x 2 + y 2 = 2 contains the rational points (1, 1) and we can use
one of these as an initial point. Taking the point (1, 1) ,
we would consider lines y 1 = m(x 1) of slope m
passing through this point. Solving this equation for
y and plugging into the equation x 2 + y 2 = 2 would
produce a quadratic equation ax 2 + bx + c = 0 whose
coefficients are polynomials in the variable m , so these
coefficients would be rational whenever m is rational.
From the quadratic formula x = b b2 4ac /2a we see that the sum of the two
roots is b/a , a rational number if m is rational, so if one root is rational then the
other root will be rational as well. The initial point (1, 1) on the curve x 2 +y 2 = 2 gives
x = 1 as one rational root of the equation ax 2 + bx + c = 0 , so for each rational value
of m the other root x will be rational as well. Then the equation y 1 = m(x 1)
implies that y will also be rational, and hence we obtain a rational point (x, y) on
the curve for each rational value of m . Conversely, if x and y are both rational then
obviously m = (y 1)/(x 1) will be rational. Thus one obtains a dense set of
rational points on the circle x 2 + y 2 = 2 , since m can be any rational number. An
exercise at the end of this chapter is to work out the formulas explicitly.
Suppose we look at the vertical plane containing the triangle ONQ . From our earlier
analysis of rational points on a circle of radius 1 we know that if the segment OQ
2r r 2 1
has length |OQ| = r , then |OP | = r 2 +1
and |P P | = r 2 +1
. From the right triangle
2 2 2
OBQ we see that u + v = r since u = |OB| and v = |BQ| . The triangle OBQ is
2
similar to the triangle OAP . Since the length of OP is
r 2 +1
times the length of OQ
we conclude from similar triangles that
2 2 2u
x = |OA| = |OB| = 2 u= 2
r2 +1 r +1 u + v2 + 1
and
2 2 2v
y = |AP | = |BQ| = 2 v = 2
r2 +1 r +1 u + v2 + 1
Also we have
r2 1 u2 + v 2 1
z = |P P | = =
r2 + 1 u2 + v 2 + 1
Summarizing, we have expressed x , y , and z in terms of u and v by the formulas
2u 2v u2 + v 2 1
x= y= z=
u + v2 + 1
2 u + v2 + 1
2 u2 + v 2 + 1
These formulas imply that we get a rational point (x, y, z) on the sphere x 2 +y 2 +z 2 =
1 for each pair of rational numbers (u, v) . We get all rational points on the sphere in
this way (except for the north pole (0, 0, 1) , of course) since it is possible to express
u and v in terms of x , y , and z by the formulas
x y
u= v=
1z 1z
which one can easily verify by substituting into the previous formulas.
Here is a short table giving a few rational points on the sphere and the corre-
sponding integer solutions of the equation a2 + b2 + c 2 = d2 :
For example, the vector (1, 1, 1) has length 3 so the corresponding unit vector is
(1/ 3, 1/ 3, 1/ 3) . It is rare that this process produces unit vectors having rational
coordinates, but we now have a method for creating as many rational unit vectors as
we like.
Incidentally, there is a name for the correspondence we have described between
points (x, y, z) on the unit sphere and points (u, v) in the plane: it is called stereo-
graphic projection. One can think of the sphere and the plane as being made of clear
glass, and one puts ones eye at the north pole of the sphere and looks downward
and outward in all directions to see points on the sphere projected onto points in
the plane, and vice versa. The north pole itself does not project onto any point in the
plane, but points approaching the north pole project to points approach infinity in the
plane, so one can think of the north pole as corresponding to an imaginary infinitely
distant point" in the plane. This geometric viewpoint somehow makes infinity less
of a mystery, as it just corresponds to a point on the sphere, and points on a sphere
are not very mysterious. (Though in the early days of polar exploration the north pole
may have seemed very mysterious and infinitely distant!)
where p and q have no common factor and are not both odd. Determining whether
a given number can be expressed in the form 2pq , p 2 q2 , or p 2 + q2 is a special
case of the general question of deciding when an equation Ap 2 + Bpq + Cq2 = n has
an integer solution p , q , for given integers A , B , C , and n . Expressions of the form
Ax 2 + Bxy + Cy 2 are called quadratic forms. These will be the main topic studied
in Chapter 2, where we will develop some general theory addressing the question of
what values a quadratic form takes on when all the numbers involved are integers.
For now, let us just look at the special cases at hand.
First let us consider which numbers occur as a or b in Pythagorean triples
(a, b, c) . We certainly cant realize the number 1 since this would say a2 + 1 = c 2 or
Chapter 0 Preview 10
1 + b2 = c 2 but 1 is not the difference between the squares of any two positive inte-
gers. For numbers bigger than 1 , if we look at the earlier table of Pythagorean triples
we see that all the numbers up to 15 can be realized as a or b in primitive triples
except for 2 , 6 , 10 , and 14 . This might lead us to guess that the numbers realizable
as a or b in primitive triples are the numbers not congruent to 2 modulo 4 . This is
indeed true, and can be proved as follows. First note that 2pq is even and p 2 q2 is
odd (otherwise both a and b would be even, violating primitivity). Every odd number
bigger than 1 is expressible in the form p 2 q2 since 2k + 1 = (k + 1)2 k2 , so in
fact every odd number is the difference between two consecutive squares. Note that
taking p = k + 1 and q = k does yield a primitive triple since k and k + 1 always have
opposite parity and no common factors. This takes care of realizing odd numbers.
For even numbers, they would have to be of the form 2pq , and by taking q = 1 we
realize any even number 2p . However, to have a primitive triple we have to have
p even since p must have opposite parity from q which is 1 . Thus we realize the
numbers a = 4k by primitive triples but not the numbers a = 4k + 2 . This is what
we claimed was true. To finish the story for a and b , note that a number a = 4k + 2
which cant be realized by a primitive triple can be realized by a nonprimitive triple,
at least if k 1 , since we know we can realize the odd number 2k + 1 if k 1 , and
by doubling this we realize 4k + 2 . Summarizing this discussion, all numbers greater
than 2 can be realized as a or b in Pythagorean triples (a, b, c) .
Now let us ask which numbers c can occur in Pythagorean triples (a, b, c) , so we
are trying to find a solution of p 2 + q2 = c for a given number c . Pythagorean triples
(p, q, r ) give solutions when c is equal to a square r 2 , but we are asking now about
arbitrary numbers c . It suffices to figure out which numbers c occur in primitive
triples (a, b, c) , since by multiplying the numbers c in primitive triples by arbitrary
numbers we get the numbers c in arbitrary triples. A look at the earlier table shows
that the numbers c that can be realized by primitive triples (a, b, c) seem to be fairly
rare: only 5 , 13 , 17 , 25 , 29 , 37 , 41 , 53 , 61 , 65 , and 85 occur in the table. These
are all odd, and in fact they are all congruent to 1 modulo 4 . This always has to
be true because p and q are of opposite parity, so one of p 2 and q2 is congruent
to 0 modulo 4 while the other is congruent to 1 , hence p 2 + q2 is congruent to 1
modulo 4 . More interesting is the fact that most of the numbers on the list are prime
numbers, and the ones that arent prime are products of earlier primes in the list:
25 = 5 5 , 65 = 5 13 , 85 = 5 17 . From this somewhat slim evidence one might
conjecture that the numbers c occurring in primitive Pythagorean triples are exactly
the numbers that are products of primes congruent to 1 modulo 4 . The first prime
Chapter 0 Preview 11
satisfying this condition that isnt on the original list is 73 , and this is realized as
p 2 + q2 = 82 + 32 , in the triple (48, 55, 73) . The next two primes congruent to 1
modulo 4 are 89 = 82 + 52 and 97 = 92 + 42 , so the conjecture continues to look
good. Proving the general conjecture is not easy, however, and we will take up this
question in Chapter 2 when we fully answer the question of which numbers can be
expressed as the sum of two squares.
Another question one can ask about Pythagorean triples is, how many are there
where two of the three numbers differ by only 1 ? In the earlier table there are
several: (3, 4, 5) , (5, 12, 13) , (7, 24, 25) , (20, 21, 29) , (9, 40, 41) , (11, 60, 61) , and
(13, 84, 85) . As the pairs of numbers that are adjacent get larger, the correspond-
ing right triangles are either approximately 45-45-90 right triangles as with the triple
(20, 21, 29) , or long thin triangles as with (13, 84, 85) . To analyze the possibilities,
note first that if two of the numbers in a triple (a, b, c) differ by 1 then the triple has
to be primitive, so we can use our formula (a, b, c) = (2pq, p 2 q2 , p 2 + q2 ) . If b and
c differ by 1 then we would have (p 2 + q2 ) (p 2 q2 ) = 2q2 = 1 which is impossible.
If a and c differ by 1 then we have p 2 + q2 2pq = (p q)2 = 1 so p q = 1 , and
in fact p q = +1 since we have to have p > q in order for b = p 2 q2 to be pos-
itive. Thus we get the infinite sequence of solutions (p, q) = (2, 1), (3, 2), (4, 3),
with corresponding triples (4, 3, 5), (12, 5, 13), (24, 7, 25), . Note that these are the
same triples we obtained earlier that realize all the odd values b = 3, 5, 7, .
The remaining case is that a and b differ by 1 . Thus we have the equation
p 2pq q2 = 1 . The left side doesnt factor using integer coefficients, so its not
2
so easy to find integer solutions this time. In the table there are only the two triples
(4, 3, 5) and (20, 21, 29) , with (p, q) = (2, 1) and (5, 2) . After some trial and error one
could find the next solution (p, q) = (12, 5) which gives the triple (120, 119, 169) . Is
there a pattern in the solutions (2, 1), (5, 2), (12, 5) ? One has the numbers 1, 2, 5, 12 ,
and perhaps it isnt too much of a stretch to notice that the third number is twice the
second plus the first, while the fourth number is twice the third plus the second. If
this pattern continued, the next number would be 29 = 2 12 + 5 , giving (p, q) =
(29, 12) , and this does indeed satisfy p 2 2pq q2 = 1 , yielding the Pythagorean
triple (696, 697, 985) . These numbers are increasing rather rapidly, and the next case
(p, q) = (70, 29) yields an even bigger Pythagorean triple (4060, 4059, 5741) . Could
there be other solutions of p 2 2pq q2 = 1 with smaller numbers that we missed?
We will develop tools in Chapter 2 to find all the integer solutions, and it will turn out
that the sequence we have just discovered gives them all.
Chapter 0 Preview 12
Although the quadratic form p 2 2pq q2 does not factor using integer coeffi-
cients, it can be simplified slightly be rewriting it as (p q)2 2q2 . Then if we change
variables by setting
x =pq
y =q
As far back as the eleventh and twelfth centuries mathematicians in India knew how to
find this solution. It was rediscovered in the seventeenth century by Fermat in France,
who also gave the smallest solution of x 2 109y 2 = 1 , the even larger pair
The way that the size of the smallest solution of x 2 Dy 2 = 1 depends upon D is
very erratic and is still not well understood today.
can be factored as (a + bi)(a bi) where i = 1 . If we rewrite the equation
2 2 2 2
a + b = c as (a + bi)(a bi) = c then since the right side of the equation is a
square, we might wonder whether each term on the left side would have to be a square
too. For example, in the case of the triple (3, 4, 5) we have (3 + 4i)(3 4i) = 52 with
3+4i = (2+i)2 and 34i = (2i)2 . So let us ask optimistically whether the equation
(a+bi)(abi) = c 2 can be rewritten as (p+qi)2 (pqi)2 = c 2 with a+bi = (p+qi)2
and a bi = (p qi)2 . We might hope also that the equation (p + qi)2 (p qi)2 = c 2
was obtained by simply squaring the equation (p + qi)(p qi) = c . Let us see what
happens when we multiply these various products out:
a + bi = (p + qi)2 = (p 2 q2 ) + (2pq)i
hence a = p 2 q2 and b = 2pq
a bi = (p qi)2 = (p 2 q2 ) (2pq)i
hence again a = p 2 q2 and b = 2pq
c = (p + qi)(p qi) = p 2 + q2
Thus we have miraculously recovered the formulas for Pythagorean triples that we
obtained earlier by geometric means (with a and b switched, which doesnt really
matter):
a = p 2 q2 b = 2pq c = p 2 + q2
Of course, our derivation of these formulas just now depended on several assump-
tions that we havent justified, but it does suggest that looking at complex numbers of
the form a + bi where a and b are integers might be a good idea. There is a name for
complex numbers of this form a + bi with a and b integers. They are called Gaus-
sian integers, since the great mathematician and physicist C.F.Gauss made a thorough
algebraic study of them some 200 years ago. We will develop the basic properties
of Gaussian integers in Chapter 3, in particular explaining why the derivation of the
formulas above is valid.
Diophantine Equations
Equations like x 2 + y 2 = z 2 or x 2 Dy 2 = 1 that involve polynomials with inte-
ger coefficients, and where the solutions sought are required to be integers, are called
Diophantine equations after the Greek mathematician Diophantus (ca. 250 A.D.) who
wrote a book about these equations that was very influential when European mathe-
maticians started to consider this topic much later in the 1600s. Usually Diophantine
equations are very hard to solve because of the restriction to integer solutions. The
Chapter 0 Preview 14
first really interesting case is quadratic Diophantine equations. By the year 1800 there
was quite a lot known about the quadratic case, and we will be focusing on this case
in this book.
Diophantine equations of higher degree than quadratic are much more challeng-
ing to understand. Probably the most famous one is x n + y n = z n where n is a fixed
integer greater than 2 . When the French mathematician Fermat in the 1600s was read-
ing about Pythagorean triples in his copy of Diophantus book he made a marginal note
that, in contrast with the equation x 2 + y 2 = z 2 , the equation x n + y n = z n has no
solutions with positive integers x, y, z when n > 2 and that he had a marvelous proof
which unfortunately the margin was too narrow to contain. This is one of many state-
ments that he claimed were true but never wrote proofs of for public distribution, nor
have proofs been found among his manuscripts. Over the next century other math-
ematicians discovered proofs for all his other statements, but this one was far more
difficult to verify. The issue is clouded by the fact that he only wrote this statement
down the one time, whereas all his other important results were stated numerous
times in his correspondence with other mathematicians of the time. So perhaps he
only briefly believed he had a proof. In any case, the statement has become known
as Fermats Last Theorem. It was finally proved in the 1990s by Andrew Wiles, using
some very deep mathematics developed over the preceding couple decades.
Just as finding integer solutions of x 2 + y 2 = z 2 is equivalent to finding rational
points on the circle x 2 + y 2 = 1 , so finding integer solutions of x n + y n = z n is
equivalent to finding rational points on the curve x n + y n = 1 . For even values of
n > 2 this curve looks like a flattened out circle while for odd n it has a rather different
shape, extending out to infinity in the second and fourth quadrants, asymptotic to the
line y = x :
Fermats Last Theorem is equivalent to the statement that these curves have no ra-
tional points except their intersections with the coordinate axes, where either x or
Chapter 0 Preview 15
y is 0 . It is curious that these curves only contain a finite number of rational points
(either two points or four points, depending on whether n odd or even) whereas
quadratic curves like x 2 + y 2 = n either contain no rational points or an infinite
dense set of rational points.
Exercises
1. (a) Make a list of the 16 primitive Pythagorean triples (a, b, c) with c 100 ,
regarding (a, b, c) and (b, a, c) as the same triple.
(b) How many more would there be if we allowed nonprimitive triples?
(c) How many triples (primitive or not) are there with c = 65 ?
2. Show that there are no Pythagorean triples (a, b, c) with a being a positive integer
multiple of b , or vice versa. (Show" means Prove", that is, give a logical argument
why the statement is true.)
3. (a) Find all the positive integer solutions of x 2 y 2 = 512 by factoring x 2 y 2 as
(x + y)(x y) and considering the possible factorizations of 512 .
(b) Show that the equation x 2 y 2 = n has only a finite number of integer solutions
for each value of n .
(c) Find a value of n for which the equation x 2 y 2 = n has at least 100 different
positive integer solutions.
4. Show that there are only a finite number of Pythagorean triples (a, b, c) with a , b ,
or c equal to a given number n . (Part of the previous problem may be useful.)
5. Find an infinite sequence of Pythagorean triples where two of the numbers in each
triple differ by 2 .
6. Find a right triangle whose sides have integer lengths and whose acute angles are
close to 30 and 60 degrees by first finding the irrational value of r that corresponds to
a right triangle with acute angles exactly 30 and 60 degrees, then choosing a rational
number close to this irrational value of r .
7. Find a right triangle whose sides have integer lengths and where one of the nonhy-
potenuse sides is approximately twice as long as the other, using a method like the
one in the preceding problem. (One possible answer might be the (8, 15, 17) triangle,
or a triangle similar to this, but you should do better than this.)
8. Find a rational point on the sphere x 2 +y 2 +z 2 = 1 whose x , y , and z coordinates
2 2 1
are nearly equal. (You can decide what nearly equal" means, but a point like ( 3 , 3 , 3 )
doesnt qualify.)
Chapter 0 Preview 16
9. (a) Derive formulas that give all the rational points on the circle x 2 + y 2 = 2 in
terms of a rational parameter m , the slope of the line through the point (1, 1) on the
circle. The calculations may be a little messy, but they work out fairly nicely in the
end to give
m2 2m 1 m2 2m + 1
x= , y=
m2 + 1 m2 + 1
(b) Using these formulas, find five different rational points on the circle in the first
quadrant, and hence five solutions of a2 + b2 = 2c 2 with positive integers a , b , c .
10. (a) Find formulas that give all the rational points on the upper branch of the
hyperbola y 2 x 2 = 1 .
(b) Can you find any relationship between these rational points and Pythagorean
triples?
11. (a) For integers x , what are the possible values of x 2 modulo 8 ?
(b) Show that the equation x 2 2y 2 = 3 has no integer solutions by considering
this equation modulo 8 .
(c) Show that there are no Pythagorean triples (a, b, c) with a and b differing by 3 .
12. Show that for every Pythagorean triple (a, b, c) the product abc must be divisible
by 60 . (It suffices to show that abc is divisible by 3 , 4 , and 5 .)
Chapter 1 The Farey Diagram 1
5 /4 1 /1 4 /5
7 /5 4 /3 3 /4 5 /
7
3 /2 2 /3
8 /5 5 /8
5 /3 3 /5
7 /4
4 /7
2 /1 1 /2
7 /3 3 /7
5 /2 2 /5
8 /3 3 /8
3 /1 1 /3
7 /2 2 /7
4 /1 1 /4
5 /1 1 /5
1 /0 0 /1
4 /1 1 /4
3 /1 1 /3
5 /2 2 /5
2 /1 1 /2
5 /3 3 /5
3 /2 2 /3
4 /3 3 /4
1 /1
What is shown here is not the whole diagram but only a finite part of it. The actual
diagram has infinitely many curvilinear triangles, getting smaller and smaller out near
the boundary circle. The diagram can be constructed by first inscribing the two big
triangles in the circle, then adding the four triangles that share an edge with the two
big triangles, then the eight triangles sharing an edge with these four, then sixteen
more triangles, and so on forever. With a little practice one can draw the diagram
Chapter 1 The Farey Diagram 2
without lifting ones pencil from the paper: First draw the outer circle starting at the
left or right side, then the diameter, then make the two large triangles, then the four
next-largest triangles, etc.
The vertices of all the triangles are labeled with fractions a/b , including the
fraction 1/0 for , according to the following scheme. In the upper half of the
diagram first label the vertices of the big triangles 0/1 , 1/1 , and 1/0 as shown. Then
by induction, if the labels at the two ends of the long edge
of a triangle are a/b and c/d , the label on the third vertex
a+c
of the triangle is b+d . This fraction is called the mediant
of a/b and c/d .
The labels in the lower half of the diagram follow the
same scheme, starting with the labels 0/1 , 1/1 , and
1/0 on the large triangle. Using 1/0 instead of 1/0
as the label of the vertex at the far left means that we are regarding + and as
the same. The labels in the lower half of the diagram are the negatives of those in the
upper half, and the labels in the left half are the reciprocals of those in the right half.
The labels occur in their proper order around the circle, increasing from to
+ as one goes around the circle in the counterclockwise direction. To see why this is
so, it suffices to look at the upper half of the diagram where all numbers are positive.
a+c a c
What we want to show is that the mediant b+d is always a number between b and d
a c a a+c c
(hence the term mediant"). Thus we want to see that if b > d then b > b+d > d .
a c
Since we are dealing with positive numbers, the inequality b > d is equivalent to
a a+c
ad > bc , and b > b+d is equivalent to ab + ad > ab + bc which follows from
a+c c
ad > bc . Similarly, b+d > d is equivalent to ad + cd > bc + cd which also follows
from ad > bc .
We will show in the next section that the mediant rule for labeling vertices in the
diagram automatically produces labels that are fractions in lowest terms. It is not
immediately apparent why this should be so. For example, the mediant of 1/3 and
2/3 is 3/6 , which is not in lowest terms, and the mediant of 2/7 and 3/8 is 5/15 ,
again not in lowest terms. Somehow cases like this dont occur in the diagram.
Another non-obvious fact about the diagram is that all rational numbers occur
eventually as labels of vertices. This will be shown in the next section as well.
Chapter 1 The Farey Diagram 3
Farey Series
We can build the set of rational numbers by starting with the integers and then
inserting in succession all the halves, thirds, fourths, fifths, sixths, and so on. Let us
look at what happens if we restrict to rational numbers between 0 and 1 . Starting
with 0 and 1 we first insert 1/2 , then 1/3 and 2/3 , then 1/4 and 3/4 , skipping 2/4
which we already have, then inserting 1/5 , 2/5 , 3/5 , and 4/5 , then 1/6 and 5/6 , etc.
This process can be pictured as in the following diagram:
0 1
1
1
1
2
1 2
3
3
1 3
4
4
1 2 3 4
5
5
5
5
1 5
6
6
1 2 3 4 5 6
7
7
7
7
7
7
Each time a new number is inserted, it forms the third vertex of a triangle whose
other two vertices are its two nearest neighbors among the numbers already listed,
and if these two neighbors are a/b and c/d then the new vertex is exactly the
a+c
mediant b+d
.
The discovery of this curious phenomenon in the early 1800s was initially attributed
to a geologist and amateur mathematician named Farey, although it turned out that
he was not the first person to have noticed it. In spite of this confusion, the sequence
of fractions a/b between 0 and 1 with denominator less than or equal to a given
number n is usually called the n th Farey series Fn . For example, here is F7 :
0 1 1 1 1 2 1 2 3 1 4 3 2 5 3 4 5 6 1
1 7 6 5 4 7 3 5 7 2 7 5 3 7 4 5 6 7 1
These numbers trace out the up-and-down path across the bottom of the figure above.
For the next Farey series F8 we would insert 1/8 between 0/1 and 1/7 , 3/8 between
1/3 and 2/5 , 5/8 between 3/5 and 2/3 , and finally 7/8 between 6/7 and 1/1 .
Chapter 1 The Farey Diagram 4
There is a cleaner way to draw the preceding diagram using straight lines in a
square:
0 1 1 2 1 3 2 1
1
4
3
5
2 34
5 3
1
One can construct this diagram in stages, as indicated in the sequence of figures
below. Start with a square together with its diagonals and a vertical line from their
intersection point down to the bottom edge of the square. Next, connect the resulting
midpoint of the lower edge of the square to the two upper corners of the square and
drop vertical lines down from the two new intersection points this produces. Now add
a W-shaped zigzag and drop verticals again. It should then be clear how to continue.
A nice feature of this construction is that if we start with a square whose sides have
length 1 and place this square so that its bottom edge lies along the x -axis with the
lower left corner of the square at the origin, then the construction assigns labels to
Chapter 1 The Farey Diagram 5
the vertices along the bottom edge of the square that are exactly the x coordinates of
these points. Thus the vertex labeled 1/2 really is at the midpoint of the bottom edge
of the square, and the vertices labeled 1/3 and 2/3 really are 1/3 and 2/3 of the way
along this edge, and so forth. In order to verify this fact the key observation is the
following: For a vertical line segment in the diagram whose lower endpoint is at the
a
point b , 0 on the x -axis, the upper endpoint is at c 1
a 1
( d )
d ,
the point b , b . This is obviously true at the first a 1
( b )
b ,
stage of the construction, and it continues to hold
at each successive stage since for a quadrilateral
whose four vertices have coordinates as shown in a+c 1
(
b
+d ,
d )
b
+
the figure at the right, the two diagonals intersect
a+c 1
at the point b+d , b+d . For example, to verify that
a+c 1 a c 1
,
b+d b+d is on the line from b , 0 to d, d it
a
suffices to show that the line segments from b , 0 a c
a+c 1 a+c 1 c 1 (
b ,0) (
d ,0)
to b+d , b+d and from b+d , b+d to d , d have
the same slope. These slopes are
1/(b + d) 0 b(b + d) b b
= =
(a + c)/(b + d) a/b b(b + d) b(a + c) a(b + d) bc ad
and
1/d 1/(b + d) d(b + d) b+dd b
= =
c/d (a + c)/(b + d) d(b + d) c(b + d) d(a + c) bc ad
so they are equal. The same argument works for the other diagonal, just by inter-
a c
changing b and d .
Going back to the square diagram, this fact that we have just shown implies that
the successive Farey series can be obtained by taking the vertices that lie above the
1 1 1
line y = 2 , then the vertices above y = 3 , then above y = 4 , and so on. Here we
are assuming the two properties of the Farey diagram that will be shown in the next
section, that all rational numbers occur eventually as labels on vertices, and that these
labels are always fractions in lowest terms.
In the square diagram, the most important thing for our purposes is the triangles,
not the vertical lines. We can get rid of all the vertical lines by shrinking each one to
its lower endpoint, converting each triangle into a curvilinear triangle with semicircles
as edges, as shown in the diagram below.
Chapter 1 The Farey Diagram 6
0 1 1 1 2 1 3 2 3 4 1
1
5
4
3
5
2
5
3
4
5
1
This looks more like a portion of the Farey diagram we started with at the beginning of
the chapter, but with the outer boundary circle straightened into a line. The advantage
of the new version is that the labels on the vertices are exactly in their correct places
a a
along the x -axis, so the vertex labeled b is exactly at the point b on the x -axis.
This diagram can be enlarged so as to include similar diagrams for fractions be-
tween all pairs of adjacent integers, not just 0 and 1 , all along the x -axis:
1 1 1 1
0
0
0
0
-- 1 -- 2 -- 1
-- 1 0 1 1 2 1 4 3 5 2
1
3
2 3
1
3
2
3
1
3
2
3
1
We can also put in vertical lines at the integer points, extending upward to infinity.
These correspond to the edges having one endpoint at the vertex 1/0 in the original
Farey diagram.
All these diagrams are variants of the Farey diagram we started with at the begin-
ning of the chapter. Let us call the diagram we have just drawn the standard Farey
diagram and the one at the beginning of the chapter the circular Farey diagram. We
could also form a variant of the Farey diagram from copies of the square:
Chapter 1 The Farey Diagram 7
1 1 1 1 1 1 1
0
0
0
0
0
0
0
-- 3 -- 2 -- 1 0 1 2 3
1
1
1
1
1
1
1
Next we describe a variant of the circular Farey diagram that is closely related
to Pythagorean triples. Recall from Chapter 0 that rational points (x, y) on the unit
circle correspond to rational points p/q on the x -axis by means of lines through the
2pq p2 q 2
point (0, 1) on the circle. In formulas, (x, y) = ( p2 +q2 , p2 +q2 ) . Using this correspon-
dence, we can label the rational points on the circle by the corresponding rational
points on the x -axis and then construct a new Farey diagram in the circle by filling in
triangles by the mediant rule just as before.
The result is a version of the circular Farey diagram that is rotated by 90 degrees
to put 1/0 at the top of the circle, and there are also some perturbations of the
positions of the other vertices and the shapes of the triangles. The next figure shows
an enlargement of the new part of the diagram, with the vertices labeled by both the
fraction p/q and the coordinates (x, y) of the vertex:
Chapter 1 The Farey Diagram 8
The construction we have described for the Farey diagram involves an inductive
process, where more and more triangles are added in succession. With a construction
like this it is not easy to tell by a simple calculation whether or not two given rational
numbers a/b and c/d are joined by an edge in the diagram. Fortunately there is such
a criterion:
Two rational numbers a/b and c/d are joined by an edge
in the Farey diagram
a c
exactly when the determinant ad bc of the matrix b d is 1 . This applies also
when one of a/b or c/d is 1/0 .
We will prove this in the next section. What it means in terms of the standard Farey
diagram is that if one were to start with the upper half of the xy -plane and insert
Chapter 1 The Farey Diagram 9
vertical lines through all the integer points on the x -axis, and then insert semicircles
perpendicular to the x -axis joining each pair of rational points a/b and c/d such
that ad bc = 1 , then no two of these vertical lines or semicircles would cross, and
they would divide the upper half of the plane into non-overlapping triangles. This
is really quite remarkable when you think about it, and it does not happen for other
values of the determinant besides 1 . For example, for determinant 2 the edges
would be the dotted lines in the figure below. Here there are three lines crossing in
each triangle of the original Farey diagram, and these lines divide each triangle of the
Farey diagram into six smaller triangles.
Chapter 1 The Farey Diagram 10
To compute the value of a continued fraction one starts in the lower right corner and
7
works ones way upward. For example in the continued fraction for 16 one starts with
1 7 2 16
3+ 2
= 2
, then taking 1 over this gives 7
, and adding the 2 to this gives 7
, and
7
finally 1 over this gives 16 .
Here is the general form of a continued fraction:
To write this in more compact form on a single line one can write it as
p
a1 + a2 + + an
= a0 + 1 1 1
q
For example:
7 67
= 1 + 1 + 1 1 + 3 + 1 + 4
=2+1 1 1 1
16 2 3 2 24
To compute the continued fraction for a given rational number one starts in the
upper left corner and works ones way downward, as the following example shows:
If one is good at mental arithmetic and the numbers arent too large, only the final
67
form of the answer needs to be written down: 24 1 + 1
= 2 + 1 3 + 1
1 + 1
4 .
Chapter 1 The Farey Diagram 11
why the successive quotients for this example are the same as in the preceding ex-
ample.) It is easy to see from the displayed equations why 3 has to be the greatest
common divisor of 72 and 201 , since from the first equation it follows that any divi-
sor of 72 and 201 must also divide 57 , then the second equation shows it must divide
15 , the third equation then shows it must divide 12 , and the fourth equation shows
it must divide 3 , the last nonzero remainder. Conversely, if a number divides the last
nonzero remainder 3 , then the last equation shows it must also divide the 12 , and
the next-to-last equation then shows it must divide 15 , and so on until we conclude
that it divides all the numbers not in the shaded rectangle, including the original two
numbers 72 and 201 . The same reasoning applies in general.
A more obvious way to try to compute the greatest common divisor of two num-
bers would be to factor each of them into a product of primes, then look to see which
primes occurred as factors of both, and to what power. But to factor a large number
into its prime factors is a very laborious and time-consuming process. For example,
even a large computer would have a hard time factoring a number of a hundred digits
into primes, so it would not be feasible to find the greatest common divisor of a pair
of hundred-digit numbers this way. However, the computer would have no trouble at
all applying the Euclidean algorithm to find their greatest common divisor.
Chapter 1 The Farey Diagram 12
Having seen what continued fractions are, let us now see what they have to do with
the Farey diagram. Some examples will illustrate this best, so let us first look at the
continued fraction for 7/16 again. This has 2, 3, 2 as its sequence of partial quotients.
We use these three numbers to build a strip of three large triangles subdivided into
2 , 3 , and 2 smaller triangles, from left to right:
1 1 1 4 7
0
1
2
9
16
7 1
=
16 1
2 +
3 2
1 2
3 +
2
0 1 2 3
1
3
5
7
We can think of the diagram as being formed from three fans", where the first fan is
made from the first 2 small triangles, the second fan from the next 3 small triangles,
and the third fan from the last 2 small triangles. Now we begin labeling the vertices
of this strip. On the left edge we start with the labels 1/0 and 0/1 . Then we use the
mediant rule for computing the third label of each triangle in succession as we move
from left to right in the strip. Thus we insert, in order, the labels 1/1 , 1/2 , 1/3 , 2/5 ,
3/7 , 4/9 , and finally 7/16 .
Was it just an accident that the final label was the fraction 7/16 that we started
with, or does this always happen? Doing more examples should help us decide. Here
is a second example:
1 1 1 1 3 5 7 9
0
1
2
3
10
17
24
31
9 1
=
31 1
3 +
3 2 4
1
2 +
4
0 1 2
1
4
7
Again the final vertex on the right has the same label as the fraction we started with.
The reader is encouraged to try more examples to make sure we are not rigging things
to get a favorable outcome by only choosing examples that work.
In fact this always works for fractions p/q between 0 and 1 . For fractions larger
than 1 the procedure works if we modify it by replacing the label 0/1 with the initial
a1 + 1
integer a0 /1 in the continued fraction a0 + 1 a2 + + 1
an . This is illustrated
by the 67/24 example:
Chapter 1 The Farey Diagram 13
1 3 14
0
1 5
67 1
= 2+
24 1
1 +
3 4
1 1 1
3 +
1
1 +
4 2 5 8 11 25 39 53 67
1
2
3
4
9 14
19
24
For comparison, here is the corresponding strip for the reciprocal, 24/67 :
1 1 1 2 3 4 9 14 19 24
0
1
2
5
8
39
11 25 53
67
24 1
=
67 1
2+
2 1 4
1 1 3
+
1
1
3+
1 0
1+ 1 5
4
1
3
14
Now let us see how all this relates to the Farey diagram. Since the rule for labeling
vertices in the triangles along the horizontal strip for a fraction p/q is the mediant
rule, each of the triangles in the strip is a triangle in the Farey diagram, somewhat
distorted in shape, and the strip of triangles can be regarded as a sequence of adjacent
triangles in the diagram. Here is what this looks like for the fraction 7/16 in the
circular Farey diagram, slightly distorted for the sake of visual clarity:
1
1
7 1 1 4 7
=
2
9
16
16 1
2 +
3
1
7
3 +
2 2
5
1 3 7 1
Convergents: 0 , ,,
3
2 7 16
1
0
0
1
In the strip of triangles for a fraction p/q there is a zigzag path from 1/0 to p/q
that we have indicated by the heavily shaded edges. The vertices that this zigzag path
passes through have a special significance. They are the fractions that occur as the
values of successively larger initial portions of the continued fraction, as illustrated
in the following example:
Chapter 1 The Farey Diagram 14
67 1
= 2 +
24 1
1 +
2 1
3 +
1
3 1+
4
11/
4
14/
5
67/
24
These fractions are called the convergents for the given fraction. Thus the convergents
for 67/24 are 2 , 3 , 11/4 , 14/5 , and 67/24 itself.
From the preceding examples one can see that each successive vertex label pi /qi
p
along the zigzag path for a continued fraction q a1 + 1
= a0 + 1 a2 + + 1
an is
computed in terms of the two preceding vertex labels according to the rule
pi ap + pi2
= i i1
qi ai qi1 + qi2
This is because the mediant rule is being applied ai times, adding pi1 /qi1 to the
previously obtained fraction each time until the next label pi /qi is obtained.
1 1
0
0
1 1 1 2 3
0
1
2
5
8
0 1
0 1 1 1 2 1 3 2 3 4 1
1
3
1
5 4 3 5 2 5 3 4 5 1
3
8
This example is typical of the general case, where the zigzag path for a continued
p
fraction q a1 + 1
= a0 + 1 a2 + + 1
an becomes a pinball path in the standard Farey
diagam, starting down the vertical line from 1/0 to a0 /1 , then turning left across a1
triangles, then right across a2 triangles, then left across a3 triangles, continuing to
alternate left and right turns until reaching the final vertex p/q . Two consequences
of this are:
(1) The convergents are alternately smaller than and greater than p/q .
(2) The triangles that form the strip of triangles for p/q are exactly the triangles in
the Farey diagram that lie directly above the point p/q on the x -axis.
In particular, since every positive rational number has a continued fraction ex-
pansion, we see that every positive rational number occurs eventually as the label of
some vertex in the positive half of the diagram. All negative rational numbers then
occur as labels in the negative half.
Chapter 1 The Farey Diagram 16
p
Proof of the Theorem: The continued fraction q a1 + 1
= a0 + 1 a2 + + 1
an deter-
mines a strip of triangles:
We will show that the label pn /qn on the final vertex in this strip is equal to p/q , the
value of the continued fraction. Replacing n by i , we conclude that this holds also
a1 + 1
for each initial seqment a0 + 1 a2 + + 1
ai of the continued fraction. This is
just saying that the vertices pi /qi along the strip are the convergents to p/q , which
is what the theorem claims.
To prove that pn /qn = p/q we will use 2 2 matrices. Consider the product
! ! ! !
1 a0 0 1 0 1 0 1
P=
0 1 1 a1 1 a2 1 an
We can multiply this product out starting either from the left or from the right.
! Sup-
1 a0
pose first that we multiply starting at the left. The initial matrix is and we
0 1
can view the two columns of this matrix as the two fractions 1/0 and a0 /1 labeling
the left edge of the strip of triangles. When we multiply this matrix by the next matrix
we get ! ! ! !
1 a0 0 1 a0 1 + a0 a1 p0 p1
= =
0 1 1 a1 1 a1 q0 q1
The two columns here give the fractions at the ends of the second edge of the zigzag
path. The same thing happens for subsequent matrix multiplications, as multiplying
by the next matrix in the product takes the matrix corresponding to one edge of the
zigzag path to the matrix corresponding to the next edge:
! ! ! !
pi2 pi1 0 1 pi1 pi2 + ai pi1 pi1 pi
= =
qi2 qi1 1 ai qi1 qi2 + ai qi1 qi1 qi
In the end, when all the matrices have been multiplied, we obtain the matrix corre-
sponding to the last edge in the strip from pn1 /qn1 to pn /qn . Thus the second
column of the product P is pn /qn , and what remains is to show that this equals the
value p/q of the continued fraction a0 + 1 a2 + + 1
a1 + 1 an .
Chapter 1 The Farey Diagram 17
a1 + 1
The value of the continued fraction a0 + 1 a2 + + 1
an is computed by
working from right to left. If we let ri /si be the value of the tail a + 1
1
i
a i+1
an
+ + 1
of the continued fraction, then rn /sn = 1/an and we have
ri 1 si+1 p r a s + r1
= r = and finally = a0 + 1 = 0 1
si ai + i+1 ai si+1 + ri+1 q s1 s1
si+1
In terms of matrices this implies that we have
! ! ! ! ! !
rn 1 0 1 ri+1 si+1 ri
= , = =
sn an 1 ai si+1 ri+1 + ai si+1 si
! ! ! !
1 a0 r1 r1 + a0 s1 p
and = =
0 1 s1 s1 q
This means that when we multiply out the!product P! starting from ! the right, then the
!
rn rn1 r1 p
second columns will be successively , , , and finally .
sn sn1 s1 q
!
pn
We already showed this second column is , so p/q = pn /qn and the proof is
qn
complete.
An interesting fact that can be deduced from the preceding proof is that for a
continued fraction a
1
1
a2 + + 1
+ 1 an with no initial integer a0 , if we reverse the
order of the numbers ai , this leaves the denominator unchanged. For example
1 + 1 + 1 = 13 and 1 +1 +1 = 7
2 3 4 30 4 3 2 30
To see why this must always be true we use the operation of transposing a matrix to in-
terchange its rows and columns. For a 22 matrix this just amounts to interchanging
the upper-right and lower-left entries:
!T !
a c a b
=
b d c d
Theorem. In the Farey diagram, two vertices labeled a/b and c/d are joined by an
edge if and only if the determinant ad bc of the matrix ba dc is equal to 1 .
Corollary. The mediant rule for labeling the vertices in the Farey diagram always
produces labels a/b that are fractions in lowest terms.
Proof : Consider an edge joining a vertex labeled a/b to some other vertex labeled
c/d . By the preceding proposition we know that ad bc = 1 . This equation implies
that a and b can have no common divisor greater than 1 since any common divisor of
Chapter 1 The Farey Diagram 19
a and b must divide the products ad and bc , hence also the difference adbc = 1 ,
but the only divisors of 1 are 1 .
Now we return to proving the converse half of the theorem, which says that there
is an edge joining a/b to c/d whenever adbc = 1 . To do this we will examine how
all the edges emanating from
! a fixed vertex a/b are related. To begin, if a/b!= 0/1
0 c 0 1
then the matrices with determinant 1 are the matrices , and
1 d 1 d
these correspond exactly to the edges in the diagram from 0/1 to 1/d . There is
a similar exact correspondence for the edges from 1/0 . For the other vertices a/b ,
the example a/b = 5/8 is shown in the left half of the figure below. The first edges
drawn to this vertex come from 2/3 and 3/5 , and after this all the other edges from
5/8 are drawn in turn. As one can see, they are all obtained by adding (5, 8) to (2, 3)
or (3, 5) repeatedly. If we choose any one of these edges from 5/8 , say the edge to
2/3 for example, then the edges from 5/8 have their other endpoints at the fractions
(2 + 5k)/(3 + 8k) as k ranges over all integers, with positive values of k giving the
edges on the upper side of the edge to 2/3 and negative values of k giving the edges
on the lower side of the edge to 2/3 .
The same thing happens for an arbitrary value of a/b as shown in the right half of
the figure, where a/b initially arises as the mediant of c/d and e/f . In this case if
we choose the edge to c/d as the starting edge, then the other edges go from a/b to
(c + ka)/(d + kb) . In particular, when k = 1 we get the edge to (c a)/(d b) =
(a c)/(b d) = e/f .
a x
To finish the argument we need to know how the various matrices b y of deter-
minant ay bx = 1 having the same first column are related. This can be deduced
from the following result about integer solutions of linear equations with integer co-
efficients:
Chapter 1 The Farey Diagram 20
Lemma. Suppose a and b are integers with no common divisor. If one solution of
ay bx = n is (x, y) = (c, d) , then the general solution is (x, y) = (c + ka, d + kb)
for k an arbitrary integer.
The proof will use the same basic argument as is used in linear algebra to show
that the general solution of a system of nonhomogeneous linear equations is obtained
from any particular solution by adding the general solution of the associated system
of homogeneous equations.
Now we can easily finish the proof of the theorem. The lemma in the cases n = 1
implies that
the edges in the Farey diagram with a/b at one endpoint account for all
a x
matrices b y of determinant ay bx = 1 .
0 1 3 8 21 55
1
2
5
13
34
89
Notice that these fractions after 1/0 are the successive ratios of the famous Fibonacci
sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, where each number is the sum of its two prede-
cessors. The sequence of convergents is thus 0/1, 1/1, 1/2, 2/3, 3/5, 5/8, 8/13, ,
the vertices along the zigzag path. The way
this zigzag path looks in the standard Farey
diagram is shown in the figure at the right.
What happens when we follow this path far-
ther and farther? The path consists of an
infinite sequence of semicircles, each one
shorter than the preceding one and sharing
a common endpoint. The left endpoints of
the semicircles form an increasing sequence
of numbers which have to be approaching a certain limiting value x . We know x has
to be finite since it is certainly less than each of the right-hand endpoints of the semi-
circles, the convergents 1/1, 2/3, 5/8, . Similarly the right endpoints of the semi-
circles form a decreasing sequence of numbers approaching a limiting value y greater
than each of the left-hand endpoints 0/1, 1/2, 3/5, . Obviously x y . Is it pos-
Chapter 1 The Farey Diagram 24
sible that x is not equal to y ? If this happened, the infinite sequence of semicircles
would be approaching the semicircle from x to y . Above this semicircle there would
then be an infinite number of semicircles, all the semicircles in the infinite sequence.
Between x and y there would have to be a rational numbers p/q (between any two
real numbers there is always a rational number), so above this rational number there
would be an infinite number of semicircles, hence an infinite number of triangles in
the Farey diagram. But we know that there are only finitely many triangles above any
rational number p/q , namely the triangles that appear in the strip for the continued
fraction for p/q . This contradiction shows that x has to be equal to y . Thus the
sequence of convergents along the edges of the infinite strip of triangles converges to
a unique real number x . (This is why the convergents are called convergents.)
This argument works for arbitrary infinite continued fractions, so we have shown
the following general result:
This limit is by definition the value of the infinite continued fraction. There is a
simple method for computing the value in the example involving Fibonacci numbers.
We begin by setting
1 + 1 + 1 +
x=1 1 1
Proof : In the Farey diagram consider the vertical line L going upward from a given
irrational number x on the x -axis. The lower endpoint of L is not a vertex of the
Farey diagram since x is irrational. Thus as we move downward along L we cross a
sequence of triangles, entering each triangle by crossing its upper edge and exiting
the triangle by crossing one of its two lower edges. When we exit one triangle we
are entering another, the one just below it, so the sequence of triangles and edges
we cross must be infinite. The left and right endpoints of the edges in the sequence
must be approaching the single point x by the argument we gave in the preceding
proposition, so the edges themselves are approaching x . Thus the triangles in the
sequence form a single infinite strip consisting of an infinite sequence of fans with
their pivot vertices on alternate sides of the strip. The zigzag path along this strip
gives a continued fraction for x .
For the uniqueness, we have seen that an infinite continued fraction for x cor-
responds to a zigzag path in the infinite strip of triangles lying above x . This set
of triangles is unique so the strip is unique, and there is only one path in this strip
that starts at 1/0 and then does left and right turns alternately, starting with a left
turn. The initial turn must be to the left because the first two convergents are a0 and
1 1
a0 + a1 , with a0 + a1 > a0 since a1 > 0 . After the path traverses the first edge, no
subsequent edge of the path can go along the border of the strip since this would
entail two successive left turns or two successive right turns.
The arguments we have just given can be used to prove a fact about the standard
Farey diagram that we have been taking more or less for granted. This is the fact that
the triangles in the diagram completely cover the upper halfplane. In other words,
every point (x, y) with y > 0 lies either in the interior of some triangle or on the
common edge between two triangles. To see why, consider the vertical line L in the
upper halfplane through the given point (x, y) . If x is an integer then (x, y) is on
one of the vertical edges of the diagram. Thus we can assume x is not an integer
and hence L is not one of the vertical edges of the diagram. The line L will then be
contained in the strip of triangles corresponding to the continued fraction for x . This
is a finite strip if x is rational and an infinite strip if x is irrational. In either case
the point (x, y) , being in L , will be in one of the triangles of the strip or on an edge
separating two triangles in the strip. This proves what we wanted to prove.
did:
67 19 1 1 1
= 2+
= 2+
= 2+
= 2+
1
24 24 24 / 19 1 + 5/ 19 1 +
19 / 5
1 1 1
= 2+
= 2+
= 2+
1 1 1
+
1
+
1 1 +
3+ 4 /5 1 1
3 +
3 +
5/ 4 1 + 1
4
2=1+1 2 + 2 + 2 +
1 1
p
We can check this calculation by finding the value of the continued fraction in the same
way that we did earlier for 1 + 1
1 1 + 1
1 + . First we set x = 1
2 + 1
2 + 1
2 + .
Taking reciprocals gives 1/x = 2 + 12 + 12 + 1 2 + = 2 + x . This leads to the
quadratic equation x 2 + 2x 1 = 0 , which has roots x = 1 2. Since x is positive
2 + 1
2 + 1
2 + . Adding
we can discard the negative root. Thus we have 1+ 2 = 1
1 to both sides of this equation gives the formula for 2 as a continued fraction.
Chapter 1 The Farey Diagram 27
We can get good rational approximations to 2 by computing the convergents
2 + 1
in its continued fraction 1 + 1 2 + 1 2 + . Its a little easier to compute the
2 + 1
2 + 1
2 + = 1 + 2 and then subtract 1 from each of
convergents in 2 + 1
2 + 1
these. For 2 + 1 2 + 1
2 + there is a nice pattern to the convergents:
2 5 12 29 70 169 408 985
, , , , , , , ,
1 2 5 12 29 70 169 408
Notice that the sequence of numbers 1, 2, 5, 12, 29, 70, 169, is constructed in a way
somewhat analogous to the Fibonacci sequence, except that each number is twice the
preceding number plus the number before that. (Its easy to see why this has to be
true, because each convergent is constructed from the previous one by inverting the
fraction and adding 2 .) After subtracting 1 from each of these fractions we get the
convergents to 2 : p
2 = 1.41421356
1/1 = 1.00000000
3/2 = 1.50000000
7/5 = 1.40000000
17/12 = 1.41666666
41/29 = 1.41379310
99/70 = 1.41428571
239/169 = 1.41420118
577/408 = 1.41421568
We can compute the continued fraction for 3 by the same method as for 2 ,
but something slightly different happens:
1
(1) 3 = 1 + ( 3 1) since 3 is between 1 and 2 . Computing 31
, we have
1 = 1 3+1 = 3+1 .
31 31 3+1 2
3+1
= 1 + ( 31 3+1
(2) 2 2 ) since thenumerator 3 + 1 of 2 is between 2 and 3 . Now
we have a remainder r2 = 31 2
whichis different from the previous remainder
r1 = 3 1 , so we have to compute r12 = 31 2 2 2
3+1
, namely 31 = 31 3+1
=
3 + 1.
(3) 3 + 1 = 2 + ( 3 1) since 3 + 1 is between 2 and 3 .
Now this remainder r3 = 3 1 is the same as r1 , so instead of the same step being
repeated infinitely often, as happened for 2, the same two steps will repeat infinitely
often. This means we get the continued fraction
3= 1+1 1 + 2 + 1 + 2 + 1 + 2 +
1 1 1 1 1
p
Chapter 1 The Farey Diagram 28
Checking this takes a little more work than before. We begin by isolating the part of
the continued fraction that repeats periodically, so we set
1 + 2 + 1 + 2 + 1 + 2 +
x=1 1 1 1 1 1
To simplify the notation we will write a bar over a block of terms in a continued
fraction that repeat infinitely often, for example
2 and 1 + 1
2
p p
2 = 1 + 1 3 = 1 + 1
It is true in general that for every positive integer n that is not a square, the
continued fraction for n has the form a0 + 1 a1 + 1
a2 + + 1
ak . The length of
1 + 1
3 + 1
1 + 1
1 + 1
2 + 1
6 + 1
2 + 1
1 + 1
1 + 1
3 + 1
1 + 1
12
p
46 = 6 + 1
This example illustrates two other curious facts about the continued fraction for an
irrational number n :
(i) The last term of the period ( 12 in the example) is always twice the integer a0 (the
initial 6 ).
(ii) If the last term of the period is omitted, the preceding terms in the period form
a palindrome, reading the same backwards as forwards.
We will see in the next chapter why these two properties have to be true.
Chapter 1 The Farey Diagram 29
It is natural to ask exactly which irrational numbers have continued fractions that
are periodic, or at least eventually periodic, like for example
2 + 1
1
4 + 1
3 + 1
5 + 1
7 = 1
2 + 1
4 + 1
3 + 1
5 + 1
7 + 1
3 + 1
5 + 1
7 + 1
3 + 1
5 + 1
7 +
The answer is given by a theorem of Lagrange from around 1766:
Lagranges Theorem. The numbers whose continued fractions are eventually periodic
are exactly the numbers of the form a + b n where a and b are rational numbers
and n is a positive integer that is not a square.
These numbers a + b n are called quadratic irrationals because they are roots
of quadratic equations with integer coefficients. The easier half of the theorem is the
statement that the value of an eventually periodic infinite continued fraction is always
a quadratic irrational. This can be proved by showing that the method we used for
finding a quadratic equation satisfied by an eventually periodic continued fraction
works in general. Rather than following this purely algebraic approach, however, we
will develop a more geometric version of the procedure in the next section, so we
will wait until then to give the argument that proves this half of Lagranges Theorem.
The more difficult half of the theorem is the assertion that the continued fraction
expansion of every quadratic irrational is eventually periodic. It is not at all apparent
from the examples of 2 and 3 why this should be true in general, but in the next
chapter we will develop some theory that will make it clear.
What can be said about the continued fraction expansions of irrational numbers
3
that are not quadratic, such as 2 , , or e , the base for natural logarithms? It
happens that e has a continued fraction whose terms have a very nice pattern, even
though they are not periodic or eventually periodic:
1 + 2 + 1 + 1 + 4 + 1 + 1 + 6 + 1 +
e =2+1 1 1 1 1 1 1 1 1
| {z } | {z } | {z }
where the terms are grouped by threes with successive even numbers as middle de-
nominators. Even simpler are the continued fractions for certain numbers built from
e that have arithmetic progressions for their denominators:
e 1 1 1 1
e+1 14 +
= 2 + 6 + 10 + 1
e2 1 1 1 1 1
= 1 + 3 + 5 + 7 +
e2 + 1
The last two formulas were found by Lambert in 1770, while the expression for e itself
was found by Hurwitz over a century later, in 1891.
Chapter 1 The Farey Diagram 30
3
For 2 and , however, the continued fractions have no known pattern. For
the continued fraction begins
7 + 15 + 1 + 292 +
= 3+1 1 1 1
Here the first four convergents are 3 , 22/7 , 333/106 , and 355/113 . We recognize
1
22/7 as the familiar approximation 3 7 to . The convergent 355/113 is a particularly
good approximation to since its decimal expansion begins 3.14159282 whereas
= 3.1415926535 . It is no accident that the convergent 355/113 obtained by
truncating the continued fraction just before the 292 term gives a good approximation
to since it is a general fact that a convergent immediately preceding a large term
in the continued fraction always gives an especially good approximation, because
the next jump in the zigzag path in the Farey diagram will be rather small, and all
succeeding jumps will of course be smaller still.
There are nice continued fractions for if one allows numerators larger than 1 ,
as in the following formula discovered by Euler:
2 2 2 2
= 3 + 16 + 36 + 56 + 76 +
However, it is the continued fractions with numerator 1 that have the nicest proper-
ties, so we will not consider the more general sort in this book.
Chapter 1 The Farey Diagram 31
Our purpose in this section is to study all possible symmetries of the Farey diagram,
where we interpret the word symmetry" in a broader sense than the familiar meaning
from Euclidean geometry. For our purposes, symmetries will be invertible transfor-
mations that take vertices to vertices and edges to edges. (It follows that triangles
are sent to triangles.) There are simple algebraic formulas for these more general
symmetries, and these formulas lead to effective means of calculation. One of the ap-
plications will be to computing the values of periodic or eventually periodic continued
fractions.
a b
From linear algebra one is familiar with the way in which 2 2 matrices c d
correspond to linear transformations of the plane R2 , transformations of the form
! ! ! !
x a b x ax + by
T = =
y c d y cx + dy
az + b
T (z) =
cz + d
Chapter 1 The Farey Diagram 32
Proof : We showed in the previous section that two vertices labeled p/q and r /s are
joined by an edge in the diagram exactly when ps qr = 1 , or in other words
p r
when the matrix q s has determinant 1 . The two columns of the product matrix
a b p r
c d q s correspond to the two vertices T (p/q) and T (r /s) , by the definition of
matrix multiplication:
! ! !
a b p r ap + bq
ar + bs
=
c d q s cp + dq
cr + ds
The proposition can then be restated as saying that if ac db and pq rs each have
determinant 1 then so does their product ac db pq rs . But it is a general fact about
determinants that the determinant of a product is the product of the determinants.
(This is easy to prove by a direct calculation in the case of 2 2 matrices.) So the
product of two matrices of determinant 1 has determinant 1 .
As notation, we will use LF (Z) to denote the set of all linear fractional transfor-
mations T (x/y)
= (ax + by)/(cx + dy) with coefficients a, b, c, d in Z such that
a b
the matrix c d has determinant 1 .
Changing the matrix ac db to its negative a b
c d produces the same linear frac-
tional transformation since (ax by)/(cx dy) = (ax + by)/(cx + dy) . This is
in fact the only way that different matrices can give the same linear fractional
transfor-
a b
mation T , as we will see later in this section. Note that changing c d to its negative
a b
c d does not change the determinant. Thus each linear fractional transformation
in LF (Z) has a well-defined determinant, either +1 or 1 . Later in this section we
will also see how the distinction between determinant +1 and determinant 1 has a
geometric interpretation in terms of orientations.
Chapter 1 The Farey Diagram 33
A useful fact about LF (Z) is that each transformation T in LF (Z) has an inverse
1
T in LF (Z) because the inverse of a 2 2 matrix is given by the formula
!1 !
a b 1 d b
=
c d ad bc c a
Thus if a, b, c, d are integers with ad bc = 1 then the inverse matrix also has
1
integer entries
and determinant
1 . The factor adbc is 1 so it can be ignored since
a b a b
the matrices c d and c d determine the same linear fractional transformation,
as we observed in the preceding paragraph.
e1 = e2 , so T cannot send two different edges to the same edge, which means it is
one-to-one on edges. Also, every edge e1 is the image T (e2 ) of some edge e2 since
we can write e1 = T T 1 (e1 ) and let e2 = T 1 (e1 ) . The same reasoning works with
We will now give examples illustrating seven different ways that elements of LF (Z)
can act on the Farey diagram.
(6) A different
sort of behavior is exhibited by T (x/y) = (2x + y)/(x + y) corre-
2 1
sponding to 1 1 . To visualize T as a transformation of the Farey diagram let us
look at the infinite strip
Chapter 1 The Farey Diagram 35
34 13 5 2 1 1 2 5 13 34 89
21
55 8 3 1 0 1 3 8 21 55
89 21 8 3 1 0 1 3 8 21 55
55
34
13
5
2
1
1
2
5
13
34
We claim that T translates the whole strip one unit to the right. To see this, notice
first that since T takes 1/0 to 2/1 , 0/1 to 1/1 , and 1/1 to 3/2 , it takes the triangle
h1/0, 0/1, 1/1i to the triangle h2/1, 1/1, 3/2i . This implies that T takes the triangle
just to the right of h1/0, 0/1, 1/1i to the triangle just to the right of h2/1, 1/1, 3/2i ,
and similarly each successive triangle is translated one unit to the right. The same
argument shows that each successive triangle to the left of the original one is also
translated one unit to the right. Thus the whole strip is translated one unit to the
right.
(7) Using the same figure as in the preceding example,
consider
the transformation
1 1
T (x/y) = (x + y)/x corresponding to the matrix 1 0 . This sends the triangle
h1/0, 0/1, 1/1i to h1/1, 1/0, 2/1i which is the next triangle to the right in the infinite
strip. Geometrically, T translates the first triangle half a unit to the right and reflects
it across the horizontal axis of the strip. It follows that the whole strip is translated
half a unit to the right and reflected across the horizontal axis. Such a motion is
sometimes referred to as a glide-reflection. Notice that performing this motion twice
in succession yields a translation of the strip one unit to the right, the transformation
in the preceding example.
Thus we have seven types of symmetries of the Farey diagram: reflections across
an edge or a line perpendicular to an edge; rotations about the centerpoint of an
edge or a triangle, or about a vertex; and translations and glide-reflections of periodic
infinite strips. (Not all periodic strips have glide-reflection symmetries.) It is a true
fact, though we wont prove it here, that every element of LF (Z) acts on the Farey
diagram in one of these seven ways, except for the identity transformation T (x/y) =
x/y of course.
fact, one can do this specifying where each individual vertex of the triangle goes.
As an example, suppose we wish to find an element T of LF (Z) that takes the
triangle h2/5, 1/3, 3/8i to the triangle h5/8, 7/11, 2/3i , preserving the indicated or-
dering of the vertices, so T (2/5) = 5/8 , T (1/3) = 7/11 , and T (3/8) = 2/3 . For
this problem to even make sense we might want to check first that these really are
triangles
in the Farey diagram. In the first case, h2/5, 1/3i is an edge since the matrix
2 1
5 3 has determinant 1 , and there is a triangle joining this edge to 3/8 since
3/8 is
5 2
the mediant of 2/5 and 1/3 . For the other triangle, the determinant of 8 3 is 1
and the mediant of 5/8 and 2/3 is 7/11 .
As a first step toward constructing the desired transformation T we will do some-
thing slightly weaker: We construct a transformation T taking the edge h2/5, 1/3i to
the edge h5/8, 7/11i . This is rather easy if we first notice the
general
fact that the
a b
transformation T (x/y) = (ax + by)/(cx + dy) with matrix c d takes 1/0 to a/c
and 0/1 to b/d . Thus the transformation T1 with matrix 25 13 takes h1/0, 0/1i
to h2/5, 1/3i , and the transformation T2 with matrix 85 11 7
takes h1/0, 0/1i to
h5/8, 7/11i . Then the product
! !1
5 7 2 1
T2 T11 =
8 11 5 3
takes h2/5, 1/3i first to h1/0, 0/1i and then to h5/8, 7/11i . Doing the calculation, we
get
! !1 ! ! !
5 7 2 1 5 7 3 1 20 9
= =
8 11 5 3 8 11 5 2 31 14
This takes the edge h2/5, 1/3i to the edge h5/8, 7/11i , but does it do the right thing
on the third vertex of the triangle h2/5, 1/3, 3/8i , taking it to the third vertex of
h5/8, 7/11, 2/3i ? This is not automatic since there are always two triangles containing
a given edge, and in this case the other triangle having h5/8, 7/11i as an edge is
h5/8, 7/11, 12/19i since 12/19 is the mediant of 5/8 and 7/11 . In fact, if we compute
what our T does to 3/8 we get
! ! !
20 9 3 12
=
31 14 8 19
1 multiplies the whole matrix by 1 which doesnt change the associated element
of LF (Z) , as
noted
earlier. In the case at hand, suppose we change the sign of the first
5 7
column of 8 11
. Then we get
! !1 ! ! !
5 7 2 1 5 7 3 1 50 19
= =
8 11 5 3 8 11 5 2 79 30
This fixes the problem since
! ! !
50 19 3 2
=
79 30 8 3
Proposition. (a) For any two triangles hp/q, r /s, t/ui and hp /q , r /s , t /u i in the
Farey diagram there is a unique element T in LF (Z) taking the first triangle to the
second triangle preserving the ordering of the vertices, so T (p/q) = p /q , T (r /s) =
r /s , and T (t/u)
=
t /u .
a b
(b) The matrix c d representing a given transformation T in LF (Z) is unique except
a b
for replacing it by c d .
Proof : As we saw in the example above, there is a composition T2 T11 taking the
edge
p r
p r
hp/q, r /si to hp /q , r /s i , where T1 has matrix q s and T2 has matrix q s .
If this composition T2 T11 does not take t/u to t /u we modify T2 bychanging
the
p r p r
sign of one of its columns, say the first column. Thus we change q s to q s ,
p r 1 0 1 0
which equals the product q s 0 1 . The matrix 0 1 corresponds to the trans-
formation R(x/y) = x/y reflecting the Farey diagram across the edge h1/0, 0/1i .
Thus we are replacing T2 T11 by T2 RT11 , inserting a reflection that interchanges the
two triangles containing the edge h1/0, 0/1i . By inserting R we change where the
composition T2 T11 sends the third vertex t/u of the triangle hp/q, r /s, t/ui , so we
can guarantee that t/u is taken to t /u . This proves part (a).
For part (b), note first that the transformation T determines the values T (1/0) =
a/c and T (0/1) = b/d . The fractions a/c and b/d are in lowest terms (because
ad bc = 1 ) so this means that we know the two columns of the matrix ac db up
to multiplying either or both columns by 1 . We need to check that changing the
sign of one column without changing the sign of the other column
gives
a different
a b a b
transformation. It doesnt matter which column we change since c d = c d .
As we saw in part (a), changing the sign in the first column amounts to replacing T
by the composition T R , but this is a different transformation from T since it has a
different effect on the triangles containing the edge h1/0, 0/1i .
Chapter 1 The Farey Diagram 38
We would like to compute the element T of LF (Z) that gives the rightward translation
4 19
of this strip that exhibits the periodicity. A first guess is the T with matrix 9 43
since this sends h1/0, 0/1i to h4/9, 19/43i . This is actually the correct T since it
sends the vertex 1/1 just to the right of 1/0 , which is the mediant of 1/0 and 0/1 ,
to the vertex (4 + 19)/(9 + 43) just to the
right
of 4/9 ,which is the mediant of 4/9
and 19/43 . This is a general fact since c d 11 = a+b
a b
c+d .
The sequence of fractions labeling the vertices along the zigzag path in the strip
moving toward the right are the convergents to 2 + 1
1
3 + 1
1 + 1
4 . Call these con-
vergents z1 , z2 , and their limit z . When we apply the translation T we are taking
each convergent to a later convergent in the sequence, so both the sequence {zn } and
the sequence {T (zn )} converge to z . Thus we have
where the middle equality uses the fact that T is continuous. (Note that a linear
az+b
fractional transformation T (z) = cz+d is defined for real values of z , not just rational
x x
values z = x/y , when T (x/y) = (ax + by)/(cx + dy) = (a y + b)/(c y + d) .)
In summary, what we have just argued is that the value z of the periodic continued
4z+19
fraction satisfies the equation T (z) = z , or in other words, 9z+43 = z . This can be
2 2
rewritten as 4z + 19 = 9z + 43z , which simplifies to 9z + 39z 19 = 0 . Computing
the roots of this quadratic equation, we get
39 392 + 4 9 19 39 3 132 + 4 19 13 245 13 7 5
z= = = =
18 18 6 6
Chapter 1 The Farey Diagram 39
The positive root is the one that the right half of the infinite strip converges to, so we
have
13 + 7 5 1 1 1 1
= 2 + 3 + 1 + 4
6
Incidentally, the other root (13 7 5)/6 has an interpretation in terms of the di-
agram as well: It is the limit of the numbers labeling the vertices of the zigzag path
moving off to the left rather than to the right. This follows by the same sort of argu-
ment as above.
1 +1 +1
1 2 3
Here the periodic strip is
2 7
The transformation T with matrix 3 10 takes h1/0, 0/1i to h2/3, 7/10i and the me-
diant 1/1 of 1/0 and 0/1 to the mediant 9/13 of 2/3 and 7/10 so this transformation
2z+7
is a glide-reflection of the strip. The equation T (z) = z becomes = z , which
3z+10
2 2
simplifies to 2z + 7 = 3z + 10z and then 3z + 8z 7 = 0 , with roots (4 37)/3 .
The positive root gives
4 + 37 1 1 1
= 1 + 2 + 3
3
Continued fractions that are only eventually periodic can be treated in a similar
fashion. For example, consider
1 + 1 + 1 + 1 + 1
2 2 1 2 3
The corresponding infinite strip is
Chapter 1 The Farey Diagram 40
In this case if we discard the triangles corresponding to the initial nonperiodic part of
the continued fraction, 2 + 1
1
2 , and then extend the remaining periodic part in both
directions, we obtain a periodic strip that is carried to itself by the glide-reflection T
taking h1/2, 2/5i to h8/19, 27/64i :
We can compute T as the composition h1/2, 2/5i h1/0, 0/1i h8/19, 27/64i cor-
responding to the product
! !1 ! ! !
8 27 1 2 8 27 5 2 14 11
= =
19 64 2 5 19 64 2 1 33 26
Since this transformation takes 3/7 to the mediant (8 + 27)/(19 + 64) , it is the glide-
14z+11
reflection we want. Now we solve T (z) = z . This means 33z+26 = z , which reduces
2
to the equation 33z 40z + 11 = 0 with roots z = (20 37)/33 . Both roots are
positive, and we want the smaller one, (20 37)/33 , because along the top edge of
the strip the numbers decrease as we move to the right, approaching the smaller root,
and they increase as we move to the left, approaching the larger root. Thus we have
2 + 2 + 1 + 2 + 3
(20 37)/33 = 1 1 1 1 1
p
Notice that 37 occurs in both this example and the preceding one where we
computed the value of 1 + 1
1 2 + 13 . This is not just an accident. It had to happen
1
because to get from 1 +
12 +
13 to 12 + 1
2 + 1
1 + 1
2 + 1
3 one adds 2 and inverts,
then adds 2 and inverts again, and each of these operations of adding an integer or
taking the reciprocal takes place within the field Q( 37) consisting of numbers of the
form a + b 37 with a and b rational. More generally, this argument shows that any
eventually periodic continued fraction whose periodic part is 1 + 1
1
2 + 1
3 has as its
Chapter 1 The Farey Diagram 41
value some number in the field Q( 37) . However, not all irrational numbers in this
field have eventually periodic continued fractions with periodic part 1 1 + 1
2 + 1
3 .
12 , with a different periodic
For example, the continued fraction for 37 itself is 6 + 1
integer n . We know that the real number z is a root of the equation so n cant be
negative, and it cant be a square since z is irrational.
Thus we have an argument that proves one half of Lagranges Theorem, the state-
ment that a number whose continued fraction is periodic or eventually periodic is a
quadratic irrational. There is one technical point that should be addressed, however.
Could the leading coefficient c in the quadratic equation cz 2 + (d a)z b = 0 be
zero? If this were the case then we couldnt apply the quadratic formula to solve
for z , so we need to show that c cannot be zero. We do this in the following way. If
c were zero the equation would become the linear equation (d a)z b = 0 . If the
coefficient of z in this equation is nonzero, we have only one root, z = b/(d a) , a
rational number contrary to the fact that z is irrational since its continued fraction is
infinite. Thus we are left with the possibility that c = 0 and a = d , so the equation
for z reduces to the equation b = 0 . Then the transformation T would have the form
az
T (z) = a = z so it would be the identity transformation. However we know it is a
genuine translation or a glide-reflection, so it is not the identity. We conclude from
all this that c cannot be zero, and the technical point is taken care of.
Orientations
a b
Elements of LF (Z) are represented by integer matrices c d of determinant 1 .
The distinction between determinant +1 and 1 has a very nice geometric interpreta-
tion in terms of orientations, which can be described in terms of triangles. A triangle
Chapter 1 The Farey Diagram 42
in the Farey diagram can be oriented by choosing either the clockwise or counter-
clockwise ordering of its three vertices. An element T of LF (Z) takes each triangle
to another triangle in a way that either preserves the two possible orientations or
reverses them.
For example, among the seven types of transformations we looked at earlier, only
reflections and glide-reflections reverse the orientations of triangles. Note that if a
transformation T preserves the orientation of one triangle, it has to preserve the
orientation of the three adjacent triangles, and then of the triangles adjacent to these,
and so on for all the triangles. Similarly, if the orientation of one triangle is reversed
by T , then the orientations of all triangles are reversed.
Proof : We will first prove a special case and then deduce the general case from the
special case. The special
case is that a, b, c, d are all positive or zero. The transforma-
a b
tion T with matrix c d
takes the edge h1/0, 0/1i in the circular Farey diagram to the
edge ha/c, b/di , and if a, b, c, d are all positive or zero, this edge lies in the upper half
of the diagram. Since T (1/1) = (a+b)/(c +d) , the triangle h1/0, 0/1, 1/1i is taken to
the triangle ha/c, b/d, (a + b)/(c + d)i whose third vertex (a + b)/(c + d) lies above
the edge ha/c, b/di , by the way the Farey diagram was constructed using mediants,
since we assume a, b, c, d are positive or zero. We know that the edge ha/c, b/di is
oriented to the right if ad bc = +1 and to the left if ad bc = 1 . This means
that T preserves the orientation of the triangle h1/0, 0/1, 1/1i if the determinant is
+1 and reverses the orientation if the determinant is 1 .
Chapter 1 The Farey Diagram 43
Proposition. For any two edges hp/q, r /si and hp /q , r /s i of the Farey diagram
there exists a unique element T LF + (Z) taking the first edge to the second edge
Chapter 1 The Farey Diagram 44
Proof : We already know that there exists an element T in LF (Z) with T (p/q) = p /q
and T (r /s) = r /s , and in fact there are exactly two choices for T which are distin-
guished by which of the two triangles containing hp /q , r /s i a triangle containing
hp/q, r /si is sent to. One of these choices will make T preserve orientation and the
other will make T reverse orientation. So there is only one choice where the determi-
nant is +1 .
Chapter 2 Quadratic Forms 1
2.1 Topographs
Finding Pythagorean triples is answering the question, When is the sum of two
squares equal to a square? More generally one can ask, Exactly which numbers are
sums of two squares? In other words, when does an equation x 2 + y 2 = n have
integer solutions, and how can one find these solutions? The brute force approach of
simply plugging in values for x and y leads to the following list of all solutions for
n 50 (apart from interchanging x and y ):
1 = 12 + 02 , 2 = 12 + 12 , 4 = 22 + 02 , 5 = 22 + 12 , 8 = 22 + 22 , 9 = 32 + 02 ,
10 = 32 + 12 , 13 = 32 + 22 , 16 = 42 + 02 , 17 = 42 + 12 , 18 = 32 + 32 ,
20 = 42 + 22 , 25 = 52 + 02 = 42 + 32 , 26 = 52 + 12 , 29 = 52 + 22 , 32 = 42 + 42 ,
34 = 52 + 32 , 36 = 62 + 02 , 37 = 62 + 12 , 40 = 62 + 22 , 41 = 52 + 42 ,
45 = 62 + 32 , 49 = 72 + 02 , 50 = 52 + 52 = 72 + 12
Notice that in some cases there is more than one solution for a given value of n .
Our first goal will be to describe a more efficient way to find the integer solutions of
x 2 + y 2 = n and to display them graphically in a way that sheds much light on their
structure. The technique for doing this will work not just for the function x 2 + y 2
but also for any function Q(x, y) = ax 2 + bxy + cy 2 , where a , b , and c are integer
constants. Such a function Q(x, y) is called a quadratic form, or sometimes just a
form for short.
Solving x 2 + y 2 = n amounts to representing n in the form of the sum of two
squares. More generally, solving Q(x, y) = n is called representing n by the form
Q(x, y) . So the overall goal is to solve the representation problem: Which numbers n
are represented by a given form Q(x, y) , and how does one find such representations.
Before starting to describe the method for displaying the values of a quadratic
form graphically let us make a preliminary observation:
If the greatest common divisor of two integers x and y is d , then Q(x, y) =
y y
d2 Q( xd , d) where the greatest common divisor of x
d and d is 1 . Hence it suf-
fices to find the values of Q on primitive pairs (x, y) , the pairs whose greatest
common divisor is 1 , and then multiply these values by arbitrary squares d2 .
Thus the real problem is to find the primitive representations of a number n by a
form Q(x, y) , or in other words, to find the primitive solutions of Q(x, y) = n .
Primitive pairs (x, y) correspond almost exactly to fractions x/y that are re-
duced to lowest terms, the only ambiguity being that both (x, y) and (x, y) cor-
respond to the same fraction x/y . However, this ambiguity does not affect the value
Chapter 2 Quadratic Forms 2
We already have a nice graphical representation of the rational numbers x/y and
1/0 as the vertices in the Farey diagram. Here is a picture of the diagram with the
so-called dual tree superimposed:
4 /3
1 /1
3 /4
3 /2 2 /3
5 /3 3 /5
2 /1 1 /2
5 /2 2 /5
3 /1 1 /3
4 /1 1 /4
1 /0 0 /1
4 /1 1 /4
3 /1 1 /3
5 /2 2 /5
2 /1 1 /2
5 /3 3 /5
3 /2 2 /3
4 /3 3 /4
1 /1
The dual tree has a vertex in the center of each triangle of the Farey diagram, and it
has an edge crossing each edge of the Farey diagram. The upper half of the dual tree
does actually look like a bit like a real tree, with the lower half being its reflection
in still water. As with the Farey diagram, we can only draw a finite part of the dual
tree. The actual dual tree has branching that repeats infinitely often, an unending
bifurcation process with smaller and smaller twigs.
The tree divides the interior of the large circle into regions, each of which is
adjacent to one vertex of the original diagram. We can write the value Q(x, y) in
the region adjacent to the vertex x/y . This is shown in the figure below for the
quadratic form Q(x, y) = x 2 + y 2 , where to unclutter the picture we no longer draw
the triangles of the original Farey diagram.
Chapter 2 Quadratic Forms 3
For example the 13 in the region adjacent to the fraction 2/3 represents the value
22 + 32 , and the 29 in the region adjacent to 5/2 represents the value 52 + 22 .
For a quadratic form Q this picture showing the values Q(x, y) is called the
topograph of Q . It turns out that there is a very simple method for computing the
topograph from just a very small amount of initial data. This method is based on the
following:
We can check this in the topograph of x 2 + y 2 shown above. Consider for exam-
ple one of the edges separating the values 1 and 2 . The values in the four regions
surrounding this edge are 1, 1, 2, 5 and the arithmetic progression is 1, 1 + 2, 5 . For
an edge separating the values 1 and 5 the arithmetic progression is 2, 1 + 5, 10 . For
an edge separating the values 5 and 13 the arithmetic progression is 2, 5 + 13, 34 .
And similarly for all the other edges.
The arithmetic progression rule implies that the values of Q in the three regions
surrounding a single vertex of the tree determine the values in all other regions, by
starting at the vertex where the three adjacent values are known and working ones
way outward in the dual tree. The easiest place to start for a quadratic form Q(x, y) =
ax 2 + bxy + cy 2 is with the three values Q(1, 0) = a , Q(0, 1) = c , and Q(1, 1) =
a + b + c for the three fractions 1/0 , 0/1 , and 1/1 . Here are two examples:
Chapter 2 Quadratic Forms 4
In the first case we start with the values 1 and 2 together with the 3 just above them.
These determine the value 9 above the 2 via the arithmetic progression 1 , 2 + 3 , 9 .
Similarly the 6 above the 1 is determined by the arithmetic progression 2 , 1 + 3 ,
6 . Next one can fill in the 19 next to the 9 we just computed, using the arithmetic
progression 3 , 2 + 9 , 19 , and so on for as long as one likes.
The procedure for the other form x 2 2y 2 is just the same, but here there are
negative as well as positive values. The edges that separate positive values from
negative values will be important later, so we have indicated these edges by special
shading.
Perhaps the most noticeable thing in both the examples x 2 + 2y 2 and x 2 2y 2
is the fact that the values in the lower half of the topograph are the same as those in
the upper half. We could have predicted in advance that this would happen because
Q(x, y) = Q(x, y) whenever Q(x, y) has the form ax 2 + cy 2 , with no xy term.
The topograph for x 2 + y 2 has even more symmetry since the values of x 2 + y 2 are
unchanged when x and y are switched, so the topograph has left-right symmetry as
well.
Here is a general observation: The three values around one vertex of the topo-
graph can be specified arbitrarily. For if we are given three numbers a , b , c then the
quadratic form ax 2 + (c a b)xy + by 2 takes these three values for (x, y) equal
to (1, 0) , (0, 1) , (1, 1) .
Proof of the Arithmetic Progression Rule: Let the two vertices of the Farey diagram
corresponding to the values q and r have labels x1 /y1 and x2 /y2 as in the figure
below. Then by the mediant rule for labeling vertices, the labels on the p and s regions
Chapter 2 Quadratic Forms 5
are the fractions shown. Note that these labels are correct even when x1 /y1 = 1/0
and x2 /y2 = 0/1 .
Similarly we have
The terms in ( ) are the same in both cases, namely the terms involving both sub-
scripts 1 and 2 . If we compute p + s by adding the two formulas together, the terms
( ) will therefore cancel, leaving just p + s = 2(q + r ) . This equation can be rewrit-
ten as (q + r ) p = s (q + r ) , which just says that p, q + r , s forms an arithmetic
progression.
and unexpected properties. For the form x 2 2y 2 there is a zigzag path of edges in
the topograph separating the positive and negative values, and if we straighten this
path out to be a line, called the separator line, what we see is the following infinitely
repeated pattern:
Chapter 2 Quadratic Forms 6
To construct this, one can first build the separator line starting with the three values
Q(1, 0) = 1 , Q(0, 1) = 2 , and Q(1, 1) = 1 . Place these as shown in part (a) of the
figure below, with a horizontal line segment separating the positive from the negative
values.
To extend the separator line one step farther to the right, apply the arithmetic progres-
sion rule to compute the next value 2 using the arithmetic progression 2, 1 1, 2 .
Since this value 2 is positive, we place it above the horizontal line and insert a vertical
edge to separate this 2 from the 1 to the left of it, as in (b) of the figure. Now we
repeat the process with the next arithmetic progression 1, 2 1, 1 and put the new 1
above the horizontal line with a vertical edge separating it from the previous 2 , as
shown in (c). At the next step we compute the next value 2 and place it below the
horizontal line since it is negative, giving (d). One more step produces (e) where we see
that further repetitions will produce a pattern that repeats periodically as we move to
the right. The arithmetic progression rule also implies that it repeats periodically to
the left, so it is periodic in both directions:
Thus we have the periodic separator line. To get the rest of the topograph we can then
work our way upward and downward from the separator line, as shown in the original
Chapter 2 Quadratic Forms 7
figure. As one moves upward from the separator line, the values of Q become larger
and larger, approaching + monotonically, and as one moves downward the values
approach monotonically. The reason for this will become clear in the next section
when we discuss something called the Monotonicity Property.
An interesting property of this form x 2 2y 2 that is evident from its topograph
is that it takes on the same negative values as positive values. This would have been
hard to predict from the formula x 2 2y 2 . Indeed, for the similar-looking quadratic
form x 2 3y 2 the negative values are quite different from the positive values, as one
can see in its straightened-out topograph:
This is a part of the dual tree of the Farey diagram. If we superimpose the triangles
of the Farey diagram corresponding to this part of the dual tree we obtain an infinite
strip of triangles:
Chapter 2 Quadratic Forms 8
1 2 3 10 17 58 99
0
1
2
7
12
41
70
2 1 2 1 2 1 2 1
1 2 1 2 1 2 1 2
0 1 4 7 24 41
1
1
3
5
17
29
Ignoring the dotted triangles to the left, the infinite strip of triangles corresponds to
2 . We could compute the value of this continued
the infinite continued fraction 1 + 1
fraction by the methods in the previous chapter, but there is an easier way using
x
the quadratic form x 2 2y 2 . For fractions y
labeling the vertices along the infinite
2 2
strip, the corresponding values n = x 2y are either 1 or 2 . We can rewrite
x 2
the equation x 2 2y 2 = n as y = 2 + yn2 . As we go farther and farther to the right
in the infinite strip, both x and y are getting larger and larger while n only varies
through finitely many values, namely 1 and 2 , so the quantity yn2 is approaching
x 2 x 2
0 . The equation y = 2 + yn2 then implies that y is approaching 2 , so we see
x x
that y is approaching 2 . Since the fractions y are also approaching the value of the
2 that corresponds to the infinite strip, this implies
infinite continued fraction 1 + 1
2 is 2 .
that the value of the continued fraction 1 + 1
Here is another example, for the quadratic form x 2 3y 2 , showing how 3=
1 + 1
1 + 1 2 .
1 2 7 26
0
1
4
15
1 1 1 1
2 3 2 3 2 3 2 3
0 1 3 5 12 19
1
1
2
3
7
11
After looking at these two examples one can see that it is not really necessary to draw
the strip of triangles, and one can just read off the continued fraction directly from
the periodic separator line. Let us illustrate this by considering the form x 2 10y 2 :
Chapter 2 Quadratic Forms 9
1
0
1 6 9 10 9 6 1 6
10 9 6 1 6 9 10 9 6
0
1
If one moves toward the right along the horizontal line starting at a point in the edge
1 0
separating the 0 region from the 1 region, one first encounters 3 edges leading off
to the right (downward), then 6 edges leading off to the left (upward), then 6 edges
leading off to the right, and so on. This means that the continued fraction for 10 is
6 .
3 + 1
Here is a more complicated example showing how to compute the continued frac-
tion for 19:
1 6 5 9 9 5 6 1 6
19 18 15 10 3 2 3 10 15 18 19 18 15 10
From this we read off that 2 + 1
19 = 4 + 1 1 + 1
3 + 1
1 + 1
2 + 1
8 .
In section 2 of this chapter we will prove that the topograph of the form x 2 dy 2
always has a periodic separator line whenever d is a positive integer that is not a
square. As in the examples above, this separator line always includes the edge of the
dual tree separating the vertices 1/0 and 0/1 since the form takes the positive value
+1 on 1/0 and the negative value d on 0/1 . The periodicity then implies that the
continued fraction for d has the form
p
a1 + 1
d = a0 + 1 a2 + + 1
an
with the periodic part starting immediately after the initial term a0 . In addition to
being periodic, the separator line also has mirror symmetry with respect to reflection
across the vertical line corresponding to the edge connecting 1/0 to 0/1 in the Farey
diagram. This is because the form x 2 dy 2 has no xy term, so replacing x/y by
x/y does not change the value of the form. Once the separator line has symmetry
with respect to this vertical line, the periodicity forces it to have mirror symmetry with
respect to an infinite sequence of vertical lines, as illustrated in the following figure
for the form x 2 19y 2 :
Chapter 2 Quadratic Forms 10
As before there is a horizontal translation giving the periodicity and there are reflec-
tional symmetries across vertical lines, but now there is an extra glide-reflection along
the strip that interchanges the positive and negative values of the form. Performing
this glide-reflection twice in succession gives the translational periodicity. Notice that
there are also 180 degree rotational symmetries about the points marked with dots
on the separator line, and these rotations account for the palindromic middle part of
the continued fraction
p
1 + 1
13 = 3 + 1 1 + 1
1 + 1
1 + 1
6
Chapter 2 Quadratic Forms 11
The fact that the periodic part has odd length corresponds to the separator strip
having the glide-reflection symmetry. We could rewrite the continued fraction to have
a periodic part of even length by doubling the period,
p
1 + 1
13 = 3 + 1 1 + 1
1 + 1
1 + 1
6 + 1
1 + 1
1 + 1
1 + 1
1 + 1
6
and this corresponds to ignoring the glide-reflection and just considering the trans-
lational periodicity.
p
1 + 1
7/3 = 1 + 1
This gives 1 + 1
8 + 1
1 + 1
1 + 1
2 . For the second example we use
2 2
p
10x 29y to compute the continued fraction for 29/10,
p
with the result that 1 + 1
29/10 = 1 + 1 2 + 1
2 + 1
1 + 1
2 . The period of odd length
here corresponds to the existence of the glide-reflection and 180 degree rotation sym-
metries.
Chapter 2 Quadratic Forms 12
As one can see in these examples, the palindrome property and the relation an =
p
2a0 still hold for the continued fractions for irrational numbers p/q assuming that
a0 > 0 , which is equivalent to the condition p/q > 1 since a0 is the integer part
p
of p/q . Fractions p/q less than 1 can easily be dealt with just by inverting them,
a1 + 1
interchanging p and q . Inverting a continued fraction a0 + 1 a2 + + 1
an
a0 +1
changes it to 1 a1 + 1
a2 + + 1
an . For example, from the earlier computation
p p
of 7/3 we obtain 3/7 = 1 1 + 11 + 1
1 + 18 + 1
1 + 1
1 + 1
2 .
p
One might ask whether the irrational numbers p/q are the only numbers having
a continued fraction a0 + a
1
1
a2 + + 1
+ 1 an or a
1
0
+ a
1
1
a2 + + 1
+ 1 an
satisfying the palindrome property and the relation a0 = 2an . The answer is yes, as
we will see later in the chapter.
Pells Equation
We encountered the equation x 2 dy 2 = 1 briefly in Chapter 0. It is traditionally
called Pells equation, and the similar equation x 2 dy 2 = 1 is sometimes called
Pells equation as well. If d is a square then the equations are not very interesting
since in this case d can be incorporated into the y 2 term, so one is looking at the
equations x 2 y 2 = 1 and x 2 y 2 = 1 , which have only the trivial solutions
(x, y) = (1, 0) for the first equation and (x, y) = (0, 1) for the second equation,
since these are the only cases when the difference between two squares is 1 . We will
therefore assume that d is not a square in what follows.
As an example let us look at the equation x 2 19y 2 = 1 . We drew a portion of
the periodic separator line for the form x 2 19y 2 earlier, and here it is again with
some of the fractional labels x/y shown as well.
Ignoring the label 741/170 for the moment, the other fractional labels are the first
few convergents for the continued fraction for 19 that we computed before, 4 +
2 + 1
1
1 + 1
3 + 1
1 + 1
2 + 1
8 . These fractional labels are the labels on the vertices
of the zigzag path in the infinite strip of triangles in the Farey diagram, which we can
imagine being superimposed on the separator line in the figure. The fractional label
we are most interested in is the 170/39 because this is the label on a region where
Chapter 2 Quadratic Forms 13
the value of the form x 2 19y 2 is 1 . This means exactly that (x, y) = (170, 39) is
a solution of x 2 19y 2 = 1 . In terms of continued fractions, the fraction 170/39 is
2 + 1
the value of the initial portion 4 + 1 1 + 1
3 + 1
1 + 1
2 of the continued fraction
for 19, with the final term of the period omitted.
Since the topograph of x 2 19y 2 is periodic along the separator line, there are
infinitely many different solutions of x 2 19y 2 = 1 along the separator line. Go-
ing toward the left just gives the negatives x/y of the fractions x/y to the right,
changing the signs of x or y , so it suffices to see what happens toward the right. One
way to do this is to use the linear fractional transformation that gives the periodicity
translation toward the right. This transformation sends the edge h1/0, 0/1i of the
Farey diagram to the edge h170/39, 741/170i . Here 741/170 is the value of the con-
2 + 1
tinued fraction 4 + 1 1 + 1
3 + 1
1 + 1
2 + 1
4 obtained from the continued fraction
for 19 by replacing the final number 8 in the period by one-half of its value, 4 .
The figure above shows why this is the right thing to do. We get an infinite sequence
2 2
of x 19y = 1 by applying the periodicity
of larger and larger positivesolutions
transformation with matrix 170 741
39 170 to the vector (1, 0) . For example,
! ! !
170 741 170 57799
=
39 170 39 13260
so the next solution of x 2 19y 2 = 1 after (170, 39) is (57799, 13260) , and we could
compute more solutions if we wanted.
Obviously
they are getting large rather quickly.
170 741
The two 170 s in the matrix 39 170 can hardly be just a coincidence. Notice
also that the entry 741 factors as 19 39 which hardly seems like it should be just a
coincidence either. Lets check that these numbers had to occur. In general, for the
form x 2 dy 2 let us suppose that we have found the first solution (x, y) = (p, q)
after (1, 0) for Pells equation x 2 dy 2 = 1 , so p 2 dq2 = 1 . Then based on the
previous example we suspect that the periodicity transformation is the transformation
! ! !
p dq x px + dqy
=
q p y qx + py
To check that this is correct the main thing to verify is that this transformation pre-
serves the values of the quadratic form. When we plug in (px + dqy, qx + dy) for
(x, y) in x 2 dy 2 we get
(px + dqy)2 d(qx + py)2
= p 2 x 2 + 2pdqxy + d2 q2 y 2 dq2 x 2 2pdqxy dp 2 y 2
= (p 2 dq2 )x 2 d(p 2 dq2 )y 2
= x 2 dy 2 since p 2 dq2 = 1
Chapter 2 Quadratic Forms 14
p dq
so the transformation q p does preserve the values of the form. Also it takes 1/0
to p/q , and its determinant is p 2 dq2 = 1 , so it has to be the translation giving
the periodicity along the separator line. (We havent actually proved yet that periodic
separator lines always exist for forms x 2 dy 2 , but we will do this in the next section.)
Are there other solutions of x 2 19y 2 = 1 besides the ones we have just described
that occur along the separator line? The answer is No because we will see in the next
section that as one moves away from the separator line in the topograph, the values
of the quadratic form change in a monotonic fashion, steadily increasing toward +
as one moves upward above the separator line, and decreasing steadily toward
as one moves downward below the separator line. Thus the value 1 occurs only along
the separator line itself. Also we see that the value 1 never occurs, which means
that the equation x 2 19y 2 = 1 has no integer solutions.
For an example where x 2 dy 2 = 1 does have solutions, let us look again at
the earlier example of x 2 13y 2 .
The values of the increment h along the boundary of a region in the topograph
have the interesting property that they also form an arithmetic progression:
We will call this property the Second Arithmetic Progression Rule. To see why it is true,
start with the edge labeled h in the figure, with the adjacent regions labeled p and
q . The original Arithmetic Progression Rule then gives the value p + q + h in the next
region to the right, and another application of the rule gives the label h + 2p on the
next edge. Thus the edge label increases by 2p when we move from one edge to the
next edge to the right, so by repeated applications of this fact we see that we have an
arithmetic progression of edge labels all along the border of the region labeled p .
Another thing worth noting at this point is something that we will refer to as
the Monotonicity Property: If the three labels p , q , and h adjacent to an edge are
all positive, then so are the three labels for the next
two edges in front of this edge, and the new labels
are larger than the old labels. It follows that when
one continues forward out this part of the topograph,
all the labels become monotonically larger the farther
one goes. Similarly, when the original three labels are
negative, all the labels become larger and larger neg-
ative. This is really just the same principle applied to
the negative Q(x, y) of the original form Q(x, y) .
Proof : For the given form Q(x, y) = ax 2 +bxy +cy 2 , the regions 1/0 and 0/1 in the
topograph are labeled a and c , and the edge in the topograph
separating these two regions has h = b since the 1/1 region is
labeled a + b + c . So the statement of the proposition is correct
for this edge. For other edges we proceed by induction, moving
farther and farther out the tree. For the induction step suppose
we have two adjacent edges labeled h and k as in the figure,
Chapter 2 Quadratic Forms 17
Hyperbolic Forms
The most interesting of the four types of quadratic forms are the hyperbolic
forms. We will show that these all have a periodic separator line as in the examples
x 2 dy 2 and qx 2 py 2 that we looked at earlier.
Theorem. For a hyperbolic form Q(x, y) the edges of the topograph for which the
two adjacent regions are labeled by numbers of opposite sign form a line which is
infinite in both directions, and the topograph is periodic along this line.
Proof : Since the form is hyperbolic, all regions of the topograph have labels that are
either positive or negative, never zero. There must exist two regions of opposite sign
since Q is hyperbolic, and by moving along a path in the topograph joining these two
regions we will somewhere encounter two adjacent regions of opposite sign. Thus
there must exist edges whose two adjacent regions have opposite sign. Let us call
these edges separating edges. If we apply the discriminant formula = h2 4pq in
preceding proposition to a separating edge, we see that must be positive since p
and q are nonzero and have opposite sign, so 4pq is positive while h2 is positive
or zero. Thus a hyperbolic form must have positive discriminant.
At an end of a separating edge the value of Q in the next region must be either
positive or negative since Q does not take the value 0 :
This implies that exactly one of the two edges at the end of the first separating edge
is also a separating edge. Repeating this argument, we see that each separating edge
is part of a line of separating edges that is infinite in both directions (and the edges
that lead off from this edge are not separating edges).
As we move off this separating line the values of Q are steadily increasing on
the positive side and steadily decreasing on the negative side, by the monotonicity
property, so there are no other separating edges that are not on this line.
Chapter 2 Quadratic Forms 18
It remains to prove that the topograph is periodic along the separating line. We
can assume all the edges along the line are oriented in the same direction, by changing
the signs of the h values where necessary. For an edge of the line labeled h with
adjacent regions labeled p and q , with p, q > 0 , we know that h2 + 4pq is equal
to the discriminant . From the equation = h2 + 4pq we obtain the inequalities
|h| < , p /4 , and q /4 . Thus there are only finitely many possible values
for h , p , and q along the separator line. Hence there are only finitely many possible
combinations of values h , p , and q for each edge on the separator line. It follows
that there must be two edges on the line that have the same values of h , p , and q .
Since the topograph is uniquely determined by the three labels h , p , q at a single
edge, the translation of the line along itself that takes one edge to another edge with
the same three labels must preserve all the labels on the line. This shows that the
separator line is periodic, including the values of Q .
Conceivably there might be just a single region on one side of the separator line,
but this doesnt actually happen: There must be edges leading away from the sep-
arating line on both the side where the form is positive and on the side where it is
negative. For if there was just a single region on one side of the line, the second arith-
metic progression rule would say that the h labels along the line formed an infinite
arithmetic progression, and hence the h values would not be bounded, contradicting
the fact that there are only finitely many different values for h along the separator
line, as we just showed.
Here is an interesting consequence of the periodicity of the separator line:
In the previous chapter we gave an argument that showed that infinite continued
fractions that are eventually periodic always represent quadratic irrational numbers.
This is one half of Lagranges Theorem, and now we can prove the other half, the
converse statement:
Chapter 2 Quadratic Forms 19
Proof : A quadratic irrational number has the form A + B n where A and B are
rational numbers and n is a positive integer that is not a square. Letting be the
conjugate A B n of , we see that and are roots of the quadratic equation
(x )(x ) = x 2 2Ax + (A2 nB 2 ) = 0 whose coefficients are rational numbers.
After multiplying through by a common denominator we can replace this equation
by an equation ax 2 + bx + c = 0 with integer coefficients having and as roots.
The leading coefficient a is nonzero since it arose from multiplying by a common
denominator.
From the quadratic equation ax 2 + bx + c = 0 we obtain a quadratic form
Q(x, y) = ax 2 + bxy + cy 2 with the same coefficients a, b, c . We claim that this
quadratic form is hyperbolic. It cannot take on the value 0 at an integer pair (x, y) 6=
(0, 0) since if ax 2 + bxy + cy 2 = 0 then we cannot have y = 0 , otherwise the equa-
tion would become ax 2 = 0 with a 6= 0 , forcing x to be 0 as well. Since y 6= 0
we can divide the equation ax 2 + bxy + cy 2 = 0 by y 2 to get a quadratic equation
a(x/y)2 + b(x/y) + c = 0 with a rational root x/y , contrary to the assumption
that the root , and hence also , was irrational. Thus the quadratic form Q(x, y)
does not take on the value 0 . To see that Q(x, y) takes on both positive and neg-
ative values, note that a(x/y)2 + b(x/y) + c takes on both positive and negative
values at rational numbers x/y since the graph of the function ax 2 + bx + c is a
parabola crossing the x -axis at two distinct points and . Multiplying the formula
a(x/y)2 + b(x/y) + c by the positive number y 2 , it follows that ax 2 + bxy + cy 2
also takes on both positive and negative values at integer pairs (x, y) .
Since Q is hyperbolic, its topograph contains a periodic line separating the pos-
itive and negative values. This corresponds to a strip in the Farey diagram which is
infinite in both directions. The fractions xn /yn labeling the vertices along this strip
have both xn and yn approaching as n goes to . (The only way this could fail
for a path consisting of an infinite sequence of distinct edges in the dual tree would be
if all the edges from some point onward bordered the 1/0 or 0/1 region, which is not
the case here since periodic separator lines have only a finite number of edges border-
2
ing a given region.) The values Q(xn , yn ) = axn + bxn yn + cyn2 = kn are bounded,
ranging over a finite set along the strip. Thus a(xn /yn )2 +b(xn /yn )+c = kn /yn2 0
as n , so at one end of the strip we have xn /yn and at the other end we
have xn /yn . Joining either end of the strip to 1/0 in the Farey diagram then
Chapter 2 Quadratic Forms 20
gives infinite strips corresponding to infinite continued fractions for and that
are eventually periodic.
Let us look at an example to illustrate the procedure in the proof of this theorem.
We will use a quadratic
form to compute the continued fractions for the two quadratic
10 2 10 1
irrationals 14 . The equation (x )(x ) = 0 is x 2 7 x + 2 = 0 , so with
integer coefficients this becomes 14x 2 20x + 7 = 0 . The associated quadratic form
is 14x 2 20xy + 7y 2 . To compute the topograph we start with the three values at
1/0 , 0/1 , and 1/1 and work toward the separator line:
This figure lies in the upper half of the circular Farey diagram where the fractions
x/y are positive, so if we follow the separator line out to the right we approach the
10 2
smaller of the two roots of 14x 2 20x + 7 = 0 , which is 14
, and if we follow the
10+ 2
separator line to the left we approach the larger root, 14 . To get the continued
fraction for the smaller root we follow the path in the figure that starts with the edge
between 1/0 and 0/1 , then zigzags up to the separator line, then goes out this line
to the right. If we straighten this path out it looks like the following:
It is not actually necessary to redraw the straightened-out path since in the original
form of the topograph we can read off the sequence of left and right side roads as
we go along the path, the sequence LRLRLLRR where L denotes a side road to the
left and R a side roadto the right. This sequence determines the continued fraction.
10+ 2
For the other root 14 the straightened-out path has the following shape:
Chapter 2 Quadratic Forms 21
A natural question to ask is whether every periodic line in the dual tree of the
Farey diagram is the separator line of some hyperbolic form, and the answer is yes.
To find the form one first uses the periodic line to construct a continued fraction
that is eventually periodic, then one computes the value of this continued fraction
by finding a quadratic equation that it satisfies, and this quadratic equation gives the
desired quadratic form. As an example, let us find a quadratic form whose periodic
line looks like the following:
This provides a realization of the given periodic line as the separator line in the to-
pograph of a quadratic form. Notice that the separator line is not symmetric under
reflection across any vertical line, unlike all the separator lines we have seen up to
this point. This is the simplest example without this bilateral symmetry property
Chapter 2 Quadratic Forms 22
a + 1
bilateral symmetry, as do the strips for continued fractions 1 a + 1
a if two
1 2 3
Elliptic Forms
An elliptic quadratic form Q(x, y) takes on either all positive or all negative
values at integer pairs (x, y) 6= (0, 0) . The two cases are equivalent since one can
switch from one to the other just by putting a minus sign in front of Q . Thus it
suffices to consider the case that Q takes on only positive values, and we will assume
we are in this case from now on.
Let p be the minimum value taken on by Q , and consider
a region of the topograph where Q takes the value p . All the
edges having one endpoint at this region are oriented away
from the region, by the arithmetic progression rule and the
assumption that p is the minimum value of Q . The mono-
tonicity property then implies that all edges farther away from
the p region are also oriented away from the region, and the
values of Q increase as one moves away from the region.
We know that the h -labels on the edges making up the
border of the p region form an arithmetic progression with
increment 2p . There are two possibilities for these h -labels:
(I) Some edge bordering the p region has the label h = 0 . The topograph then has
the form shown in the first figure below. An example of such a form is px 2 + qy 2 .
We call the 0 -labeled edge a source edge since all other edges are oriented away from
this edge.
(II) No edge bordering the p region has label h = 0 . Since the labels on these edges
form an arithmetic progression, there must be some vertex where the terms in the
Chapter 2 Quadratic Forms 23
progression change sign, and so all three edges meeting at this vertex will be oriented
away from the vertex, as in the second figure above. We call this a source vertex since
all edges in the topograph are oriented away from this vertex. The fact that the three
edges leading from a source vertex all point away from the vertex is
equivalent to the three triangle inequalities
Proof : In the case of a source edge with the label h = 0 separating regions labeled p
and q , the discriminant is = h2 4pq = 4pq , which is negative. In the case of
a source vertex with adjacent regions labeled p, q, r , the edge between the p and q
regions is labeled h = p + q r so we have
= h2 4pq = (p + q r )2 4pq
= p 2 + q2 + r 2 2pq 2pr 2qr
= p(p q r ) + q(q p r ) + r (r p q)
In the last line the three quantities in parentheses are negative by the triangle inequal-
ities, so is negative.
A special case is h = 0 . Then the topograph is as shown in the next figure, and the
form is parabolic with discriminant = h2 = 0 . Notice that the topograph is periodic
along the 0 region since it consists of the same tree pattern repeated infinitely often.
Chapter 2 Quadratic Forms 24
We have seen that the discriminant of a form that takes the value 0 is a square.
Here is the converse:
Proof : Suppose first that the form Q(x, y) = ax 2 + bxy + cy 2 has a 6= 0 . Then
the equation aX 2 + bX + c = 0 has roots X = (b b2 4ac)/2a . If b2 4ac
is a square, this means the roots are rational. If X = p/q is a rational root then
a(p/q)2 + b(p/q) + c = 0 and hence ap 2 + bpq + cq2 = 0 so Q takes the value 0 at a
pair (p, q) with q 6= 0 . There remains the case that a = 0 , so Q(x, y) = bxy + cy 2 ,
which is 0 at (x, y) = (1, 0) .
Equivalence of Forms
In the pictures of topographs we have drawn, we often omit the fractional labels
x/y for the regions in the topograph since the more important information is of-
ten just the values Q(x, y) of the form. This leads to the idea of considering two
quadratic forms to be equivalent if their topographs look the same when the labels
x/y are disregarded. For a precise definition, one can say that quadratic forms Q1
and Q2 are equivalent if there is a vertex v1 in the topograph of Q1 and a vertex v2
in the topograph of Q2 such that the values of Q1 in the three regions surrounding
v1 are equal to the values of Q2 in the three regions surrounding v2 . Since the three
values around a vertex determine all the other values in a topograph, this guarantees
that the topographs look the same everywhere, if the labels x/y are omitted.
With this definition, a topograph and its mirror image correspond to equivalent
forms since the mirror image topograph has the same three labels around each vertex
as in the corresponding vertex of the original topograph. For example, switching the
variables x and y reflects the circular Farey diagram across its vertical axis and hence
reflects the topograph of a form Q(x, y) to the topograph of the equivalent form
Q(y, x) . As another example, the forms ax 2 + bxy + cy 2 and ax 2 bxy + cy 2 are
equivalent since they are related by changing (x, y) to (x, y) , reflecting the Farey
diagram across its horizontal axis, with a corresponding reflection of the topograph.
Chapter 2 Quadratic Forms 26
Theorem. Up to equivalence, there are just a finite number of forms with a given
discriminant, except in the special case that the discriminant is zero.
This fails to hold for forms of discriminant 0 , the parabolic forms, since multi-
plying such a form by different integers produces infinitely many inequivalent forms.
Proof : Consider first the case of forms of positive discriminant. These are either
hyperbolic or 0 -hyperbolic. Hyperbolic forms have a separator line. For an edge in
the separator line labeled h with adjacent regions labeled p > 0 and q < 0 we have
= h2 + 4pq , so each of the quantities |h| , p , and q is bounded in size by . This
means that for fixed there are only finitely many possibilities for h , p , and q for
each edge of the separator line, hence just finitely many possible combinations of h ,
p , and q for each edge, so there are just finitely many possibilities for the form, up
to equivalence. The same reasoning applies also to 0 -hyperbolic forms that have a
separating edge in their topograph. The only ones that do not have a separating edge
are the ones with two adjacent regions labeled 0 . In this case the edge separating
these two regions has h2 = , so the value of h on this edge is determined by ,
hence the form is determined by .
For forms of negative discriminant we can assume we are dealing with positive
elliptic forms since a form Q and its negative Q have the same discriminant. If
a positive elliptic form has a source edge in its topograph, this edge has h = 0 so
= 4pq where p and q are the values of Q in the adjacent regions. For fixed
there are only finitely many choices of p and q satisfying = 4pq , so there are only
finitely many positive elliptic forms of discriminant having a source edge. In the
other case of a source vertex surrounded by values p, q, r of the form, we obtained
the formula = p(p q r ) + q(q p r ) + r (r p q) with the three quantities in
parentheses being negative, so p + q + r || and hence there are only finitely many
possibilities for p , q , and r for each .
From these topographs it is apparent that the two forms are not equivalent, and also
that the negatives of these two forms, x 2 + 15y 2 and 3x 2 + 5y 2 , give two more
Chapter 2 Quadratic Forms 27
inequivalent forms. To see whether there are others we use the formula = 60 =
h2 + 4pq relating the values p and q along an edge labeled h in the separator line,
with p > 0 and q > 0 . The various possibilities are listed in the table below. Note
that the equation 60 = h2 + 4pq implies that h has to be even.
h pq (p, q)
0 15 (1, 15), (3, 5), (5, 3), (15, 1)
2 14 (1, 14), (2, 7), (7, 2), (14, 1)
4 11 (1, 11), (11, 1)
6 6 (1, 6), (2, 3), (3, 2), (6, 1)
Each combination of values for h , p , and q in the table occurs at some edge along the
separator line in one of the two topographs shown above, or the negatives of these
topographs. Hence every form of discriminant 60 is equivalent to one of these four.
If it had not been true that all the possibilities in the table occurred in the topographs
of the forms we started with, we could have used these other possibilities for h , p ,
and q to generate new topographs and hence new forms, eventually exhausting all
the finitely many possibilities.
For finding all the positive elliptic quadratic forms of a given discriminant, up
to equivalence, the procedure is simpler since one doesnt have to actually draw any
topographs. At a source vertex or edge in the topograph for such
a form Q let the smaller two of the three adjacent values of Q be
a c , with the edge between them labeled h 0 , so that the third
adjacent value of Q is a + c h . The form is then equivalent to the
form ax 2 + hxy + cy 2 . Since a and c are the smallest values of
Q we have a c a + c h , and the latter inequality implies that
h a . Thus we have the inequalities 0 h a c . Note that these
inequalities imply the three triangle inequalities at the source vertex
or edge: a + c h a + c , a < c + (a + c h) , and c < a + (a + c h) . For the
discriminant = D we have D = 4ac h2 , so we are seeking solutions of
The number h must have the same parity as D , and we can bound the choices for h by
the inequalities 4h2 4a2 4ac = D + h2 which imply 3h2 D , or h2 D/3 . Every
positive elliptic form is equivalent to one of the forms ax 2 + hxy + cy 2 for triples
(a, h, c) satisfying these conditions 4ac = h2 + D , 0 h a c , and h2 D/3 .
Different choices of (a, h, c) satisfying these conditions never give forms that are
Chapter 2 Quadratic Forms 28
equivalent since a and c are the labels on the two regions in the topograph where the
form takes its smallest values, and h is determined by a , c , and D by the formula
4ac = h2 + D .
As an example, when D = 80 we must have h even and h2 80/3 so h must
be 0 , 2 , or 4 . The corresponding values of a and c that are possible can then be
computed from the equation 4ac = 80 + h2 , keeping in mind that h a c . The
possibilities are shown in the following table:
h ac (a, c)
0 20 (1, 20), (2, 10), (4, 5)
2 21 (3, 7)
4 24 (4, 6)
Thus every positive elliptic form of discriminant 80 is equivalent to one of the forms
x 2 + 20y 2 , 2x 2 + 10y 2 , 4x 2 + 5y 2 , 3x 2 + 2xy + 7y 2 , or 4x 2 + 4xy + 6y 2 , and no
two of these are equivalent to each other, as explained earlier.
We know there are only finitely many forms of a given nonzero discriminant, up
to equivalence, but what about the question of whether every integer occurs as the
discriminant of a form? For a form ax 2 + bxy + cy 2 we have = b2 4ac , and this
is congruent to b2 mod 4 . A square such as b2 is always congruent to 0 or 1 mod 4 ,
so the discriminant of a form is always congruent to 0 or 1 mod 4 . Conversely, for
every integer congruent to 0 or 1 mod 4 there exists a form whose discriminant is
. Namely, if = 4k then the form x 2 ky 2 has discriminant 4k , and if = 4k + 1
then the form x 2 +xy ky 2 has discriminant 4k+1 . Here k can be positive, negative,
or zero. The forms x 2 ky 2 and x 2 + xy ky 2 are called the principal quadratic
forms of these discriminants.
It was conjectured by Gauss around 1800 that this is the complete list for negative
discriminants. It was shown in the 1930s that there is at most one more, and then in
the 1960s the possibility of an elusive tenth such discriminant was finally ruled out,
finishing the proof of the conjecture. For positive discriminant there are many more
Chapter 2 Quadratic Forms 29
cases where the class number is 1 , but it is still unknown whether there are infinitely
many such discriminants.
In the nine cases D = 3, 4, 7, 8, 11, 19, 43, 67, 163 it is very easy to check that all
forms are equivalent. For example when D = 163 we must have h odd with h2
163/3 so the only possibilities are h = 1, 3, 5, 7 . From the equation 4ac = 163 + h2
the corresponding values of ac are 41, 43, 47, 53 which all happen to be primes, and
since a c this forces a to be 1 in each case. But since h a this means h must
be 1 , and we obtain the single quadratic form ax 2 + hxy + cy 2 = x 2 + xy + 41y 2 .
The corresponding polynomial x 2 + x + 41 has a curious property discovered by
Euler: For each x = 0, 1, 2, , 39 the value of x 2 + x + 41 is a prime number. Here
are these primes:
41 43 47 53 61 71 83 97 113 131 151 173 197 223 251 281 313 347 383 421
461 503 547 593 641 691 743 797 853 911 971 1033 1097 1163 1231 1301
1373 1447 1523 1601
Notice that the successive differences between these numbers are 2, 4, 6, 8, . The
next number in the sequence would be 1681 = 412 , not a prime. (Write x 2 + x + 41
as x(x + 1) + 41 to see why x = 40 must give a nonprime value.) A similar thing
happens for the other values of D . The nontrivial cases are:
D
7 x2 + x + 2 2
11 x2 + x + 3 35
19 x2 + x + 5 5 7 11 17
43 x 2 + x + 11 11 13 17 23 31 41 53 67 83 101
67 x 2 + x + 17 17 19 23 29 37 47 59 73 89 107 127 149 173 199 227 257
Its interesting that these lists include all primes less than 100 except for 79 .
Chapter 2 Quadratic Forms 30
Inspecting the values here, we see that the following two statements appear to be true:
(1) The prime numbers that occur as values of x 2 + 2y 2 are 2 and the primes
congruent to 1 or 3 modulo 8 . In the part of the topograph shown these are
3, 11, 17, 19, 41, 43, 59, 67, 73, 83, 89, 97. The remaining primes are congruent to
5 or 7 modulo 8 and these do not occur as values of x 2 + 2y 2 .
(2) The values of x 2 +2y 2 are exactly the numbers that can be expressed as products
m2 p1 p2 pk where m is an arbitrary integer and each pi is a prime values of
x 2 + 2y 2 as in (1).
These statements are in fact true and were also known to Fermat.
These two examples were elliptic forms, but the same sort of behavior can occur
for hyperbolic forms, as we see in the next example, the form x 2 2y 2 . The negative
values of this form happen to be just the negatives of the positive values, so we need
only show the positive values in the topograph:
Chapter 2 Quadratic Forms 32
Here the primes that occur are 2 and primes congruent to 1 or 7 modulo 8 . We
can count the negative of a prime number as a prime as well, and then the primes
represented are 2 and the primes congruent to 1 modulo 8 . The nonprime values
are the products of the primes represented and squares times these numbers.
In these three examples the crucial idea was to look at prime factorizations and at
primes modulo certain numbers, the numbers 4 , 8 , and 8 in the three cases. Notice
that these numbers are just the absolute values of the discriminants 4 , 8 , and 8
in the three cases. Looking at primes modulo || turns out to be a key idea for all
quadratic forms, as we will see.
A special feature of the discriminants 4 , 8 , and 8 is that all forms of each
of these discriminants are equivalent, or in other words, the class numbers are 1 for
these discriminants. It is a general fact that whenever the class number is 1 , the
representation problem has the same sort of simple answer as in the examples above.
Chapter 2 Quadratic Forms 33
An example with slightly more complicated behavior is the form x 2 10y 2 . Here
is a portion of its topograph showing all the positive values less than 100 :
There is no need to show any more of the negative values since these will just be the
negatives of the positive values. The prime values less than 100 are 31, 41, 71, 79, 89 .
These are the primes congruent to 1 or 9 modulo 40 , the discriminant. However,
in contrast to what happened in the previous examples, there are many nonprime
values that are not products of these prime values. In fact these nonprime values
are products of the primes 2, 3, 5, 13, 37, 43 , none of which occur as a value of the
form. Rather miraculously, these prime values are realized instead by another form
2x 2 5y 2 having the same discriminant as x 2 10y 2 . Here is the topograph of this
companion form 2x 2 5y 2 :
The prime values this form takes on are 2 and 5 , which are the prime divisors of the
discriminant 40 , along with primes congruent to 3 and 13 modulo 40 , namely
3, 13, 37, 43, 53, 67, 83 .
Apart from the primes 2 and 5 that divide the discriminant 40 , the possible val-
ues of primes modulo 40 are 1, 3, 7, 9, 11, 13, 17, 19 since even numbers
and multiples of 5 are excluded. There are 16 different congruence classes here, and
Chapter 2 Quadratic Forms 34
exactly half of them, 8 , are realized by one or the other of the two forms x 2 10y 2
and 2x 2 5y 2 , with 4 classes realized by each form. The other 8 congruence classes
are not realized by any form of discriminant 40 since every form of discriminant 40
is equivalent to one of the two forms x 2 10y 2 or 2x 2 5y 2 , as is easily checked
by the methods from the previous section.
This is in fact a general phenomenon, valid for all discriminants: If one looks at
primes that do not divide the discriminant, then the prime values of quadratic forms
of that discriminant are exactly the primes in one-half of the possible congruence
classes modulo the discriminant.
Let us mention in passing a famous theorem of Dirichlet, proved in the 1820s
or 1830s, which says that every arithmetic progression a, a + d, a + 2d, a + 3d,
contains infinitely many primes, provided that one rules out the obvious exceptions
where a and d have a common divisor, which would then be a common divisor of all
the numbers in the progression. For example, when we take d = 40 , each of the 16
congruence classes listed above gives an arithmetic progression containing infinitely
many primes, such as the progression 1, 41, 81, 121, 161, 201, or the progression
17, 57, 97, 137, 177, 217, . In fact Dirichlet proved more: If one looks at primes less
than some large number N such as a million, then each of the possible congruence
classes contains approximately the same number of primes less than N .
The analog of Fermats Theorem for discriminant 40 is the following pair of state-
ments:
(1) The numbers represented by one of the two quadratic forms Q1 = x 2 10y 2 or
Q2 = 2x 2 5y 2 of discriminant 40 are exactly the numbers n = m2 p1 p2 pk
where m is an arbitrary integer and each pi is 2 , 5 , or a prime congruent to
1, 3, 9 , or 13 modulo 40 .
(2) If the number of factors pi in n = m2 p1 p2 pk that equal 2 , 5 , or 3 or 13
modulo 40 is even, then n is represented by Q1 , and if this number is odd then n
is represented by Q2 . In particular, the primes represented by Q1 are the primes
congruent to 1 or 9 modulo 40 and the primes represented by Q2 are 2 , 5 ,
and primes congruent to 3 or 13 modulo 40 .
Another case which is similar to the preceding one is discriminant 12 . Here there
are two forms up to equivalence, x 2 3y 2 and 3x 2 y 2 , which is equivalent to
Chapter 2 Quadratic Forms 35
For the form x 2 +3y 2 we get the negatives of the numbers represented by x 2 3y 2 .
For discriminat 12 we have the following answer to the representation problem:
Apart from the primes 2 and 7 that divide the discriminant 56 , all other primes
belong to the following 24 congruence classes modulo 56 , corresponding to odd
numbers less than 56 not divisible by 7 :
1 3 5 9 11 13 15 17 19 23 25 27 29 31 33 37 39 41 43 45 47 51 53 55
The six congruence classes whose prime elements are represented by Q1 or Q2 are
indicated by underlines, and the six congruence classes whose prime elements are
Chapter 2 Quadratic Forms 37
Proposition. Let two numbers n and be given. Then the following two statements
are equivalent : (1) There exists a form of discriminant that represents n primitively.
(2) is congruent to a square modulo 4n .
nx 2 +hxy +ky 2 , which has these three labels on the 1/0, 0/1 edge. The discriminant
of this form has the desired value = h2 4nk .
Let us see what this proposition implies for small values of n . For n = 1 it says
that there is a form of discriminant representing 1 if and only if is a square mod-
ulo 4 . The squares modulo 4 are 0 and 1 , and we already know that discriminants
of forms are always congruent to 0 or 1 modulo 4 . So we conclude that for every
possible value of the discriminant there exists a form that represents 1 . This isnt
really new information, however, since the principal form x 2 + dy 2 or x 2 + xy + dy 2
represents 1 and there is a principal form for each discriminant.
In the next case n = 2 we will get some new information. The possible values of
the discriminant modulo 8 are 0, 1, 4, 5 , and the squares modulo 8 are 0, 1, 4 since
02 = 0 , (1)2 = 1 , (2)2 = 4 , (3)2 1 , and (4)2 0 . Thus 2 is not represented
by any form of discriminant congruent to 5 modulo 8 , but for all other values of the
discriminant there is a form representing 2 . Explicit forms are:
= 8k : 2x 2 ky 2
= 8k + 1 : 2x 2 + xy ky 2
= 8k + 4 : 2x 2 + 2xy ky 2
The second statement is easier to prove so we do this first. Suppose m and n are
represented as m = a2 + b2 and n = c 2 + d2 . Using complex numbers we can then
factor m and n as m = (a + bi)(a bi) and n = (c + di)(c di) . This gives a
factorization of mn as a product of four factors, and by rearranging the factors we
Chapter 2 Quadratic Forms 40
which shows that the product of two numbers that are sums of two squares is again
a sum of two squares. This identity can be checked directly without using complex
numbers, just by multiplying both sides out, but the advantage of using complex
numbers is that they show where the identity comes from.
It remains to prove the nontrivial part of the earlier statement (1), that every prime
p = 4k + 1 is representable as the sum of two squares. Such a representation has to
be primitive since p is prime. An equivalent statement is then that 1 is a square
modulo p , and this is what we will show by finding an explicit but rather large number
h such that h2 1 modulo p .
Let us first illustrate how the proof will go by doing a specific example, the case
p = 13 , which is of the form 4k + 1 . Each of the numbers from 1 to p 1 = 12 has
a multiplicative inverse modulo 13 :
The last congruence could have been written (1) (1) 1 . The only cases when
a number equals its own inverse modulo 13 are 1 and 12 . Therefore if we consider
the product
all the terms that are not equal to their inverse will cancel in pairs, leaving only the last
term 12 . Thus we have the congruence 12! 12 modulo 13 , which we can rewrite as
12! 1 . Now notice that modulo 13 we have 7 6 , 8 5 , 9 4 , etc., so we
have
1 (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
(1)(2)(3)(4)(5)(6)(6)(5)(4)(3)(2)(1)
= (6!)2
Chapter 2 Quadratic Forms 41
This shows that 1 is a square modulo 13 , namely (6!)2 . We will generalize this by
2
showing that for every prime p = 4k + 1 the congruence 1 (2k)! modulo p
holds.
The first fact we need about congruences modulo a prime p is that each of the
numbers a = 1, 2, , p 1 has a multiplicative inverse modulo p . To see why
this is true, notice that each such a has no common factors with p , so we know from
Chapter 1 that the equation ax+py = 1 has an integer solution (x, y) . This equation
can be rewritten as ax 1 modulo p , which says that x is an inverse for a modulo
p . Note that any two choices for x here are congruent modulo p since if ax 1 and
ax 1 then multiplying both sides of ax 1 by x gives xax x , and xa 1
so we conclude that x x .
Which numbers equal their own inverse modulo p ? If a a 1 , then we can
rewrite this as a2 1 0 , or in other words (a + 1)(a 1) 0 . This is certainly
a valid congruence if a 1 , so suppose that a 6 1 . The factor a + 1 is then
not congruent to 0 modulo p so it has a multiplicative inverse modulo p , and if we
multiply the congruence (a + 1)(a 1) 0 by this inverse, we get a 1 0 so
a 1 , contradicting the assumption that a 6 1 . This argument shows that the only
numbers among 1, 2, , p 1 that are congruent to their inverses modulo p are 1
and p 1 .
Now if we consider the product (p 1)! = (1)(2) (p 1) modulo p , then
each factor other than 1 and p 1 can be paired up with its multiplicative inverse
and these two terms multiply together to give 1 modulo p , so the whole product
simplifies to just (1)(p 1) . Thus we have a fact known as Wilsons theorem:
(p 1)! 1 modulo p whenever p is prime.
Now let us assume that p is a prime of the form p = 4k + 1 . In the product (p 1)!
there are p 1 = 4k terms. The first 2k of these are (2k)! and the last 2k , in
reverse order, are p 1, p 2, , p 2k . Modulo p the latter are equivalent to
1, 2, , 2k , so we have
The last 2k of these factors are the negatives of the first 2k factors, and 2k is even, so
the signs on all the negative terms cancel out and we see that (p 1)! is congruent to
(2k)!(2k)! modulo p . Combining this with Wilsons theorem we get the desired result
2
that 1 is a square modulo p , namely 1 (2k)! modulo p . This finishes the
proof of Fermats theorem answering the question of which numbers are representable
as sums of two squares.
Chapter 2 Quadratic Forms 42
Quadratic Reciprocity
We have seen that the condition for a prime p to be represented by some form of
discriminant is that is a square modulo 4p . (For primes there is no need to add
the condition that the representation is primitive since this is automatic for numbers
with no square factors, in particular for primes.) What we need is a way to convert
this criterion from a condition on modulo 4p to a condition on p modulo . The
main tool to make this conversion is something called Quadratic Reciprocity. Here is
what it says:
Quadratic Reciprocity. Let p and q be two distinct odd primes. If p and q are not
both congruent to 3 modulo 4 , then p is a square modulo q if and only if q is a
square modulo p . In the exceptional case that p and q are both congruent to 3
modulo 4 , then p is a square modulo q if and only if q is not a square modulo p .
and only if p is a square modulo 13 . The squares modulo 13 are easily listed: 0 , 1 ,
4 , 9 , 16 3 , 25 12 , and 36 10 . (There is no need to go farther since 7 6 ,
8 5 , etc.) Thus if p is represented by a form of discriminant 13 , p is congruent
to one of 0, 1, 3, 4, 9, 10, 12 modulo 13 . The only prime congruent to 0 modulo 13
is 13 itself, so we can say that the primes represented by forms of discriminant 13
must be either 13 or primes congruent to one of 1, 3, 4, 9, 10, 12 modulo 13 , or in
other words, 1, 3, 4 modulo 13 . Each of these six congruence classes contains
infinitely many primes by Dirichlets theorem on primes in arithmetic progressions.
The converse is also true, that every prime satisfying these conditions is actually
represented by some form of discriminant 13 . All the steps in the reasoning above
were reversible except for the step of going from 13 being a square modulo 4p to
13 being a square modulo p . In fact this step is reversible too, for suppose 13 is a
square modulo p , so there is a number h such that h2 13 is divisible by p . We can
assume h is odd since if it is even we can replace h by h + p , which is odd since p
is odd, and the new h will still satisfy h2 13 modulo p . Since h is odd, we have
h2 1 modulo 4 . Since 1 13 , this can be rephrased as saying h2 13 modulo
4 , which means that h2 13 is divisible by 4 . We already knew that h2 13 was
divisible by p , so h2 13 is divisible by 4p since 4 and p have no common factors.
Thus we have shown that 13 is a square modulo 4p if it is a square modulo p .
This completes the characterization of the primes that are representable by some
form of discriminant 13 , assuming that quadratic reciprocity is known. As it happens,
the class number for discriminant 13 is one, as you can easily verify, so all forms of
discriminant 13 are equivalent to the principal form x 2 +xy 3y 2 and so we have an
exact criterion for which primes this form represents: p = 13 and primes congruent to
1 , 3 , or 4 modulo 13 . One could predict this was true by drawing a large enough
part of the topograph, but a full proof requires more than this since it is obviously
impossible to draw the whole topograph all at once and check all the infinitely many
primes that occur in it. (For one thing, there is the difficulty of knowing when a large
number is a prime.)
The full answer to which numbers, not just primes, are represented by the form
x + xy 3y 2 is what you would now expect: The numbers represented are the
2