5 The Pell Equation: 5.1 Side and Diagonal Numbers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

5 The Pell equation

5.1 Side and diagonal numbers


In ancient time, only rational numbers were thought of as numbers. Hence the discovery that (1)

2 is the length of a hypoteneuse of a right triangle, and (2)



2 is irrational (which we proved
in Section 3, but the Greeks also had another proof for) was quite perplexing. Hence the ancient
Greeks studied the Diophantine equation
x
2
2y
2
= 1
to try to understand

2. They were able to produce a sequence of increasingly large solutions
(x
i
, y
i
). Note that we can rewrite such an equation as
x
2
y
2
= 2 +
1
y
2

x
y
=
_
2 +
1
y
2

2
as y . Hence the solutions (x
i
, y
i
) provide increasingly good rational approximations to

2 as
y
i
gets large.
We will study the solutions in Z to the more general Pell equation
x
2
ny
2
= 1
for any n N. Note Pells equation always has the trivial solutions (1, 0). Further, the case
where n is a square is easy:
Exercise 5.1. If n N is a square, show the only solutions of x
2
ny
2
= 1 are (1, 0). (Cf.
Exercises 5.1.3, 5.1.4.)
Hence, from now on, we will assume n is not a square. Then we know

n is irrational from
Section 2.5.
To understand this equation thoroughly over Z, we need to work with another number system
Z[

n] =
_
a +b

n : a, b Z
_
.
This idea of working with a larger number system than Z to study problems about Z is the basis of
algebraic number theory. (Note we could also think of each Z/mZ a smaller number system than Z,
which we used in Chapter 3 to study problems over Z as well. However because Z/mZ is smaller, it
can typically only be used to limit the kinds of solutions we might have to an equation over Z, and
not actually prove the existence of solutions.) Here we take for granted the existence of

n, which
as pointed out above, was not always known (or thought) to be a number. The reason we want to
work specically with the ring
1
Z[

n] is of course so we can factor Pells equation:


x
2
ny
2
= (x +y

n)(x y

n) = 1.
If x = a + b

n, we say a is the rational part and b is the irrational part of x. (This is


analogous to real and imaginary parts of complex numbers.) Note that two numbers x, y Z[

n]
1
A ring is, roughly, a number system in which you can add, subtract and multiply, but not necessarily divide. We
will give the formal denition in Chapter 10. For now, you dont need to know anything about rings; we will just get
in the habit of calling some number systems rings to stress that they are somehow similar to Z.
29
are equal if and only if their rational and irrational parts are equal. The () direction is obvious.
To prove the () direction, write x = a
1
+b
1

n and y = a
2
+b
2

n. Then
x = y a
1
a
2
= (b
2
b
1
)

n.
If b
2
b
1
= 0, then

n =
a
1
a
2
b
2
b
1
Q which is a contradiction. Hence b
2
= b
1
, and therefore a
1
= a
2
also.
You might wonder about the title of this section. I wont cover it. See the text.
5.2 The equation x
2
2y
2
= 1
It is simple to determine all rational solutions of x
2
2y
2
= 1 using the rational slope (Diophantus
chord) method. However determining the integer solutions is a dierent matter. First we observe
by trial-and-error that the smallest non-trivial solution is (3, 2).
Exercise 5.2. Check the following composition rule holds:
(x
2
1
2y
2
1
)(x
2
2
2y
2
2
) = x
2
3
2y
2
3
where
x
3
= x
1
x
2
+ 2y
1
y
2
, y
3
= x
1
y
2
+y
1
x
2
.
Hence if (x
1
, y
1
) and (x
2
, y
2
) are to solutions to x
2
2y
2
= 1, so is there composition (x
3
, y
3
),
dened as above. We denote
(x
3
, y
3
) = (x
1
, y
1
) (x
2
, y
2
).
We will see in the next section that the solutions to x
2
2y
2
= 1 form a group under this operation.
Further, since the denition of composition is symmetric in (x
1
, y
1
) and (x
2
, y
2
) this will be an
abelian group.
Example. (x
1
, y
1
) (1, 0) = (x
1
, y
1
).
Example. (3, 2) (3, 2) = (9 + 8, 12) = (17, 12)
Example. (3, 2)
3
= (3, 2) (17, 12) = (99, 70).
Through composition (the powers of (3, 2)) we can see that we can get innitely many solutions,
each getting larger. The rst three powers give the sequence of approximations
3
2
= 1.5
17
12
= 1.416
99
70
= 1.41428757

2 = 1.4142135623 . . .
Exercise 5.3. Compute (3, 2)
4
. Use this to obtain a decimal approximation for

2. To how many
digits is it accurate? (Use a calculator/computer.)
30
5.3 The group of solutions
This section shows that the solutions of x
2
2y
2
= 1 form a group under the composition dened
above, and this group is generated by (3, 2) and (1, 0). However it is subsumed in the next section
which treats x
2
ny
2
= 1 using norms, so I will not treat the case n = 2 separately.
5.4 The general Pell equation and Z[

n]
To put the ideas we will use in context, let us recall some things about complex numbers. Let z C.
Then we can write z = x +yi where x, y R. The complex conjugate of z is z = x yi, and
zz = (x +yi)(x yi) = x
2
+y
2
.
Drawing z as a vector in the complex plane, we see that zz is the square of the length of this vector,
i.e., zz = |z|
2
. Dene the norm of z to be
N(z) = zz.
Since complex conjugation respects multiplication,
N(z
1
z
2
) = z
1
z
2
z
1
z
2
= z
1
z
1
z
2
z
2
= N(z
1
)N(z
2
),
i.e, the norm is multiplicative.
Similar to the complex case, dene the conjugate of = x+y

n Z[

n] to be = xy

n,
and the norm of to be
N() = = (x +y

n)(x y

n) = x
2
ny
2
.
Note N() = N().
The following lemma is clear.
Lemma 5.1. Solutions of x
2
ny
2
= 1 are in 1-1 correspondence with the elements in Z[

n] of
norm 1. The correspondence is given by (x, y) x +y

n.
Now we want to know a basic property of norms.
Lemma 5.2. For , Z[

n], we have N() = N()N(), i.e., N is multiplicative.


Proof. Write = x
1
+y
1

n, = x
2
+y
2

n. Note
= x
1
x
2
+ny
1
y
2
(x
1
y
2
+y
1
x
2
)

n = (x
1
y
1

n)(x
2
y
2

n) = .
Hence
N() = = = N()N().
Hence if and have norm 1, so does . In light of Lemma 5.1, this says if we have two
solutions to x
2
ny
2
= 1, we can compose them to construct a third. Precisely, we can say something
stronger.
31
Corollary 5.3. (Brahmagupta composition rule) If (x
1
, y
1
) and (x
2
, y
2
) are solutions to
x
2
1
ny
2
1
= a, x
2
2
ny
2
= b.
Then the composition
(x
3
, y
3
) = (x
1
, y
1
) (x
2
, y
2
) := (x
1
x
2
+ny
1
y
2
, x
1
y
2
+y
1
x
2
)
is a solution of
x
2
3
ny
2
3
= ab.
Proof. We simply translate the above into a statement about norms. The hypothesis says N(x
1
+
y
1

n) = a and N(x
2
+y
2

n) = b. Now observe that


(x
1
+y
1

n)(x
2
+y
2

n) = x
1
x
2
+ny
1
y
2
+ (x
1
y
2
+y
1
x
2
)

n = x
3
+y
3

n.
Hence by the multiplicative property of the norm,
x
2
3
ny
2
3
= N(x
3
+y
3

n) = N(x
1
+y
1

n)N(x
2
+y
2

n) = ab.
Note this is much nicer than the straightforward proof given on p. 82.
Aside: this result says that if a and b are of the form x
2
ny
2
, so is ab. Hence if we want to
ask the question which integers are of the form x
2
ny
2
, we should rst determine which primes
are of the form x
2
ny
2
. We will not pursue this now, however we will return to this idea when
considering which numbers are sums of squares, or more generally, of the form x
2
+ny
2
. (The +
case turns out to be simpler, but still not easy.)
Since Z[

n] R, there is a natural order on Z[

n]. By Lemma 5.1, this gives us a way to order


the solutions to Pells equation. As the case of n = 2 suggests, we want to rst look for a smallest
non-trivial solution and try to obtain all other solutions from that.
If (x, y) is a solution to x
2
ny
2
= 1, so are (x, y). So when we say we want a smallest
solution, we should make a restriction like x, y > 0. Thinking in terms of elements of norm 1, note
that conjugates and negatives of = x + y

n give x y

n. Hence we want to look for the


smallest element of norm 1 such that x, y > 0.
Denition 5.4. The fundamental +unit
2
of Z[

n] is the smallest = x +y

n Z[

n] such
that x, y > 0 and N() = 1.
Lemma 5.5. The fundamental +unit of Z[

n] is well dened and always exists.


Proof. We will show in the next section that there is always some = 1 in Z[

n] such that
N() = 1. By possibly taking the negative and/or conjugate of , we may assume x, y > 0. So at
least one candidate exists. Since the set of all a +b

n with a, b N is discrete in R, there must be


a minimal such (i.e., such cannot get arbitrarly close to 1).
2
This is not standard terminology. One normally denes the fundamental unit, which can have norm 1. If it
has norm +1, this coincides with our denition; if it has norm -1, its square is what we are calling the fundamental
+unit.
32
Example. The fundamental +unit of Z[

2] is 3 + 2

2.
Exercise 5.4. An alternative denition of fundamental unit is the smallest > 1 such that N() = 1.
Prove that this is equivalent to the above denition as follows. Suppose = x + y

n > 1 and
N() = 1. Show (i) 0 < < 1. Then deduce (ii) x, y > 0.
Theorem 5.6. Let U
+
= { Z[

n] : N() = 1}. Then U


+
is an innite abelian group under
multplication. Furthermore, it is generated by the fundamental unit of Z[

n] and 1.
Proof. Clearly the identity 1 U
+
and multiplication on U
+
is associative. Note that for any
U
+
, N() = = 1 implies that =
1
. Since N() = 1 also, we have U
+
. Also by the
multiplicative property of the norm, if , U
+
then N() = N()N() = 1 so U
+
. This
shows U
+
is a group, and it is clearly abelian because multiplication in Z[

n] is commutative.
Now let be the fundamental +unit of Z[

n]. Suppose there exists U


+
such that =
m
for any m Z. By taking the negative and/or conjugate if need be, we may assume > 1. Since
is minimal and
m
as m , there must be some m > 0 such that
m
< <
m
+ 1. But
then 1 <
m
< and N(
m
) = 1, contradicting the minimality of . Hence each U
+
is
() a power of .
Remark. All U
+
are called units of Z[

n], because like 1, they are invertible in Z[

n].
The actual denition of the units of Z[

n] is the set of invertible elements, which is easy to see is


precisely the set of elements of norm 1.
Hence the solutions of Pells equation are given by
m
where m Z. Since we know =
1
, then

m
=
m
. Thus
m
and
m
give essentially the same solutions.
Corollary 5.7. Suppose = x
0
+y
0

n is a fundamental +unit of Z[

n]. Then all integer solutions


to Pells equation x
2
ny
2
= 1 are of the form (x, y) where x+y

n = (x
0
+y
0

n)
m
and m 0.
Equivalently, up to sign, all solutions to Pells equations are given by non-negative powers (in the
sense of Brahmagupta composition) of the fundamental solution (x
0
, y
0
).
Example. Up to sign, all non-trivial solutions of x
2
2y
2
= 1 are given by (x+y

2) = (3+2

2)
m
for m > 0, i.e., x and y are the rational and irrational parts of (3 + 2

2)
m
.
The book says little about how to nd fundamental solutions (called smallest positive solutions
in the text). By rewriting Pells equation as
x
2
= ny
2
+ 1
it becomes clear that we can nd the fundamental solution (or fundamental +unit) by nding the
smallest y > 0 such that ny
2
+ 1 is a square. This will give the smallest x > 0 which solves
x
2
ny
2
= 1, i.e., x and y are simultaneously minimal for this solution, making x + y

n minimal
(with x, y > 0) among U
+
.
Example. Since 3 1
2
+1 is a square, the smallest positive (fundamental) solution to x
2
3y
2
= 1
is (2, 1). Hence the fundamental +unit of Z[

3] is 2 +

3. Up to sign, all solutions are powers of


(2, 1), e.g., (2, 1)
2
= (7, 4) and (2, 1)
3
= (26, 15). This provides the successive approximations
2
1
,
7
4
,
26
15
for

3.
33
Exercise 5.5. Find the fundamental solution (x
0
, y
0
) to x
2
5y
2
= 1. What is the fundamental
+unit of Z[

5]? Compute the solutions given by the square and the cube of (x
0
, y
0
). What rational
number decimal approximations to

5 do they yield? To how many digits are they accurate? (Use
a calculator.)
Exercise 5.6. Exercises 5.4.4, 5.4.5.
5.5 The pigeonhole argument
The simple-minded method for determining fundamental solutions above is only practical for small
n. For instance, when n = 61, the fundamental solution is
(1766319049, 226153980)
(Bhaskara II, 12th century; Fermat). In general, one can, for instance, use the classical theory of
continued fractions. We will not go into this here, but we will prove the existence of a non-trivial
solution for all nonsquare n, which is due to Lagrange in 1768. However, we will give a proof due
to Dirichlet (ca. 1840). It uses the
Pigeonhole principle. If m > k pigeons go into k boxes, at least one must box must contain more
than 1 pigeon (nite version). If innitely many pigeons go into k boxes, at least one box must
contain innitely many pigeons (innite version).
Proposition 5.8. (Dirichlets approximation theorem) For any nonsquare n and integer B >
1, there exist a, b Z such that 0 < b < B and
|a b

n| <
1
B
.
(This says that
a
b
is close to

n.)
Proof. Consider the B 1 irrational numbers

n, 2

n, . . . , (B 1)

n.
For each such k

n, let a
k
N be such that
0 < a
k
k

n < 1.
Partition the interval [0, 1] into B subintervals of length
1
B
. Then, of the B + 1 numbers
0, a
1

n, a
2

n, . . . , a
B1
(B 1)

n, 1
in [0, 1] two of them must be in the same subinterval of length
1
B
. Hence they are less than distance
1
B
apart, i.e., their dierence satises |ab

n| <
1
B
. Further their irrational parts must be distinct,
so we have B < b < B with b = 0. If b > 0 we are done; if b < 0, simply multiply a and b by
1.
34
Step 1. Fix B
1
= B. Then by above, there exists |a
1
b
1

n| <
1
B
<
1
b
1
. Let B
2
> B
1
such that
1
B
2
< |a
1
b
1

n|. Applying Dirichlets approximation again, we get a new pair (a


2
, b
2
) of integers
such that
|a
2
+b
2

n| <
1
B
2
<
1
b
2
.
Repeating this we see there an innite sequence of integer pairs (a, b) such that |a b

n| gets
smaller and smaller, and
|a b

n| <
1
b
.
for all (a, b). (This is gives a innite sequence of increasingly good approximations.)
Step 2. Assume (a, b) satisfy |a b

n| <
1
b
. Note that
|a +b

n| |a b

n| +|2b

n| 1 + 2b

n 3b

n.
Then
|a
2
nb
2
| = |a +b

n||a b

n| 3b

n
1
b
= 3

n.
Hence there are innitely many a b

n Z[

n] whose norm, in absolute values, is at most 3

n.
Step 3. By successive applications of the (innite) pigeonhole principle, we have
(i) innitely many a b

n with the same norm N, where |N| 3

n (the norm is always an


integer)
(ii) innitely many a b

n with norm N and a a


0
mod N for some a
0
.
(iii) innitely many a b

n with norm N, a a
0
mod N, b b
0
mod N for some b
0
.
In particular, we have two a
1
b
1

n, a
2
b
2

n such that they both have norm N, a


1

a
2
mod N, b
1
b
2
mod N, and a
1
b
1

n = (a
2
b
2

n). (Its possible N < 0, and we dene


mod N for negative N to be the same as mod |N|. However, N = 0 because 0 is the only element
of Z[

n] of norm 0.)
Step 4. Consider
a +b

n =
a
1
b
1

n
a
2
b
2

n
=
(a
1
b
1

n)(a
2
b
2

n)
a
2
2
nb
2
2
=
a
1
a
2
nb
1
b
2
N
+
a
1
b
2
b
1
a
2
N

n.
Since a
1
b
1

n = (a
2
b
2

n), surely a +b

n = 1. If we know a, b Z, then since


N(a +b

n) = N(a
1
b
1

n)N
_
(a
2
b
2

n)
1
_
= NN
1
= 1,
we get that a +b

n is an element of Z[

n] of norm 1 which is not 1.


To show that a is an integer, observe that N|a
1
a
2
nb
1
b
2
because
a
1
a
2
nb
1
b
2
a
1
a
1
nb
1
b
1
a
2
1
nb
2
1
0 mod N.
The rst congruence holds because a
1
a
2
mod N and b
1
b
2
mod N. Similarly, b is an integer
because
a
1
b
2
b
1
a
2
a
1
b
1
b
1
a
1
0 mod N.
This proves
Theorem 5.9. If n N is nonsquare, then x
2
ny
2
= 1 has a nontrivial solution in Z, i.e., a
solution besides (1, 0).
35
5.6 *Quadratic forms
Note: This section does not AT ALL follow what is in the text.
The ideas above can be put into a more general context. We say Q(x, y) is a binary quadratic form
if
Q(x, y) = ax
2
+bxy +cy
2
.
The basic questions are, which numbers k are of the form
k = Q(x, y),
and for such n, what are the solutions (or at least, how many are there?). We answered the question
thoroughly for Q(x, y) = x
2
ny
2
(n > 0) and k = 1: 1 is always of the form x
2
ny
2
in two
ways if n is a square and in innitely many ways otherwise, and we showed how to determine all
solutions.
Assuming n is not a square, if k is of the form x
2
ny
2
, then there are innitely many solutions
to
x
2
ny
2
= k,
and they are generated from a fundamental solution. The reason is that such solutions correspond
to elements of Z[

n] of norm k. If has norm k, then so does for any of norm 1, and we


showed that there are innitely many elements of norm 1 in Section 5.4. We will not deal with the
question of which k are of the form x
2
ny
2
here, but it was treated in Gauss Disquistiones.
The form x
2
ny
2
is called an indenite form because it takes on positive and negative values.
The general theory of indenite forms is similar, and another interesting example is the case of the
form
Q(x, y) = x
2
+xy y
2
.
Here the solutions to Q(x, y) = 1 are given by (F
2n+1
, F
2n+2
) where F
n
is the n-th Fibonacci number
(cf. Exercise 5.8.4; F
1
= F
2
= 1). The form Q(x, y) is the norm of the element x +y
1+

5
2
in
Z[
1 +

5
2
] :=
_
a +b
1 +

5
2
: a, b Z
_
.
Here, the golden ratio
1+

5
2
is a fundamental unit for Z[
1+

5
2
], but this has norm 1. In fact the
solutions are generated by the powers of the fundamental +unit, 1+
1+

5
2
=
3+

5
2
. Hence this gives
an interesting way of computing the Fibonacci numbers:
_
3 +

5
2
_
n
= F
2n+1
+F
2n+2
1 +

5
2
.
In fact, proving this relation (say by induction) is an alternative way of showing Exercise 5.8.4.
Exercise 5.7. Check that
_
3+

5
2
_
n
= F
2n1
+F
2n
1+

5
2
holds for n = 1, 2, 3.
Opposed to the indenite forms, we have the denite forms. We say Q(x, y) is positive
denite (resp. negative denite) if Q(x, y) 0 (resp Q(x, y) 0) for all x, y Z. For example,
36
x
2
+ny
2
for n N is a positive denite form. (The negative denite forms are just the negatives of
positive denite forms, so it makes sense to study just the positive ones.)
In contrast to the indenite case, it is clear that if k is of the form
x
2
+ny
2
= k
there are only nitely many solutions for (x, y). These are in 1-1 correspondence with the elements
of norm k in the imaginary quadratic ring Z[

n]. While the point of view of norms is similar to


the indenite case, the denite and indenite cases have a rather dierent avor (with the denite
case being the more easy of the two).
In the next chapter, we will study the ring of Gaussian integers Z[i], with the goal in mind of
determining which numbers are the sum of two squares x
2
+ y
2
. Brahmagupta composition, as we
remarked earlier, suggests that we can reduce the problem to the question of which primes are sums
of two squares, for which the pattern becomes much more apparent.
5.7 *The map of primitive vectors
5.8 *Periodicity in the map of x
2
ny
2
The material in these two optional sections is an introduction to Conways recent (in the last 20
years or so) new insights into a visual approach to binary quadratic forms. While the material is
interesting, we will focus on other things in this class. If you are interested in learning about it, I
recommend Conways own (small) book, The Sensual Quadratic Form.
5.9 Discussion
Probably worth reading.
37

You might also like