Wong

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

REPRESENTING INTEGERS AS SUMS OF SQUARES

MICHAEL WONG

Abstract. We study in detail the special case of Warings problem when the power k = 2. Ultimately, we prove that four is the least number of squares needed to represent any integer. To this end, we prove that some numbers cannot be represented as sums of two squares, some cannot be represented as sums of three, and all can be represented as sums of four. We also show that numbers of a certain form can be represented as sums of two squares. Though we mostly use classic methods of number theory, we venture into group theory to prove a few preliminary theorems.

Contents 1. Introduction 2. Quadratic Character of -1 3. Representing Integers as Sums of Two Squares 4. Representing Integers as Sums of Four Squares 5. Representing Integers as Sums of Three Squares Acknowledgments References 1 2 4 5 7 9 9

1. Introduction The Diophantine equation N = i=1 xk i is of special interest in mathematics. Elegant in its simplicity, this equation can be applied to advanced topics in mathematics and physics. Beginning math students, or just someone with a curious mind, will nd it an accessible introduction to number theory. The equation itself has a rich history apart from the general theory of Diophantine equations. Such celebrated mathematicians as Euler, Lagrange, Legendre, and Gauss have contributed to our knowledge of it. Among the problems they tackled was what integers can be represented by the sum. Providing a concise statment of this problem, Edward Waring conjectured that every integer is the sum of a xed number s of k th powers. The function g (k ), depending only on k , is dened as the least number of k th powers needed for all integers. Hilbert rst proved Warings conjecture, and mathematicians are still calculating the values of g (k ) for larger k . The rst few values have been known for some time: g (2) = 4, g (3) = 9, and g (4) = 19. In the following exposition, we prove that g (2) = 4. Obviously, we must show that some numbers cannot be represented as sums of two squares; some numbers cannot be represented as sums of three squares, and all numbers can be represented
Date : August 21, 2009.
1 s

MICHAEL WONG

as sums of four squares. Along the way, we also prove that numbers satisyng certain conditions can be represented as sums of two squares. Throughout, we assume the reader has a basic understanding of congruences, quadratic residues, elds, and groups. 2. Quadratic Character of -1 For the two- and four-square theorems, we must know whether 1 is a quadratic residue modulo the prime p. Our approach to this point involves a few basic theorems in eld theory and group theory. First, we establish the relation of congruence to nite elds. Let Zp be the set of all congruence classes modulo p. We denote the congruence class corresponding to an integer n as [n]. Clearly, the set consists only of p elements. Now, we dene addition and multiplication on these elements in the usual manner: [m] + [n] = [m + n] [m] [n] = [m n] Then {Zp , +, } is a eld, as is easily veried. For convenience, we represent [0] by 0 and every other congruence class by its least postive residue, allowing us to write Zp = {0, 1, . . . , p 1}. Let Z p = Zp \ {0}. From the fact that {Zp , +, } is a eld, we can immediately deduce that {Z p , } is an abelian group. We call this group the multiplicative group of Zp . Let us examine its structure in a general context. For any x in a group G, we dene the order of x as the least positive integer n such that xn = 1 and denote it as ord(x) = n. It is easily shown that the elements x0 , x1 , . . . , xn1 are distinct and, combined with the operation {}, make an abelian subgroup of G, denoted as G . We say that G is the cyclic subgroup generated by x. Thus, we arrive at an alternative denition of the order of x: the order of G . We recall here the denition of quadratic residue in the special case that the modulus is prime: Denition 2.1. The element a of Z p is a quadratic residue (mod p) if the polyno2 mial a has a zero over the eld Zp . Otherwise, a is a quadratic non-residue. To determine the quadratic character of 1, we need to prove that the the multiplicative group of Zp is cyclic. In the trivial case p = 2, 1 is a quadratic residue. Now, assuming p is odd and {Z p , } is cyclic, let x be a generator of the 1 p 1 (p1) 2 group. So x = 1, implying that x = p 1 1 (mod p). Clearly, then, 1 1 is a quadratic residue if and only if P = 2 (p 1) is even. P is even for all primes congruent to 1 and odd for those congruent to 3 (mod 4). Therefore, we have the following proposition: Proposition 2.2. Suppose p is an odd prime. 1 is a quadratic residue (mod p) if and only if p is of the form 4k + 1. We proceed to the theorems necessary to prove this statement. Lemma 2.3. Suppose x, y are elements of the abelian group {G, }. Furthermore, suppose ord(x) = a, ord(y ) = b, and (a, b) = 1. Then ord(xy ) = ab. Proof. Let ord(xy ) = j . j must divide ab, for (xy )ab = xab y ab = 1. By contradiction, assume j < ab. j | ab and (a, b) = 1 implies that j | a or j | b. Without loss

REPRESENTING INTEGERS AS SUMS OF SQUARES

of generality, assume the latter. (xy )j = xj y j = 1, implying that y j = (xj )1 . Raising both sides to the ath power, we have y aj = (xaj )1 = 1. Because j | b, the order of y j is b/j . So a must be a multiple of b/j ; that is, for some integer m, we have a = m(b/j ), contradicting our assumption that (a, b) = 1. A similar argument follows if we assume j | a. Therefore, j = ab. This lemma leads to its own generalization. In the next lemma, we remove the condition that (a, b) = 1. Lemma 2.4. Suppose x, y are elements of the abelian group {G, }. Furthermore, suppose ord(x) = a and ord(y ) = b. Then there exists z G such that ord(z ) = lcm(a, b). Proof. For the moment, assume that there exist m and n such that m | a and n | b, (m, n) = 1, and mn = lcm(a, b). We will prove afterwards that m and n exist. So ord(xa/m ) = m and ord(y b/n ) = n. By the previous lemma, ord(xa/m y b/n ) = mn. Now, we shall construct m and n. Consider a and b in standard form: a=
iN
i p i

b=
i N
i p i ;

i p i

i For each i, if i < i , divide a by if i > i , divide b by p i ; if i = i , perform either operation. In the end, we are left with two numbers satisfying the conditions for m and n.

It may be helpful to illustrate this construction of m and n by a numerical example. Example 2.5. Suppose a = 120 and b = 36. In standard form, a = 120 = 5 3 2 2 2 b = 36 = 3 3 2 2 Completing the procedure in the previous lemma, we have m = 40 = 5 2 2 2 n=9=33 with mn = 360 = lcm(120, 36). The reader may easily verify that m and n satisfy the other conditions in Lemma 2.4. From the previous lemma, we can immediately draw an important conclusion that will help us in the next and nal theorem. Suppose x and y are elements of the abelian group {G, }, and suppose ord(x) = a and ord(y ) = b. Then a | b, b | a, or there exists z G such that ord(z ) > a, b. Theorem 2.6. The multiplicative group of Zp is cyclic Proof. Let k be the highest order of all elements in Z p . Consider the polynomial k 1 over the eld Zp . By what we just stated, the order of any element in Z p divides k . Therefore, every such element is a zero of the polynomial, implying k p 1. But the order of an element of a group cannot be greater than the order of the group. So k = p 1, and any element whose order is k is a generator of {Z p , }.

MICHAEL WONG

By what was discussed before Lemma 2.3, this theorem implies Proposition 2.2. 3. Representing Integers as Sums of Two Squares To familiarize the reader with the problem, we begin with a simple geometric interpretation of the Diophantine equation (3.1) N = x2 + y 2

Consider a circle with radius N , centered at (0, 0) in R2 . Then N can be represented as a sum of two integer squares if and only if there exist integer coordinates (x, y ) on the circle. Obviously, if the coordinates (x, y ) in the rst quadrant satisfy Equation 3.1, then (x, y ), (x, y ), and (x, y ) are also solutions. Because they dier only in sign, these solutions are not essentially distinct. Clearly, not all circles of radius N have points of integer coordinates. Our task for this section is to develop a criterion for two-square representability. The rst step is to demonstrate a necessary condition for a number to be representable. Theorem 3.2. Suppose p = 4k + 3 is prime. If a number N divisible by p is a sum of two squares, then the power of p in the standard form of N is even. Proof. Let N = x2 + y 2 . Observe that p | N implies x2 y 2 (mod p). We know from Proposition 2.2 that -1 is a quadratic non-residue (mod p). Thus, the only possible solutions to the congruence are x y 0 (mod p). Then p | x and p | y , implying p2 | N . Now, let N = p2 d. By the same argument, p | d implies p2 | d. Therefore, any divisor of N divisible by p is divisible by p2 . So the power of p in the standard form of N is even. An immediate corollary of this theorem is that any number of the form 4k + 3 is not representable. The proof is left as an exercise for the reader. We now want to show that the conclusion of Theorem 3.2 is also a sucient condition. Reformulating the statement accordingly, we have the following criterion: Theorem 3.3 (Two-Squares Theorem). A number N is a sum of two squares if and only if the power ai of a factor pi = 4k + 3 in the standard form of N is even. The proof of the remaining half of this theorem is greatly simplied by an identity, (3.4) (a2 + b2 )(c2 + d2 ) = (ac + bd)2 + (ad bc)2

By this equation, if each factor in the standard form of a number is representable, then the number itself is representable. So let us examine the standard form of a number N satisfying the condition of the theorem. Trivially, 2 = 12 + 12 , so any power of 2 is representable. By assumption, the power ai of a prime pi = 4k + 3 is i even, so pa i is representable. All other primes are of the form 4k + 1. Therefore, to prove Theorem 3.3, we must show that a prime p = 4k + 1 is a sum of two squares. Our method to prove the last statement is Fermats method of descent. First, we prove that a multiple of p is representable. Second, we prove that the least representable multiple is p itself. Theorem 3.5. A prime p = 4k + 1 is a sum of two squares.

REPRESENTING INTEGERS AS SUMS OF SQUARES

Proof. By Proposition 2.2, -1 is a quadratic residue (mod p). Therefore, the congruence z 2 + 1 0 (mod p) is soluble. So there exists m > 0 such that mp = z 2 + 12 . Now, consider the more general congruence, x2 + y 2 0 (mod p). Let x1 , y1 be 1 residues of x, y (mod p) such that |x1 |, |y1 | < 2 p Then (3.6) 0<m= 1 2 1 1 2 (x + y1 ) < ( p2 ) < p p 1 p 2

Let m = m0 be the least value for which mp is representable as a sum of two squares. (3.7)
2 m 0 p = x2 1 + y1

By contradiction, assume m0 > 1. Let x2 , y2 be residues of x1 , y1 (mod m0 ) such 1 that |x2 |, |y2 | 2 m0 . Observe that
2 2 2 x2 2 + y2 x1 + y1 0

(mod m0 )

This implies that there exists r such that (3.8)


2 m0 r = x2 2 + y2

We wish to show that r = 0. If r = 0, then x2 = y2 = 0 m0 | x1 , m0 | y1 m2 0 | m0 p m0 | p But by Equation 3.6, m0 < p, and by assumption, 1 < m0 . So r = 0 contradicts that p is prime. Observe that 1 1 2 1 2 2 (x2 + y2 ) ( m ) < m0 r= m0 m0 2 0 Multiplying Equation 3.7 and Equation 3.8 and applying Equation 3.4, we have (3.9)
2 2 2 2 2 2 m2 0 rp = (x1 + y1 )(x2 + y2 ) = (x1 x2 + y1 y2 ) (x1 y2 y1 x2 )

Note that each factor on the right is divisble by m0 :


2 x1 x2 + y1 y2 x2 1 + y1 0

(mod m0 ) (mod m0 )

x1 y2 y1 x2 x1 y1 y1 x1 0

So let m0 X = (x1 x2 + y1 y2 ) and m0 Y = (x1 y2 y1 x2 ). Then dividing Equation 3.9 by m2 0 , we have rp = X 2 + Y 2 Consequently, there exists r < m0 , r = 0, such that rp is representable as a sum of two squares. This contradicts the denition of m0 . Therefore, m0 = 1. From the the two-squares theorem, we can deduce that g (2) > 2. 4. Representing Integers as Sums of Four Squares Perhaps the rst question that comes to the readers mind is why we address four squares before we address three. As will become evident, the proofs for the twoand four-square theorems are very similar, while the three-square theorem requires completely dierent methods. To hint at the similarity, we again touch on the geometry of the problem. Consider the Diophantine equation (4.1) N = w 2 + x2 + y 2 + z 2

MICHAEL WONG

and imagine a 4-sphere of radius N , centered at (0, 0, 0, 0) in R4 . Then N can be represented as a sum of four integer squares if and only if there exist integer coordinates (w, x, y, z ) on the 4-sphere. With regard to sign, there are sixteen solutions that are not essentially distinct. As mentioned in the introduction, all numbers can be represented as a sum of four squares. We formalize this statement thus: Theorem 4.2 (Four-Squares Theorem). Any integer is a sum of four squares. Our work is simplied by an analog to Equation 3.4: (aA + bB + cC + dD)2 (aB bA cD + dC )2 (a2 + b2 + c2 + d2 )(A2 + B 2 + C 2 + D2 ) = (aC + bD cA dB )2 (aD bC + cB dA)2 + + +

(4.3)

So if each factor in the standard form of a number is representable, then the number itself is representable. Once again, let us examine the standard form of a number N ; here, however, N is completely arbitrary. Trivially, 2 = 12 + 12 + 02 + 02 , so any power of 2 is representable. It immediately follows from Theorem 3.5 that any power of a prime p = 4k + 1 is representable. Thus, in order to prove the above theorem, we must show that any prime of the form 4k + 3 is a sum of four squares. We again employ the method of descent, making the structure of the proof nearly identical to that of the two-squares proof. Just as we did before, we show that a multiple of p is representable from a soluble congruence: x2 + y 2 + 1 0 (mod p). Because the solubility of this congruence is not immediately obvious, however, we prove it in a separate lemma. Lemma 4.4. Suppose p = 4k +3 is prime. The congruence x2 + y 2 +1 0 (mod p) is soluble. Proof. Consider the congruence in this form: x2 + 1 y 2 (mod p). 1 p 1 is a quadratic non-residue modulo p = 4k + 3. Now that we have proven that a positive quadratic non-residue exists, let z be the least postive non-residue (mod p). Trivially, z > 1. Then there exists a residue w such that w + 1 = z . Observe that there exists y such that z y 2 (mod p). Therefore, the congruence x2 + 1 y 2 (mod p) is soluble. We are now ready for the main proof. Theorem 4.5. A prime p = 4k + 3 is a sum of four squares. Proof. By the previous lemma, the congruence x2 + y 2 + 1 0 (mod p) is soluble. So there exists m > 0 such that mp = x2 + y 2 + 1. Now, consider the more general 2 2 2 congruence, x2 1 + x2 + x3 + x4 0 (mod p). Let yi , i {1, 2, 3, 4}, be a residue of 1 xi (mod p) such that |yi | < 2 p. Then (4.6) 0<m= 1 p
4 2 yi < i=1

1 2 (p ) = p p

Let m = m0 be the least value for which mp is representable as a sum of four squares. (4.7)
2 2 2 2 m0 p = y1 + y2 + y3 + y4

REPRESENTING INTEGERS AS SUMS OF SQUARES

By contradiction, assume m0 > 1. Let zi be a residue of yi (mod m0 ) such that |zi | 1 2 m0 . Observe that
4 2 zi i=1 i=1 4 2 yi 0

(mod m0 )

This implies that there exists r such that (4.8)


2 2 2 2 m0 r = z1 + z2 + z3 + z4

We wish to show that r = 0. If r = 0, then i, zi = 0 m0 | yi m2 0 | m0 p m0 | p But by Equation 4.6, m0 < p, and by assumption, 1 < m0 . So r = 0 contradicts that p is prime. Observe that r= 1 m0
4 2 zi i=1

1 (m2 ) = m0 m0 0

But we want to show that r is strictly less than m0 . If r = m0 , then |zi | = 1 2 m0 4 1 1 1 2 0 (mod m2 ). Now, | z | = and i=1 zi m implies that z m i 0 2 0 2 0 2 m0 (mod m0 ). Recall zi yi (mod m0 ). Therefore,
2 2 2 2 m0 | yi zi , m0 | yi + zi m2 0 | yi zi yi zi 4 4

(mod m2 0)

2 2 2 Then i=1 yi i=1 zi 0 (mod m2 0 ). So m0 | m0 p, which implies m0 | p. As before, this contradicts that p is prime. Now, multiplying Equation 4.7 and Equation 4.8 and applying Equation 4.3, (4.9) (y1 z1 + y2 z2 + y3 z3 + y4 z4 )2 + (y1 z2 y2 z1 y3 z4 + y4 z3 )2 + 2 2 2 2 2 2 2 2 m2 0 rp = (y1 + y2 + y3 + y4 )(z1 + z2 + z3 + z4 ) = (y z + y z y z y z )2 + 1 3 2 4 3 1 4 2 (y1 z4 y2 z3 + y3 z2 y4 z1 )2

Note that each factor on the right is divisble by m0 : y1 z1 + y2 z2 + y3 z3 + y4 z4 y1 z2 y2 z1 y3 z4 + y4 z3


2 2 2 2 y1 + y2 + y3 + y4 y1 y2 y2 y1 y3 y4 + y4 y3

(mod m0 )

and similarly for the other two factors. Then dividing Equation 4.9 by m2 0 , we have
2 2 2 2 rp = X1 + X2 + X3 + X4 , Xi Z

So there exists r < m0 , r = 0, such that rp is representable as a sum of four squares. This contradicts the denition of m0 . Therefore, m0 = 1. From the four-square theorem, we deduce that g (2) 4. 5. Representing Integers as Sums of Three Squares To prove nally that g (2) = 4, we must show that certain numbers cannot be represented as sums of three squares. To start, consider the Diophantine equation (5.1) N = x2 + y 2 + z 2 We suspect that the parities of x, y , and z give rise to distinct sets of representable numbers. Three variables, each with two possibilites for partiy, yield eight possibile sets of values for N . Thus, we are naturally lead to consider the sum in Equation 5.1

MICHAEL WONG

to the modulus 8. Clearly, if n is even, then n2 is congruent to 0 or 4 (mod 8). It remains to show the possible congruences where n is odd. Lemma 5.2. If n is odd, then n2 1 (mod 8). Proof. n is odd implies that n + 1 and n 1 are even. Consider the equation n+1 n1 )( ) n2 1 = 4( 2 2 n1 n+1 n1 Now, ( n+1 2 ) and ( 2 ) are consecutive integers. So either ( 2 ) or ( 2 ) is even. Therefore, n2 1 0 (mod 8), implying n2 1 (mod 8). By these considerations, we can directly calculate the possible congruences for a sum of three squares: 0 or 4 if x, y, z even 1 or 5 if x, y even 2 2 2 (5.3) x +y +z (mod 8) 2 or 6 if x even 3 if x, y, z odd Because 7 is not a residue (mod 8), no number of the form 8k +7 can be represented as a sum of three squares. This conclusion is sucient to prove that g (2) = 4. But to show which numbers are representable, we must prove a stronger theorem. Theorem 5.4. No number N = 4n (8k + 7), where n N and k Z, is the sum of three squares. Proof. By contradiction, assume (5.5) N = 4n (8k + 7) = x2 + y 2 + z 2 where x, y , and z are integers. Let n = n0 be the least power for which N is representable. Note that n0 > 0, for if n0 = 0, then N = (8k + 7) 7 (mod 8). Now, by Equation 5.3, N is even and 4 | N implies that x, y , and z are even. So 1 1 x, Y = 1 let X = 2 2 y , and Z = 2 z . Then dividing Equation 5.5 by 4, we have N = 4n0 1 (8k + 7) = X 2 + Y 2 + Z 2 implying n = n0 1 is a power for which N is representable. This contradicts the denition of n0 . Therefore, N is not representable. We assert that all numbers not of the form 4n (8k + 7) are representable. Combining this statement with the previous theorem, we can formulate a criterion for three-square representability: Theorem 5.6 (Three-Squares Theorem). A number N is a sum of three squares if and only if N is not of the form 4n (8k + 7). Unfortunately, there is no elementary proof for the remaining half of this theorem because no identity like Equation 3.4 or Equation 4.3 exists for three squares. For a counterexample, take any number congruent to 3 and any number congruent to 5 (mod 8). Trivially, 3 = 12 + 12 + 12 , and 5 = 22 + 12 + 02 . However, 3 5 = 15 7 (mod 8), which we just declared to be not representable. Consequently, the proof requires methods we have not discussed thus far. We simply leave it as a topic for further investigation, providing only a brief outline to serve as a starting point. The proof we have in mind is based on the theory of quadratic forms, viewed through the lens of linear algebra. The following is a list of informal denitions of the principal terms.

REPRESENTING INTEGERS AS SUMS OF SQUARES

A quadratic form is a homogeneous polynomial of degree 2 in k variables. In the general case, we denote a quadratic form as Q(x1 , x2 , . . . , xk ) = n n k i=1 j =1 aij xi xj . One may regard the input of Q as a k-tuple x in R . For the proof, we assume that the coecients aij are integers. Let Q1 (x) and Q2 (y) be quadratic forms. They are equivalent if for some k k k matrix C with determinant 1, xi = j =1 yj for all i. The reader may verify that this relation indeed constitutes an equivalence relation. A positive denite quadratic form is one whose values are all positive; that is, for all x, Q(x) > 0. The discriminant of a quadratic form Q is the determinant of the coecient matrix A = (aij ), denoted as d(Q). From these denitions, one must demonstrate certain elementary properties of binary (two-variable) and ternary (three-variable) quadratic forms, ultimately proving the following lemma: Lemma 5.7. Every positive denite ternary quadratic form Q with d(Q) = 1 is equivalent to a sum of three squares. . To begin proving Theorem 5.6, let us re-examine the possible congruences of a sum of three squares modulo 8. If a number N 7 (mod 8), then N is not representable, so we ignore this case. Suppose now that N = 4n d, where 4 does not divide d. Clearly, if d is representable, then N is representable. Therefore, it is sucient to prove that any N not congruent to 0, 4, or 7 (mod 8) is representable. Operating under the previous lemma, one must then show that such N can be represented by a positive denite ternary quadratic form with discriminant 1. The proof would then be complete. Acknowledgments. I would like to thank my graduate student mentors, Asaf Hadari and Rita Jim enez. Their boundless knowledge, patience, and enthusiasm enabled me to extract the most out of this project. I would also like to thank Professor Peter May for organizing the 2009 REU, providing the perfect opportunity to pursue extracurricular mathematics. References
[1] [2] [3] [4] [5] T. Apostol. Calculus. Vol. 2. 2nd ed., New York, John Wiley & Sons. 1969. H. Davenport. The Higher Arithmetic. Cambridge, Cambridge University Press. 1982. W. Deskins. Abstract Algebra. New York, Dover Publications, Inc. 1995. P. Erd os and J. Sur anyi. Topics in the Theory of Numbers. New York, Springer. 2003. E. Grosswald. Representations of Integers as Sums of Squares. New York, Springer-Verlag. 1985. [6] G. Hardy and E. Wright. An Introduction to the Theory of Numbers. 5th ed., Oxford, Oxford University Press. 1979. [7] D. Loeer. (27 April, 2003). Is the multiplicative group mod p necessarily cyclic? Message posted to https://nrich.maths.org/discus/messages/2069/5977.html?1051613517.

You might also like