Mathematics Magazine Vol. 75, No. 3, June 2002

Download as pdf or txt
Download as pdf or txt
You are on page 1of 84

EDITORIAL POLICY AUTHORS

Mathematics Magazine a ims to prov i de Michael Naylor is a former circus performer, who
l ive ly and appeal ing mathematica l exposi­ now juggles patterns in the Mathematics Depart­
tion . The Magazine is not a research jou r­ ment at Washington State University. His interests
nal , so the terse style appropriate for such a include geometry, mathematics history, elementary
mathematics education, and music.
journal ( lemma-theorem-proof-corol lary) is
not appropriate for the Magazine. Artic les Grant Cairns studied electrical engineering at the
shou l d inc lude examp les, app l ications, his­ University of Queensland, Australia, before doing
torica l backg round, and i l lustrations, where a doctorate in differential geometry in Montpel­
appropriate . They shou l d be attractive and lier, France, under the direction of Pierre Molino.
accessible to undergraduates and wou l d, He benefited from two years as an assistant at the
idea l l y, be he lpfu l in supp lementing un­ University of Geneva, and a one-year postdoctor­
dergraduate courses or in stimu l ating stu­ ate at the University of Waterloo, before coming
to La Trobe University, Melbourne, where he is
dent investigations. Manusc ripts on h i story
now an associate professor. When he is not being
are espec i a l l y we lcome, as are those show­
generally enthusiastic about all matters mathemat­
ing re l ationsh ips among various b ranches of ical, his time is devoted to his sons Desmond and
mathematics and between mathematics and Maxwell, and his beautiful wife Romana.
other d i sc ip l ines.
A more deta i led statement of author Dan Kalman received his Ph.D. from the U niver­
gu i de l ines appears in th is Magazine, Vol . sity of Wisconsin in 1 980. He joined the mathe­
74, pp. 75-76, and is ava i lable from the Ed i­ matics faculty at American U niversity in 1 993, fol­
tor or at www.maa .org/pubs/mathmag.htm l . lowing an eight-year stint in the aerospace indus­
Manusc ripts to be submitted shou l d not be try (working on automatic differentiation, among
other things) and earlier teaching positions in Wis­
concurrently subm itted to, accepted for pub­
consin and South Dakota. Kalman has won three
l ication by, or pub l i shed by another journal MAA writing awards, is on the Editorial Board of
or pub l i sher. FOCUS, and served a term as Associate Executive
Subm it new manusc ripts to F rank A. Director of the MAA. His interests include matrix
Farris, Editor, Mathematics Magaz ine, Santa algebra, curriculum development, and interactive
Clara Un ive rsity, 500 El Cam ino Rea l, Santa computer environments for exploring mathematics,
Clara, CA 95053-0373. Manusc ripts shou l d especially using Mathwright software.
b e l aser printed, with wide l ine spac ing, and
prepared in a style consistent with the format
of Mathematics Magazine. Authors shou ld
ma i l three copies and keep one copy. In
addition, authors shou l d supp ly the fu l l
f ive-symbo l 2000 Mathematics Subject
Classif ication number, as described in Math­
ematical Reviews.

Cover image: A Clockwork Sunflower,


by Jason Cha l l as and F rank Farris. The disk
flowers of the sunf lowe r are drawn using the
a lgorithm f rom Naylor's article in th i s issue,
with c i rc les of varying size instead of dots;
near the outer edge, the c i rc les g ive way to
curves with 6-fol d symmetry to m im ic the
disk flowe rs that have opened . The peta l s
of the ray flowers around the edge show
the human touch. Jason Cha l las lectures on
computer art at Santa Clara Unive rsity.
Vol. 75, No. 3, June 2002

MATHEMATICS
MAGAZINE

EDITOR
Fran k A. Farris
Santa Clara University
ASSOCIATE ED I TORS
G l enn D. App l eby
Santa Clara University
Arthur T. Benjam in
Harvey Mudd College
Pau l j. Campbe l l
Beloit College
Anna l i sa Crannel l
Franklin & Marshall College
Dav id M. james
Howard University
Elgin H. Johnston
Iowa State University
Victor J . Katz
University of District of Columbia
jenn ifer j. Qu inn
Occidental College
Dav i d R. Scott
University of Puget Sound
Sanford L. Segal
University o f Rochester
Harry Wa ldman
MAA, Washington, DC
ED ITORIAL ASSI STANT
Martha L. G i ann in i
MATHEMATICS MAGAZINE (ISSN 0025-570X) is pub­
lished by the Mathematical Association of America at
1529 Eighteenth Street, N.W., Washington, D.C. 20036
and Montpelier, VT, bimonthly except july/August.
The annual subscription price for MATHEMATICS
MAGAZINE to an individual member of the Associ­
ation is $131. Student and unemployed members re­
ceive a 66% dues discount; emeritus members receive
a 50% discount; and new members receive a 20% dues
discount for the first two years of membership.)

Subscription correspondence and notice of change


of address should be sent to the Membership!
Subscriptions Department, Mathematical Association
of America, 1529 Eighteenth Street, N.W., Washington,
D.C. 20036. Microfilmed issues may be obtained from
University Microfilms International, Serials Bid Coordi­
nator, 300 North Zeeb Road, Ann Arbor, Ml48106.

Advertising correspondence should be addressed to


Dave Riska ([email protected]), Advertising Manager, the
Mathematical Association of America, 1529 Eighteenth
Street, N.W., Washington, D.C. 20036.

Copyright© by the Mathematical Association of Amer­


ica (Incorporated), 2002, including rights to this journal
issue as a whole and, except where otherwise noted,
rights to each individual contribution. Permission to
make copies of individual articles, in paper or elec­
tronic form, including posting on personal and class
web pages, for educational and scientific use is granted
without fee provided that copies are not made or dis­
tributed for profit or commercial advantage and that
copies bear the following copyright notice:
Copyright the Mathematical Association
of America 2002. All rights reserved.
Abstracting with credit is permitted. To copy other­
wise, or to republish, requires specific permission of
the MAA's Director of Publication and possibly a fee.

Periodicals postage paid at Washington, D.C. and ad­


ditional mailing offices.

Postmaster: Send address changes to Membership/


Subscriptions Department, Mathematical Association
of America, 1529 Eighteenth Street, N.W., Washington,
D.C. 20036-1385.

Printed in the United States of America


VOL. 75, N O . 3, J U N E 2002 163

G o l d e n, ,Ji, and n F l owe rs : A S p i ra l Sto ry


MICHAEL NAY LOR
Western Washington U niversity
Bellingham, WA 98226
[email protected]

Fibonacci numbers and the golden ratio are ubiquitous in nature. The number
(1 + ,JS) /2 seems an unlikely candidate for what is arguably the most important
ratio in the natural world, yet it possesses a subtle power that drives the arrangements
of leaves, seeds, and spirals in many plants from vastly different origins. This story is
something like these spirals, twisting and turning in one direction and then another,
crisscrossing themes and ideas over and over again. We begin with a mathematical
model for making these spirals. Many spirals in nature use the golden ratio, but some­
thing beautiful happens when we replace that ratio with some other famous irrational
numbers. Another twist takes us to rational approximations and continued fractions.
Let us follow these spirals into the beautiful world of irrational numbers.

Seed spirals

When a plant such as a sunflower grows, it produces seeds at the center of the flower
and these push the other seeds outward. Each seed settles into a location that turns
out to have a specific constant angle of rotation relative to the previous seed. It is this
rotating seed placement that creates the spiraling patterns in the seed pod [7, p. 1 76].
These spirals can be very neatly simulated as follows: Let's say there are k seeds in
the arrangement, and call the most recent seed 1, the previous seed 2, and so on, so that
the farthest seed from the center is seed number k. As an approximation, if each seed
has an area of 1 , then the area of the circular face is k, and the radius is v'k;rr. The
distance from the center of the flower to each seed, then, should vary proportionally to
the square root of its seed number. If we call the angle a, since the angle between any
two seeds is constant, the angle of seed k is simply ka. We now have a simple way to
describe the location of any seed with polar coordinates: r = -/k, e = ka.

Center oo line Center oo line


<>:.<···········································• !················································•
distance = 1 ·a__ distance = ..f2 l 0
_
_

-,
_

9

Center oo line
_..£>················································•

distance = v�_./ 0

___ .J:J· 0

�--
Figure 1 G rowing a seed s p i ra l
164 MATHEMATICS MAGAZI N E

I
1·:,!.'

...

* ./
. ·


_ -
· •
··. _ . •
.

---
_ _ .
_
. •

--··a

"6

�- -_
_ • • •
_ ,{ •

_
• •
_
-
·
·

_
- •

-17
-
-
_ _ -
- _ _ _ ••-

_L
- _ _ _ •
_ _ _

distance = -14
_ -····· · . • . · ···--

_. ........................ -a ............. .. ..... a ............................. o···•


--. . •
. • •

Cent� -18 •




_
_ • •
_ • • .

,
_
o _
o ·a _ _
_ •

/'
_
_ •
_

-19 ---o__ • .

I
_

,
_

--

Figu re 2 The fi rst 9 seeds Figure 3 The fi rst 1 00 seeds

Here 's an example of a spiral formed with an angle a = 45° , or 1 /8 of a complete


rotation. Seed 1 is located at a distance of Jl and an angle of 45o (clockwise, in this
example). The next seed is located 45° from this seed, or 2 * 45o = 90° from the zero
line; its distance is v'l. Seed 3 is located at 3 * 45o at a distance of .J3.
Continuing in this manner, the eighth seed falls on the oo line, the ninth seed is on
the same line as seed 1 , and so on (see FIGURE 2). FIGURE 3 shows what the spiral
looks like with 1 00 seeds. It's easy to see the spiral near the center, but the pattern gets
lost farther out as the eight radial arms become prominent. Notice how close together
the seeds become, and how much space there is between rows of seeds; this is not a
very even distribution of seeds. We can get a better distribution of seeds by choosing
an angle that keeps the seeds from lining up so readily. If we try an angle of 0. 1 5
revolutions (or 54°), the result i s better, especially for the first few dozen seeds, but
again we end up with radial arms, 20 this time (see FIGURE 4). Since 0. 1 5 = 3/20,
the 20th seed will be rotated 20 * 3/20 rotations, or 3 complete rotations to bring it to
the oo line. An angle of 0.48 results in 25 radial arms (see FIGURE 5), since the 25th
seed will be positioned at an angle of 25 * 0.48 rotations, or 1 2 complete rotations, and
the cycle begins anew.

�\\I� '
'\\lit .
. .
. . .
. . .
. •'

""".. . . . . . . .
-"
.....
)
. . . . ..
....... . . .
..... ..
. . .. .

.....
. .
. . . . ..
.. .
...... ... . .
.
. . . ...

>
•• •
•• •

••• • • •.
• • •

.. c
-----· · · • ··-----
........ . . .
· • . · · ····

.....
. .
• • ••
. . ..

- ,
_..... . . . ..
. . .. . . -
.
. ..
.,... . .

_.... .
. . .•. .

,
• •

,
• •
. . . . .
. .
.. . . . . . .

Figu re 4 angl e= 0.15


/if\\\
Figu re 5 a ngl e= 0.48
VOL . 75, NO. 3, J U N E 2002 165
Clearly, if the angle is any rational fraction of one revolution, say ajb, seed b will
fall on the oa line, since the angle ab 1 b is an integral number of complete rotations.
Therefore the pattern will repeat, radial arms will be formed, and the distribution be far
from ideal. The best choice then, would be an irrational angle-we are then guaranteed
that no seed will fall on the same line as any other seed.

Golden flowers

The irrational angle most often observed in plants is the golden ratio, ¢ = (1 + .J5) /2,
or approximately 1 .6 1 8 . This angle drives the placement of leaves, stalks, and seeds
in pine cones, sunflowers, artichokes, celery, hawthorns, lilies, daisies, and many,
many other plants [5, pp. 1 55-66; 2, pp. 90-- 1 05; 1, pp. 8 1 -1 1 3 ] . With this angle of
rotation, each seed is rotated approximately 1 .6 1 8 revolutions from the previous
seed-which is the same as 0.6 1 8 revolutions, or about 6 1 .8% of a complete tum
(approximately 222S). For our purposes, only the fractional part of the angle is
significant and the whole number portion can be ignored. FIGURE 6 shows 1 000
simulated seeds plotted with this angle of rotation, an arrangement we will call a
golden flower. Notice how well distributed the seeds appear; there is no clumping
of seeds and very little wasted space. Even though the pattern grows quite large, the
distances between neighboring seeds appear to stay nearly constant. In the natural
world, many plants grow their seeds (or stalks or leaves or thorns) simply where there
is the most room [5, p. 1 6 1 ] . The resulting golden flower is the most even distribution
possible [1, pp. 84-88 ; 6, pp. 96-99] . (For an excellent discussion of the mechanics
of the placement of seeds in a growing plant apex and the inevitability of these golden
arrangements, see Mitchison [3, pp. 270-75].)

. . .. .... . .
. .. . . :: · :· .· · ..· ..· . · .· . · .
. . . ·. .·
. . . . . . .. :........ ......·..·... .. . ..
: . . : : : :: ·::. . . :·: � ·. ::
:: · · : .: . . · ·
. ...... . .....·.: ·:· :....... .. :::::: : : ::.
. � .. . . ..
;.::. .::::: :: ::.... :. ::: : : :::: :: : :.
=·=·:::::: ::::::::.: ·: ·: :::....... .. :::: ::
· .
. :. :.::.· :.:: ·. ·:. .::: :·.:::: :::.:....::.:::.
. :
· · · · ·
: . ·
· · . .· . · · ·
. . . : : : : : :. ·
. .·. :: ·:·: · · · ·.· .·.:.
: :
.· .·..·..· :. ·:..·.:·...·..·:..·.. ·.·:·.·.·. ·: .·.·.· · :.·:.·..·.··.·.·
· . .· .· .· . · . .: · ::. · · ·. . · . · .· .
... . .. . . . .. .. . . . . ...· .· . . ...... . . .:.. .....· .
...... ::.:.:· .: ::: : ·.·: ·:· · · :·. :::: · : ·:· · · ·
. : · .·.· :::.. :.· .· .· .· .· :...· ..·.· ·
. ...:. .:....· ....·..·.· .·..·..·..·
. . . . .. .. . ................ :.. .... . .
.. .. .. .. .. :. ... : . . .. ·.·.. ·.........·.... .... ........
.. ... ·..
:::.:: ·.�.··:·:·.......:.:.::.: :::·.·.·. .
. . .. . . . . . . . .. ... .. . . . . .
· ·.·. ·. ·. ·.· .· .:· .· ·.::.:::: · ·
.

· · · · ·:·:·:· .·..·. .. .... . .


. .
Figure 6 1 000 seeds i n a go l den flower

Notice also the many different spiral arms. Spiral arms seem to fall into certain fam­
ilies. In this pattern above, you can see how a group of spirals twist in one direction,
only to be taken over by another group of spirals twisting in the opposite direction.
The interesting properties of spiral families form the heart of our discussion.
FIGURE 7 shows three families of spirals in the golden spiral. Each set of 300 seeds
pictured is identical, but different spirals arms have been drawn on each set. The first
set shown consists of 8 spiral arms, the second has 1 3 , and the third 2 1 -all Fibonacci
numbers. You may be able to see other spirals not shown in these images, and the size
of these groups are Fibonacci numbers as well.
MATHEMATICS MAGAZI N E

Figure 7 S p i ra l fam i l ies 8, 1 3, and 21

To understand why spirals o n a golden flower appear i n groups whose size are Fi­
bonacci numbers, it helps to consider placement of individually numbered seeds. In
FIGURE 8, the first 1 44 seeds are numbered and the Fibonacci numbers are enclosed
in rectangles. The baseline at oo has also been added.

•142.
•12.9
•12.1 • 134
•108
•137. 116
·95 • 87 •100
•113
•12.4. 103 . 82 • 74
•66 • 79 •12.6
•61 •53 •92.
•111' 90 ' 69 48 •40 •45 •58 • lOS 1 39

'

•132. ·71
•32.
77 2.7 ,19
• • • •84 •118
• 98 56 o3S • 2.4 •37 •SO
• 119 64 • 43 2. •14 •11 2.9 •63 •
• • 2. •6 •16 • 97 • 131

�� ·9
· •42
•311 1!11 E]] E!l �
85
•110
rn FJ¥1<
1 1 •51 • • •76
-11
rn rn m
•72 •4
·38
� [!]] ·l�
· 12.7
·93 ·59 •2.5 68
•12.
·

•7 •2.6 '47
·102.
•114 •80 •46 • 10 · 18
. •20 •81
33 •15, • 39 •60 •136

• 54
•l3SoJ01 •67 • 115
•52 •94,
• 2.8 •23 •31
• 41 ·73
• •.36 .•.44
.• 122. 88 < •
• iS • 62. • 49
•109' •57 •65 • 86 •.107•12.8
•143
•96 •
83 •70 • 78
•130 •91 99 •12. 0 •141

•117 •104 •112.


138
• 133
• • 12.5

Figure 8 Fibonacci seeds

The Fibonacci numbered seeds converge on the oo line, alternating above and below,
just as the ratios of pairs of consecutive Fibonacci numbers converge to ¢, alternately
greater and less than ¢ . A seed that is numbered with a Fibonacci number will fall close
to the zero degree line, since its angle (a Fibonacci number times ¢) is approximately
an integer. For example, since 55/34 is approximately ¢, seed 34 will be located at an
angle of about 34 * (55/34), or very nearly 55 complete rotations (actually'"" 55 . 0 1 3
rotations, a slight over-rotation). The larger the Fibonacci numbers involved, the closer
their ratio is to ¢ and therefore the closer the seeds lie to the zero degree line.
It is for this reason that seeds in each spiral arm in a golden flower differ by mul­
tiples of a Fibonacci number. For example, seed 34 is slightly over-rotated past the
oo line, seed 68 is rotated by the same angle from seed 34, as are seeds 1 02, 1 36,
1 70, and every other multiple of 34. These seeds form one spiral arm in family 34.
Another arm in this family is 1, 35, 69, 1 03 . . . , and another is 2, 36, 70, 1 04, . . . , etc.
Members of an arm in family 34 are seeds with numbers 34m + n, where m and n are
nonnegative integers and n is constant for that arm. Trace any spiral arm in the golden
VOL. 75, NO. 3, J U N E 2002 1 67
flower and you will find that its seed numbers are in arithmetic progression, since all
share a common difference-a Fibonacci number.

n flowers

Why should the golden ratio be the preferred irrational number in nature? Shouldn't
any irrational number work just a s well? Let's take a look a t a simulated seed pod gen­
erated with an angle of rr rotations, or rr * 360° . This angle is,..._, 3 . 1 4 1 59 revolutions,
which is the same as,..._, 0. 1 4 1 59 revolutions, or,..._, 50.97° . FIGURE 9 shows the first
500 seeds-not a very even distribution at all ! Seven spiral arms dominate the pattern
with no new spirals apparent. With 1 0,000 seeds (FIGURE 1 0), a new set of spirals be­
come visible, 1 1 3 arms in this family with so little curvature that the next set of spirals
doesn' t show until about a million seeds have been grown.

. . . . . . . . . . .· .
.. . .
.. .. .
. . .....· · · · · · .. · .
..
.. . ·· . ·.
· · ·.
..· ..· ··· ····· ····· .. ·... .. · ..
.· .·· · ·
·... ·. ·.. ·.
. . ......... ' ·. .
.· ..· .····· '- ·, \ � .
. .
.... . . .. ..,
......... \ 1 , i ::. : · :

:l:
. . . .... ... : J : • ·•

:·· ,... •• .•• ••• .


: .. ' · : : ::: :
:
••• •
:
'

I
• • • • , •
=
: : =• ,
. · . . ..... ..,
���

I ( ·, ..··... :.: ..:
\ .... . .. . ...
·
..· . :
� . ·· · ·
. . .. '' ' ..
. . ·
·. ·.. ·· .
·.
. ·... .. · .··
. .. ············ . .· .·
·. ·. ·. .... . .·
·. ·.. ··.. ···· ··· ·· ..·
.. ... ···· · · · . . . .·
.. .. .
. . . . . ..... . .
. .. . . .
Figu re 9 500 seeds, angle = rr Figu re 1 0 1 0,000 seeds, angle = rr

Why should there be 7 spiral arms so prominently displayed in the center, and 1 1 3
arms in the next set of spirals? Perhaps you recognize these numbers as denominators
in well-known rational approximations of rr. An excellent approximation of rr is 22/7.
The decimal expansion of 1 /7 is 0. 1 42857 . . . and the angle of rotation in a rr flower is
0. 1 4 1 59 . . . -a close match ! Another great approximation of rr is 355/ 1 1 3 , accurate
to 6 decimal places, and for this reason the next set of spirals has 1 1 3 arms.
The gap between these spiral families (7 and 1 1 3) in a rr flower is huge compared
to that of a golden flower. No other sets of spirals are apparent between family 7 and
family 1 1 3 --does this mean that there are no better rational approximations of rr with
denominators between 7 and 1 1 3 ? Plotting and numbering the seeds in a rr flower
suggests an answer. In FIGURE 1 1 , seed 7 in the first spiral arm falls near the oo line
as expected, as does seed 1 1 3 . Since seed 1 1 3 is part of the second spiral arm to cross
this line, there is no seed less than 1 1 3 that lies closer to the oo line than seed 7, and
thus there is no better rational approximation of rr than 22/7 with a denominator less
than 1 1 3 . The approximation 355/ 1 1 3 is so accurate that the spirals in family 1 1 3
have very little curvature and their members dominate the oo line for generations. The
nearest seed in the third arm to cross is seed 226-part of the same arm as seed 1 1 3 . In
fact, we need to check tens of thousands of seeds before we find one that falls closer
to the oo line than any multiple of 1 1 3-a topic we will visit again later.
168 MATH EMATICS MAGAZ I N E

Figure 11 A n u m bered rr flower

.J2 flowers
An angle of rotation of ./2 produces a very even distribution of seeds, rivaling that of
the golden ratio. Five hundred seeds are shown in FIGURE 1 2; families of spirals are
again readily apparent in this arrangement. A study of these ./2 spirals is worthwhile,
as their structure illuminates many properties of algebraic numbers and seed spirals in
general.

• •• •
• • ••
• • •• •• • ••
• ••
• • • • • •• •• ••••

..
• • •
• •• • • •• ••• •
: ::
• • • • • • ·• ••
• · · · ... . .. .

:
• · . · .
. . .. . · ..
. .. . . ·· ···
:
. . ·· .
··
· . . . . . . ..... . .

... -�
•• • • •• • • • •
• • · · ·· ···· •
. . •· · · • · ·
· · · · · · ·
••••• • • •

�� ::
••• • • • • ••• • •
· · · · · • . . • ·· ·

.....
· · • ·
. . .· ··· . · ·
·.·
· ·

...
· . .. .
. . . .. .... . •. .
.
.
. .


. . . .
.
.
. . ..
. . . .. . .. . . .
. .. .. . . . . ·
. . .. . . .
. . . .. .. . . .

��

• • •• ••• • • • • •• •• • • • • •
• • • .. .• • • • • . •. • ••
. . . .. .· ·· · · ..

:
.
. . ....... . · . . . . . ..
. . . . . . .
•• • • • • •
• •••• • •• •• •

..
• ••• • • •• • • •• • • •• •••
• . •••• . •. • ·• • . • ••
• . .... • .. . . .
. . . .
•• •
••••• •• • • • •• •
• • • •
• •• •• •• • • • •
• •
•• • • • •
• • •
•• •

Figure 12 A root-two s pira l

FIGURE 1 3 shows the results of a brute-force analysis-the first 1 2 families of


spirals in a ./2 spiral. Family 1 is made by connecting the seeds in order, family 2 is
made by connecting seeds whose numbers differ by 2, family 3 by connecting seeds
whose numbers differ by 3, etc . Study these spiral families for a moment. Notice that
VO L . 75, NO. 3 , J U N E 2002 169

2 3

4 5 6

7 8 9

.
. . · .

. . .
.
. .. . . .
.. .
. . . . .
. .
. . . . . . .
' .
· . .
. .

. . : .....·.
. . ...
• . • .• •. .
• • • • • • • • • ••
. .. .
. .
·.
. .
. . •
.
. . . .
. . . . ..
. . ." . . ..
.
' .. . . . . . .
..
.. .
.
.
·
. '. ' . . . . . . . _.. . . . .
.
• • • • •
· •
·

·
• • . •. •
.

. : !
. . .
. . .
. .
. .
.

.
. . . • . . .
. • .

.
. . . . . . .
.
. . .. . .
. . . .

10 11 12
Figure 13 Spiral fami lies 1 -1 2 of a ,J2 spiral
MATH EMATICS MAGAZ I N E

17 19 22

29 41 46

Figure 14 Selected fam i l ies of the ../2 s p i ra l

some of the families produce very clean spiral arms, while others do not appear to be
spirals at all, crossing themselves in a star-like or even scribble-like pattern.
Families 2 and 3 start well but quickly die, that is, they cross themselves after a
small number of iterations. Families 5 and 7 are smooth. Family 1 0 looks like smooth
spirals, but on closer examination it is seen that it crosses itself immediately-since
10 is a multiple of 5, this is the same as family 5 but with alternate seeds on the arms
connected. Family 1 2 has the best looking spirals among the first 1 2 .
The numbers o f the families that produce nice spirals look suspicious: 2, 3 , 5 , 7 ,
1 2 . . . could there be a Fibonacci-like relationship between spiral families in a J2 spi­
ral as well? More spiral families are shown in FIGURE 14, but the next spiral family af­
ter 1 2 is not 1 9 as we might expect by adding 7 and 1 2, but rather 1 7 , and the next fam­
ily better than 1 7 is 29. The sequence is in fact: 1 , 2, 3 , 5 , 7 , 1 2 , 1 7 , 29, 4 1 , 70, 99 , . . .
Before reading further, can you find the pattern in this sequence and extend it?
The numbers in this sequence are the numbers in the Columns of Pythagoras. The
Columns of Pythagoras are a pair of columns of integers. The top entry in each col­
umn is 1 . Given a row with numbers A and B in that order, the next row is generated
by summing A and B and writing this number, C, in the first column underneath A,
then summing A and C and writing i t i n the second column underneath B . This pro­
cess generates all of the spiral families of the J2 flower (see FIGURE 1 5). Further, the
ratio of the numbers in each row converges to J2: 1 / 1 = 1 , 3/2 = 1 . 5, 7/5 = 1 .4,
1 7 / 1 2 = 1 .4 1 666 . . . , 4 1 /29 = 1 .4 1 379 . . . , etc.

Continued fractions

Let us follow one more twist on this spiraling journey. The golden ratio may be written
as the following continued fraction:
VOL . 75, NO. 3 , J U N E 2002 17 1

Figure 15 The Co l u m n s of Pythagoras

1 +� 1
---
1 --
= 1 + ----,--
2
t+rf.::
1+-

This result is easily verified by setting the continued fraction equal to some variable,
say x, and then recognizing that x is repeated in the denominator of the fractional part,
that is, x = 1 + 1 I x. This expresses the continued fraction perfectly as one root of an
easily evaluated quadratic.
Partial evaluations of this continued fraction, called convergents, result in ratios of
Fibonacci numbers, that is,

1 5 1 8 1 13
1+ 1 = -, 1+ = 5' 1+
1 + -1-
--

1 +I 3 t+t 1 + --
1 +-1-
I
8
1+ t
The reader may enj oy checking that the following continued fraction gives an expres­
sion for ../2:

v'2 = 1 + 1
.
2 + 2+ I

-J:::
--

Partial evaluations of this continued fraction yield the following ratios :

1 3 1 7 1 17 1 41
1 + - = -1 + = -' 1+ -1 = ' 1+
2+ �
--

2 2 5 2+ -
2+! 12 2 + 2+ I
I
-
29
2+ !
These are the same numbers in the Columns of Pythagoras and the same ratios found
in the ../2 flower!
Let us examine the continued fraction for the other irrational number we have used
to build flowers, JT. The continued fraction begins 3 + 1 /7 . . . and the values of num­
bers leading the expressions under the denominators at each level, starting with the 7,
are: 7, 15, 1 , 292, 1 , 1, 1, 2, 1, 3 , 1 , 14, . . .
.

Looking at the partial evaluations yields rational approximations to JT that reflect


the number of spiral arms in the JT flower:
172 MAT H EMAT I CS MAGAZI N E
1 22
1'( = 3+- = -,
7 7
1 1 15 333
= 3 + - - --/ - = 3 + -\0- 6 3 + - - = - - '
5 =
1'(
7+ 5 1 06 1 06

1 1
=3 + __ _-:- =3 + + 111 3
...!..
3 =3 + 16
=
355
.
1'(
7+ 15+1- 7+ 16 1 13 1 13
t 16
Remember the 1 1 3 arms in the rr spiral? It would take a lot of seeds to begin to find
the next series of spirals-the next partial evaluation explains why:

1 1 1 1 03993
=3 + - - -----:- =3 + =3 + 29 - 3 + 4687 -
- -
1'(
7 15+ 1-
+ 7+ 15+292I
+
7 4687 3 33 1 02 33 102
I+ �
l
m

The next family of spirals past family 1 1 3 is family 33 1 02. We would need 3 3 , 1 02
seeds just to get one seed in each spiral arm ! If we plot about a million seeds we
may be able to starting seeing these spirals; however, there would be nearly 92
spirals packed into each degree arc of the circular face. An illustration 10 em in
diameter would have over 1 000 spiral arms in each em of the circumference-the
illustration would appear to be nothing other than a black circle ! (Note that 333/ 1 06 is
also an approximation of rr. However, due to the closeness of a better approximation,
355/ 1 1 3 , the set of 1 06 spiral arms is immediately obscured by the set of 1 1 3 spiral
arms.)
The continued fraction expansion for the golden ratio uses the smallest possible
numbers in the expansion, namely 1 s . Therefore, it converges to a rational number the
least quickly. In this sense, the golden ratio is the most irrational number and therefore
gives the best possible distribution [6, pp. 96--99] .

More... Given that seed spirals are easily plotted using polar coordinates, you
may wish to create your own irrational flowers using mathematics software. Software
(Mac OS) used to create many of the images seen here is also available to download
for free from the author's web page at http://www.wwu.edu/"-'mnaylor.

REFERENCES
1. T. Cook, The Curves of Life, 19 14. Reprint 1979, Dover Press.
2. S. Colman, Nature's Harnwnic Unity, Benjamin BJorn, Inc., New York, 1 9 1 2.
3. G. J. Mitchison, Phyllotaxis and the Fibonacci series, Science 196 (April 1977), 270--275.
4. K. Niklas, Plant Biomechanics, University of Chicago Press, Chicago, 1992.
5. P. Stevens, Patterns in Nature, Little, Brown, and Co., New York, 1974.
6. I. Stewart, Daisy, Daisy, give me your answer, do, Scientific American (January 1995), 96--9 9.
7. D. Thompson, On Growth and Form, abridged version, Cambridge University Press, London, 197 1 .
VOL. 75, N O. 3 , J U N E 2002 173

P i I l ow Ch ess
G RANT CAI RN S
La Trobe University
Melbourne, Australia 3086

In cylindrical chess, one plays on a regular chessboard with the pieces in their standard
positions, but imagines that the left and right edges of the board are identified. So, for
example, when a rook travels horizontally off the right-hand edge, it reappears in the
same row on the left-hand side of the board. It is easy to make a cylindrical chessboard
out of paper, by taping together the left- and right-hand edges. However, to avoid
the pieces falling off, the simplest way to play cylindrical chess is to use the regular
(flat) chessboard and just remember the edge identification. Like 3-dimensional chess,
cylindrical chess is well over 1 00 years old. According to Pritchard [59] it dates to the
early eighteenth century. Byzantine or Round chess is played on a cylindrical 4 x 1 6
board, and i s possibly a thousand years old [59, 77, 53] .
I n chess o n a torus, one identifies the left and right edges and also the top and bottom
edges, as in FIGURE 1 a. These identifications really do produce the torus; the left and
right edge identification gives a cylinder, and the top and bottom edge identification
amounts to connecting up the ends of the cylinder (see Barr [3] or Stillwell [70] for
drawings of this construction). Once again, for playing toroidal chess, it is to best to
use the regular chessboard and just remember the edge identifications. The origins of
toroidal chess are not clear, but it goes back at least as far as P6lya's 1 9 1 8 paper [57] .
One can also play chess on the Klein bottle; the situation is similar to the torus,
but the horizontal edge identifications involve a reflection (FIGURE 1 b) : as a rook
travels up from h8, it reappears in a 1 . For games on the Klein bottle, there are even
stronger reasons to use the regular chessboard; the Klein bottle can't be constructed in
3-dimensional Euclidean space without self-intersections [3] .
The proj ective plane is another surface that is commonly represented by a square
with edge identifications, as in FIGURE 1 c . The projective plane is the space of straight
lines through the origin in 3-dimensional Euclidean space. It can be obtained from
the sphere by identifying antipodal points. Like the Klein bottle, it is not orientable
and can't be realized in 3-dimensional Euclidean space without self-intersections. See
B arr [3] , Stillwell [70] or Prasolov [58] for more information about representing sur­
faces. An alternative projective chess is obtained from the traditional board by adding
squares at infinity [56, Chapter 6. 1 6] . This is also called projected chess [59] .
The "fairy chess" games of FIGURE 1 are easy to understand; conceptually they are
really only a small variation on traditional chess since after all, the usual board can be

Figure 1 a Torus Figure 1 b K l ei n bottle Figure 1 c Projective p l ane


1 74 MATH EMAT I CS MAGAZI N E
thought of as being composed of 64 squares, with rules that tell you how the squares are
connected up. The game is quite different in practice; for starters, one wouldn't want
to commence one of these chess games with the pieces in their traditional positions,
since the kings would begin on adj acent squares ! Pritchard [59] gives various possible
starting positions on the torus. Also, on the Klein bottle and the projective plane, one
doesn' t have the usual black-white pattern; the black square al is immediately above
the black square h8. We could fix this if we were willing to use a 9 x 8 board, for
example.

The pi I low board

In this paper, we introduce pillow chess, a way to play chess on a surface equivalent to
the sphere where one can play with the standard pieces in their traditional positions.
The edge identifications are shown in FIGURE 2; the left and right edges are identified
with each other, while the top and bottom edges are identified with themselves. So,
for example, when a rook travels up the "b" column and off the top edge of b8, it re­
appears in g8 . Notice that unlike the torus chessboard, the board depicted in FIGURE 2
isn't homogeneous; like the projective plane chessboard, it has corners. The midpoint
of the bottom edge is such a comer: when you attempt to circle that point, the total
angle traversed is only JT. This also occurs at the midpoint of the top edge, at the ex­
treme top left point (which is identified with extreme top right point), and the extreme
bottom left point (which is identified with extreme bottom right point) .

Fig u re 2 The pi l low chessboard Fig u re 3 Exam p l e of moves

A little work with the Euler characteristic of surfaces such as these will show that the
existence of 4 comers is virtually forced. Indeed, for any choice of edge identifications,
the squares of the chessboard form a cell decomposition of the resulting surface S. For
each integer i :::: 1 , let n; denote the number of zero-cells (that is, points) that are
adj acent to i two-cells (squares). Then there are Li >i n; zero-cells, L; 4n; one-cells,
and L; �n; two-cells and the Euler characteristic [48] is
VOL. 75, N O . 3, J U N E 2002 1 75
If we agree only to form boards with ni = 0 for i other than 2 and 4, we get x (S) = !!:f
and because x (S) ::: 2, this gives n 2 = 4, 2 or 0. The case n 2 = 2 is the projective
plane. In the case n 2 = 4, we get the sphere with 4 comers, which we call a pillow,
for obvious reasons. One can form hyperbolic boards by allowing ni i= 0 for i i= 2, 4,
but then one loses the concepts of horizontal and vertical, as long as one persists with
squares that are really square. Boards paved by triangles or hexagons have been stud­
ied [59, 33, 7] , but we will stay with squares.
As in the other fairy chessboards described above, the pieces in pillow chess have
a greater command of the board than in traditional chess; see FIGURE 3. For example,
the knight at g l can reach h l by moving two squares to the right and one square down.
The bishops travel along parallel lines: if the bishop moves from d7 through e8, it re­
appears in c8, travelling along the parallel line through b7. (The bishops don't bounce
off the top edge as in the reflecting queens introduced by Klamer [37] ; for more on
reflecting queens, see articles by Guy [27, Se�tion C 1 8] or Gardner [24, Chapter 1 5 ] ,
and the Klamer's Maths Review [38] o f Huff's paper.) To get used to pillow chess,
we suggest considering the pillow chess problem posed in FIGURE 4; the solution is
given at the end of the paper. When considering this problem, it may assist the reader
to imagine what the game looks like "across the edges" of the board; such an expanded
view is shown in FIGURE 5.

a b c d e f g h

Figure 4 W h i te to p l ay. B l ack to mate i n one!

The aim of this paper is to revisit, on the pillow board, two classical chess problems
of a mathematical nature: the knight's tour problem, and the n-queens problem. As
is the tradition, we do not restrict ourselves to 8 x 8 boards. However, in order to
retain the usual black-white pattern, we will restrict our attention to n x 2m boards.
First, we need to understand better the nature of the pillow board. The pillow is an
example of an orbifold. In the same way that a manifold is a space that looks locally
like Euclidean space, an orbifold is a space that looks locally like the quotient of Eu­
clidean space by a finite group action. In fact, the pillow was one of the first examples
treated in Thurston's book [72] , where the term orbifold first appeared. Orbifolds were
176 MATH EMATICS MAGAZI N E

Figure 5 Expanded view of the prob lem

previously defined by Satake [63], who called them V -manifolds. For a nice intro­
ductory treatment of orbifolds in a somewhat more restrictive sense, see Stillwell [70,
Chapter 8].
The simplest example of an orbifold is the quotient of the plane by a finite group
of rotations about the origin; the quotient is a cone, which is smooth everywhere ex­
cept at a single conical singularity (the image of the origin) . The pillow is an orbifold
which is smooth everywhere except at its 4 conical singularities. But the concept of an
orbifold is quite subtle. Indeed, just as the pillow is homeomorphic to the sphere S 2 ,
every 2-dimensional orbifold is homeomorphic to a 2-dimensional manifold; what dis­
tinguishes the orbifold from the manifold is not just what it looks like topologically,
but what it is geometrically. In our case, the pillow has the flat Euclidean geometry,
and each of the 4 conical singularities has angle ;r.
The pillow is a special kind of orbifold. It is not just locally a quotient by a fi­
nite group action; we will show that it is globally the quotient of the torus by an
action of the 2-element group Z2 . First recall that the torus can be regarded as the
quotient space of the plane JR2 under the action of the group 'Z} by translation:
(i, j) : (x, y) --+ (x + i, y + })[54]. In this way, one can play chess on the torus
by playing on the infinite plane and identifying appropriate squares; if the plane is
tiled by unit-square chessboards, centered on the vertices of the integer lattice ;t:?,
VOL . 75, NO. 3 , J U N E 2002 177
2
then one identifies the square at (x, y) E JR with the squares at (x + i, y + j ) , for
all (i, j ) E 'Z}. The board centered at (0, 0) is a fundamental domain [54] for the 'Z}
action. Every chess piece on the torus board is thus represented by infinitely many
replicas, each in the corresponding position on its tile/board. (FIGURE 6 depicts the
torus with a queen and a knight.)

Figure 6 The tor u s as a quotient of the p l a n e

I n a manner similar t o the representation o f the torus as a quotient o f the plane, the
pillow can be realized as a quotient of the torus. One represents the torus 1!'2 as the unit
square, centered at the origin in JR2 with opposite edges identified as in FIGURE la,
and one considers the rotation a : 1!'2 --+ 1!'2 defined by a(z) = -z. The set {id, a}
defines a 2-element group isomorphic to Z2 . The upper half of the unit square is a
fundamental domain for this action of this group, and you can convince yourself that
the identifications on the boundary of the fundamental domain are precisely those
of the pillow board. Thus, one can play chess on the pillow by playing on the torus
centered at the origin and identifying diametrically opposite squares.
The realization of the pillow board as a quotient of the torus is not just a curiosity;
it is a useful tool. To fix ideas, let us say that a board is a set of squares, with marked
edges, together with a rule for connecting edges in pairs. We say that two boards are
equivalent if there is a bijection between their sets of squares that respects the edge
1 78 MATH EMAT I CS MAGAZI N E
connections. Now notice that, whereas i n the above description of the pillow board as
a quotient of the torus we took the upper half of the torus as the fundamental domain,
we could equally as well have taken the left-hand half as the fundamental domain. This
simple observation immediately gives:

LEMMA. The n x 2m pillow chessboard and the m x 2n pillow chessboard are


equivalent.
For example, the 8 x 8 pillow is equivalent to the 4 x 16 pillow, and one can readily
experience this eql!ivalence by playing on the 4 x 16 pillow; see FIGURE 7. From
this perspective, the 4 x 16 pillow is just the traditional Byzantine chessboard with
additional top and bottom edge identifications, and the starting chess piece positions
are also the same.

Figure 7 The 4 x 1 6 p i l low chessboard

The knight's tour problem


The knight's tour problem is this: can a knight visit all the squares of the board exactly
once and return to its starting position? In this paper we use the word tour in the sense
of a closed or re-entrant tour. Some authors use the word tour in the more general
sense of an open tour-one not requiring the knight to return to its initial position,
and some authors refer to closed tours as circuits. As documented by Murray [52]
(see also [53] and [68]), the knight's tour problem dates back over a thousand years
to Indian chess and has numerous appearances throughout the history of the game of
chess (but not back as far as 200 BCE, as some have claimed [73, 74]). The prob­
lem was investigated by mathematicians such as Euler [2] and Vandermonde [71]; in
modem terminology, the tour is an example of a Hamiltonian circuit [80], [5, Chap­
ter 1 1] . There is a vast literature on the problem. As Kraitchik remarked [42] (in a
paper first published in 1941), "Many generalizations of the knight's problem have
been proposed. Many alterations of the size and shape of the board have already been
considered."
One recurring topic is the (open) knight's tour on the half board [52, 53] . There are
cute proofs of the impossibility of a closed tour on 4 x n boards; Honsberger gives
P6sa's proof [32, p. 145], and Gardner [22] gives a proof that he attributes to Golomb.
The connection between knight tours on the 4 x 4 board (minus one square) and the
1 5 puzzle was investigated in this MAGAZINE in 1993 [34]. The number of knight's
tours on the 8 x 8 board [46, 49] and the n x n board [ 43] have both been studied.
Heuristics for generating tours are given by Shufelt and Berliner [67]. Boards of other
shapes have also been studied [42; 47, Vol. 4]. Tours on the cylinder, Mobius band and
Klein bottle appear in the "fanciful" account of Stewart [69, Chapter 7] and are studied
in Watkins [78]. See Eggleton and Eid [17] for tours on infinite boards. For tours on
boards with hexagonal tiles, see [44].
VO L . 75, N O. 3, J U N E 2002 1 79
Constructing a knight's tour on the (traditional) k x l rectangular chessboard is
classical. The following theorem is stated without proof in Kraitchik's Mathemat­
ical Recreations [ 42, Chapter 1 1 ] ; independent proofs were given by Cull and De
Curtins [16] (except for the k 3 case) and Schwenk [65].
=

RECTANGULAR BOARD THEORE M . For k :S l, there is a knight's tour on the k l X

rectangular chessboard unless one or more of the following three conditions hold:
(a) k and l are both odd,
(b) k = 1 , 2 or 4,
(c) k = 3 and l = 4, 6 or 8.
Another proof of the rectangular board theorem for square boards is available [15] ;
this also treats open tours that commence and terminate on specified squares. Open
tours on n x m boards with min (n , m) ::=:: 5 are examined in [16] . Results for open tours
on 4 x m boards are known [41, 73], and the 3 x n case is treated elsewhere [41, 73,
75]. (We alert the reader to the typographical error in the statement of [75, Theorem 2] ,
in which tour should read circuit in the language of that paper.)
On the torus, Watkins and Hoenigman proved [79]:

TORAL BOARD THEOREM. For all k, l, there is a knight's tour on the k x l torus
chessboard.
We now tum to the pillow board. Note that since the n x 2m pillow board is the
quotient of the 2n x 2m toral board, it follows immediately from the previous theorem
that there is a closed knight's path on the n x 2m pillow board that visits each square
exactly twice. In fact, one has:
THEOREM 1 . For all n, m, there is a knight's tour on the n x 2m pillow chess­
board.
Proof By our Lemma, we may assume that n :::; m. By the rectangular board the­
orem, it suffices to consider the cases n :::; 4 and in the case n = 3 we need only
consider m = 3 and m 4. Moreover, by the Lemma, the 4 x 2 m board is equivalent
=

to the m x 8 board, so by the rectangular board theorem again, for n 4 and n :::; m,
w e need only consider m = 4 . So this leaves u s with 5 cases to consider: ( 1 ) n = 1 ,
=

(2) n = 2, (3) n = m = 3, (4) n = 3, m 4 , and (5) n = m = 4. Tours o n the 3 x 6


=

and 3 x 8 boards are shown in FIGURE 8; the 3 x 6 example was taken from Watkins
and Hoenigman's study of tours on the torus [79] (this particular tour uses only the
side edge identifications) . A tour on the 4 x 8 board is shown in FIGURE 9. It remains
to deal with cases ( 1 ) and (2) .

1 16 13 10 7 4 1 16 7 22 13 4 19 10

8 5 2 17 14 11 20 11 2 17 8 23 14 5

15 12 9 6 3 18 15 6 21 12 3 18 9 24

Figure 8 Tours on the 3 x 6 and 3 x 8 p i l l ow boards

Each move of a knight can be represented by a pair (i, j ) , with i, j E {± 1 , ±2} ,


where for example ( 1 , 2) means move 1 square t o the right and 2 squares u p (in the ob­
vious sense). For convenience, we adopt a notation similar to that of Monsky [50] ; we
180 MAT H EMATICS MAGAZI N E

1 20 11 26 17 8 23 14

24 15 2 21 12 27 18 9

19 10 25 16 7 22 13 32

30 29 6 3 28 31 4 5

Figure 9 Tour on the 4 x 8 pi l l ow board

let A = ( 1 , 2) , B = ( 1 , -2), C = ( - 1 , 2), D = ( - 1 , -2), E = (2, 1 ) , F = (2, - 1 ) ,


G = (-2, 1 ) , H = (-2, - 1 ) , and we interpret the word A(BC) 2 , for example, to
mean the sequence of moves A, B, C, B, C. In Case ( 1 ), a simple tour is given by the
sequence A 2m ; FIGURE 10 shows the example of the 1 8 board. In Case (2), we start
x

at the top left comer and make the following sequence of moves:
for m even, m= 2k, B(C B) m - 1 H(EF) k - 1 EB(GH) k - 1 G2•
for m odd, m = 2k - 1 , B(C B) m - 1 H(EF) k - 1 A (HG) k - 1 G.

FIGURE 1 1 shows the example of the 2 6 and 2 8 boards.


x x •

1 1 1 1 1 1 1
2
3
4 5 6 7 8

Figure 10 Tour on the 1 x 8 pi l l ow board

1 12 3 8 5 10 1 16 3 10 5 14 7 12

6 7 4 11 2 9 8 9 6 15 4 11 2 13

Figure 11 Tours on the 2 x 6 and 2 x 8 pi l low boards

The n-queens problem

The n-queens problem is this: can one place n queens on an n x n board such that no
pair is attacking each other? Such queens are said to be nonattacking or invulnerable.
This problem dates back to the middle of the 1 9th century [10]. The traditional
board has solutions for n =!= 2, 3, while the n x n torus board has solutions when n
is not divisible by 2 or 3. The problem was solved for the torus by P6lya [57] and
the traditional board by Ahrens [1] and essentially the same underlying idea has been
used by several authors. The key construction for n relatively prime to 6 appeared
in Theorem II at the end of the 4th section of Lucas' book [47, Vol. 1 ] , which was
published originally in 1 8 83. For more recent presentations and formulations of this
idea, see [13, 25, 40], for the torus, and [4, 20, 31, 60, 81], for the traditional board.
VOL . 75, NO. 3 , J U N E 2002 181
(Incidentally, the n-queen problem for the torus is obviously equivalent to the n-queen
problem for the cylinder.)
The basic construction can be described as follows: start on any given square on
the torus, and make repeated knight moves (2, 1 ) , placing a queen on each square that
one visits. Regarding the initial square as the origin in Zn x Zn , this places queens on
the squares k (2, 1 ) , where the entries are reduced modulo n . More generally, solutions
of the form k(d, 1 ) , for some d, are said to be linear or regular. Most authors are
content in giving a particular linear solution. A systematic approach is adopted by
Erbas, Tanik, and Aliyazicioglu [19] (see also [18, 45, 66]). Nonlinear solutions are
given by Bruen and Dixon [9] and by Chandra [11] . When n is not divisible by 2 or 3 ,
the d = 2 case gives a solution o n the torus, and one can easily deduce a solution o n the
n x n traditional board for n "¢ 2, 3 (mod 6) . The cases n = 2, 3 (mod 6) can be treated
by combining two linear parts; a succinct description is given in a Monthly article on
the subject [61]. These solutions were already considered by Lucas, who called them
semi-regular [47, Vol. 1 , p. 64] .
Many other aspects of the n-queen problem have been studied. In addition to the
question of existence, one may also investigate the number of solutions. This latter
problem is still largely open [62, 61] . When no solutions exist, one is interested in
the maximum number of nonattacking queens that can be placed on the board. This
has been completely solved for the torus [50, 28, 12, 29, 51] . Like the knight's tour
problem, the n-queens problem has a graph theoretic interpretation: the existence of a
maximal stable set [5, Chapter 4; 6, Chapter 1 3 ] . The problem has been examined in
higher dimensions [39, 55], and on infinite boards [14] . The n -queens problem has
found many echoes in computer science; according to Vardi [76], it is "a canoni­
cal homework assignment in introductory programming classes." See the references
at the end of Section 6.4 of [30] . The queens problem has also been turned into a
game [64] .
Variations on the n-queens problem have been studied by changing the possible
moves of the queen: a semiqueen is a queen that can't move on the negative diago­
nals [21; 36; 47, Vol. 1 , p. 84, Theorem I; 76] ; an amazon is a piece that can move
like a queen and a knight [26, 11]. Incidentally, the term amazon is traditional [8,
Section 30; 33] ; the term superqueen is sometimes used [23, Chapter 1 6 ; 26], but su­
perqueen is also used to mean a queen on a torus [11, 50, 35]. Amazons are called
nite-queens by Chandra [11].
Now w e tum to the pillow board. By our Lemma, the 2 m x 2 m pillow board is
equivalent to the m x 4m pillow board, which has m rows. So one can place at most
m nonattacking queens on the 2m x 2m pillow board. Thus a solution requires the
placing of half as many queens as on the traditional board, but each queen commands
more squares, in fact, twice as many squares, for most positions. Investigations on
small boards show that solutions are more abundant than on the traditional board.
According to my calculations, for m = 2, 3, 4, 5, 6, the 2m x 2m pillow chessboard
has 24, 64, 768, 6464, 54656 solutions respectively. In general, one has:
THEOREM 2. For all m, the 2m x 2m pillow chessboard admits m nonattacking
queens.
Before proving this result, let us look more closely at the notion of nonattacking
queens. We give the squares coordinate labels (x , y) , x, y E { 1 , . . . , 2m } in the ob­
vious way, starting at ( 1 , 1 ) in the bottom left-hand comer. Notice that queens on
the traditional board at distinct positions (x 1 , y 1 ) , (x2 , y2 ) are nonattacking if and
only if the following 4 conditions hold: x1 =I= Xz , Y 1 =I= yz , (y l - xt ) =I= (Yz - xz) ,
(y 1 + x1 ) =I= (y2 + x2 ) . On the torus one has similar conditions, with equality replaced
by equivalence modulo 2m. On the pillow, the conditions are: x1 =I= xz , Y 1 =I= yz,
1 82 MAT H EMAT I CS MAGAZI N E
JL(y, - x, ) =I= JL (Y2 - x2 ) , JL(y, + x, - 1) =/= JL (Y2 + x2 - 1), where the function JL is
defined as follows. Let [i ] denote the remainder of i modulo 2m . Then

JL ( z
"
) = { [i ] if [i ] :::;: m ,
[-i ] otherwise.

Proof There is an obvious solution for m = 1 . For m > 1 , let A = { (i, 2i ) : i <
�m } U { (i, 2i + 1) : �m :::: i < m } and B = { (i, 2i + 1) : i < m } and place a queen

{
at each point in the following set:

A U { (m + 1 , m + 1)} if m = 0 (Mod 6),


C= B U { (m , m + 1) } if m = 1 or 5 (Mod 6) ,
B U { (m + 1 , m)} if m = 2 or 4(Mod 6) ,
A U {(m , m)} if m = 3 (Mod 6).

These positions are just a variation on the classical idea; we begin near the bottom
left corner and proceed up the board using knight moves, placing a queen on each
square visited. When m is divisible by 3, there is a little skip near the major diagonal.
Finally there is a solitary rogue queen near the middle of the board. The positions for
m = 6, . . . , 9 are shown in FIGURES 1 2 to 1 5 .

' ,


'
,


, '
- ,

:r/'
' ,
· I

)I =- ·


'
, , I·

, '
h-t
• ,
, '

�- Figu re 1 2
'
I•

Figu re 1 3
'

The proof that the m positions in C are nonattacking is perhaps best done by con­
sidering three separate cases: (a) m divisible by 6, (b) m divisible by 3 but not by 6,
and (c) m not divisible by 3. We briefly describe case (a). From the construction of C,
no two positions are in the same column. So we must show that the positions are also
in distinct rows, distinct negative diagonals, and distinct positive diagonals. This gives
three things to check:

Row Check

(1) 2i = m + 1 is impossible for m even.


(2) 2i + 1 m + 1 ===} i = m/2 ===} �m 1:
=
i.
VOL . 75, NO. 3 , J U N E 2002 183
'

• •

w ' •

- '

'

:=� •
'
I
'


'

tJ( I'
'



'

'

Figu re 1 4 Figu re 1 5

Negative Diagonals Check

( 1 ) For i , j < � m , JL (i ) = JL (j) ==> i = j .


(2) For i < � m , � m ::::; j < m , JL (i ) = JL (j + 1 ) ==> i = j + 1 , which is impossible
as i < j .
(3) For i < �m, j = m + 1 , JL (i ) = JL (O) ==> i = 0 , which i s impossible.
(4) For � m ::::; i , j < m, JL (i + 1 ) = JL (j + 1 ) ==> i = j .
(5) For � m ::::; i < m , j = m + 1 , JL (i + 1 ) = JL (O) ==> i + 1 = 0 , which i s impos­
sible.

Positive Diagonals Check

( 1 ) For i , j ::::; m j 1 , JL (3i - 1 ) = JL (3j - 1 ) ==> i = j .


(2) For i :::=; m j 1 , m j' < j < � m , JL (3i - 1 ) =
JL (3j - 1 ) ==> 3i - 1 =
2m - 3j + 1 , which is impossible for m divisible by 3 .
(3) For i ::::; m j 1 , � m ::::; j < m , JL (3i - 1 ) = JL (3j) ==> 3i - 1 = 3j - 2m , which is
impossible for m divisible by 3 .
(4) For i ::::; m j 1 , j = m + 1 , JL (3i - 1 ) = JL (2m + 1 ) ==> 3i - 1 = 1 , which is im­
possible.
(5) For m j 1 < i , j < � m , JL (3i - 1 ) = JL (3j - 1 ) ==> i = j .
(6) For m j 1 < i < � m , � m ::::; j < m , JL (3i - 1 ) = JL (3j )
==> 2m - 3i + 1
3 j - 2m ==> 4m + 1 = 3 (i + j ) , which is impossible for m divisible by 3 .
(7) For mj1 < i < �m, j = m + 1 , JL (3i - 1 ) = JL (2m + 1 ) ==> 2m - 3i + 1 = 1 ,
which is impossible for i < � m.
(8) For � m ::::; i , j < m , JL (3 i ) = JL (3j) ==> i = j .
(9) For � m ::::; i < m , j = m + 1 , JL (3 i ) = JL (2m + 1 ) ==> 3i - 2m = 1 , which is
impossible for m divisible by 3 .

This completes case (a). Cases (b) and (c) can b e treated i n an analogous manner;
we leave their verification to the reader. •
1 84 MATH EMAT I CS MAGAZI N E

Final comments

We finish with two comments. First, it would be interesting to extend Theorems 1 and 2
to pillow chessboards that do not have the usual black-white pattern, as is already the
case for the results on the torus. This means considering pillow boards that are obtained
from the n x m torus, with m and/or n odd, by identifying diametrically opposite
squares, as discussed above. This entails some unusual features; when n and m are
both odd, the fundamental domain no longer consists of an integer number of squares,
so the "board" looks rather odd. Also, if m is odd and n is even, one of the squares has
an edge which is identified with itself!
Second, we remark that pillow chess is by no means the first chess game to be
played on the sphere. There is a great number of chess variants (see Pritchard [59] and
the web site www.chessvariants.com). In particular, we mention Global Chess, which
is a commercial variant played on two revolving disks representing the hemispheres
(here the "squares" at the poles are triangles), and Andrea Mori's small spherical chess,
which is obtained from a cylindrical board by adding two additional (pole) squares, one
at each end. Don Miller's spherical chess [59] as modified by Leo Nadvomey, is very
close in appearance to pillow chess (the identifications are shown in FIGURE 1 6), but
in fact this board is not a sphere but a Klein bottle !

Figure 1 6

Solution to the pillow chess problem. First notice that white can't move its rook
at h4, since this would disclose check by black's bishop at g5 ! As well, e2 and f1 are
covered by black's bishop at d7. Thus black is threatening mate in three ways: Rfl,
Nfl and Nxe2. White can avoid all these attacks only by playing Re i , in which case
black mates with RxR.

Acknowledgments. My thanks go to John Bamberg for battling me in several games of pillow chess, to Andrej
Panjkov for interesting suggestions for further generalizing pillow chess, and to Andrew Gove and Danny Sleator
of Chessclub.com, whose software Chess Viewer produced the chessboard graphics in this paper.
VO L . 75, NO. 3 , J U N E 2002 1 85
REFERENCES
1 . W. Ahrens, Mathematische Unterhaltungen und Spiele, B.G. Teubner, Leipzig, 192 1 .
2. W. W. R . Ball, Mathematical Recreations and Essays, Revised b y H. S. M. Coxeter, Macmillan and Co.,
London, 1947.
3. S. Barr, Experiments in Topology, Dover Publications, Inc., Mineola, NY, 1 989.
4. J. D. Beasley, The Mathematics of Games, Oxford Uni. Press, Oxford, 1989.
5. C. Berge, The Theory of Graphs, Methuen and Co., London, 1962.
6.--- , Graphs and Hypergraphs, North Holland Publ., Amsterdam, 1973.
7. J-P. Bode and H. Harborth, Independent chess pieces on Euclidean boards, J. Combin. Math. Combin. Com­
put. 33 (2000), 209-223.
8. J. Boyer, No uveaux Jeux d'Echecs Non Orthodoxes, J. Boyer, Paris, 1954.
9. A. Bruen and R. Dixon, The n-queens problem, Discrete Math. 12:4 ( 1 975), 393-395.
10. P. J. Campbell, Gauss and the eight queens problem: a study in miniature of the propagation of historical
error, Historia Math. 4:4 ( 1 977), 397-404.
1 1 . A. K. Chandra, Independent permutations, as related to a problem of Moser and a theorem of Poly a, J. Com­
binatorial Theory Ser. A 16 (1 974), 1 1 1-120.
12. M.R. Chen, R. G. Sun, and J. C. Zhu, Partial n-solution to the modular n-queens problem II, Combinato rics
and graph theory, 1-4, World Sci. Publishing, River Edge, NJ, 1993.
13. D. S. Clark, A combinatorial theorem on circulant matrices, Amer. Math. Monthly 92: 10 ( 1985), 725-729.
14. D. S. Clark and 0. Shisha, Invulnerable queens on an infinite chessboard, Ann. New York Acad. Sci. 555
( 1989), 1 33-1 39.
15. A. Conrad, T. Hindrichs, H. Morsy, and I. Wegener, Solution of the knight's Hamiltonian path problem on
chessboards, Discrete Appl. Math. 50 ( 1994), no. 2, 1 25-1 34.
1 6. P. Cull and J. De Curtins, Knight's tour revisited, Fibonacci Quarterly 16:3 ( 1 978), 276-286.
17. R. B. Eggleton and A. Eid, Knight's circuits and tours, Ars Combin. 17A ( 1984), 145-1 67.
18. C. Erbas and M. M. Tanik, Generating solutions to the N -queens problem using 2-circulants, this MAGAZINE
68:5 ( 1 995), 343-356.
19. C. Erbas, M. M. Tanik, and Z. Aliyazicioglu, Linear congruence equations for the solutions of the N -queens
problem, Inform. Process. Lett. 41:6 ( 1 992), 301-306.
20. B-J. Falkowski and L. Schmitz, A note on the Queens' problem, Inform. Process. Lett. 23: 1 (1986), 39-46.
2 1 . N. J. Fine, Solution of problem El699, Amer. Math. Monthly 72 (1 965), 552-553.
22. M. Gardner, Problems that are built on the knight's move in chess, Scientific American 217 ( 1 967), no. 4,
128-1 32.
23. --- , Further Mathematical Diversions, Penguin Books, New York, 1 977.
24. --- , Fractal music, hypercards and more. . . . Mathematical recreations fro m Scientific American maga­
zine, W. H. Freeman and Company, New York, 1992.
25. Z. Goldstein, Solution of problem E2698, Amer. Math. Monthly 86 ( 1 979), 309-3 10.
26. S. W. Golomb, Sphere packing, coding metrics, and chess puzzles, Proc. Second Chapel Hill Conf. on Com-
binatorial Mathematics and its Applications, Univ. North Carolina, Chapel Hill, NC, 1 970, pp. 176-1 89.
27. R. K. Guy, Unsolved Problems in Number Theory, Springer-Verlag, New York, 1994.
28. 0. Heden, On the modular n-queen problem, Discrete Math. 102:2 (1 992), 1 55-1 6 1 .
29. --- , Maximal partial spreads and the modular n-queen problem, Discrete Math. 120: 1-3 ( 1 993), 75-9 1 ;
II, ibid. 142: 1-3 (1995), 97-106.
30. S. M. Hedetniemi, S. T. Hedetniemi, and R. Reynolds, Combinatorial Problems on chessboards II, Domina­
tion in graphs, advanced topics (T. W. Haynes, S. T. Hedetniemi, P. J. Slater eds.), Marcel Dekker, Inc., New
York, 1 998, pp. 1 33-1 62.
3 1 . E. J. Hoffman, J. C. Loessi, and R. C. Moore, Constructions for the solution of the m queens problem, this
MAGAZINE 42 ( 1969), 66-72.
32. R. Honsberger, Mathematical Gemsfrom Elementary Combinatorics, Number Theory, and Geometry, MAA,
Buffalo, 1973.
33. D. Hooper and K. Whyld, The Oxfo rd Companion t o Chess, Oxford Uni. Press, Oxford, New York, 1984.
34. S. P. Hurd and D. A. Trautman, The knight's tour on the 15-puzzle, this MAGAZINE 66:3 ( 1 993) 1 59-166.
35. F. K. Hwang and K. W. Lib, Latin squares and superqueens, J. Combin. Theory Ser. A 34: 1 ( 1 983), 1 10-1 14.
36. E. Just, Solution of problem E2302, Amer. Math. Monthly 79:5 (1972), 522-523.
37. D. A. Klamer, The problem of reflecting queens, Amer. Math. Monthly 74 ( 1 967), 953-955.
38. --- , Mathematical Review 48 #10849 of "On pairings of the first 2n natural numbers", by G. B. Huff,
Acta Arith. 23 ( 1 973), 1 17-1 26.
39. --- , Queen squares, J. Recreational Math. 12:3 ( 1 979/80), 177-178.
40. T. Kl!1lve, The modular n-queen problem, Discrete Math. 19:3 ( 1 977), 289-29 1 ; II, ibid. 36: 1 (1981), 33-48.
41. M. Kraitchik, L e Probleme d u Cavalier, Gauthiers-Villars, Paris, 1927.
42. --- , Mathematical Recreations, Dover Publ., New York, 1953.
186 MAT H EMATICS MAGAZ I N E
43. 0. Kyek, I . Parberry, and I . Wegener, Bounds on the number of knight's tours, Discrete Appl. Math. 74:2
( 1 997), 171-1 8 1 .
44. P. C . B. Lam, W. C . Shiu, and H . L . Cheng, Knight's tour on hexagonal nets, Congr. Numer. 141 ( 1999),
73-82.
45. L. C. Larson, A theorem about primes proved on a chessboard, this MAGAZINE 50:2 ( 1 977), 69--74.
46. M. Loebbing and I. Wegener, The number of knight's tours equals 33, 439, 123, 484, 294-counting with
binary decision diagrams, Electron. J. Combin. 3: 1 ( 1996), #R5 and #R5 Comment 1 .
47. E . Lucas, Recreations Mathematiques, Blanchard, Paris, 1 960.
48. W. S. Massey, A Basic Course in Algebraic Topology, Springer-Verlag, New York, 199 1 .
49. B . McKay, Comments o n "The number o f knight's tours equals 3 3 , 439, 1 2 3 , 484, 294-counting with binary
decision diagrams" [Electron. J. Combin. 3: 1 ( 1996), #R5] by M. Loebbing and I. Wegener, Electron. J.
Combin. 3: 1 ( 1996), #R5 Comment 2.
50. P. Monsky, Solution of problem E3 1 62, Amer. Math. Monthly 96 ( 1 989), 258-259.
51. --- , Addendum: "On the modular n-queen problem," by 0. Heden, Discrete Math. 118: 1-3 ( 1 993), 293.
52. H. J. R. Murray, The knight's tour: ancient and oriental, British Chess Mag. ( 1903):Jan. 1-7.
53. --- , A History of Chess, Oxford Univ. Press, London, 1 9 1 3 .
54. V. V. Nikulin and I. R . Shafarevich, Geometries and Groups, Springer-Verlag, Berlin-New York, 1987.
55. S. P. Nudelman, The modular n-queens problem in higher dimensions, Discrete Math. 146: 1-3 (1 995), 159-
1 67.
56. M. Petkovic, Mathematics and Chess, Dover Publications, Inc., Mineola, NY, 1997.
57. G. P6lya, Ober die "doppelt-periodischen" Losungen des n-Darnen-Problems, George P6lya: Collected pa-
pers Vol. IV (G-C. Rota, ed.), MIT Press, Cambridge, London, 1984, pp. 237-247.
58. V. V. Prasolov, Intuitive Topology, American Mathematical Society, Providence, RI, 1995.
59. D. B . Pritchard, The Encyclopedia of Chess Variants, Garnes and Puzzles Pub!., Surrey, 1994.
60. M. Reichling, A simplified solution of the n queens problem, Inform. Process. Lett. 25:4 ( 1987), 253-255.
61 . I. Rivin, I. Vardi, and P. Zimmermann, The n-queens problem, Amer. Math. Monthly 101:7 ( 1994), 629-639.
62. I. Rivin and R. Zabih, A dynamic programming solution to the n-queens problem, Inform. Process. Lett. 41:5
( 1 992), 253-256.
63. I. Satake, On a generalization of the notion of manifold, Proc. Nat. Acad. Sci. U. S.A. 42 ( 1956), 359--363.
64. G. Schrage, The eight queens problem as a strategy game, Intemat. J. Math. Ed. Sci. Tech. 17:2 ( 1986),
143-148.
65. A. J. Schwenk, Which rectangular chessboards have a knight's tour?, this MAGAZINE 64:5 ( 1 99 1 ), 325-332.
66. H. D. Shapiro, Generalized Latin squares on the torus, Discrete Math. 24: 1 ( 1 978), 63-77.
67. J. A. Shufelt and H. J. Berliner, Generating Hamiltonian circuits without backtracking from errors, Theoret.
Comput. Sci. 132: 1-2 ( 1994), 347-375.
68. D. Singmaster, Some early sources in recreational mathematics, Mathematics from manuscript to print, 1300-
1600, (C. Hay ed.), 195-208, Oxford Univ. Press, New York, 1 988.
69. I. Stewart, Another Fine Math You 've G o t M e Into . . . , W. H. Freeman and Company, New York, 1992.
70. J. Stillwell, Geometry of Surfaces, Springer-Verlag, New York, 1 992.
71. J. J. Tattersall, Vandermonde's contributions to the early history of combinatorial theory, Ars Combin. 25C
( 1988), 1 95-203.
72. W. Thurston, The Geometry and Topology of 3-Manifolds, mimeographic notes, Princeton Uni., 1978-1979.
73. M. Valtorta and M. I. Zahid, Warnsdorff's tours of a knight, J. Recreational Math. 25:4 ( 1 993), 263-275.
74. --- , Tie-breaking rules for 4 x n Warnsdorff's tours, Congr. Numer. 95 ( 1 993), 75-86.
75. G. H. J. van Rees, Knight's tours and circuits on the 3 x n chessboard, Bull. Inst. Combin. Appl. 16 ( 1996),
8 1-86.
76. I. Vardi, Computational Recreations in Mathematica, Addison-Wesley, Redwood City, 199 1 .
77. G . H. Verney, Chess Eccentricities, Longmans, Green and Co., London, 1 885.
78. J. Watkins, Knight's tours on cylinders and other surfaces, Congr. Numer. 143 (2000), 1 17-1 27.
79. J. Watkins and R. L. Hoenigman, Knight's tours on a torus, this M AGAZINE 70:3 ( 1 997), 175-1 84.
80. R. J. Wilson, A brief history of Hamiltonian graphs, Graph Theory in Memory of G.A. Dirac, (Sandbjerg,
1985), Ann. Discrete Math. , 41, North-Holland, Amsterdam-New York, 1989, pp. 487-496.
81. A. M. Yaglom and I. M. Yaglom, Challenging Mathematical Problems with Elementary Solutions Vo l l ,
Holden-Day, San Francisco, 1964.
VO L . 75, NO. 3 , J U N E 2002 187

Dou b l y Rec u rs ive M u l tiva r i ate


Auto m ati c D i ffe re n t i at i o n
DAN K A LMAN
American University
Washington, D.C. 2001 6
kalman®american.edu

Automatic differentiation is a way to find the derivative of an expression without find­


ing an expression for the derivative. More specifically, in a computing environment
with automatic differentiation, you can obtain a numerical value for f'(x) by enter­
ing an expression for f (x). The resulting computation is accurate to the precision
of the computer system-it does not depend on the approximation of derivatives by
difference quotients. Indeed, the computation is equivalent to evaluating a symbolic
expression for f'(x) , but no one has to find that expression-not even the computer
system.
That's right. The automatic differentiation system never formulates a symbolic ex­
pression for the derivative. Automatically calling on something like Mathematica to
produce a symbolic derivative, and then plugging in a value for x is the wrong image
entirely. Automatic differentiation is something completely different.
Well OK, but so what? Symbolic algebra systems are so prevalent and powerful
today, why should we be concerned with avoiding symbolic methods? There are two
answers. The first is practical. Symbolic generation of derivatives can lead to expo­
nential growth in the length of expressions. That causes computational problems in
real applications. Accordingly, there is a practical applied side to the subject of auto­
matic differentiation, as witnessed by the serious attention of computer scientists and
numerical analysts [3, 4] .
The second answer is more mathematical. It is a relatively easy task to create a sin­
gle variable automatic differentiation system capable of evaluating first derivatives. In
fact, writing in this MAGAZINE in 1 986, Rall [ 10] gives a beautiful presentation of just
such a system. What is mathematically interesting is an amazingly elegant extension
of the one-variable/one-derivative system that handles essentially any number of vari­
ables and derivatives. The extension is recursively defined, employing an induction on
both the number of variables and the number of derivatives, and using fundamental
definitions that are virtually identical to the ones used in Rail's system.
The purpose of this paper is to present the recursive automatic differentiation sys­
tem. To set the stage, we will begin with a brief review of Rail's one-variable/one­
derivative system, followed by an example of the recursive system in action. Then the
mathematical formulation of the recursive system will be presented. The paper will
end with a brief discussion of practical issues related to the recursive system.

Rail's system

Because automatic differentiation is a computational technique, it is best understood


in the context of a computer language. In particular, recall that in a scientific computer
language such as Basic, or FORTRAN, variables correspond to memory locations. For
example, consider the statements

x=3
f = x2 - 5.
188 MATH EMATICS MAGAZI N E
The first causes a value of 3 to be stored in the memory location for x , while the
second reads the value of x, squares it, subtracts 5, and stores the result in the memory
location for f. We can think of this as a procedure for evaluating the function f (x) =
x 2 - 3.
In Rall's system, the idea is to simultaneously evaluate both f (x) and f'(x) . In this
system, each variable corresponds to an ordered pair of memory locations, one for the
value of a function, and one for the value of the derivative. Now the goal is for the
statements above to produce the pair (4, 6) , incorporating the values of both / (3)
and / ' (3).
This is accomplished as follows. First, when a variable is assigned a value in a state­
ment such as x = 3 the automatic differentiation system stores in the memory for x
the pair (3, 1 ) . This corresponds to the value of the identity function I (x) = x , and its
derivative, at x = 3. Second, any numerical constant that appears in an expression is
represented by a pair corresponding to the value and derivative of a constant function.
For the example above, the constant 5 is represented by (5 , 0)-the value of the con­
stant function C(x) = 5, and its derivative. Finally, each operation appearing in the
expression is carried out in an extended sense, operating on pairs. The rule for pair
addition or subtraction is just the usual componentwise operation. The rule for pair
multiplication is

(1)

Using these definitions, we can anticipate what the automatic differentiation system
will do in response to the pair of statements

x=3
f = x 2 - 5.
The first statement leads to the creation of the pair (3 , 1 ). The second statement trans­
lates into a sequence of operations on pairs:

f= (3, 1 ) X (3 , 1) - (5 , 0)
= (3 . 3, 1 . 3 + 3 . 1) - (5 , 0)

= (9, 6) - (5 , 0)

= (4, 6).

As easily verified, this result correctly represents the value of both x 2 - 5 and its
derivative at x = 3. Notice that there is no symbolic computation here. However, the
equivalent of symbolic differentiation rules are built into the definitions of pair addition
and multiplication. Thus, the expression for f is evaluated to produce both the value
of the expression and of its derivative.
It should be stressed that the operations on pairs can be formulated without any
reference to functions and derivatives. We adopt an abstract framework with objects
(ordered pairs) and operations. As defined above, ordered pairs can be added, sub­
tracted, and multiplied. In fact, extended operations for pairs can be defined for all the
usual elementary functions. For example, the sine of a pair is defined according to

(2)

Of course, these abstract definitions are inspired by the idea that each ordered pair will
contain values of a function and its derivative. To make the connection explicit, we will
use the notation f[ l . l l (x) = (f (x) , f'(x)) , where the [ 1 , 1 ] indicates the presence of
VO L . 75, NO. 3 , J U N E 2002 1 89
one variable, and the inclusion of one derivative. Thus, in the original computation, we
found f [l . l l (3) = (4, 6). Similarly, using the sine operation for pairs, the statements

x = 3
2
g = sin(x - 5)

result in the computation of sin (4, 6) = (sin 4, 6 cos 4) . The elements of this ordered
pair are the correct values of sin(x 2 - 5) and its derivative at x = 3 . That is, with g (x )
defined as sin(x 2 - 5 ) , the lines above compute g [ l . l l (3) .
What makes the system work is that each operation correctly propagates derivative
values. For the arithmetic operations, that means
l ' · ' l (3) + [1. 1] + [ l . l] (3)
j g (3) = (f g)
[ 1 . 1 ] (3) - [ l . l ] [ l . l] (3)
! g (3) = (f - g ) (3)
[ l , l] (3) X [ 1 . 1 ] (3) = (f ) [ l . l ] (3 ) .
j g g

Observe that the rules for addition, subtraction, and products of pairs are based on the
sum and product rules for derivatives. Similarly, (2) is really nothing more than the
chain rule, since the derivative of sin (f (x ) ) i s given b y cos(f (x ) ) f ' (x ) . With a 1 in
place of f (x ) and a 2 in place of f' (x ), this becomes cos(a J )a 2 . That shows that in (2),
if (a 1 , a 2 ) = f [ l , l l (3 ) , then sin(a 1 , a 2 ) = sin(f [ l . l l (3)) = (sin o f) [ l , l l (3) . In a similar
way, any differentiable function ¢ can be extended to pairs by the formula

(4)

With this definition, we have

(5)

Although these examples pertain to a function of a single variable, and involve


only a single derivative, it is easy to envision extensions involving several variables
and partial derivatives of various orders. Throughout, we will restrict our attention to
functions sufficiently smooth so that order of differentiation does not matter.
In the recursive system that we will present below, the idea is to compute all of the
partial derivatives up to some specified order. In this system, evaluating a function f
at a point in its domain means determining an object pn .ml that contains the function
value as well as the values of all partial derivatives through order m with respect to
n variables. These objects are referred to as derivative structures. Since m defines
the maximum number of derivatives, it is called the derivative index. Similarly, n is
the variable index. As in the discussion above, we can proceed abstractly by defining
derivative structures and appropriate operations without any mention of functions and
derivatives. However, given a function f , we do need some way to construct pn . ml as
one of our abstract derivative structures, and equations analogous to (3) and (5) must
hold.

The recursive system in action

Before describing the abstract system, let's take a look at how the system operates.
Consider the function

f (x , y , z ) =
v'x+Y
�·
yZ - y
190 MATH EMATICS MAGAZI N E
and suppose we wish to evaluate f and all partial derivatives through second order at
the point (4, 5 , 14). The recursive automatic differentiation system can be given this
problem with the following commands (with slightly modified syntax for readability) :
x = DS -Make -Var ( 3 , 2 , 1 , 4 )
y DS -Make-Var ( 3 , 2 , 2 , 5 )
z = DS -Make-Var ( 3 , 2 , 3 , 1 4 )
u DS- Sqrt ( D S-Add ( x , y ) )
v D S - Sqrt ( D S - Sub ( z , y ) )
P r i nt D S - D i v i de ( u , v )

These commands involve applications of several different functions within the auto­
matic differentiation system. First, there are three invocations of DS-Make -Var . This
function creates the derivative structures corresponding to the independent variables
x, y, and z. For example, x DS -Make -Var ( 3 , 2 , 1 , 4 ) creates a derivative struc­
=

ture for 3 variables, and for partial derivatives through order 2, corresponding to vari­
able number 1 (x ), and assigning that variable a value of 4. This command is the
equivalent of x = 4 in the one-variable/one-derivative system. Similarly, the next two
statements create the derivative structures corresponding to variables y and z, assign­
ing values of 5 and 14, respectively. The other commands are the derivative structure
versions of standard operations; DS-Add is addition of derivative structures, DS- Sqrt
applies the square root for derivative structures, and so on. So the fourth statement
adds the derivatives structures for x and y and takes the square root of the result. That
defines a new derivative structure, u. Similarly, the next line defines v by subtracting
y from z, and applying the derivative structure for square roots. The final command
applies derivative structure division to u and v, and prints the result.
As in Rall's system, the computations above are completely numerical. For ex­
ample, the derivative structure for the variable x stores the value of x , 4, as well as
all the partial derivatives through second order with respect to x , y, and z. These val­
ues are, of course, trivially determined. The partial derivative with respect to x is 1 ,
and all the other partial derivatives are 0 . But the point is that the derivative structure
called x is just some sort of array with entries of 4, 1 , and many zeroes. In the same
way, y and z are arrays of numbers as well. When these are combined according to the
commands listed above, the final result is printed out as

0.0 1 235
0. 1 1 1 1 1 0.00000 -0.0 1 235
1 . 00000 0.05556 -0.00309 -0. 05556 -0.00309 0.00926.

These are the values of f and its derivatives, in the following arrangement:

/yy
/y fx y /yz
J fx fxx fz fx z fzz ·
The subscripts indicate partial differentiation: fx for �� , fx y for ::L ,
and so on. The
rationale for laying out the derivatives in this way will become clear when the general
system is defined. For this example, it is enough to see how the system operates, and
to observe that all the desired partial derivatives are correctly computed.
At this point, I hope that the basic idea of the automatic differentiation system is
clear. Numerical values for a function and its derivatives are arranged in some sort of
data structure, and operations on these structures are defined according to the rules of
differentiation so that derivatives are correctly propagated. The structures for the sim­
plest functions, namely the constant functions (like c(x , y , z) = 5) and variables (like
VOL . 75, NO. 3 , J U N E 2002 191
11 (x , y, z) = x) are easy to specify directly. By operating on these simple derivative
structures, we can formulate derivative structures for essentially arbitrary expressions
involving the variables and elementary functions.
Although these ideas are feasible in principle, I also hope the reader has some sense
of the difficulty of handling all the details in practice. At first glance, the . idea of
defining appropriate structures to contain all the partial derivatives through second
order relative to three variables, and then specifying the proper operations of arith­
metic, as well as proper definitions for functions like sine and cosine, should seem
fairly intimidating, or at least unpleasantly tedious. Happily, and surprisingly, there
is a remarkably simple recursive formulation that is no more complicated than Rall's
one-variable/one-derivative system. Indeed, considered formally, the operations within
this recursive formulation are virtually identical to the operations in Rall's system.
With that in mind, let us tum now to the recursive development of an automatic differ­
entiation system.

The obj ects

The first step in constructing the recursive system is to define the obj ects, or deriva­
tive structures, on which we will operate. Let us consider a few motivating exam­
ples. First, for functions of a single variable, automatic calculation of m derivatives
can be provided by operating on (m + I ) -tuples. A typical object in the system,
a (ao , a 1 , , am ), includes the value of a function and its first m derivatives. For
= · · ·

example, with m 3 , we can write


=

For a function of two variables, assuming equality of mixed partials, the par­
tial derivatives through order m are conveniently arranged in a triangular array.
This is illustrated in FIGURE 1 for m = 3. It is important to note that the entry
in the lower left-hand comer has a special significance. In the derivative struc­
ture f [ 2 ,ml , the lower left-hand comer is the value of the original function f.

/yyy
/yy /yyx
/y /yx /yxx
J fx fxx fxxx
Figure 1 Layout of f[l ,JJ

Observe that the array in FIGURE 1 can be decomposed into two parts. The bottom
row is a vector of derivatives with respect to a single variable, as described in the
preceding paragraph. That is, the bottom row is just f [ l , 3 l . The second part, all of the
triangle except the bottom row, is also a derivative structure, namely /y [ 2 • 21 ; it contains
the value of /y , and all of its first and second order partial derivatives with respect to

x and y . This gives / [ 2 • 3 1 as a combination of J l 3 l and /y [ 2 • 2l .

In a similar way, we can lay out the entries of f [3 • 3 l, that is, the partial derivatives
through third order with respect to three variables (see FIGURE 2). The partial deriva­
tives are arranged in a pyramid composed of several triangular layers. Each layer has
the same form as the triangular array in FIGURE 1 . As before, there is a distinguished
entry identifying the function f, at the lower left-hand comer of the lowest level.
Also, as before, there is a natural decomposition into two parts. The first part is the
1 92 MAT H EMATICS MAGAZI N E
bottom triangular array, which is recognizable as / [2 • 3 1 . It contains all partial deriva­
tives through order m = 3 with respect to x and y . The complementary part is the
sub-pyramid made up of levels 2, 3, and 4. This can be recognized as fz [3 • 21 . It con­
tains all partial derivatives relative to the three variables x, y, and z, through order 2 of
the function fz . The decomposition gives / [3·3 1 as a combination of / [2 · 3 1 and fz [3 • 21 .

Figure 2 Layout of f[3 , 3l

These examples suggest a hierarchy of automatic differentiation objects. For any


n and m , we can imagine a set of objects that contain all partial derivatives through
order m with respect to n variables. These will be our derivative structures. Thus, for
a single variable we have derivative vectors; for two variables, derivative triangles; for
three variables, derivative pyramids; and in general, derivative structures.
The decomposition discussed in the examples above can be described in general us­
ing the terminology of derivative structures. For each example we considered, a deriva­
tive structure of partial derivatives through order m with respect to n variables was
partitioned into two smaller derivative structures. The first part had the same number
of derivatives (m) and one fewer variables (n - 1 ) than the original structure, while the
VO L . 75, NO. 3 , J U N E 2002 193
second part had one fewer derivatives (m - 1 ) and the same number of variables as
the original. These observations inspire the following recursive definition of derivative
structures.
DEFINITION 1 . For m, n :=:: 0, we define DS(n, m ), the set of derivative structures
with derivative index m and variable index n, asfollows. lfm = 0 or n = 0, DS(n, m)
is just JR, the real numbers. Otherwise
DS(n , m) = DS(n - 1, m) x DS(n , m - 1)

(where x denotes the Cartesian product).


It should be emphasized here that this definition makes no mention of functions
or derivatives. It abstractly defines a class of objects, built up recursively, and reduc­
ing to real numbers at the lowest level of the recursion. In this context, a derivative
structure is understood most simply as a binary tree, with real numbers as the leaves.
An element a E DS(4, 7) , for example, has two components, one in DS( 3 , 7) and the
other in DS(4, 6). Each of these components likewise has two components, as shown
in FIGURE 3. Each branch of the tree ends when one of the two indices reaches zero,
indicating that the corresponding component is a real number. For a = f [n ,m l , the real
numbers at the leaves are simply the values of partial derivatives of f . However, this
visualization turns out to be of limited value. Instead, the best approach is to retain the
recursive image of an element of DS(n, m) as an ordered pair, each of whose compo­
nents is a lower order derivative structure.

DS (4 ,7)

DS ( 3 , 7 ) DS (4 ,6)

DS (2,7) DS (3 ,6) DS (3 ,6) DS (4 , 5 )

Figure 3 Partial Tree for a E 05(4 , 7) .

The idea of a derivative structure as an ordered pair hints at the connection to Rall's
automatic differentiation system. Shortly we will see that the definitions for operations
on derivative structures make this connection into a perfect analogy. But there is one
final prerequisite needed. In terms of the triangular arrays and pyramids considered
earlier, the two components of a derivative structure are particular substructures. For
example, if a = (a 1 , a2 ) is a derivative pyramid, then a 1 is a derivative triangle, and a2
is a smaller derivative pyramid. We also need a third substructure, denoted ar. Later
an abstract recursive definition of ar will be provided. But conceptually, think of ar
as follows: If the derivative structure a = f [n,m ] , then it contains within it f [n ,m - I l , the
derivatives up to order m - 1 . That substructure is ar, Thus, in FIGURE 1 , a 1 is the
bottom row, a 2 is the sub-triangle consisting of everything but the bottom row, and ar
is the triangle that contains everything except the third order derivatives lying along
the hypotenuse. Notice that a2 and ar have the same size and shape, but are derivative
structures for different functions. Similarly, in FIGURE 2, the triangle on the lowest
level is a ! ' the remaining levels form the sub-pyramid a2 , and ar is the sub-pyramid
consisting of everything except the highest order derivatives lying on the slanting outer
face of the pyramid.
194 MAT H EMATICS MAGAZ I N E
This completes the background w e need to define derivative structure operations.
We know that a derivative structure a is an ordered pair (a 1 , a 2 ), that the components
are derivative substructures of lower order, and that ar is another sub-structure with
the same size and shape as a2 • The operations on derivative structures are defined in
terms of these substructures.

Operations on derivative structures

To build expressions out of derivative structures, we need to be able to apply arith­


metic operations and elementary functions. By considering the reciprocal function
r (x) = 1 Ix as one of our elementary functions, we eliminate the need to define deriva­
tive structure division. To divide ajb we simply multiply a x r (b) . Accordingly, the
only arithmetic operations that we need are addition, subtraction, and multiplication.
As a convenience we will also include scalar multiplication.
The definitions of all the arithmetic operations are recursive. The case of addition,
subtraction, and scalar multiplication will make this clear.
DEFINITION 2 . For DS(O, m) and DS(n , 0), the elements are real numbers and
addition, subtraction, and multiplication are the usual real number operations. For
n, m > 0, let a = (a 1 , a2 ) and b = (b� > b2 ) be elements of DS(n , m), and let r be a
real number. Then addition, subtraction, and scalar multiplication are defined by
a + b = (a 1 + b� > az + bz)
a - b = (a 1 - b 1 , az - bz)
ra = (ra 1 , raz) .
Formally, these are identical to the componentwise definitions in Rall's system. But
they have a slightly different meaning in the present context. To add a and b we must
add their components, which are themselves derivative structures. The computer im­
plementation of the addition is thus recursive. To add two elements of DS( 3 , 4), for
example, we recall the addition operation for components in DS(3, 3) and in DS(2, 4) .
Those additions, in tum, spawn additions of more derivative structures. At each recur­
sion, though, one of the two indices is reduced. Eventually, an index becomes zero,
and the recursion terminates with an addition of real numbers. Subtraction and scalar
multiplication operate similarly.
The definition of multiplication is again an analog of what we saw in Rall 's system.
DEFINITION 3 . For DS(O, m) and DS(n, 0) multiplication is defined to be the
usual real number operation. For n, m > 0, if a = (a 1 , a2 ) and b = (b 1 , b2 ) are ele­
ments of DS(n, m), define
a x b = (a 1 x b 1 , az x br + a� x bz ) .
Formally, this is virtually identical to the one-variable/one-derivative multiplication
rule defined by (1). The only difference is that there are no asterisks in (1). Indeed,
the ordered pairs in Rall's systems are elements of DS( l , 1), and in that setting, a 1
and ar are identical. However, while there are clear formal similarities between mul­
tiplication in Rall's system and in DS(n , m), it must be remembered that in the latter
system the definition is recursive. As for the operations of addition, subtraction, and
scalar multiplication, the multiplication of derivative structures requires multiplying
their components, and hence a recursive use of multiplication. And as we saw earlier,
the recursive process keeps generating more and more multiplications, finally reaching
VO L . 75, NO. 3 , J U N E 2002 195
a point at which the derivative objects reduce t o real numbers. So, while the multiplica­
tion definition seems to have the same simplicity as in Rail's system, under the surface
there is a complex sequence of operations implicitly defined.
Finally we come to the elementary functions. Given a derivative structure a and an
elementary function ¢, we wish to define ¢ (a). Once again, the definition is almost
identical to what appeared in Rail's system.

D EFINITION 4. Let ¢ be an m-times differentiable function of a real variable. If


m = 0 or n = 0, DS(n , m) is just lR and ¢ is applied to the elements in the usual way.
For n, m > 0, if a = (a 1 , a2 ) E DS(n, m), define

This definition is a direct analog of (4), to which it reduces in the case that
n=m= 1 . As we saw with multiplication, the only formal difference is the ap­
pearance of an asterisk in the general derivative structure definition. Here again, the
actual computation of ¢ (a) is recursive, and the recursion terminates when ¢ or one
of its derivatives is finally called upon to operate on a real number.
That's it. That is all you need to construct arbitrary elementary function expressions
involving general derivative structures. As promised, the definitions are virtually the
same as those in Rail's system, and yet they provide for the automatic generation of
partial derivatives to essentially arbitrary order with respect to an essentially arbitrary
number of variables. But the presentation is not quite complete. We still have to see
how to create the fundamental derivative structures that correspond to constants and
variables. And at some point we need to see why the definitions just given really work.

Fundamental derivative structures

So far, we have defined derivative structures and their operations abstractly, without
mention of functions and partial derivatives. To make the connection with automatic
differentiation clear, we must have a definition of pn . m J as an element of DS(n , m).

DEFINITION 5 . Let f be a function of at least n variables with continuous partial


derivatives through order m, and let x be an element of the domain of f. Then the
derivative structure for f with derivatives through order m with respect to the first n
variables is given at x by

f. [n ' m] (X ) = { f(f(x)[n - l ,ml (x) , (an f) [n,m - l] (x)) ifn = 0 or m = 0


otherwise
where an denotes partial differentiation with respect to the nth variable of f.
This definition is a formalization of the pattern we saw in special cases, but some
caution is needed. How do we know that f [n ,m J , as defined here, really does contain
all the partial derivatives it is supposed to? For now the reader is asked to accept the
validity of the definition. We will return to the justification in the next section.
Given the preceding definition, we can construct derivative structures for constants
recursively. For example, to create the derivative structure for the constant 5 , we con­
sider the constant function f (x , y, z, . . . ) = 5. Now f [n ,m ] has two components. The
first is f [n - I,m l , and that can be constructed recursively. The second is an f [n - I,m l , and
since f is constant, the partial derivative is 0. But that is again a constant function.
Thus, a recursive construction algorithm can operate similarly to the operation al­
gorithms. To construct a constant in DS(n , m), we must first construct constants in
196 MAT H EMAT I CS MAGAZ I N E
DS(n - 1 , m) and DS(n , m - 1 ) . The recursion proceeds until one index becomes 0,
and at that point the value of the constant is returned. That constant is 5 just once,
corresponding to tracing the left branch all the way down the tree to a leaf. In any
path that involves a right branch, the function will be differentiated at least once, and it
will be a zero function that is finally evaluated. On some level, however, this image is
irrelevant. All that really matters is that a simple recursive construction algorithm for
constants exists in the automatic differentiation system.
To illustrate the situation for the independent variables, let's consider the function
12 (x , y , z) = y. How do we construct /2 [ 3 • 21 at y = 8, for example? At the top level,
fz [3 • 21 is an ordered pair. The first component is /2 [2 • 21 , which will be constructed recur­
sively. The second component is az fz [ • 1 1 , and since az y = 0 that is just the derivative
3

22 12 2l
structure of the constant 0. It can be constructed using the algorithm for a constant.
At the next level, /z [ • 1 is decomposed into /z [ , 1 and ay i2 [ , l . For the first of these,
notice that the first index is 1 . This is a derivative structure that does not involve any
derivatives with respect to y, and for its construction we can treat y as the constant 8.
For the second component, ay i2 = ayy = 1 . Again w e need only construct a deriva­
tive structure for a constant. In a similar way, the derivative structure for any of the
independent variables can be constructed recursively. Indeed, x/n,m l = (a 1 , a2) is de­
fined as follows: If j < m, then a 1 is defined by a recursive construction of Xj [n ,m - IJ
and a2 is a derivative structure for the constant 0. If j = m, then a 1 is constructed as
a constant derivative structure, with whatever value was assigned to Xj , and a2 is the
derivative structure for the constant 1 . And if j > m, a 1 is again a constant derivative
structure with the value of xj , but a2 is the derivative structure of the constant 0.
This is the construction used to define DS -Make-Var in the sample computation
presented earlier. In fact, if you review that computation, you will see that we have
now defined every operation that appears there. The automatic differentiation system
is complete. With algorithms for constructing derivative structures for independent
variables and constants, and definitions of derivative structure operations and elemen­
tary functions, nothing more is needed. However, we have yet to see any verification
that the system actually works. How do we know, for example, that the arithmetic defi­
nitions propagate derivatives correctly? How do we know that applying an elementary
function to a derivative structure as in Definition 4 produces the desired derivative
information at the end? For that matter, how do we even know that the recursive defi­
nition for p n . mJ is correct? The next section will address these questions.

Validation of the system

There are two aspects of the system that require validation. First, we have to verify
that the recursive definition of f [n ,m l properly represents the intuition suggested by the
triangle and pyramid examples. Second, it must be established that the definitions of
derivative structure operations correctly propagate derivative information. That is, we
must see that

j [n,m ] + g [n,m ] = (j + g ) [n,m ]


j [n,m ] g [n,m ]
_
= (j g ) [n ,m ]
_
(6)
j [n ,m ] X g [n,m ] = (jg ) [n,m ]
and

(7)
VO L . 75, NO. 3 , J U N E 2002 197
For both of these ends, expressing a derivative structure a as an ordered pair
(a 1 , a2 )
and referring to the components and to a; will be of central importance.
It simplifies the presentation to express these substructures using an operator no­
tation. Thus, if a = (a 1 , a2 ) is a derivative structure, we define V (a) = a 1 and
D(a) = a2 . The names of these operators reflect the meaning of the components
in the one-variable/one-derivative system, where a 1 is the value of the function, and a 2
is the derivative. Recall that af is obtained from a by removing all the highest order
derivatives, so that af is a lower order version of a. Accordingly, we use the notation
L(a) = af .
Although the conceptual meaning of the operators is clear, formal definitions will
be given for completeness. For L , this is particularly important as there has not yet
been given an abstract definition in terms of derivative structures.

DEFINITION 6 . Let a E DS(n, m). If n = 0 or m = 0, a is a real number and


V(a), D(a), and L (a) are all defined to equal a. Otherwise, a = (a 1 , a2 ). In this case,
we define V (a) = a 1 , D(a) = a2 , and L (a) according to
ifm = 1
ifm > 1.

It may not be immediately apparent that this definition of L is consistent with the
earlier explanation of a;. The reader may wish to verify that the definition works cor­
rectly for triangles and pyramids. However, for the arguments that will follow, it is not
logically necessary to connect the definition of L with the conceptual image of f [n , mJ .
Instead, we will be content to take L (a) as the definition of a;, and show that this
definition has the properties we need for automatic differentiation.
The three operators provide the means to connect the abstract definition of D S (n , m)
to the ideas illustrated by the derivative vectors, triangles, and pyramids. As a first in­
stance of this, we have the following result.

THEOREM 1 . Derivative structures for functions are related to the operations V,


D, and L as follows:
V (f [n , m ] ) = f [n- l , m ]
D(f [n , ml ) = ( an f) [n , m - 1 1
L(f [n , m ] ) = f [n , m - 1 1 .
If the derivatives in f ln , ml are laid out as in the examples of triangles and pyramids,
these identities are obvious. However, it is possible to prove the identities using only
the abstract definitions of the operators and of f ln , mJ . In fact, the first two identities are
immediate consequences of the abstract definition of f ln , ml . The third identity can be
proved by a straightforward induction argument that exploits the recursive definitions
of both f ln , mJ and L. This same style of proof is effective for a number of the results
to follow, and while a detailed proof for the third identity above will not be given, a
sample proof will be given for a later theorem. In any case, it is important to note that
the induction proof uses only the abstract definitions of L and fln , ml, and so makes no
direct use of the full image of how partial derivatives are laid out in pn .ml . Thus, the
fact that the third identity can be established by an abstract proof confirms that, at least
in this regard, L and fln , ml operate according to expectation.
Theorem 1 lends itself to a simple formal algorithm for applying V, D, or L
to fln , ml : V decrements the variable index by 1 ; L decrements the derivative index
by 1 ; and D both decrements the derivative index and differentiates f once with re-
198 MAT H EMATICS MAGAZ I N E
spect to the nth variable. Using just the first two of these rules we can prove the next
result.

THEOREM 2. Suppose pn .ml is defined at x . Let ej be a nonnegative integer for


1 s j s n with L ej s m. Then the partial derivative a�] . . . a�n f (x) can be obtained
from f[n , ml (x) as follows: If L ej = m then
a� 1 • • • a�n f (x) = D'� V D e2 V . . · V D e" f [n , ml (x) ;
otherwise

The proof is simply a matter of applying the identities in Theorem 1 . Rather than
present the details in a formal way, it will be more illuminating to work through
an example. Consider the derivative structure jl 3 •61 and suppose we want to obtain
afa2 aif (x) . Since this is a fifth derivative and m = 6, the theorem says to compute
V D2 V D V D2 jl 3 •61 . We can verify that the desired result is obtained by applying the
identities in Theorem 1 as follows:

V D 2 V D V D 2 f [ 3 •61 (x) = V D 2 V D V (ai f ) [ 3 '41 (x)


= v D 2 V D (ai f ) [ 2 '41 (x)
= v D 2 V (a2 ai f ) [2 ' 3 1 (x)
= V D 2 (a2 ai n [ ' . 31 (x)
= v ca�a2 ain [ ' . ' 1 (x)
= ca�a2 ai n [o. ' 1 (x)
= a�a2 ai f (x) .
This example reveals the general nature of the algorithm for extracting a particu­
lar derivative from f[n , ml . Notice that the D operator only performs differentiation of
f [n , mJ with respect to Xn . But each time we apply V, we reduce the value of n, and
hence change the variable that D differentiates. If we want a certain number of deriva­
tives with respect to Xn , we apply D that many times. Then we apply V, in effect,
shifting the focus to Xn- i · If we want one or more derivatives with respect to Xn- i , we
apply D that many times again. So we continue, alternately applying D to differenti­
ate and V to shift to a new variable, until all the desired derivatives have been applied.
For an mth derivative, there will be m applications of D, reducing the derivative index
to 0, and so reducing the derivative structure to a real number. Otherwise, there will
be exactly n applications of V . This will reduce the variable index to 0, and so again
result in a real number.
It should be stressed again that the operators V and D were defined completely
abstractly, with no reference to derivatives. In a computational system, a particular
derivative structure is simply an organized network of memory locations which store
real values. The algorithm above navigates through such a network to a particular
entry. Theorems 1 and 2 show that when a derivative structure is constructed according
to the abstract definition of f[n , ml , the desired derivative values can all be located and
extracted. More specifically, visualizing the network as a binary tree, each application
of V selects a left branch from a node, each application of D selects a right branch, and
after either m applications of D or n applications of V a terminal node is reached. Thus,
VO L . 75, NO. 3, J U N E 2002 199
Theorem 2 can b e understood a s a prescription for finding the appropriate terminal
node for a particular partial derivative.
To complete the validation of the system, we must see that derivative structure op­
erations really do succeed in constructing f[n , mJ. That is, we must verify (6) and (7).
The formal statement is given in the following theorem.
THEOREM 3 . Let f and g be real valued functions of n or more variables, with
continuous partial derivatives through order m, let x be in the domain of f and g,
let r be a real number, and let 4> be a real function m times differentiable at f (x ).
Then the following identities hold:
f [n , m ] (x) + g [n , ml (x) = (f + g) [n , m ] (x)
f [n , ml (x) g [n , ml (x) = (f g) [n , m ] (x)
_
_

rf [n , ml (x) = (rf) [n , m ] (x)


f [n , m ] (x) g [n , ml (x) = (fg) [n , m ] (x)
X

(j> (f [n , ml (x)) = (¢ f) [n , ml (x) .


0

As mentioned earlier, the recursive nature of the definitions makes induction a nat­
ural approach to proving results like these. To illustrate, here is a proof of the final
identity above. It assumes that the preceding identities have already been established.
Proof. The proof is by induction on n + m. If either n or m is zero, the conclu­
sion holds trivially. So assume that both n and m are positive, and that the conclusion
holds for pn , m J whenever n' + m' < n + m. From the definition of 4> for derivative
' '
structures, if f[n , ml (x) is expressed as the pair (a 1 , a2 ), then

In terms of the V, D, and L operators, this becomes

Applying Theorem 1 we obtain

Now we are ready to use the induction hypothesis. On the right side of the preced­
ing equation, the real functions 4> and ¢' are applied to derivative structures with lower
order than f[n ,mJ (x) . By induction, we can bring 4> and 4>' inside their respective paren­
theses, leading to

Similarly, the identity for derivative structure multiplication allows us to bring the
product on the right side of the equation inside the parentheses. Performing that reduc­
tion and recognizing the normal real function chain rule then produces

4> ( / [n , ml (x)) = ( <4> )


f) [n-! ,m ] (x) , (¢' f . an n [n ,m - ! ] (x)
0 0

= ( (¢ f) [n - ! ,ml (x), [ a (¢ f)] [n ,m - l l (x))


o
n o

= (¢ f) [ ,ml (x) .
n 0

This shows that the identity holds for f[n ,ml, completing the induction argument. •
2 00 MAT H EMATICS MAGAZI N E
This concludes the validation of the recursively defined automatic differentiation
system. It has been demonstrated that the simple recursive definitions for derivative
structure operations properly propagate partial derivatives. To put it more simply, we
have seen that the recursive automatic differentiation system works. In a final section,
we discuss a few ideas connected with implementation and computational efficiency.

I m plementation and efficiency

The recursive automatic differentiation system presented here can be implemented in


any computer programming language that supports recursion. A working version is
described in [6]. There, the interested reader will find LISP code for the entire system,
amounting to about 1 50 lines. Although the presentation in [6] is from a different point
of view than the double recursion described here, the LISP code can be considered an
implementation of either point of view. In fact, the double recursion described here
was discovered as a direct result of studying the implementation in [6]. It should also
be mentioned that the original idea for treating the number of derivatives recursively
is due to Neidinger [8]. His work provided a critical inspiration for both the approach
of [6] and the double recursion presented here.
It is beyond the scope of this paper to discuss the LISP implementation in detail.
However, there is one aspect that is worth considering. The programming for the
automatic differentiation system must include derivative structure formulations for
all the familiar elementary functions: exponential, sine, cosine, etc. Each of these is
programmed according to Definition 4. Interestingly, this definition can be imple­
mented quite generally, and then used to create the procedures for all the desired
elementary functions. The basic idea is to define a procedure that will combine
the original function </J, the derivative </J', and the derivative structure a to com­
pute </J (a). For the sake of discussion, let us call the procedure Comp o s e . It will take as
arguments procedures phi and phi -pr ime , and a derivative structure a. If a is ac­
tually just a real value, C omp o s e applies phi to a and returns the result. Otherwise,
Comp o s e uses the V, D, and L operators to compute a 1 , a2 , and a i * , respec­
tively. Then it applies phi to a 1 , phi -prime to a 1 * , and returns the ordered pair
(phi ( a 1 ) , ph i -pr ime ( a 1 * ) * a2 ) .
All of the elementary functions are defined in terms of the procedure Comp o s e . For
example, here is what the definition of the derivative structure exponential function
might look like:

Funct i on DS-Exp ( a )
i f a i s r e al
return exp ( a)
else
return Compo s e ( D S -Exp , DS-Exp , a)
end

Note that DS-Exp plays the role of both phi and phi -pr ime in the call to
Thus, the computation of DS-Exp ( a) requires evaluations of DS -Exp ( a 1 )
Comp o s e .
and O S -Exp ( a1 * ) . This is simply a direct implementation of the recursive nature of
Definition 4. In a similar way, the reciprocal function is defined as follows:

Funct i on DS-Re c ip ( a)
i f a i s r e al
return 1 / a
else
VO L . 75, NO. 3 , J U N E 2 002 201
return Compo s e ( D S - Re c ip , DS -DRe c ip , a)
end

Here, DS -DRe c ip is a derivative structure function that plays the role of the
derivative of the reciprocal function. That is, with � (x) = 1 Ix, the derivative is
�' (x) = - 1 I x 2 • This can be defined by

Fun c t i on DS-DRe c ip ( a)
r e c ip - a = DS-Re c ip ( a)
return - 1 * r e c ip - a * r e c ip - a

And now that we have defined the reciprocal function, it is no problem to add the
natural logarithm.

Funct i on DS-Ln ( a)
i f a i s r e al
return ln ( a)
else
return Comp o s e ( D S - Ln , DS-Re c ip , a)
end

As these examples suggest, the development of a complete automatic differentia­


tion system requires very little programming, once the derivative structure operations
are in place. For each elementary function that is included, the developer does have to
explicitly specify the derivative. However, that is a small price to pay for the automatic
generation of derivatives to essentially arbitrary order. And in any case, one cannot
reasonably hope to avoid defining derivatives altogether in a system that is supposed
to compute derivatives automatically. In comparison to other approaches to automatic
differentiation for higher derivatives [2, 7], the development presented here is remark­
ably simple.
This simplicity streamlines the task of implementing an automatic differentiation
system. How the system performs is quite another issue, and it turns out that the ele­
gance of the recursive approach is accompanied by some significant sources of inef­
ficiency. While we will not take up this issue in any significant way here, a few brief
comments are in order.
A little reflection reveals that a naive implementation of the doubly recursive ap­
proach involves widespread recomputation of previously obtained results. To illustrate
this idea, consider the third derivative of the product fg. We know by Leibniz' rule
that

(fg) "' = !"' + 3f " g' + 3f'g " + g "'.


This can be derived by repeatedly applying the product rule, and then algebraically
simplifying the result. In particular, three different terms, each equal to f" g', would
appear, giving rise to the single term 3 f " g' in Leibniz' rule. The recursive automatic
differentiation system is similar to repeatedly applying the product rule without alge­
braic simplification. That would entail three separate evaluations of f " g'.
In contrast, Neidinger [9] has developed a multivariate automatic differentiation
system that uses explicit looping and subscripting. This system avoids the recompu­
tation that can arise in the recursion process, and should be expected to outperform a
direct implementation of the design presented here.
Inspired by Neidinger's approach, there are obvious strategies for reducing some
of the recursive approach's inefficiency. In particular, a carefully optimized multipli­
cation procedure, based on Leibniz' rule rather than simple recursion, might make a
2 02 MATH EMATICS MAGAZI N E
significant impact. Another attractive idea is to identify and exploit redundant calcu­
lations in the recursion process. Yet another improvement would be to take advantage
of sparseness, eliminating computations that ultimately lead to multiplication by zero.
Whether a modified version of the recursive system would be competitive with Nei­
dinger's system is a question for further study.
However, no matter what formulation is used, direct computation of all partial
derivatives of an expression is simply not the fastest approach. A more efficient al­
ternative is to use systems of univariate automatic differentiation computations and
an interpolation scheme [1]. Although this does increase memory requirements, it is
easily shown to produce huge reductions in execution for large scale systems. Thus,
for example, in a system with several hundred variables and a need for third order par­
tial derivatives, any direct computation of all partial derivatives would be much slower
than the alternative using interpolation.
On the other hand, computational speed is not always an issue. An automatic differ­
entiation system of the type described here has been used successfully in an interactive
application for analyzing systems of constraints arising in the design of satellite sys­
tems. In that context, automatic differentiation was used to perform sensitivity analyses
among dozens of variables. For this application, computation was limited by the speed
of user input, not by the speed with which the automatic differentiation system op­
erated. In that situation, the speed of the automatic differentiation system was of no
concern at all.
More generally, as computational speed continues to increase, the importance of
execution efficiency will continue to decline, particularly for problems with small
numbers of variables. In these cases, the directness and simplicity of the current de­
velopment offers an attractive paradigm for implementing an automatic differentiation
system.

Acknowledgment. This paper is based on an invited address at the January 1 997 AMS/MAA meeting [5] .

REFERENCES
1 . C. Bischof, G. Corliss, and A. Griewank, Structured second- and higher-order derivatives through univariate
Taylor series, preprint MCS-P296-0392, Argonne National Laboratory, Argonne, Illinois, May 1 992.
2. H. Flanders, Automatic differentiation of composite functions, in Automatic Differentiation of Algorithms:
Theory. Implementation. and Application, A. Griewank and G. F. Corliss, eds., SIAM, Philadelphia, 199 1 ,
pp. 95-99.
3. A. Griewank, Evaluating Derivatives: P rinciples and Techniques of Algorithmic Differentiation, SIAM,
Philadelphia, 2000.
4. A. Griewank and G. F. Corliss, eds., Automatic Differentiation of Algorithms: Theory, Implementation, and
Application, SIAM, Philadelphia, 1 99 1 .
5. D . Kalman, Automatic differentiation: computing derivative values without derivative formulas, Invited ad­
dress, Joint Meetings of the American Mathematical Society and the Mathematical Association of America,
San Diego, January 1 997.
6. D. Kalman and R. Lindell, Recursive multivariate automatic differentiation, Optimization Methods and Soft­
ware 6 ( 1 995), 161-192.
7 . C. L. Lawson, Automatic differentiation of inverse functions, in Automatic Differentiation of Algorithms:
Theory, Implementation, and Application, A. Griewank and G. F. Corliss, eds., SIAM, Philadelphia, 199 1 ,
pp. 87-94.
8. R. D. Neidinger, Automatic differentiation and APL, College Math. J. 20 ( 1989), 238-25 1 .
9 . R . D . Neidinger, An efficient method for the numerical evaluation of partial derivatives of arbitrary order,
ACM Transactions on Mathematical Software 18 ( 1 992), 159-173.
10. L. B. Rall, The arithmetic of differentiation, this MAGAZINE 59 ( 1986), 275-282.
N OTES
Avo i d i n g Yo u r S p o u se at a Pa rty Lead s to Wa r
MARC A. B RO D I E
The Col l ege of St. Benedict
St. Joseph, M N 56374
[email protected]

In her note in the February 200 1 issue of this MAGAZINE, "Avoiding Your Spouse at
a Bridge Party," Barbara H. Margolius [4] addresses the Bridge Couples Problem, a
question originally considered in terms of dancing by James Brawner a year earlier
("Dinner, Dancing and Tennis, Anyone?" this MAGAZINE, February 2000 [2]):

THE B RIDGE COUPLES PROBLEM . Suppose n married couples (2n people) are
invited to a bridge party. Bridge partners are chosen at random, without regard to
gender. What is the probability that no one will be paired with his or her spouse?

I see no reason to avoid only my spouse. In addition to a wife, I have two children,
and we would all like to avoid each other. Presumably, there are other families of four
in this situation. Thus we will consider the following problem:

THE B RIDGE FAMILIES PROB LEM . Suppose n families of four (4n people) are
invited to a bridge party. Bridge partners are chosen at random, without regard to
gender or generation. What is the probability that no one will be paired with a member
of his or her family?

Note that the Bridge Families Problem is different from the generalization of the
Bridge Couples Problem that Margolius offered as an exercise at the end of her article.
The Bridge Families Problem can also be interpreted as a problem arising in the
playing of the card game War. War is played by thoroughly shuffling a standard deck of
52 playing cards and dealing 26 cards to each of two players. The players then compare
the top cards in their hands, with both of those cards going to the player with the higher
ranking card. Play continues in this fashion until each player has played all of his or
her 26 cards. In the case of a match (sometimes called a "battle"), that is both players
turning over cards of the same rank, additional rules exist concerning how to proceed.
However, we will not be concerned with that issue here. Neither will we consider plays
beyond the first 26, though in the traditional game of War, play generally extends well
beyond going through the deck once. In terms of War, the Bridge Families Problem
becomes:

THE PROBLEM OF WAR WITHOUT BATTLES . Suppose a well-shuffled deck con­


sisting of 4n cards (4 distinguishable cards of each of n linearly ordered ranks) is dealt
to two players so that each player has 2n cards. What is the probability that when the
players play through their decks and compare the cards, there are no matches?

We choose to use the language of cards over that of families because cards in a deck
come with an obvious ordering by rank. This ordering allows for many additional

2 03
2 04 MAT H EMATICS MAGAZ I N E
interesting questions, some o f which have been investigated by two o f my students
in their undergraduate research projects and honors theses. Here we will answer the
following question, which is closely related to the Problem of War Without Battles :

T H E ANNIHILATION PROBLEM . What is the probability that i n a game o f War


with a deck of 4n cards, player 1 annihilates player 2 (that is, has a higher ranking card
in all 2n plays)?

The Annihilation Problem is of particular interest to me since I was once almost


annihilated by one of my children (he won all but the last card) ! As we shall later
see, the probability of being annihilated when using a standard deck of cards, and the
probability of being almost annihilated in this fashion, are both less than 300•0�0•000 .
Thus, I am left wondering just who shuffled the deck before that ill-fated game.

Computing the probability of k matches We envision the deal of the cards as a


2 x 2n rectangular array, with the cards of player 1 in the first row, and those of player 2
in the second (see FIGURE 1 ) . A match then corresponds to two cards of the same rank
in the same column. Thus in FIGURE 1 there are four matches shown, J<> and J.t.,
3<:? and 3•, J• and JC?, and 8.t. and 8 •. Note that there are two matches involving
j acks; we will call such a situation a double match. The presence of the (unmatched)
3.t. and 8<:>, indicates that the 3s and 8s are single matches.

3.t. 5. J<> 7. 3<:? QC? KC? 4.t. J. 5<> 6.t. 2<:? A. 8.t. 4.
7<> 8<> J.t. Q.t. 3• 6• so 2.t. JC? 7.t. 2<> 9.t. 9<:? 8• K<>
Figure 1 A poss i b l e dea l of the cards for a sta n dard deck (n = 1 3)

It is the potential for double matches that makes computing the probability of k
matches an interesting generalization of the Bridge Couples Problem. Since there are
4n distinguishable cards being placed into 4n positions in the array, there are (4n) !
possible deals. Let k be a fixed integer, 1 ::=: k ::=: 2n . We will compute the probability
of a deal with at least k matches, then use the inclusion-exclusion principle to compute
the probability of a deal with exactly k matches.
If we have k matches, some may be doubles. We let m be the number of double
matches, and r be the number of single matches. Thus 2m + r = k, and we must
consider all possible values of m from m = 0 to m = Lk/2J . Note, for example, that
in a standard deck of 52 cards (where n = 1 3), if we let k = 20, we will need at
least seven double matches. In general, the smallest possible value of m will be the
maximum of the set {0, k - n}. To count the number of deals with at least k matches,
n
we first observe that there are ek ) ways to specify which positions contain the matches.
We then select the r ranks for single matches, choose two of each of those four cards
for the match, and choose which of those cards goes to each player. This stage results in

G) G)' . 2r =
G) . 1 2r (1)

possibilities. Next, from the remaining n - r ranks w e choose the m ranks involved in
the double matches, split the four cards of each of those ranks into an unordered pair of
unordered pairs, and choose which card goes to which player in each of the resulting
2m pairs. This next stage of our process results in

(2)
VO L . 75, NO. 3 , J U N E 2002 2 05
possibilities. Finally, the k pairs we have chosen can be put into the k specified po­
sitions in k! ways, and the remaining 4n - 2k cards can be put into the array in
(4n - 2k) ! ways. Thus, for a fixed m, the number of deals that have at least k matches
in specified positions is the product of k! (4n - 2k) ! and formulas (1) and (2):
(nr ) (n m- r) · 12m+r · k! (4n - 2k) ! =
12k - m · k! n ! (4n - 2k) !
. (3)
m ! (k - 2m)! (n - k + m) !
Thus the probability of a deal with at least k matches in specified positions is
given by

l�J 12k - m · k! n ! (4n - 2k) !


m =m � , k -n ) m ! (k - 2m) ! (n - k + m) ! (4n)!
12k · k! n ! (4n - 2k) ! l�J 1
(4n) ! L 12m · m ! (k - 2m)! (n - k + m) !
m =max(O , k -n )
We are now in a position to solve the Problem of War Without Battles. We will find
the probability of no matches, Po , n , by computing the probability of the complement,
at least one match. Keep in mind that, for example, the event of a match in the first
position, and that of a match in the second position are not disjoint. Thus we cannot
find the probability of at least one match by simply evaluating our formula above when
k = 1 . Instead, we use the inclusion-exclusion principle; we include all the nondisjoint
possibilities with one match, then subtract off the overlapping possibilities with two
matches, and so on. Since there are e;) ways to specify the k positions in which there
are matches, inclusion-exclusion tells us that the probability of at least one match is
P l* , n =

f ( (2nk ) 12k · k! n!(4n(4n! ) - 2k) ! · �


)
(- 1) k + l 1
.
k=l 12m · m ! (k - 2m) ! (n - k + m)!
m =max(O , k -n )
(4)
(The asterisk in p 1 • , n signifies that this is the probability of at least one match.)
Hence, the solution to the Problem of War Without Battles is Po , n = 1 - P1 • , n .
At this point, we pause to answer the Annihilation Problem. Since the Annihilation
Problem does not consider situations involving battles, we need only determine the
probability that player 1 has a higher ranking card than player 2 does in each of the
2n plays, given that there are no matches. Since all deals are assumed to be equally
likely, this probability is quickly seen to be 1 j2 2n , so that the probability of annihi­
lation is Po, n j22n . When n = 13, we compute that for a standard deck of cards, the
probability of no battles is approximately 0.210214, and the probability of my annihi­
lation is 0.210214/226 � 3 . 1 3243 x w - 9 • Note that under the same assumptions, the
probability that player 1 has a higher ranking card in all but the last of the 2n plays is
also 1 /22n . Thus the probability of my almost being annihilated in this fashion is also
approximately 3 . 1 3243 X 10 - 9•
Using more general Inclusion-Exclusion formulas (see [3, Chapter IV]) , w e now
compute the probability of exactly j matches, for any j : pj , n =

f ( (k) 12k ·(2nn ! (2n)- k)! !(4n(4n)!- 2k) ! . �


)
( - 1 )k+ l 1
j 12m · m ! (k - 2m) ! (n - k + m) ! ·
k=j m =max(O ,k -n )
(5)
2 06 MAT H EMATICS MAGAZ I N E
A s a reality check, w e had Mathematica use (5) to compute the probability distri­
bution for a standard deck of cards, some of which is shown in T E 1. ABL
TAB L E 1 : The p robab i l ity d i stribution for the n u mber of matches
u s i n g a standard deck of cards.

j j
0 0.210214 7 0.000595043
1 0.334183 8 0.0000946984
2 0.259336 9 0.0000128083
3 0. 1 307 1 3
4 0.048028 24 3.338 19 x w -25
5 0.01 36839 25 0
6 0.003 14026 26 5 .34967 x w -28

to find that p2 5 , 1 =
Given the relative complexity of the formula for P j , ! 3 • it was particularly pleasing
3 0, which of course must be the case since it is impossible to have
25 matches without the two remaining cards matching also.

The asymptotic behavior of the probability of war without battles In Margolius'


article [4] , it was shown that the probability that no one will be paired with his or her
spouse converges to e - 1 1 2 . Here we investigate the convergence (as n ---+ oo) of the
probability, Po , n , of no battles in a game of War with a deck of 4n cards. In T
values of Po, n . as computed by Mathematica for increasing n, are displayed.
E 2, ABL
TAB L E 2: The p robabi l ity of zero matches i n a game of War with
d i fferent s i zed decks of cards.

n Po, n n Po . n
13 0.210214 50 0.219779
20 0.214742 100 0.221456
30 0.217542 500 0.222795
40 0.21 8941 1000 0.222963

Note that although convergence is slow, appearances certainly indicate that the se­
quence is increasing, and in this case, since the sequence is bounded above by 1 , a
limit must exist. The interested reader might wish to make a conjecture as to the limit
before reading further. (Hint: the limit involves e.)

THEORE M . The limit of the probability of no battles in a game of War with cards

=
of n different ranks is
lim Po e - 3 12 0.223 1 30.
n ->oo , n

Proof The presence of the double summation and the floor function i n formula (4)
makes finding the limit of the Po , n directly from that formula somewhat difficult, so

= =
we take a different approach. We will first compute the probability, q 1 • , n , of having at
least one double match. Using (3), with k 2 and m 1 , choosing two positions to

(2n) 12 2 n ! (4n - 4) ! =
contain the double match, and noting that we're overcounting, we find that

6n (2n - 1 )
q ! * ' n -<
· ·

2 (n - 1 ) ! (4n ) ! (4n - 1 ) (4n - 2 ) (4n - 3)


.
VO L . 75, NO. 3 , J U N E 2002 2 07
Therefore,

lim q i * , n = 0.
n -> oo
Since the probability o f a double match goes t o 0, w e may evaluate the limit o f the
probability of no battles in a game of War by using (4) with the inner sum evaluated
only at m = 0. Doing so yields

n ! (2n)! (4n - 2k) !


(n - k) ! (2n - k) ! (4n) !
A formal argument using epsilonics (similar to that used by Margolius [4]) now
shows that
.
11m P i * , n = 1 - e - 3 /2 .
n ->oo
However, w e choose t o note that one can see this result more informally b y observing
that
n ! (2n)! (4n - 2k) ! n (n - 1) · · · (n - k + 1) 2n (2n - 1) (2n - k + 1)
· · · ·

(n - k) ! (2n - k) ! (4n) ! 4n (4n - 1) (4n - 2k + 1)


· · ·

s o that for fixed k,

. n ! (2n) ! (4n - 2k) !


1lm -------­

n->00 (n - k) ! (2n - k) ! (4n) !


Thus,

Suggestions for further investigation In [1], Blom, Holst, and Sandell consider
(among other problems) the "matching Sing-Sing problem." This problem is equiva­
lent to the Bridge Couples Problem. Blom, Holst, and Sandell prove that as n --+ oo,
the probability distribution of the number of matches approaches a Poisson distribu­
tion with parameter A = 1 /2. Given the result of our theorem, the obvious conjecture
is that in the Problem of War Without Battles, the probability distribution of the num­
ber of matches approaches a Poisson distribution with parameter A = 3/2. Is this the
case? Numerical evidence at least indicates this is plausible. For example, in TABLE 3,

TAB L E 3 : A comparison o f t h e p robabi l ity d i stributions o f t h e n u mber o f matches for


n 1 3 with the Poi sson d i stribution with parameter A. = 3 /2.
=

j Pj , i 3 P (j ; 3/2) j Pj , i 3 P (j ; 3/2)
0 0.210214 0.223 1 30 7 0.000595043 0.000756426
1 0.334183 0.334695 8 0.0000946984 0.000141 830
2 0.259336 0.25 1021 9 0.0000128083 0.0000236383
3 0. 1307 13 0. 1255 1 1
4 0.048028 0.0470665 24 3 .338 19 x w- 25 6.05401 x w- 2 '
5 0.01 36839 0.0141200 25 0 3 .63240 x w-n
6 0.003 14026 0.00352999 26 5.34967 x w- 2 8 2.09562 x w- 23
2 08 MAT H E MATICS MAGAZ I N E
w e compare the probability distribution for the number o f matches playing with a stan­
dard deck of cards (from TABLE 2) with the probabilities from the Poisson distribution
P (j ; 3 /2) .
In the annihilation problem, we considered only the case that player 1 had a higher
ranking card than player 2 in each play. What if we allow matches, and adopt the
convention that if there is a match, the winner of the next play takes all four cards?
What if we adopt the standard convention that in the event of a match, each players'
next three cards are placed face down, and their fourth cards are compared, with the
winner taking all ten cards involved? What are the probabilities of annihilation in these
cases? Here, "annihilation" means player 2 doesn't win any cards during the course
of play. Numerous generalizations can be investigated. What if we play with a deck
consisting of six cards per rank? What about m cards per rank? What if m is odd?
Finally, what if, as often seems to be the case in my family, we're not playing with a
full deck?

Acknowledgment. The author would like to thank the two anonymous referees for their comments. This paper
was substantially improved by their suggestions.

REFERENCES
1 . Gunnar Blom, Lars Holst, and Dennis Sandell, Three Sing-Sing problems, Amer. Math. Monthly 102 ( 1 995),
887.
2. J. N. Brawner, Dinner, dancing and tennis, anyone?, this MAGAZINE 73 (2000), 29-35.
3. W. Feller, An Introduction t o P robability Theory and Its Applications, 3rd ed., revised printing, John Wiley
and Sons, New York, NY, 1970.
4. Barbara H. Margolius, Avoiding your spouse at a bridge party, this MAGAZINE 74 (2001), 33-4 1 .
VO L . 75, NO. 3, J U N E 2002 2 09

B e r n o u I I i o n Arc Len gth


V ICT OR MO L L
Tulane U niversity
New Orleans, Louisiana 701 48
vhm@math .tulane.edu

J U D I T H NOW A LS K Y
University of N ew Orleans
New Orleans, Louisiana 701 48
jnowalsk@math. uno.edu

G I NE D ROA
LEONAR DO S O LAN I L LA
Universidad del Tolima
lbague, Colombia
gined ® l atinmail. com
solanila ®bunde.tolinet.com.co

The academic life of the Bernoulli family was always surrounded by controversy. The
disputes between Johann (John) and his older brother and former teacher Jacob and
with his son Daniel are famous and well documented. An interesting discussion of
this remarkable family is found in Section 12.6 of [3]. After the death of L'Hopital,
John claimed the authorship of his classical analysis book. In the controversy between
Leibniz and Newton about the creation of calculus, he stood on Leibniz' side. His
controversial positions were not restricted to mathematics: he was even accused of
denying the possibility of the resurrection of Christ.
In the course of our study of the history of elliptic integrals, we found a paper by
Johann Bernoulli [1] which, in our opinion, both illuminates the calculation of arc
lengths of smooth curves, a topic covered in most undergraduate calculus programs
around the world, and provides an additional tool for producing new and interesting
examples of rectifiable curves. According to Bernoulli, these are curves whose arc
length can be expressed as elementary functions of their end points. The paper contains
a main theorem that is perfectly valid even today, and admits a nice interpretation in
terms of the notion of radius of curvature. Furthermore, we discovered in it a colorful
antecedent of Landen integral transformations [2].
Let y = y (x ) be a differentiable function defined on [a, b]. Then its arc length is
defined by

In general, this integral is not trivial. The examples and exercises provided in most text­
books look unnatural: for instance, the first example given in Thomas [4], page 395,
deals with the arc length of the curve

y = 4J2 x 3 f 2 - 1
3
for 0 ::: x ::: 1 . This is an easy example in the sense that it is computable:
13
g(1) =
11
0
J 1 + 8� d� = - .

6
2 10 MAT H EMATICS MAGAZ I N E
The reader can easjly verify that the integral corresponding to the length o f a circle
can be evaluated. However, the calculation of the arc length of an ellipse leads to the
integral

L (x) = a
1'
0 ) - e ' l; '
' 1 - �2
d�,

where a is the sernimajor axis of the ellipse, and e its eccentricity. This last integral is
one of the fundamental elliptic integrals and is not an elementary function. It was the
starting point of our research on Bernoulli's work.

Bernoulli 's universal theorem The main goal of this section is to present Bernoulli's
result on how to produce rectifiable curves, which in this sense might also be called
rectifiable by straight lines; their arc length can be expressed as elementary functions
of their end points �
THEOREM 1 . Let y = y(x) be a twice differentiable function satisfying
( )dy 2
- + 3x - -- -
dy d 2 y >
0 ( -< 0)
dx dx dx 2
in its interval of definition [a, b ]. Define a new curve with coordinates
( ) lx ( )
a
y 3x dy 2 _ ! dy 2
= d�.
2 dx 2 d�
Now let g(x) and G (x) be the arc lengths ofy and the parametric curve (X (x), Y (x))

( d )3] /;=x ,
starting at x = a. Then

d� /; =a ,
g(x) + (-) G (x) = �

for all x E [a, b].


Proof First observe that

( ) 3 = [ 1 + ( dxd ) 2] 3/2 + 3x [ 1 + ( dxd ) 2] 1 /2 dxd ddx2 2


Following Bernoulli's recommendation, we compute

d d
- x ....! _z _1'_ _z _2: .
dx dx
On the other hand, careful differentiation shows that

G (x) =lx (*Y + (�;y d�


= x 1+
1 (��YI (��Y + 3x ������ d� .
To conclude the proof, note that the integrand of g(x) + (-) G (x) is fx x ( �� ) 3 , so the
result follows from the Fundamental Theorem of Calculus . •
VO L . 75, NO. 3 , J U N E 2002 211
For example, the function y = ln x yields
1 2 1 1
X = ---z . y = -2 - - = 2 ,fi - - .
X x 2 2
After struggling to get the correct constants in some assertions in Bernoulli's article,
we discovered a nice interpretation of Theorem 1 . This formulation eluded Bernoulli
as he did not relate the result to the curvature of the graph y = y(x). Recall that the
radius of curvature of the graph y = y (x) at a point x is

This is the radius of a circle whose curvature matches that of the curve at the given
point. Let us restate the previous theorem in terms of curvature.

THEOREM 2 . Under the assumptions of Theorem 1,


d2
g(x) ± G (x) = � � R(�)
] �=
X
a.
d� �=

Geometrically, denote respectively by C (a), C(x) the centers of the osculating


circles at the points A (a) = (a , y(a)), A (x) = (x , y(x)) on the curve. Also, let
a (a) = LBAC(a), a (x) = LBAC(x) be the corresponding angles between the radii
of curvature R(a) = CA(a), R(x) = CA(x) at these points and the hypotenuses of
some right triangles A B C (a) , A B C (x) as sketched as in the figure below. The posi­
tions of points B(a) and B (x) along the rays shown are determined by the angles a (a)
and a (x), respectively. Then

CB(a) = R(a) tan a (a), CB(x) = R (x) tan a (x).

A (a)

a x

Figure 1 Geometric i nterpretati o n


2 12 MAT H EMAT I CS MAGAZ I N E
In order to get the rectification, set

tan a (a ) = a-
d2y
d� 2
] � =a
, tan a (x ) = x-
d2 y
d� 2
] � =X
.

In this way,

g(x) ± G (x) = CB(x) - CB(a)


gives a new meaning to Theorem 2: the sum (difference) of the arc length integrals
equals the difference of two straight segments. Bernoulli was proud to declare that this
sum (difference) could be measured on a straight line.

Parabolas In Bernoulli's language, a parabola is a curve defined by the function


y = xq , for q a rational number. In this section we discuss parabolas that are recti­
fiable by the above method. Remember that a curve y = y(x) is rectifiable if its arc
length integral admits an antiderivative in terms of elementary functions. Bernoulli
was interested in the question of rectifiable parabolas and was aware of the following
result.
THEOREM 3 . Let n be a nonzero integer. Then the parabola y = x 2n+ 1
2n is rectifiable
on [0, 1].
Proof The arc length is

g (x) = 1x 1+ ( : 1 ) 2 � 1/n d� ,
2n
2

and the substitution u (�) = 1 + e�;J ) 2 � J j n yields

g(x) = (n 2n +n 1 ) 2n lu(1 x) Ju(u - 1t- 1 du ,


--
2

which can be evaluated by expanding (u - o n - I using the binomial theorem. •

The reader may recognize that this result is the source of most arc length exercises
in textbooks. Our first example corresponds to n = 1 . Moreover, the presence of the
factor 4 -Jl /3 is not essential to the solution of the problem: it is window dressing.
We can now use Theorem 1 to assert that every parabola can be rectified by adding
the arc length of another (conveniently chosen) parabola.
THEOREM 4. Any parabola y = xq , q =F 2 /3 can be rectified by adding (sub­
,

tracting) to its arc length the arc length of the auxiliary parabola
3q - 2 �
y =
1
q 2=3tJ X
--- 3q- 2 ,
2q - 1
where X = q 3 x 3q -2 . In particular, the usual quadratic (Archimedean) parabola
y = x 2 is rectified by adding the arc length of the biquadratic-cubic parabola
y = � 2-I / 4 X f 4 .
3

Proof Note that


( )
dy 2 + dy d 2 y
dx
3x
dx dx 2
= (3q - 2)q 2 x 2 (q -2l .

The rest of the proof is a straightforward calculation. •


VO L . 75, NO. 3 , J U N E 2002 2 13
Integral transformations Many interesting questions can be formulated at this
point. For instance: Under what circumstances does the degree of the auxiliary
parabola equal the degree of the original parabola ? The answer is clearly given by
the fixed points of the rational transformation b(q) = (2q - l)j(3q - 2) , q =I= 2/3,
namely, q = 1, 1/3. Since the first value gives a trivial answer, Bernoulli considered
only the second value, which corresponds to the primary cubic parabola y = x 1 3 •
1
1
This case is important because it yields Y = X 1 3 , and consequently

g (x) - G (x)
r � r
o 1 + 9fl/3 d� - o
1
2 L
dX
/1 +
!X �
J Y
=
J 9X 4/ 3

= _l_ 1 + / d� = x 1 +
1 3/2( '
)
27x
9� 4 3 9x 4f3
an actual arc length integral formula !
It is interesting to study the sequence defined recursively by p0 = q , P n = bpn _ 1 ,
n = 1 , 2, . . . , for a given starting value q . For example, if q = 2, then p 1 = 3/4 and
p2 = 2 (again). This implies that if the original parabola is the usual y = x 2 , for which
X = 8x7 and Y = �2 - 1 14X 3 14, then

But applying the transformation from Theorem 1 once more to the auxiliary parabola,
we obtain X = 2 - 3 14X 1 14 = x714 and Y = 2 - 3 1 2./X = X 2 . Thus

from which

Finally, we may ask these questions : How many different values can a sequence P n
take and still lead to an arc length integral formula ? Is there any relation between the
convergence of this type of sequence and new arc length integral formulas ?

REFERENCES
1. J. Bernoulli, Theorema universale rectificationi linearum curvarum inserviens. Nova Parabolarum propietas.
Cubicalis primaria arcum mensura, etc., Opera Omnia /4, Georg Olms Verlagbuchhandlung, Hildesheim,
1968, pp. 249-25. English translation: Universal Theorem useful for the rectification of curves by J. L. Nowal­
sky, G. Roa, L. Solanilla, preprint, 1 999.
2. V. Moll, J. Nowalsky, and L. Solanilla, The story of Landen, the hyperbola and the ellipse, Elem. Math. 57
(2002), 1-7 .
3. J. Stillwell, Mathematics and its Histo ry, Undergraduate Texts in Mathematics, Springer Verlag, 1 989.
4. G. Thomas and R. Finney, Calculus and Analytic Geometry, 9th ed., Addison Wesley, 1996.
2 14 MAT H EMATICS MAGAZI N E

P roof W i t h o u t Wo rds :
A L i n e th ro u gh the l n ce n te r of a Tr i a n g l e

A line passing through the incenter of a triangle bisects the perimeter if and only if it
bisects the area.

B c
A
a

y
B a

(b - b1 + c - c1 ) r (a + b1 + C1 ) r
A top = A bottom =
______

2 2
a + b +c
A top = A bottom ¢> a + b1 + c 1 =
2
-- Sidney H. Kung
University of North Florida
Jacksonville, FL 32224
VOL . 75, NO. 3 , J U N E 2002 2 15

F i n i te G ro u p s Th at H ave Exact l y
n E l e m e nts o f O rd e r n

CARR I EE. F I N C H
U nivetsity of South Carolina
Columbia, SC 29208

R ICHAR D M. F OOT E
U niversity. of Vermont
Burlington, VT 05405

LENN Y J ONES
Shippensburg University
Shippensburg, PA 1 72 5 7

D0 NA L D S P ICK LER, J R .
Salisbury State U niversity
Salisbury, MD 2 1 801

Several years ago one of the authors placed the following rather innocuous question
on a group theory exam: Can a finite group have exactly two elements of order two?
While the correct answer of "no" can be proven fairly easily by a variety of techniques,
depending on the sophistication of the solver, the authors discovered that the general­
ization from "two" to n is not as painless, nor is the answer always negative. In this
note we investigate the answer to the question: For which n do there exist finite groups
that have exactly n elements of order n ? We present the details of the solution in the
abelian case while only stating the result in the nonabelian case.

Preliminaries For n = 1 , the question is easily answered since all finite groups con­
tain exactly one element of order one, namely the identity element. So assume that
n > 1 and write n = n�= I p�; where the Pi are distinct primes. Suppose that G has
exactly n elements of order n .
Define an equivalence relation on the set of elements of order n by saying that
elements x and y are related if and only if they generate the same cyclic subgroup.
The number, , of equivalence classes is the nurrtber of distinct cyclic subgroups of G
k 1
of order n. Each such subgroup of G contains exactly </J (n) = n�=I p�; - ( pi - 1)
elements of order n, where <P is Euler's totient function. This means that the number
of distinct cyclic subgroups of G of order n is

n m

k- -_ _-
</J (n) -n
i =I _!!!_
Pi - 1 _ ·

Consequently, m = 1 or m = 2. If m = 1 , then p 1 = 2. If m = 2, then p 1 = 2 and


P = 3. Hence we have the following:
2
PROPOS ITION 1 . Let n > 1. Then a finite group G has exactly n elements of order
n if and only if
either n = 2a, and G has exactly two cyclic subgroups of order 2a, a ::: 1
or n 2a 3b: and G has exactly three cyclic subgroups of ord� r 2a 3 b , a-, b ::: 1.
=
2 16 MAT H EMATICS MAGAZI N E
Although Proposition 1 imposes severe restrictions on the structure of a group that has
exactly n elements of order n, we can, nevertheless, construct infinitely many such
groups.

PROPOS ITION 2 . If a finite group G has exactly n elements of order n and H is a


finite group such that its order is relatively prime to the order of G, then G H also x

has exactly n elements of order n.


For example, it is easy to check that G = Z2 x Z has exactly 4 elements of order 4;
4
and if H is any finite group such that the order of H is odd, then G x H also has
exactly 4 elements of order 4. Therefore, we have

C OROLLARY. There exist infinitely many groups that contain exactly n elements
of order n.

In light o f Proposition 2 the following definition seems natural.

DEFINITION . Let G be a finite group with exactly n elements of order n . We call


G minimalif no proper subgroup of G has exactly n elements of order n .

The abelian case Suppose that G i s an abelian minimal finite group with exactly n
elements of order n . From the preceding section we have that n = 2a 3 b with a :::: 1
and b :::: 0. By the Fundamental Theorem of Finitely Generated Abelian Groups, G is
isomorphic to the direct product of subgroups of prime power order, namely its Sylow
p-subgroups, where p runs over the distinct primes dividing the order of G [1, Theo­
rem 5, Section 5 . 2] . Grouping all the direct factors for primes greater than 3 together
into a subgroup Gm , we obtain that G � G 2 x G 3 x Gm , where G 2 and G 3 are the Sy­
low 2- and 3-subgroups respectively. Since the order of an element in a direct product
is the least common multiple of the orders of the elements in its components, it follows
easily that G 2 x G 3 contains all elements in G of order n (more generally, one can de­
duce this from Exercise 1 8, Section 3 . 2 of [1], using N = G 2 x G 3 and H any cyclic
subgroup of order n ). Therefore, by Proposition 2, since G is minimal, we must have
Gm = 1 , that is, G � G 2 x G 3 ; furthermore, by the Fundamental Theorem, the Sylow
subgroups Gz and G 3 decompose further into direct products of cyclic groups as :

Gz � Zz X X Zz X Z 2 2
· · · X · · · X Z 22 X · · · X Zza X · · · X Zza
'"-,-'
SJ factors s2 factors sa factors

and G3 � z3 X . . . X z 3 X Z3 2 X . . . X Z3 2 X . . . X Z3b X . . . X Z3 b
'"-,-'
fJ factors t2 factors fb factors

with G 3 possibly being trivial. In the situation when G 3 is trivial, n = 2a and G � G 2 .


When a = 1 , every nonidentity element in G has order two, and hence the number of
elements of order 2a in G is 2sa 1 , which is obviously not equal to 2a . For a > 1 ,
-

one way to count the number of elements of order 2a is to count the total number of
elements in G and then subtract the number of elements of order less than 2a . Now, in
the group Z2a , there are ¢ ( 2a ) = 2a- I elements of order 2a . If we think of an element
in G as a vector with an element from Z2 in each of the first s 1 positions, an element
from Z 22 in each of the next s2 positions and so on, then an element of order 2a in G
must have a generator of Z2a in at least one of the last sa positions in the vector.
So elements of G with order strictly less than 2a may have any element in the first
s 1 + s2 + + sa- I positions and must have an element of order strictly less than 2a
· · ·
VOL . 7 5 , NO. 3 , J U N E 2002 2 17
in each of the last sa a
positions in the vector. Using these facts, we find that the number
of elements of order 2
in G
is

2 (L:%:: ksk)+(a -J )sa (2sa 1).


_

a
Since w e want this to equal 2 , we get that sa = 1, s1 = 1, and sk = 0 for
k = 2, . . . , a - 1 . Hence, G � Z2 Z2a . Now assume that G3 is nontrivial, so that
x

b :::a 1 .b Then, equating the actual count of the number of elements in G of order 2a 3 b
to 2 3 gives the equation
b
2 (L:�:: ksk )+(a -J ) sa (2sa 1)3 (L:t;; : ktk )+( -J )th (3 th 1) = 2a 3 b
_ _
(1)
where w e adopt the convention that the summation i s zero if the upper limit i s smaller
than the lower limit. From (1)
we see that 2sa - 1
must be a power of and 3 3 th - 1
2.
must be a power of Therefore we have to solve the Diophantine equations

2Sa - 1 = 3 x and (2)


Y
3 th - 1 = 2 . (3)
When sa is odd, 2sa is congruent to 2 modulo 3. So (2) is impossible modulo 3 unless
x = 0, and hence sa = 1 . When sa is even, we can factor the left side of (2) as the
difference of two squares and conclude that the only solution is sa = 2. Similarly, when
1b is even we get the solution 1b = 2; and when 1b is odd, 3 th is congruent to 3 modulo 4.
Reduction of (3) modulo 4 produces the one solution y = 1 and consequently, 1b = 1 .
Hence, there are four cases to consider and in each case we equate exponents in ( 1 ) to
determine if a group exists. The four cases are listed below.

• Sa = 1b = 1
Equating exponents in (1) we get
a -J
L ksk = 0 and
k=l
which implies that b ::: 2, s1 = s2 = · · · = sa -J = 12 = 13 = · · · = 1b -J = 0 and
11 = 1 . Hence,

• Sa = 2, 1b = 1
Equating exponents in (1) we get
a -J b -J
L ksk = 1 - a and L: k1k = o.
k=l k=l
which implies that a = 1 and 11 = 12 = 13 = = 1b -J = 0. Hence,
· · ·

G � z2 z2 Z3h .x x

• Sa = 1, 1b = 2
Equating exponents in (1) we get
a -J b -J
L ksk = -2 and L k1k = 2 - b,
k=l k=l
which is impossible so no group exists for this case.
2 18 MAT H EMAT I CS MAGAZI N E
• Sa = tb = 2
Equating exponents in ( 1) we get

a- 1 b- 1
L ksk = -a - 1 and L: ktk = 1 - b,
k=l k= l
which again is impossible and no group exists for this case.

Therefore we have proven the following:


THEOREM 1 . An abelian group G is a minimal finite group with exactly n > 1
elements of order n if and only if n = 2a 3 b and G is isomorphic to one of the following
groups:
z 2 Z2a , a :::: 2, b = 0
• X

Z 2a z3 Z3 b, a :::: 1, b :::: 2
• X X

z 2 z2 Z 3 b, a = 1, b :::: 1
• X X

Theorem 1 shows that there exists an abelian group with exactly 2a 3 b elements of
order 2a 3 b except when a :::: 2 and b = 1 . We see next that these curious exceptions
are not duplicated in the nonabelian case.

The nonabelian case Of course from Proposition 2 we can easily manufacture non­
abelian groups that have exactly n elements of order n by letting G be any of the
groups from Theorem 1 and letting H be any nonabelian group such that the order
of H is relatively prime to the order of G . This construction, however, violates our
definition of minimality. To guarantee minimality we can assume that the nonabelian
group G with exactly n elements of order n is generated by its subgroups of order n.
This formulation allows u s to use generators and relations to construct such groups.
Theorem 2 gives the complete classification of minimal nonabelian groups that have
exactly n elements of order n. For the details of the proof, see [2] .
THEOREM 2 . A nonabelian group G is a minimal finite group with exactly n > 1
elements of order n if and only ifn = 2a 3 b and G is isomorphic to one of the following
groups:
•(x , y I x 2a = l = 1 , x - 1 yx = y - 1 ) Z3 b with a :=::: 1, b :::: 1.
x

•(x , y I x 2a = y 2 = 1, y - 1 xy = x 2a -l +1 ) with a :=::: 3 , b = 0.


b '
Z 2a (x , y I x 3 = y 3 = 1 , y - 1 xy = x 3h - +1 ) with a :::: 1, b :::: 2.
• x

•Qs Z3 b with a = 2, b :::: 1 where Q 8 is the quaternion group of order 8.


x

s4 Z3 b with a = 2, b :::: 1 where s4 is the symmetric group on four letters.


• X

GL 2 (3) x Z3 b with a = 3, b :::: 1 where GL 2 (3) is the group of2 2 invertible


• x

matrices with entries from the field z3 .


GL; (3) Z3 b with a = 3, b :::: 1 where GL; (3) is the group oforder 48, which
• x

has generalized quaternion Sylow 2-subgroups and contains SL 2 (3), the group
of all 2 2 matrices of determinant 1 with entries from the field Z3 , as a sub­
x

group of index 2.

Open questions The following is a list of some unanswered questions for future
investigation that have arisen from the work in this paper.

• For a given n, is it possible to determine what values of m are possible such that
a finite group has exactly m elements of order n? What relationship can be found
between m and n?
VO L . 75, N O . 3 , J U N E 2002 2 19
• For a given pair m and n, is it possible to classify all minimal finite groups having
exactly m elements of order n and does this classification provide any insight into
the groups themselves?

Acknowledgment. The authors thank the referees for the valuable suggestions.

REFERENCES
1 . D. Dummit, R. Foote, Abstract Algebra, 2nd ed., John Wiley & Sons, 1999.
2. C. Finch, R. Foote, L. Jones, The classification of minimal finite groups that have exactly n elements of order
n, preprint.

Root l ess Matri ces


B ERT RAM Y OO D
Pennsylvania State U niversity
U niversity Park, PA 1 6802

Let A be an n x n matrix over the complex field, n ;:: 2. An rth root of A is a matrix S
such that S' = A. For example, S 2 = W where

S=
[ 01 -21 ] '
so that S is the square root of W. It is natural to ask when a matrix does or does not
have roots. We say that A is rootless if there is no matrix S and no positive integer
r ;:: 2 such that S' = A.
This study started with the rather accidental discovery that the matrix

T = [� �]
is rootless. To see this, suppose that S' = T for some r ;:: 2 and

The null space of T , that is, the set of all vectors G) for which T (;) = (�), is the set
of all vectors of the form (�) . The null space of S is contained in that of T (which is
a rank-one matrix), and therefore the null space of S has dimension zero or dimension
one. It cannot be zero-dimensional, for in that case S, and hence T, would be one-to­
one, and hence invertible. Thus, S @ = (�) . so that a = c = 0. We then have

[ 00 db ] ' - [ 00 01 ] .
It is readily shown by induction that

[ 00 db ] ' - [ d'0
_ bdr- 1
0
].
Therefore, bd' - 1 = 1 and d' = 0, which is impossible, as r ;:: 2.
22 0 MAT H EMATICS MAGAZI N E
It i s easy to check that our matrix T above satisfies T 2 = 0 . That is, T i s a 2 x 2
nilpotent matrix such that T f. 0. Recall, an n x n matrix is A called nilpotent if
1
A s = 0 for some positive integer s :=: 2. We sought a more general result for which
this example would be a special case:

Let A be an n x n matrix over the complex field, n :=: 2. Then A is


A n- I
THEOREM 1 .
rootless if f. 0 and A n = 0.
For the following clever proof we are indebted to one of the referees.

Proof The proof is by contradiction. Suppose that A = S' , for r :::: 2. Then sr n =
A n = 0 so that S is an n x n nilpotent matrix. It is well known that if some power
of an n x n matrix is 0, then its nth power is zero (this will be proved independently
below). Therefore, Sk = 0 for all positive integers k :::: n. But we also have

s r (n - 1 ) = A n - I i= 0.

Now n :=: 2 and r :=: 2, hence,


n
-- < 2 < r so that n <
_ r (n - 1).
n 1 - - '

cn - l ) n- I =
-

Therefore s r = 0 or A 0, which is contrary to the hypotheses on A . Hence


A is rootless. •
k-i k
We point out that A need not be rootless if A f. 0 and A = 0 for some positive

[ � � � ]2 [ � � � ] .
integer k < n . An example with n = 3 and k = 2 is the following:

=
0 0 0 0 0 0

So the right-hand 3 x 3 matrix has a square root, yet its square vanishes.
The problem of solving the matrix equation x m = A , where A is a given matrix,
has been examined with care. Such a solution always exists if A is a self-adjoint matrix
(that is, A is equal to its conjugate transpose). This is a consequence of the spectral the­
orem for self-adjoint matrices [2, Ch. 5 ] . But as we saw above, there do exist rootless
matrices . If A is a nonsingular matrix (one with an inverse), solutions always exist. We
cite the classical reference by Wedderburn [3, pp. 96-97] . Considerable attention has
been given to special cases of A where the roots are polynomials in A . For reference,
we cite MacDuffee [1, pp. 1 1 9- 1 20] .
The class of rootless matrices given in our theorem above is described in a top­
down manner. We would like to add to this a bottom-up characterization which gives a
more precise description of the shape and construction of the rootless matrices of the
theorem.
We say that a matrix B is upper triangular if all entries of B below the diagonal are
zero. That is, if the biJ (the entry in the i th-row and jth-column of B) satisfy biJ = 0 if
i > j . A matrix B is said to be strictly upper triangular if every entry of B on or below
the diagonal is zero (that is, biJ = 0 if i :=: j ) . Lastly, recall that the superdiagonal of
a matrix is the collection of entries immediately above and to the right of the diagonal
(that is, entries biJ such that j = i + 1 ) . Then we have the following:

THEOREM 2 . Let A be an n x n matrix over the complex field, n :=: 2. Suppose


that A is of the form s - 1 B S where S is an invertible matrix, and B is a strictly upper
triangular matrix with all nonzero entries on its superdiagonal. Then A is rootless.
VO L. 75, NO. 3 , J U N E 2002 22 1
This theorem will be proved as soon as it is shown that the "top-down" characterization
(A n = 0, A n - ! "I= 0) is equivalent to the "bottom-up" hypotheses of Theorem 2. That is,
THEOREM 3 . Let A be an n x n matrix over the complex field, n ::=:: 2. Then A is of
the form s-' B S where S is an invertible matrix, and B is a strictly upper triangular
matrix with all nonzero entries on its superdiagonal, if and only if A satisfies A n = 0
and A n - ! -:/= 0.
We will prove Theorem 3 (and hence Theorem 2) by a series of lemmas. We first
cite the well known result:
LEMMA 1 . Any n x n matrix A, n ::=:: 2,
is similar to an upper triangular matrix.
The proof for this may be found in many standard references, such as [2, Ch. 5 ] .
Therefore, i n our study o f an n x n matrix A , w e may assume that A i s upper triangular.
But we can say more, since we assume as well that A n = 0.
LEMMA 2 . Suppose that A is an upper-triangular matrix such that A n = 0. Then
A is strictly upper triangular.
Proof For any complex number ). "I= 0, the matrix A I - A is invertible (where I
is the identity matrix) since its inverse is ). _ , I + I:: � :� ). - k A k . Thus zero is the only
possible eigenvalue for A . As the diagonal elements of A are its eigenvalues (since it is
assumed A is upper triangular), the diagonal elements are all zero. Hence A is strictly
upper triangular. •

Let f'n denote the set of all n x n strictly upper triangular matrices.
LEMMA 3. Let V be the product of k matrices in ['n • 1 ::: k ::: n. Then the first k
columns and the last k rows of V are zero.
Proof We give a proof by mathematical induction. The statement is true by defi­
nition if k = 1 . Let 1 ::: k < n. We assume our result for any product V = ( V ij ) of k
elements of [' n .
Let W = ( Wij ) be the product of k + 1 elements of rn . We can express W as either
B V, or Vz B for B = (b ij ) an element of f' n and V1 , V2 each a product of k elements
of f' n . We describe each of V1 and V2 in tum by ( vij ) . By the inductive hypothesis
Vij = 0 if j ::: k and i :::: n - k .
From W = V1 B w e see that the last k rows o f W are zero and from W = B V2 we
see its first k columns are zero. We now extend this to show W is zero in column k + 1
and row n - (k + 1 ) = n - 1 - k .
From W = V1 B we have, for any row i
n
Wi, k + l = L Vij bj,k + ! = L Vijbj , k + l ·
j =l j >k
But bj, k + ! = 0 if j :::: k + 1 , so we have wi , k + l = 0. Thus W is zero in column k + 1 ,
hence in its first k + 1 columns.
Similarly from W = B V2 we have, for any column r
n
Wn - k - ! ,r = L bn -k - ! ,j Vjr ·
j =l
But Vjr = 0 for j ::=:: n - k and bn - k - ! , j = 0 for j ::= n - k - 1 so that W n - k - ! ,r = 0,
and W is zero in row n - (k + 1 ) , hence zero in its last k + 1 rows. •

As an immediate consequence of the above lemma we consider the cases k = n and


k = n - 1 , and conclude:
222
== thatexceptis, allpossibly
MAT H EMATICS MAGAZ I N E

=
LEMMA 4 . For B E f n , B n
let B n - 1 (wij). Then every Wij
0, elements of fn are nilpotent. Further,

=
0 w 1n.
w 1n .
: = fE:"f bi,i+ 1 ·
We next determine the value of that Given an n xn (bij )
matrix B let rr ( B ) be

For B fn let B - 1 = (wij ) · Then WJ n =


the product of the entries on the superdiagonal: rr ( B )
LEMMA 5 . E
n rr (B ) .

n=
Proof Again, w e argue b y induction o n the size o f the matrix. The statement of
n,
the lemma is true for 2, since in that case

[ bl2 ]
B -
-
0
0 0
.

Suppose that the lemma is valid for all matrices in r k where k is some particular

V E r k +h V =
integer, 1 :::: k < n. We then must show that it is valid for any V E r k + 1 · Suppose

= =
(vij ) . Since V is a strictly upper triangular matrix, its first column and
0 for 1 :::: j :::: k + 1 .

].
last row are zero. Furthermore, viz 0 for i :::: 2 and vk + 1 .j
Hence, if w let Q be the matrix obtained from V by deleting the first row and first
column we have

Vz, k + 1
=
[�
. .
Q
Vk , k + l
0 ��30 0

Note that Q E rk . By construction, we have


0 V1 ,2 V J ,k+ 1
V= Q

[
0

Since the first column of V is zero, we see that

0 ...

yk- 1 �
0

Let v k - 1 = =
(wij). By Lemma 4 and our induction hypothesis the entries in Q k - 1 are all
=
Note that wi, k + 1 =
zero except the entry in the upper right comer; that is, wz, k + 1

the sole nonzero entry of V k =


rr ( Q )
0 for 2 < i :::: k + 1 . As the first row of V is (0, v 12 ,
Vz3 · · · vk , k + 1 ·
) we see that •

V . v k - 1 must be the ( 1 , k + 1 ) entry with value rr ( V ) .


• •

This completes our inductive argument. •

such that A n =
With this, the proof of Theorem 3 is nearly complete. Suppose A is an n x n matrix
0 and A n - 1 -:/:- 0. By Lemmas 1 and 2 we may assume A is strictly

to be zero is a 1 , n • which must equal rr (A ) =


upper triangular. But then by Lemma 4 the sole entry of A n - 1 not known in advance
a 1 , 2 an - l , n by Lemma 5, and this entry
• • •

must be nonzero since A n - 1 -:/:- 0. Therefore each ai,i+ 1 -:/:- 0 if A n - 1 -:/:- 0. Similarly, if
a matrix satisfies the hypotheses of Theorem 2, then it must also satisfy the hypotheses
of Theorem 1 .
VO L . 75, NO. 3 , J U N E 2002 223
There are some further questions the reader might like to consider. We have shown
that nilpotent n x n matrices A such that A n- I =/= 0 are rootless. Such nilpotent ma­
trices are of greatest possible rank (here, rank n 1 ) . These have been termed in the
-

literature principal nilpotents and are part of many interesting problems in matrix the­
ory. All of the rootless matrices shown here are principal nilpotents, but there are
nonnilpotent rootless matrices. The reader is encouraged to work this out to show the
matrix

[b � �]
0 0 0

is rootless, but not nilpotent. Also, we saw an example of a nilpotent matrix that was
not principal and had a square root. Is this always the case? That is, are there nilpotent
matrices of less than maximal rank that are still rootless? With these interesting ques­
tions worked out, the reader should try to give a complete description of all rootless
matrices, and we hope that our remarks will help you on your way.

REFERENCES
1 . C. C. Macduffie, The Theory o fMatrices, Chelsea Pub!. Co., New York, 1 946.
2. G. Strang, Linear Algebra and its Applications, 3rd ed., Harcourt, Brace, Jovanovich, San Diego, CA, 1988.
3 . 1. H. M. Wedderburn, Lectures on matrices, Amer. Math. Soc. Coli. Pub!. , vol. 17, Providence, RI, 1 934.

Lots of S m i t h s
PAT R ICK COSTEL LO
Eastern Kentucky University
Richmond, KY 40475
pat.costello®eku.edu

K AT H Y LEW I S
Somerset Community College
Somerset, KY 42 501

Undoubtedly, many people had called Dr. Harold Smith at 493-7775 without thinking
much about his phone number. Dr. Smith's brother-in-law, Albert Wilansky, however,
noticed something very interesting about this phone number. When written as the sin­
gle number 4937775, it is a composite number where the sum of the digits in its prime
factorization is equal to the digit sum of the number. Adding the digits in the number
and the digits of its prime factors 3, 5, 5 , and 65837 resulted in identical sums of 42.
Wilansky, a mathematics professor at Lehigh University, termed numbers having this
property to be Smith numbers [5] . It turns out that the terminology was an appropriate
choice because we will show that Smith numbers are very common, about as common
as the name Smith in most American phone books.
The number 4 is the smallest Smith number because it is composite, it has a digit
sum of 4, and the sum of the digits in its prime factorization is 2 + 2 = 4. In his
article, Wilansky provided two other slightly larger examples of Smith numbers: 9985
and 6036. He also told how many Smith numbers lie between 0 and 9999; as you can
check, there are 376 of them. Because these numbers seemed to occur fairly frequently,
Wilansky raised the question of whether there are infinitely many Smith numbers .
224 MAT H EMATICS MAGAZ I N E
In 1987, Wayne McDaniel [3] succeeded i n showing that infinitely many Smith
numbers do in fact exist. McDaniel's approach was through a generalization of the
problem. He defined a k-Smith number to be a composite integer where the sum of
the digits in the prime factorization is equal to k times the digit sum of the number.
In his article, McDaniel produced an infinite sequence of k-Smith numbers for each
positive integer k. Since k = 1 corresponds to Wilansky's definition of a simple Smith
number, McDaniel has shown that there are infinitely many Smith numbers. We will
give further evidence of their abundance by producing yet another infinite sequence of
Smith numbers.

Notation and basic theorems For any positive integer n, we let S(n) denote the
sum of the digits of n and SP (n) denote the sum of the digits of the prime factorization
of n. A number is Smith when these two quantities are equal. For example, S (27) =
2 + 7 = 9 and Sp (27) = Sp (3 * 3 * 3) = 3 + 3 + 3 = 9. Hence 27 is a Smith number.
For any positive integer n, we let N ( n ) denote the number of digits of n. For ex­
ample, N (27) = 2. An algebraic formula for N (m) is N (m) = [log 1 0 m ] + 1 where
[ x] is the greatest integer in x .
A repunit, denoted Rn , i s a number consisting o f a string o f n ones. For example,
� = 1 1 1 1 . An algebraic formula for Rn is Rn = ( I On - 1)/9.
We now relate a few of the known results about the functions S and SP that are
pertinent to the construction of our infinite sequence of Smith numbers. In his paper,
McDaniel [3] stated the following theorem without proof. (A detailed proof of the
theorem is supplied in [2] .)
THEOREM 1 . Ift is a positive integer with t < 9Rn , then S(t * (9Rn )) = S(9Rn ) =
9n.
This theorem gives a way to know the digit sum of certain large numbers with­
out having to expand the number into all its digits. For example, suppose we start with
9R5 = 99999 and arbitrarily choose t = 44599. Since 44599 < 99999, it follows from
Theorem 1 that S(44599 * 99999) = 9 * 5 = 45. We can check by expanding the prod­
uct 44599 * 99999. The product gives 4459855401 and in fact S(4459855401) = 45.
The following theorem follows directly from the definition of the function SP .

THEOREM 2 . Ifm and n are positive integers, then Sp (mn) = Sp (m) + Sp (n).
This theorem says that the function Sp is an additive function, a fact we will use
often.
In 1983, Keith Wayland and Sham Oltikar [4] provided another useful theorem.
THEOREM 3 . If S(u) > Sp (u) and S(u) = Sp (u) (mod 7), then lOk u is a Smith
number where k = (S(u) - Sp (u ) ) / 7.
The essence of this theorem is that padding a zero on the end of a number does not
change its digit sum, but it does increase the digit sum of its primes by 7. Each new
zero on the end (which is achieved by multiplying by another factor of 10 = 2 * 5)
adds 2 + 5 = 7 to the Sp (u) value until it equals the S(u) value. Theorem 3 was used
by Oltikar and Wayland to say that every prime whose digits are all 0 and 1 has some
multiple that qualifies as a Smith number. For example, some multiple of the prime
number 10101 1 1 1 1 1 must be a Smith number. (Try multiplying the prime by 6 and
then apply the Theorem.) Since there are lots of primes containing just Os and 1 s, this
gave further circumstantial evidence (before McDaniel's proof) that there are infinitely
many Smith numbers.
McDaniel [3] also gave an upper bound for Sp (m) that does not involve the value
of specific prime factors of m, as follows:
VOL . 75, NO. 3 , J U N E 2002 225
THEOREM 4 . If P I • . . . , Pr are prime numbers, not necessarily distinct, and if
m = P 1 P2 · · · Pn then Sp (m) < 9N(m) - .54r.
The new infinite sequence With the help of the previous theorems, we begin to
construct a new infinite sequence of Smith numbers. Start with an integer n greater
than 7. Let m = 9Rn = 10n - 1 . By Theorem 4, Sp (m) < 9N(m) - . 54r , where r is
the number of prime factors of m. Since 3 2 divides m and Rn > 1 , m has at least three
primes in its factorization. This means that r > 2 and .54r > 1 . Since m has n digits,
Theorem 4 gives us Sp (m) < 9n - 1 . But 9n is actually the digit sum of m and so
Sp (m) < S(m) - 1 . This implies that 0 < SOOn - 1) - Sp (1on - 1). Then let x be
the least residue of s 0 on - 1) - sp 0 on - 1) modulo 7.
We now show that i f w e multiply 1 on - 1 by a power o f 1 1 that involves the com­
puted least residue x, we get a number where the S and SP values are congruent mod 7.
This will b e the main ingredient used i n generating the new infinite sequence o f Smith
numbers.

THEOREM 5 . Let x be the least residue of SOon - 1) - SP OOn - 1) modulo 7. Let


j be the least residue of 4x modulo 7. Then

Proof First observe that S(1 1 j Oon - 1)) - Sp 0 1 j 00n - 1)) = soon - 1) ­
SP ( 1 1 j ( 1 on - 1)) by Theorem 1 and the choice of n � 8 so that 1 1 j < 1 on - 1 . This
is equal to

SOOn - 1) - Sp ( 1 1 j ) - Sp OOn - 1) by Theorem 2


= S(1on - 1) - 2j - Sp OOn - 1) b y Theorem 2
= (soon - 1) - sp oon - 1)) - 2j
= x - 2j (mod 7)

But since 4x = j (mod 7) , 8x = 2j (mod 7), and the expression above is congruent to
zero. Hence S0 1 j 00n - I)) - Sp (1 1 j (1on - 1)) = O(mod 7) . •

Using Theorems 3 and 5, we can now construct the infinite sequence of Smith
numbers. Let n be an integer greater than 7. Compute x as the least residue of
soon - 1) - SP OOn - 1) modulo 7. Compute j to be the least residue of 4x mod 7.
Then S(l l j (1on - 1 )) - SP 0 1 j 00n - 1)) = O(mod 7) by Theorem 5 . So let k =
(S( 1 1 j 00n - 1)) - Sp (1 1 j 00n - 1)))/ 7 . Then the number an = 10k 1 1 j (1on - 1)
i s a Smith number by Theorem 3. Since each integer n � 8 gives a Smith number,
·

there must be infinitely many Smith numbers.

Examples We now show the computations needed to produce two specific Smith
numbers in our infinite sequence.

EXAMPLE 1 . Let n = 8. Then 108 - 1 = 99999999. In this case, S008 - 1) =


8 * 9 = 72 and SP 008 - 1) = Sp (3 * 3 * 1 1 * 7 3 * 101 * 137) = 3 1 so that S008 -
1) - Sp 008 - 1) = 72 - 3 1 = 4 1 6(mod 7) . Then x = 6 in Theorem S and 4 * 6
= =

3 (mod 7) which gives us j = 3. We let k = (S0 1 \108 - 1) - Sp ( 1 1 3 008 - 1)))/7.


Then k = (S0 33099998669) - Sp (3 * 3 * 1 1 * 1 1 * 1 1 * 1 1 * 7 3 * 101 * 137))/7 =
(72 - 37)/7 = 3 5 / 7 = 5 . Finally, 105 * 1 1\108 - 1) = 1 3309999866900000 is the
first Smith number in our sequence.
226 MAT H EMAT I CS MAGAZI N E
In 1 925 , Lt.-Col. Allan J . C . Cunningham and H . Woodall published a small volume
of tables of the factorizations of bn ± 1 for the bases b = 2, 3 , 5, 6, 7, 10, 1 1 , 1 2 to
various powers of n. The authors left blanks in the tables where new factors could be
entered. They put question marks on numbers of unknown character. Most importantly,
they gave credit to those who had discovered notable factors in the past. All of these
techniques stimulated work on the remaining composite numbers in the tables. The
ongoing work on the Cunningham-Woodall tables has usually been referred to as the
Cunningham project.
Factorizations of bn ± 1 for the bases b = 2, 3, 5, 6, 7, 1 0, 1 1 , 12 to various high
powers n are easily available [ 1 ] . In fact, for any value of 1 0n - 1 , n � 8 that has
been completely factored in the table, we can find the corresponding Smith number
belonging to our sequence.
EXAMPLE 2. Let n = 44. The factorization of 1 044 - 1 is given [1) as 1 044 - 1 =
3 2 * 1 1 2 * 23 * 89 * 1 0 1 * 4093 * 8779 * 2 1 649 * 5 1 3239 * 1 052788969 * 1 05668926 1 .
Adding up the digits in the prime factors, we get Sp ( 1 044 - 1 ) = 225 . Since 1 oM -
1 = 9�4 • we have that S ( l oM - 1 ) = 44 * 9 = 396. Then S ( l oM - 1 ) - Sp ( 1 044 -
1 ) = 396 - 225 = 1 7 1 = 3 (mod 7) . So x = 3 in Theorem 5 and 4 * 3 = 5 (mod 7)
which gives us j = 5 .
We let k = ((S( 1 1 5 ( 1 Q44 - 1 ) - Sp ( 1 1 5 ( 1 Q44 - 1 ) ) ) /7 = (396 - (225 + 5 * 2)) /7 =
3
(396 - 235)/7 = 1 6 1 /7 = 23. Thus the 73-digit number 1 02 1 1 5 ( 1 044 - 1 ) is the
Smith number in our sequence corresponding to n = 44.
The Smith numbers that McDaniel produces in his infinite sequence have the form
t ( l On - 1 ) 1 0"' , where t is chosen from the set {2, 3 , 4, 5 , 7 , 8 , 1 5 } . Our Smith numbers
replace the t value with a power of 1 1 and sometimes alter the m value. When n = 8,
McDaniel's procedure gives 8 ( 1 08 - 1 ) 1 05 ; when n = 44, it gives 3 ( 1 044 - 1 ) 1 024 . Our
slight change has produced an entirely different infinite sequence of Smith numbers.
We leave the reader with a challenge. Since there seem to be lots of Smith numbers,
can you find another infinite sequence of Smiths? (Hint: Look back at Theorem 5 and
see what role the digit sum of 1 1 played. The key is that 2 is relatively prime to 7. )

REFERENCES
1. J. Brillhart, D . H. Lehmer, J. Selfridge, B. Tuckerman, and S. Wagstaff, Jr., Factorizations of b n ± I ,
b = 2, 3 , 5 , 6, 7 , 1 0 , 1 1 , 1 2 Up to High Powers, 2nd ed., American Mathematical Society, Providence, Rhode
Island, 1988.
2. K. Lewis, Smith Numbers: An Infinite Subset of N, M.S. thesis at Eastern Kentucky University, 1 994.
3. W. McDaniel, The existence of infinitely many k-Smith numbers, Fibonacci Quarterly 25 ( 1 987), 76-80.
4. S. Oltikar and K. Wayland, Construction of Smith numbers, this MAGAZINE 56 ( 1 983), 36-37.
5 . A. Wilansky, Smith numbers, Two - Year College Math Journal l3 ( 1982), 2 1 .
PRO B L EMS
E LG I N H . JOH NSTON, Editor
I owa State U n ivers ity

Assistant Editors: RAZVAN G ELCA, Texas Tech U n ivers ity; ROB E RT G REGORAC, I owa
State U n ivers ity; G E RALD H E U E R, Concord i a Col l ege; VAN IA MASCIO N I, Western Wash ­
i n gton U n ivers ity; PAU L ZEITZ, T h e U n ivers ity o f San Fra n c i sco

Pro p osa l s
To be considered for publica tion, solutions should be received by November 1,
2002.

1648. Proposed by Erwin Just (Emeritus) Bronx Community College, New York, NY.
Prove that there exist an infinite number of integers, none of which is expressible as
the sum of a prime and a perfect square.

1649. Proposed by K. R. S. Sastry, Bangalaore, India.


Prove that if a right triangle has all sides of integral length, then it has at most one
angle bisector of integral length.

1650. Proposed by M. N. Deshpande, Nagpur, India.


Let R (O) denote the rhombus with unit side and and a vertex angle of 0 , and let
n :=: 2 be a positive integer. Prove that a regular 4n-gon of unit side can be tiled with
the collection of n (2n - 1 ) rhombi consisting of n copies of R (I) and 2n copies of
each of R ( �: ). 1 :::=: k :::=: n- 1.

1651. Proposed by Juan-Bosco Romero Marquez, Universidad de Valladolid, Val­


ladolid, Spain.
Prove that for :=: 2,x
( x )x-1 r(x ) ( x )x-1
- < < -
e - - 2 '
where r is the gamma function.

We invite readers to submit problems believed to be new and appealing to students and teachers of advanced
undergraduate mathematics. Proposals must, in general, be accompanied by solutions and by any bibliographical
information that will assist the editors and referees. A problem submitted as a Quickie should have an unexpected,
succinct solution.
Solutions should be written in a style appropriate for this MAGAZINE. Each solution should begin on a
separate sheet.
Solutions and new proposals should be mailed to Elgin Johnston, Problems Editor, Department of
Mathematics, Iowa State University, Ames, lA 5001 1 , or mailed electronically (ideally as a IM_EX file) to
ehj ohnstQiastate . edu. All communications should include the readers name, full address, and an e-mail
address and/or FAX number.

227
228 MAT H EMATICS MAGAZI N E
1652. Proposed by Razvan A . Satnoianu, City University, London, United Kingdom.
In triangle ABC, let r denote the radius of the inscribed circle, R the radius of the
circumscribed circle, and p the semiperimeter. Prove the following inequalities, and
show that in each case the constant on the right is the best possible:
R p
(a) - + - >- 2.
p R
(b) --
r p 28J3
- + - ::: .

- -( )
p r 9
r p 56 R p
(c) - + :=: -+- .
p r 31 p R

Qu i ckies
Answers to the Quickies are on page 233 .

Q921. Proposed by Kent Holing, Statoil Research Center, Trondheim, Norway.


Let m and n be relatively prime positive integers. Show that the numbers �
and �m + n are not both constructible with straightedge and compass.

Q922. Proposed by Murray S. Klamkin, University of Alberta, Edmonton, AB,


Canada.
Two directly homothetic triangles are such that the incircle of one of them is the
circumcircle of the other. If the ratio of their areas is 4, prove that the triangles are
equilateral.

So l uti ons

Tromino Tiles June 2001


1623. Proposed by Emeric Deutsch, Polytechnic University, Brooklyn, NY.
Find the number of ways that k copies of the tromino

can be placed, with the orientation shown and without overlapping, on a 3 x n rect­
angle.

I. Solution by Stephen Blair, Portland State University, Portland, OR.


Any configuration of k trominos on a 3 x n rectangle can be described as an
ordered juxtaposition of four types of column structures. These types are blank
columns: -- , pairs of columns containing one tromino: if' or P , and sets of three
adj acent columns containing two trominos : . With each of these four column
2
sets we associate, as shown, a x 1 block with zero, one, or two squares shaded:

§-B -·
VO L . 75, NO. 3 , J U N E 2 002 229
Note that the number of squares shaded in the 2 x 1 block i s equal to the num­
ber of trominos in the associated column structure. For a given configuration of k
trominos on a 3 x n board, we associate a 2 x (n - k) board on which k squares
are shaded. This is done by partitioning the k tromino configuration into the four
types of column structures, then replacing each column structure with its associated
2 x 1 block. For example, the 3 x 8, four-tromino configuration

is associated with the 2 x (8 - 4) board

It is not hard to see that this mapping scheme defines a bijection between the
configurations of k trominos on the 3 x n board and the set of 2 x (n - k) boards
n k
with k cells colored. Because there are e( k- )) ways to color k squares on a 2 X
(n - k) board, there are also e(nk-k )) ways to place k trominos on a 3 n board. X

II. Solution by the proposer.


Imagine that the uncovered squares of the 3 n rectangle are covered by 1
x 1 x

squares, so the 3 n rectangle is tiled by 3n - 3k pieces and k pieces The


x 0

only tilings that are not concatenations of tilings of smaller rectangles are

and

Furthermore, if a tiling is not one of the tilings in ( * }, then it is a concatenation of


a finite sequence of such tilings.
Now let an , k be the number of ways to tile a 3 x n rectangle using k tromino
pieces and 3n - 3k single square pieces, and let G (t, z) = L n , k ?: O an , k t z n be the
k
generating function for the an , k o where we set a0 , 0 = 1 . Because the generating
function for the tilings ( *) is

P (t, z) = z + 2tz 2 + t 2 z 3 = z( l + tz) 2 ,


and any tiling is a concatenation of tilings from ( * ) , it follows that

Also solved by D. Bednarchak, Agnes Benedek (Argentina), Robert E. Bernstein, Jany C. Binz (Switzerland),

Tom Boerkoel, Marc Brodie, Knut Dale (Norway), Daniele Donini (Italy), Marty Getz and Dixon Jones, Jerrold

W. Grossman, Tom Jager; S. C. Locke, Reiner Martin, Carl P. McCarty and Loretta McCarty, Rob P ratt, Les Reid,
William Tressler; LeRoy Wenstrom, WMC P roblems Group, Michel Woltermann, and Li Zhou. There were two

incorrect submissions.

An Ellipsoid Tangent to a Tetrahedron June 2001


1624. Proposed by Murray S. Klamkin, University ofAlberta, Edmonton, AB, Canada.
An ellipsoid is tangent to each of the six edges of a tetrahedron. Prove that the three
segments j oining the points of tangency of opposite edges are concurrent.
Solution by Michel Bataille, Rouen, France.
Under a suitable affine transformation, the ellipsoid becomes a sphere, and concur­
rency and tangency are preserved. Thus we need only consider the case in which the
230 MAT H EMATICS MAGAZI N E
ellipsoid is a sphere that is tangent at points R, S , T, U , V, and W to sides B C, C A ,
A B , DA, D B , and D C , respectively, of tetrahedron A B C D . Because all segments of
tangents from a vertex to the point of tangency on the sphere have the same length,
we can set x = AS = A T = AU, y = B T = B R = B V, z = C R = C S = C W, and
t = DU = DV = D W . Denoting by M the vector from the origin to the point M, let
I be the point determined by
ml = yztA + ztxB + txy C + xyzD,
where m = yzt + ztx + txy + xyz. Then
ml = zt (yA + xB) + xy(t C + zD) = zt (y + x)T + xy(t + z)W .
Because zt (y + x) and xy(t + z) are positive and sum to m , it follows that I lies on
segment T W . Similarly,

ml = yz(tA + xD) + tx (zB + yC) = yz(t + x) V + tx (z + y) R ,


and

ml = ty(zA + xC) + zx (t B + yD) = ty(z + x) S + zx (t + y) V ,


showing that I lies on segments U R and S V as well. Thus the three segments joining
points of tangency of opposite edges are concurrent at I .
Also solved by Daniele Donini (Italy). Ovidiu F urdui, Michael Golomb, Joel Schlosberg, Peter Y. Woo, and
the proposer.

A Product of Powers and a Power of Products June 2001


1625. Proposed by Mihaly Bencze, Romania.
Let x 1 , x 2 , , Xn be positive real numbers and let a" a 2 ,
• • • • • • , an be positive inte­
gers. Prove that

Solution by Robert R. Burnside, University of Paisley, Scotland.


We prove a more general version of the inequality. Let bk and Yb k = 1 , 2, . . . , n,
b e fixed positive real numbers, with .L�= l bk = 1. For s ::=: 0 , define f (s) = fl�= (s +
yk ) bk . Then '

where we have used the weighted arithmetic mean-geometric mean inequality. It fol­
lows that f ' (s) ::=: 1 and hence that f(x) - f(O) ::=: x for all x ::=: 0. Consequently,
fl�= l (x + Yk ) bk ::=: x + fl�= y;k , with equality for x nonzero, if and only if Y I = Yz =
. . . = Yn · '
The inequality in the problem statement is obtained by taking x = 1 , Yk = x�fak , and
bk = ad .L�= I ak . with all ak > 0. If all ak < 0, then the inequality sign is reversed.
Also solved by Michel Bataille (France), Jean Bogaert (Belgium), Knut Dale (Norway), Minh Can, Daniele

Donini (Italy). Costas Efthimiou, Ovidiu F urdui, Tom Jager, Reiner Martin, Michael G. Neubauer, Joel Schlos­

berg, Heinz-Jiirgen Seiffert (Germany), Beiment Teclezghi and Tewodros Amdeberhan, Xianfu Wang (Canada),

Li Zhou, and the proposer.


VO L . 75, NO. 3 , J U N E 2 002 231
A Condition Implying Additivity 2001 June
1626. Proposed by Ho-joo Lee, student, Kwangwoon University, Seoul, South Korea.
Let f, g, h : lR ---+ lR be functions such that f(g(O)) = g(f(O)) = h(f(O)) = 0 and

f (x + g(y)) = g (h (f(x))) + y
y E R Prove that h = f and that g(x + y) = g(x) + g(y) for all x , y
for all x , E JR.

Solution by Michael K. Kinyon, Western Michigan University, Kalamazoo, MI.


Setting x = 0 in ( *) gives

f (g(y)) = g (h (f(O))) + y = g(O) + y,


for all y. Setting y = 0 i n this expression gives
g(O) = f (g(O)) = 0,
and it follows that f(g(y)) = y for all y . Thus f is surjective and is a left inverse of g.
Setting y = 0 i n ( *) w e have

f(x) = f (x + g(O)) = g (h (f(x))) .


Because f is surj ective, it follows that g is surj ective and h is a right inverse of g. The
-1
left and right inverses of g must be equal, so we have g = f = h . Using this in (*)
we obtain

f (x + g(y)) = f(x) + y.
Substituting x = g(u) i n this last expression, then applying g to both sides yields
g(u) + g(y) = g(u + y),
for all u, y E R Note that the condition g(f(O)) = 0 was not used.
Also solved by Hamza Ahmad, Claudi Alsina (Spain), Geta Techanie Ayele, S. Floyd Barger, Michel Bataille

(France), Brian D. Beasley, Anthony C. Blackman and Eduard S. Belinsky (Barbados), Jean Bogaert (Belgium),
Marc Brodie, Minh Can, Ron Martin Carroll, Con Amore P roblem Group (Denmark), Knut Dale (Norway),

Richard Daquila, Charles R. Diminnie, Daniele Donini (Italy), Tim Edwards, Costas Efthimiou, Ovidiu F urdui,

Michel Golomb, Kazuo Goto (Japan), Lee 0. Hagglund, Tracy Dawn Hamilton and Howard B. Hamilton, Damian
J. Hammock, Brian Hogan, Joel Iiams, Tom Jager, J. Todd Lee and Paula Grafton Young, S. C. Locke, Hieu D.
Nguyen, Stephen No/tie, Perry and the Masons Solving Group (Spain), Victor Pambuccian, David R. Patten, Sam

L. Robinson and Gerald Thompson, Richard F. Ryan, Grigor Sargsyan, Joel Schlosberg, Heinz-Jiirgen Seiffert,
Laishram Shanta Singh and Ritumoni Sarma (India), Shing S. So, Beiment Teclezghi and Tewodros Amdeber­

han, Nora S. Thornber, Thomas Vanden Eynden, Gregory P. Wene, LeRoy Wenstrom, Western Maryland College
P roblems Group, Li Zhou, and the proposer.

A Generalization of the Arbelos June 2001


1627. Proposed by Jiro Fukata, Shinsei-cho, Gifu-ken, Japan.
Semicircle C has diameter AoA n . Semicircles Ct . C2 , . . . , Cn are drawn so that
Ck has diameter Ak- t Ak on AoA n . In addition, C1 is internally tangent to C at A0
and externally tangent to C2 at A t . Cn is internally tangent to C at A n and externally
tangent to Cn - t at A n - t . for 2 S k S n 1 , Ck is externally tangent to Ck- t and Ck + t
-

at Ak- l and Ak respectively, and each Cb 1 s k s n is tangent to a chord P Q of C .


The case n = 5 i s illustrated i n the accompanying figure.
232 MAT H EMATICS MAGAZ I N E

(a) Let A 1 A'1 and A n _ 1 A� _ 1 b e perpendicular to A o A n at A 1 and A n- I . respectively.


Let circle X be externally tangent to C2 , internally tangent to C and tangent to
A 1 A; on the side opposite C 1 , and let circle Y be externally tangent to Cn- I . in­
ternally tangent to C and tangent to A n- I A� _ 1 on the side opposite Cn . Prove that
circle X is congruent to circle Y .
(b) Suppose C0 i s a semicircle with diameter o n A o A n and tangent to P Q. Let D and
E be the endpoints of its diameter. Lines D D' and E E' are drawn perpendicular to
A o A n Let Z be the circle tangent to each of D D' and E E' and internally tangent
to C. Show that Z is tangent to the circle with radius A 1 A n- I ·

Solution by Marty Getz and Dixon Jones, University of Alaska, Fairbanks, AK.
(a) We show that circles X and Y each have diameter <�o�nn-1- �����1 nn l . First observe that
0
that

Aj Aj+l 1 - sin e
j = 1 , 2, . . . , n - 1,
Aj_ 1 Aj l + sin e '

where e is the angle determined by the extended chord PQ and the extended
d 1ameter A 0 A n . In part1cu 1ar, w1th a = 1! +
. . . - sin e
sin 11 , we have

= a.

Let B be the point at which circle X touches C, and let line A n B intersect
line A 1 A'1 in T . See Figure 1 . Let A0B meet X in R and A n B meet X in S.
Because LA0BA n is a right angle, it follows that RS is a diameter of X and is
parallel to A o A n . In particular, R is the point at which circle X touches A , A; .
Because L TSA 1 = 1 80° - L B R A 2 = L A 0 R A 2 and LA 1 T S = L R A 0 A z , it fol­
lows that /::,. T SA 1 !::,. A 0 R A 2 • Hence ��� = 1�1� = a. Furthermore, because
'"'"'

Figu re 1 Figu re 2
VO L. 75, NO. 3, J U N E 2 002 233

RS TR TR 1 A o A n- t
= = =
T R + RA t 1 +a A o A n- t + A t A n
Thus, RS = <�o�nn --11 ��1�")
0 1 . A symmetric argument gives the same result for the
0
diameter of Y .
(b) Figure 2 shows circle Z with diameter RS 1, arallel to A o A n . Let F b e the inter­
section of lines P Q and A o A n . Because = A�s 1 AZ�1 ,
the intersection G of lines
A0R and A 1 S lies on the line through F and perpendicular to line A 0 F . By a sim­
ilar argument, the intersection H of lines A n- t R and A n S also lies on line G F.
Thus RS and H S lie along two o f the altitudes o f !::, R G H. Because GS is con­
current with these two lines, it lies along the third altitude of !::, R G H. Thus, SA 1
is perpendicular to RA n _ 1 , and it follows that circle Z is tangent to the circle with
diameter A t A n - l ·

Also solved by Herb Bailey, Michel Bataille (France), Jany C. Binz (Switzerland), Daniele Donini (Italy), Joel
Schlosberg, and the proposer.

An swers
Solutions to the Quickies from page 228.

A921. It is known that r � is constructible if and only if r is rational and that


=

s = 4m + n is constructible if and only if s is an integer. (See George E. Martin,


Geometric Constructions, Springer Verlag, 1998.) Furthermore, r is a rational number
if and only if both m and n are cubes, and s is an integer if and only if m + n is a cube.
3 3
Now assume that r is constructible. Then m p and n = q for integers p and q ,
=
3 3
and m + n =p + q • Thus, by Fermat's Theorem, m + n cannot also be a cube.
Therefore 4m + n cannot be constructed.

A922. Let the sides, area, circumradius, and inradius of the larger triangle be a, b,
c, F, R, and r, respectively, and let the corresponding sides and area of the smaller
triangle be a', b', c', and F'. We then have

a b c
- = - = - = 2' 4FR = abc, and 4F'r a'b'c'. =
a ' b' c'
It follows that
FR abc
- = -- = 8,
F'r a 'b'c'
and hence that R = 2r. However, it is known that R ::=: 2r with equality if and only if
the triangle is equilateral.
REVI EWS

PAU L j . CAM PBELL, Editor


B e l o i t Col l ege

Assistant Editor: Eric S. Rosenthal, West Orange, NJ. Articles and books are selected for this
section to call attention to interesting mathematical exposition that occurs outside the main­
stream of mathematics literature. Readers are invited to suggest items for review to the editors.

Johnson, Valen E., An A is an A is an A . . . and that's the problem, New York Times ( 1 4 April
2002), Section 4A (Education Life), 1 4 ; http : I lwww . nyt ime s . comi2002I041 141
edl i f e i 14EDVIEW . html . Special section: The grade inflation problem, The UMAP Journal 1 9
( 3 ) ( 1 998) 279-336. Johnson, Valen E . , A n alternative to traditional GPA for evaluating student
performance, Statistical Science 1 2 (4) 25 1 -27 8 ; ftp : I lftp . isds . duke . edul
pubiWorkingPapersl96-20 . ps .

Last week, a statistically sophisticated colleague in another department approached me for help
in devising a system to keep up with the current level of grade inflation, so that his students
would not be "penalized" relative to others. Author Johnson (Statistics and Decision Sciences,
Duke University) was the main proponent of a 1 996 proposal to revise the calculation of grade­
point averages (GPAs) at Duke to take into account difficulty of the course, quality of students in
the course, and the instructor' s grading history. Duke rejected the proposal, but Johnson returns
here with further data on ramifications-not of grade inflation per se, but of grading inequity,
which can be a consequence of grade inflation not being uniform among departments or instruc­
tors. He examined the effect of grades on student evaluations of courses and instructors and on
student choice of courses, at Duke. The results were what you might expect: Students are much
more likely to give low evaluations in courses where they expect lower grades than usual, and
students are much more likely to enroll with instructors who grade higher. Apart from the con­
sequences for retention and advancement of individual faculty (which can be dire), there are
implications for departments and the institution as a whole. Differences in grading can result
in-in fact, probably already have at your institution-shifts in enrollments and allocation of re­
sources . If your institution is like Duke (and most others), you as a mathematics instructor have
a particular problem, since your department grades the lowest (or nearly) . "Uneven grading
practices allow students to manipulate their grade point averages and honors status by select­
ing certain courses, and discourage them from taking courses that would benefit them [think
mathematics courses] . By rewarding mediocrity, excellence is discouraged." The Special Sec­
tion in The UMAP Journal contains three Outstanding entries in COMAP's 1 998 Mathematical
Contest in Modeling on the Grade Inflation Problem, along with a commentary by Johnson; his
article in Statistical Science details the Duke proposal. Watch for his forthcoming book College
Grading: A National Crisis in Undergraduate Education.

Primus: Problems, Resources, and Issues in Mathematics Undergraduate Studies. Special Is­
sues. The Undergraduate Seminar in Mathematics. Part 1 : September 200 1 ; Part 2: December
200 1 ; 1 1 (3 and 4) 1 93-257, 289-369.

These special issues of Primus offer 1 1 interpretations of what a seminar in mathematics can
be about, from presenters at the New Orleans Joint Mathematics Meetings in January 200 1 .
The ideas and experiences vary i n level (freshman to senior), focus (communication skills,
integration of mathematical ideas), faculty role, and grading, but all feature student involvement
at a fundamentally different level than in other courses. Does your department offer such a
seminar? Do you need ideas or new ideas? Take a look here for inspiration.

234
VOL . 76, NO. 3 , J U N E 2002 235
Chown, Marcus, Smash and grab, New Scientist (6 April 2002) 24-28 . Calude, C . S . , and
B . Pavlov, Coins, quantum measurements, and Turing's barrier, Quantum Information Process­
ing (in press); http : I /www . c s . auckland . ac . nz/CDMTCS/re s e ar chreport s /
170bor i s . pdf .

Will quantum computers really make a difference? Will they make a difference to mathematics?
Cristian S . Calude (University of Auckland) thinks that "quantum computing is theoretically ca­
pable of computing uncomputable functions ." Taking the halting problem as the uncomputable
function, the key ideas are to superimpose simultaneously an infinite number of quantum states
(via Hilbert space) and then to detect and measure the probability of a program halting. After
some finite amount of time, you get an answer: not an absolute yes-or-no answer but an answer
with an accompanying (tiny) probability (want a larger probability? run the quantum program
again for longer) . "Because of these new computational models, the idea of 'proof' might . . .
change."

Matthews, Robert, $ 1 million mathematical mystery "solved," NewScientist.com news ser­


vice; ht tp : I /www . new s c ient i s t . com/news/news . j sp? id=ns 99992 143 . Matthews, Robert,
British professor chases solution to $ 1 m maths prize, Daily Telegraph ( 1 3 April 2002). Dun­
woody, Martin, A proof of the Poincare conjecture? (revised version eight, 1 1 April 2002) ;
ht tp : / /www . maths . s oton . ac . uk/ mj d/Poin . pdf .

Martin Dunwoody (Southampton University) claimed to have proved the famous Poincare con­
jecture, which states that if every loop on a compact 3-D manifold can be shrunk to a point, the
manifold is topologically equivalent to a sphere? This is one of the seven Clay Mathematics In­
stitute million-dollar mathematical questions. The press (at least in Great Britain) got excited.
Mathematicians found a gap. (Does this sequence of events sound familiar?) At this writing,
Dunwoody has added a question mark at the end of the title of his preprint, which is more of a
blueprint for a proof than a proof itself. Stay tuned, but don't hold your breath.

Flannery, Sarah, with David Flannery, In Code: A Mathematical Journey, Workman, 200 1 ; ix +

341 pp, $24.95 . ISBN 0- 1 9628- 1 23 84-8.

High-school student Sarah Flannery invented a new public-key cryptographic algorithm (which
she calls the Cayley-Purser algorithm) as her project entry in the Irish Young Scientist competi­
tion. Her algorithm uses matrices but not modular exponentiation, hence it is 20 to 30 times as
fast in practice as the RSA algorithm. This book, written in part by her mathematician father,
details how she came to enter the contest, the fame that winning it brought her, the accompa­
nying stress (due to press publicity of the potential of her becoming wealthy from selling her
idea), and (in an appendix) all the mathematics behind the algorithm. There is enough exposi­
tion in the text itself of the elementary aspects of public-key cryptography and of matrices so
that the reader gets the flavor of the subject and her work. Her tale rambles; but on the whole it
is inspiring, and the personal nature of the writing may help add to its appeal to young readers.

Weibel, Ewald R. , Symmorphosis: On Form and Function in Shaping Life, Harvard University
Press, 2000; xiii + 263 pp, $45 . ISBN 0-674-00068-4.

''This book addresses a simple question: Are animals designed economically?" The "symmor­
phosis" of the title refers to sizing of parts of a system to its function, including providing some
margin of safety. Author Weibel works out "the quantitative relations between form and func­
tion" in various physiological settings: cell, muscle, lung, and circulation. Much of the modeling
is traditional, but the last page mentions the "Koch tree" as a model of the airway tree and re­
marks on its relation to the Mandelbrot set. This book may not interest you as a mathematician
directly, unless you are involved in mathematical modeling of physiological processes; but, like
its predecessor D' Arcy Thompson's On Growth and Form, it may inspire biology students to
study mathematics with you .

T HIS MAGAZINE extends its appreciation to Prof. Campbell on his completing 25 years
of service as Associate Editor and Reviews Editor. He in tum thanks Assistant Editor Eric
Rosenthal for prodigious assistance over the years.
N EWS A N D L ETT E RS

42nd International Mathematical Olympiad

Washington, D.C., United States of America

July 9 and 10, 2001


edited by Titu Andreescu and Zuming Feng

Problems

1 . Let ABC be an acute-angled triangle with 0 as its circumcenter. Let P on line BC


b e the foot of the altitude from A. Assume that L B C A ::: L A B C + 30° . Prove that
LCAB + L C O P < 90° .
2. Prove that

for all positive real numbers a , b, and c.


3 . Twenty-one girls and twenty-one boys took part in a mathematical competition. It
turned out that
(a) each contestant solved at most six problems, and
(b) for each pair of a girl and a boy, there was at least one problem that was
solved by both the girl and the boy.
Prove that there is a problem that was solved by at least three girls and at least three
boys.
4. Let n be an odd integer greater than 1 and let c 1 , c2 , , cn be integers. For each
• . .

permutation a = (a 1 , a 2 , , an ) of { 1 , 2, . . . , n}, define S( a ) = 2:7= 1 c; a; . Prove


• • .

that there exist permutations b and c, b -::/= c, such that n ! divides S(b) - S( c ) .
5 . In a triangle ABC, let segment A P bisect L BAC, with P on side BC, and let
segment B Q bisect LA BC, with Q on side CA. It is known that L BAC = 60° and
that A B + B P = A Q + Q B . What are the possible angles of triangle ABC?
6. Let a > b > c > d b e positive integers and suppose
ac + bd = (b + d + a - c ) (b + d - a + c) .
Prove that ab + cd is not prime.

Solutions

Note : USA and International Math­


For interested readers, the editors recommend the
ematical Olympiads 2001 . There many of the problems are presented together with a
collection of remarkable solutions developed by the examination committees, contes­
tants, and experts, during or after the contests.

236
VO L. 75, NO. 3 , J U N E 2002 237
1 . Let a = L CAB, f3 = L A B C , and y = L B C A . Let w be the circumcircle of triangle
A BC, and let R denote the circumradius of triangle ABC. Also, let M be the
midpoint of side B C. Because 90° > y > f3, line A P is closer to C than to B .
Then P is on segment CM. Moreover, since triangle ABC is acute, 0 is inside
triangle ABC and triangles B O P, C O P, O P M are all nondegenerate. Because
90° > y - f3 � 30°
'
1

.
sm(y - {3) 2. (1)

Note that LCOB = 2LCAB = 2a and LPCO = LBCO = LOBC = ( l SOo ­


L C O B)/2 = 90° - LCAB = 90° - a. It suffices to prove that L C O P < L P C O ,
or to proving that 0 P > PC.
Because LA OC = 2{3, w e have LCA O = 90° - {3. Note that i n right tri­
angle APC, LCAP = 90° - y , so L P A O = LCAO - LCAP = y - {3. By ( 1 ) ,
sin L P A 0 � 1 /2. Let N b e the foot o f perpendicular from 0 to segment A P . Then
M P N O is a rectangle, so M P j O A = O N j O A = sin L P A O � 1 /2, or 2M P �
O A = OC. In right triangles OCM and O P M, OC > CM and O P > MP.
Therefore, PC - MP = MC - 2MP :::: MC - OC < MC - MC = 0 . We ob­
tain 0 P > M P > PC, as desired.
2. By multiplying a, b, and c by a suitable factor, we reduce the problem to the case
when a + b + c = 1 . Note that the function f (t) = Jt is convex for t > 0 as
f " (t) = 4Jts . Thus, by Jensen 's Inequality, we obtain
a b c 1
- + - + - >- --;:::.=====7=== '
,;X y'y ,fi Jax + by + cz
for all x , y, z > 0. Setting x = a 2 + Sbc, y = b 2 + Sea, z = c 2 + Sab in the last
inequality, we obtain

a b c 1
--;=::;;==:=::=
+ + > ---;=::==
- :;;= ;;:=
=:= ===;=
;: :::=:: =o=
Ja 2 + Sbc Jb2 + Sea Jc2 + Sab Ja3 + b3 + c3 + 24abc
1
>- = 1'
J (a + b + c)3
as

(a + b + c) 3 = a 3 + b 3 + c 3 + 3 L (a 2 b + b 2 a) + 6abc
eye

� a 3 + b 3 + c 3 + l S�a6b6c6 + 6abc = a 3 + b 3 + c 3 + 24abc.


3 . Assign each problem a unique letter, and also number the boys 1 , 2, . . . , 21 and
number the girls 1 , 2, . . . , 2 1 . Construct a 21 x 21 matrix of letters as follows: in
the i th row and j th column, write the letter of any problem that both the i th girl
and the j th boy solved-at least one such problem exists by condition (b) . If we
consider the i th row, each letter in that row corresponds to a problem that the i th
girl solved. Since each girl solved at most six problems, each row contains at most
6 distinct letters. Similarly, each column contains at most 6 distinct letters.
We have following key observation: In each row (resp. column), consider the let­
ters which appear at least three times. At least I 1 squares in the row (resp. column)
contain one of these letters. Indeed, there are at most 6 different letters, and they
cannot all appear at most twice, since there are 21 > 12 letters total. So at most 5
238 MAT H E MATICS MAGAZI N E
different letters appear at most twice, giving a total of at most 10 squares containing
letters appearing at most twice. Then at least 1 1 other squares each contain a letter
that appears at least three times.
In the matrix, color all the squares which contain letters appearing at least three
times in the same row (resp. column) in red (resp. blue). By the above observation,
each row contains at least 1 1 red squares, so the total number of red squares is
at least 21 x 1 1 . Similarly, each column contains at least 1 1 blue squares, so the
total number of blue squares is at least 21 x 1 1 . Since there are only 21 x 21 <
21 x 1 1 + 21 · 1 1 total squares, some square is colored both red and blue. Because
the letter in this square appears in three different columns and three different rows,
at least three boys and three girls solved the corresponding problem. Thus, we find
the problem satisfying the desired property.
4. Let L a denote the sum over all n ! permutations a = (a� o a2 , , an ) . We compute
• . •

L a S (a) modulo n ! in two ways, one of which assuming that the desired conclusion
is false, and reach a contradiction.
Suppose, for the sake of contradiction, that the claim is false. Then each S (a)
must have a different remainder mod n ! . Since there are exactly n! such permuta­
tions a, there exists exactly one permutation a such that S (a) = (mod n ! ) for each s
s = 1 , 2, . . . , n ! . Since n > 1 , n ! is even and n ! + 1 is odd. Hence,
n! n!
L S (a) = L:s = 2" · (n ! + 1) (mod n! ),
a s=I

or
n!
L S (a) = 2 (mod n! ). (1)
a
O n the other hand, for i , k E {1, . . . , n}, w e have a; = k in exactly (n - 1 ) ! permu­
tations a. Thus, for 1 ::::: i ::::: n,

" n+1
� a; = (n - 1) ! (1 + 2 + · · · + n) = n ! · -2- .
a
Hence,
n
� S a) = � 8 c;a; = 8 � c;a; = 8 c; � a;
(
n n ( ) .

Because n + 1 is even, n ! divides L a; = n ! · n ; 1 for each i . It follows that n !


a
divides L S(a), contradicting (1). Therefore, the initial assumption was false, and
a
there do exist distinct permutations b and c such that n! is a divisor of S (b) - S(c) .
5. LetLABC = 2x and LBCA = y . Then L A B Q = L Q BC = x and LCAB +
LABC + L BCA = 60° + 2x + y = 1 80°, so
y = 1 20° - 2x . (1)
Extend segment A B through B to R s o that B R = B P, and construct S o n ray A Q
so that A S = A R .
We claim that points B, P, S are collinear. Because B R = B P, triangle B P R is
isosceles with base angles

L B R P = L R P B = ( 1 80° - L P B R)/2 = x = L QB P . (2)


VO L. 75, NO. 3 , J U N E 2002 239
Note that AS = AR and L RAS = L BAC = 60° , implying that triangle ARS is
equilateral. Since line A P bisects LRAS, R and S are symmetric with respect to
line A P. Thus,

PR = PS (3)
and L A R P = L P SA, or L B R P = L P S Q. B y (2), we have
L QBP = LBRP = LPSQ. (4)
Because A Q + QS = A B + B R = AB + B P = A Q + QB, QS = Q B . Hence,
triangle B Q S is isosceles with

L B S Q = L Q B S. (5)
Now, assume to the contrary that triangle is B P S is nondegenerate. Then either
AC < AS or AC > AS. In either case, combining (4) and (5) gives L P B S =
I L Q B P - L Q B S I = I L P S Q - L B S Q I = L P S B, that is, triangle P B S is isosce­
les with P B = P S. By (3), it follows that P B = P S = P R. Hence, triangle B P R
is equilateral. But then LABC = 1 80° - LC B R = 1 20° , and by (1), y = oo , which
is absurd. Therefore, our assumption was wrong and B, P, S are collinear. Conse­
quently, S = C.
Since S = C, by (1) and (5), we obtain x = y = 1 20° - 2x, or x = 40° . There­
fore, LABC = 80°, L B CA = 40°, and LCAB = 60° .
6. Let x = b + d + a - c. It is clear that x > 1 . We have c = a + b + d(mod x) and
d = c - a - b(mod x). These congruences, combined with the given condition,
yield

0 = ac + bd = a(a + b + d) + bd = (a + b) (a + d) (mod x)
and

0 = ac + bd = ac + b(c - a - b) = (a + b) (c - b) (mod x).


Hence, x ll (a + b) (a + d) and x ll (a + b) (c - b) .
Because a + b > (a + b) - (c - d) = x and 2x = 2[a + (b - c) + d] > 2a >
a + b, a + b is not divisible by x . Thus, there is a prime p that divides each of
x , (a + d), and (c - b) . To finish, we only need to prove that p is a proper divisor
of ab + cd. In fact, ab + cd > a + d :::: p and

P ll (a + d)b + (c - b)d = ab + cd,


as desired.
Introducing Maple® 8
c o m m a n d t h e b r i l l i a n c e o f a t h o u s a n d m a t h e m a t i c i a n s

Continuing the tradition of innovation ,


rnathernatical leadership and rnaxirnurn value.
Revolution Maple" 8 features Moplets'" - o true innovation for mathematical software.

Moplets lets you eas i l y create custom opplet-style user i nterfaces for Maple" w i thout comp l i cated Ul

programming. Bu i l t on the flexib i l i ty of Jovo'", Moplets ore ideal for shoring your Maple work w i th students,

colleagues, or anyone who may not need or wont the ful l power of the worksheet environment.

Evolution Maple 8 offers o sweeping range of new mathematical functiona l i ty and environment

enhancements includ i ng o new package to explore and demonstrate concepts i n calculus, o l ibrary of

over 1 3,000 scienti fic constants, numerical solutions to PDEs w i th boundary cond i tions, calculus of

variations, code generation and connectlvl t l ty to Jovo, spell-checking, greater worksheet d i splay control,

o dialog-based plot bui lder, and much more.

For more i n formation, and deta i ls on special upgrade pri c i ng,


cal l t/Soo 267.6583. Outside North Ameri ca, cal l t/519 747.2373
or vi s i t www.maplesoft.com/sales for your local d i stri butor.

Waterloo Ma le
ADVA N C I NG MAT H E MAT I C S

WWW.MAPLESO FT.COM I WWW.MAPLEAPPS.COM I INFO@MAPLESO FT.COM I NORTH AMERICAN SALES 1.800.267. 6583
© 2002 Waterloo Maple Inc. Maple is a registered trademark of Waterloo Maple Inc. Maplets is a trademark of Waterlo Maple Inc. All other trademarks are property of their respective owners .
a The Mathe matical Association o f Ame ric a

Cooperative Learning in Undergraduate Mathematics:


Issues that Matter and Strategies that Work

E l izabeth C. Rogers, B a r b a ra E. Reyn o l d s ,


N e i l A. Davi d so n , a n d Anth o ny D . Tho m a s , E d ito rs

Seri e s : MAA N otes

T h i s vo l u m e offers prac t i cal su gges t i o n s and strategies b o t h fo r i n s t r u c t o rs w h o are a l ready


u s i n g coopera t ive learn i n g i n t h e i r classes, and for those who are t h i n k i n g a b o u t i m p l e m e n t i n g
i t . The a u t hors a re w idely expe r i e n ced w i t h b r i n g i n g coopera t i ve lea r n i n g i n to t h e u n d e rgrad­
u a t e m a t h e m a t i cs c las s room . I n add i t i o n t h ey d raw on t h e experiences of co l l eagues w h o
respo n ded to a su rvey abo u t coope r a t ive l e a rn i n g w h i c h was c o n d u c t e d i n 1 996-97 for Proj e c t
C L U M E ( Coo p erat ive Lea rn i n g i n U n dergrad u a t e M a t h e m a t i c s E d u c a t i o n ) .

T h e vol u m e d is c u sses m a n y of t h e prac t i c a l i m p l e m e n t a t i o n issues i n v o l ved i n crea t i n g a cooperative l ea rn i n g env i ro n m e n t :

h o w to d ev e l o p a posit ive soc i a l c l i m a t e , form groups a n d preve n t or reso l v e d i fficu l t ies w i t h i n a n d a m o n g t h e g ro u p s .


w h a t arc s o m e o f the coopera t ive s t ra tegies { w i t h spec i fi c exa m p l es for a vari e t y o f cou rses ) t h a t c a n be used i n
cou rses ra n g i n g from lower- d i v ision, to calc ulus, to u p p e r d i v i s i o n m a t h e m a t i c s courses .
w h a t arc some of t h e c r i t ical a n d s e n s i t ive issues of assessing i n d iv i d u a l l ea rn i n g in t h e con text of a cooperative
lea rn i ng e n v i ron men t .
h o w d o t h eo r i es abo u t t h e n a t u re o f mathematics con t e n t rela t e to t h e views of t h e i n s t r ucto r i n h e l p i n g s t u d e n t s learn
that co n t e n t .

T h e a u t hors p resent powe rfu l a p p l i c a t i o n s o f lea rn i n g theory that i l l u strate h o w readers m i g h t con s t ruct coopera t ive lea rn i n g a c t i v i ­
t ies to h a r m o n i ze with t h e i r own b e l iefs a b o u t t h e n a t u re of m a t h e m a t i c s a n d how mathemat ics is l ea rn e d .

I n w r i t i n g t h i s vol u m e t h e a u t h ors a n a l yzed and co m p a red t h e d i s t i n c t ive a p p ro a c h es t h ey were usi n g a t t h e i r various i n st i t u t io n s .


F u n d a m e n t a l d i fferences i n t h e i r a p proaches to cooperat ive learn i n g emerge d . For exa mp l e , c hoos i n g D a v i d son's g u i d ed - d i scovery
model over a c o n s t ru c t i v i s t model based o n D u b i n s ky's a c t i o n - p rocess-obj e c t - sc h e m a ( A POS ) t h eory a ffec t s o n e's c h o ice o f activ i t i e s .
Th ese a n d related d i st i n c t i o n s a re ex plored .

A selected b i b l iography p rovi des a n u mber of t h e major references ava i l a b l e in the field of coopera t i ve learn i n g in m a t h e m a t ics e d u ­
cat i o n . To m a ke t h is b i b l i o g r a p h y easier to use, i t has been a r r a n ged in two sec t i o n s . The first sect i o n i n c l udes refe rences c i ted i n t h e
t e x t a n d some sou rces fo r fu r t h e r read i n g . T h e sec o n d sec t i o n l i s t s a selec t i o n ( fa r from complete ) o f t e x t books a n d c o u r s e m a t e r i a l s
that work well in a coopera t ive classroom fo r u n d e r g r a d u a t e m a t hematics st udents.

Cat a l og Code: NT E - 5 5 / J R 250 pp., Pa perbo u n d , 200 I ISBN 0 8 8 3 8 5 - 1 66-0 List: $ 3 1 . 9 5 MAA Member: $ 2 5 . 95

N a m e ____________ ________________________ _ C red i t C a rd No. ___________________

A d d re ss ___________________________________ _ S i g n a t u re ______ Exp. Date __ / _

City Q t y ___ P r i c e $ _______ A m o u n t $. _______


S t a t e________ Zip _________ S h i p p i n g a n d H a n d l i n g $. _______

P h o ne ------- Catalog Code: NTE- 5 5 / J R

Shipping and Han d l i n g : USA orders ( � h ippt•J \ 1,1 U P S ) : SJ.OO tor tht'
fi rst book. .md $ 1 .00 lor l'ach .! d d i t i u n d l book. Canadian orders: S4.50 for
tht' first book .md $ 1 . 50 for e.ICh a d d t t ion;tl hook. C.ttl.ldt.tn ord<>rs w 1 l l be
!>hi pped w i t h i n 2 · .3 wccb of recctpt of order vt,\ the t.t,ll''l J\ ,ul.iblt• route.
We d o not <;ht p ,.,,, U P� 1 1 1 1 0 C.mada u n l e:.s tlw �.- u ,t o m a '\w..: i,tllv r..-quest:.
this scrviCl'. C.mad 1.111 l"u:.tonll' rs who reque:.t UP� ' h 1 pnt�· 1 1 t w1ll ht" b i l l e d
.m .1dd i t i O tMl 7% of t h c 1 r t o t . t l ordl·r. Overseas Orders: $450 ]'l'r l t �.-• m

ordl·rcd for book:. :.c n t s u r f. t �.- c m.ul. A1rnuil servin· j, ,1\',t i l .thll· .11 .1 r.ne of
per book. l-ore 1 � n orders m u ' t he p.ud Ill U� d o l l ,t r ' t h rou�h .1 liS
through .t t'\ch' York de.mnghouSl·. Credit c.trd orders are accepted
customers. A l l orders must be prepaid with the exception of books
for resale by bookstores and wholesaleH.
CO NTENTS

A RT I C L E S
1 63 G o l den, Ji, and n F l owers: A S p i ra l Sto ry, by M ichael Naylor

1 73 P i l l o w C h ess, by Gra n t Ca irns

1 87 Do u b l y Rec u r s i ve M u l t i va r i ate Auto m a t i c: D i ffe re n t i a t i o n ,


b y Da n Ka lma n

N OT E S
2 03 Avo i d i n g Yo u r S p o u s e at a Pa rty Leads to Wa r, by Marc A Brodie

2 09 B e r n o u l l i on Arc Lengt h , by Victor H. Moll, judith L . Nowa lsky,


Gined Roa, a n d L eonardo Sola n illa

214 Proof W i t h o u t Wo rd s : A L i n e t h ro u g h t h e l n center of a Tr i a n g l e,


by Sidney H. Kung

21 5 F i n i te G ro u p s T h a t H ave E x a c t l y n E l ements of Order n,


by Carrie E. Finch, Richard M . Foote, Lenny jones,
a n d Dona ld Spickler, jr.

219 Root l ess M a t r i ces, by Bertram Yood

223 Lots o f S m i t h s , by Pa trick Costello a n d Kathy L e wis

P RO B L E M S
227 Proposa l s 1 648-1 6 5 2

228 Q u i c k i es 92 1 -92 2

228 S o l u t i o n s 1 62 3 - 1 62 7

233 A n swers 92 1 -9 2 2

R EV I EWS
234

N EWS A N D L ETT E R S
236 42 n d A n n u a l I n te r n a t i o n a l M a t h e m at i c a l O l y m p i a d

You might also like