Paulin, Corwin - Introduction To Abstract Algebra (2019)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 150

Introduction to Abstract Algebra (Math 113)

Alexander Paulin, with edits by David Corwin

FOR FALL 2019 MATH 113 002 ONLY

Contents

1 Introduction 4

1.1 What is Algebra? . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . 12

2 The Structure of + and × on Z 15

2.1 Basic Observations . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Factorization and the Fundamental Theorem of Arithmetic . . 17

2.3 Congruences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Groups 23

1
3.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1.1 Cayley Tables for Binary Operations and Groups . . . 28

3.2 Subgroups, Cosets and Lagrange’s Theorem . . . . . . . . . . 30

3.3 Generating Sets for Groups . . . . . . . . . . . . . . . . . . . 35

3.4 Permutation Groups and Finite Symmetric Groups . . . . . . 40

3.4.1 Active vs. Passive Notation for Permutations . . . . . 40

3.4.2 The Symmetric Group Sym3 . . . . . . . . . . . . . . . 43

3.4.3 Symmetric Groups in General . . . . . . . . . . . . . . 44

3.5 Group Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.5.1 The Orbit-Stabiliser Theorem . . . . . . . . . . . . . . 55

3.5.2 Centralizers and Conjugacy Classes . . . . . . . . . . . 59

3.5.3 Sylow’s Theorem . . . . . . . . . . . . . . . . . . . . . 66

3.6 Symmetry of Sets with Extra Structure . . . . . . . . . . . . . 68

3.7 Normal Subgroups and Isomorphism Theorems . . . . . . . . . 73

3.8 Direct Products and Direct Sums . . . . . . . . . . . . . . . . 83

3.9 Finitely Generated Abelian Groups . . . . . . . . . . . . . . . 85

3.10 Finite Abelian Groups . . . . . . . . . . . . . . . . . . . . . . 90

3.11 The Classification of Finite Groups (Proofs Omitted) . . . . . 95

4 Rings, Ideals, and Homomorphisms 100

2
4.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.2 Ideals, Quotient Rings and the First Isomorphism Theorem


for Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

4.3 Properties of Elements of Rings . . . . . . . . . . . . . . . . . 109

4.4 Polynomial Rings . . . . . . . . . . . . . . . . . . . . . . . . . 112

4.5 Ring Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.6 Field of Fractions . . . . . . . . . . . . . . . . . . . . . . . . . 116

4.7 Characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

4.7.1 Characteristic in a General Ring . . . . . . . . . . . . . 121

4.7.2 Characteristic in Entire Rings . . . . . . . . . . . . . . 122

4.8 Principal, Prime and Maximal Ideals . . . . . . . . . . . . . . 125

5 Polynomials and Factorization 127

5.1 Factorisation in Integral Domains . . . . . . . . . . . . . . . . 127

5.1.1 Associated Elements . . . . . . . . . . . . . . . . . . . 128

5.1.2 Irreducible and Prime Elements . . . . . . . . . . . . . 129

5.1.3 Unique Factorization Domains . . . . . . . . . . . . . . 130

5.2 Remainder Theorem for Polynomials . . . . . . . . . . . . . . 132

5.3 PID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

5.3.1 RT implies PID . . . . . . . . . . . . . . . . . . . . . . 134

3
5.3.2 Consequences of Being a PID . . . . . . . . . . . . . . 135

5.4 Factorization of Polynomials . . . . . . . . . . . . . . . . . . . 137

5.4.1 Linear Factors of Polynomials . . . . . . . . . . . . . . 138

5.5 Ring and Field Extensions . . . . . . . . . . . . . . . . . . . . 139

5.5.1 Minimal Polynomials . . . . . . . . . . . . . . . . . . . 141

6 Material Beyond Our Course 143

6.1 Toward Galois Theory . . . . . . . . . . . . . . . . . . . . . . 143

6.1.1 Degree of a Field Extension . . . . . . . . . . . . . . . 143

6.1.2 Galois Theory . . . . . . . . . . . . . . . . . . . . . . . 145

6.2 Algebraic Geometry . . . . . . . . . . . . . . . . . . . . . . . . 146

6.3 p-adic Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6.4 Algebraic Number Theory . . . . . . . . . . . . . . . . . . . . 148

6.5 Commutative Algebra . . . . . . . . . . . . . . . . . . . . . . 149

1 Introduction

1.1 What is Algebra?

If you ask someone on the street this question, the most likely response will
be: “Something horrible to do with x, y and z”. If you’re lucky enough to
bump into a mathematician then you might get something along the lines

4
of: “Algebra is the abstract encapsulation of our intuition for composition”.
By composition, we mean the concept of two object coming together to form
a new one. For example adding two numbers, multiplying two numbers, or
composing real valued single variable functions. As we shall discover, the
seemly simple idea of composition hides vast hidden depth.

Algebra permeates all of our mathematical intuitions. In fact the first


mathematical concepts we ever encounter are the foundation of the subject.
Let me summarize the first six to seven years of your mathematical educa-
tion:

The concept of Unity. The number 1.

You probably always understood this, even as a little baby.

N := {1, 2, 3...}, the natural numbers. N comes equipped with two natural
operations + and ×.

Z := {... − 2, −1, 0, 1, 2, ...}, the integers.

We form these by using geometric intuition thinking of N as sitting on a


line. Z also comes with + and ×. Addition on Z has particularly good
properties, e.g. additive inverses exist.

a
Q := { | a, b ∈ Z, b 6= 0}, the rational numbers. We form these by taking
b
Z and formally dividing through by non-negative integers. We can again

5
use geometric insight to picture Q as points on a line. The rational
numbers also come equipped with + and ×. This time, multiplication is
has particularly good properties, e.g non-zero elements have multiplicative
inverses.

We could continue by going on to form R, the real numbers and then C, the
complex numbers. The motivation for passing to R is about analysis rather
than algebra (limits rather than binary operations). But R has binary oper-
ations of addition and multiplication just like Q, so one may still study it in
the context of algebra.

Notice that at each stage the operations of + and × gain additional prop-
erties. These ideas are very simple, but also profound. We spend years
understanding how + and × behave in Q. For example

a + b = b + a for all a, b ∈ Q,
or
a × (b + c) = a × b + a × c for all a, b, c ∈ Q.
The central idea behind abstract algebra is to define a larger class of objects
(sets with extra structure), of which Z and Q are definitive members.
(Z, +) −→ Groups
(Z, +, ×) −→ Rings
(Q, +, ×) −→ F ields
In linear algebra the analogous idea is
(Rn , +, scalar multiplication) −→ V ector Spaces over R
The amazing thing is that these vague ideas mean something very precise
and have far far more depth than one could ever imagine.

1.2 Sets

A set is any collection of objects. For example six dogs, all the protons on
Earth, every thought you’ve ever had, N, Z, Q, R, C. Observe that Z and

6
Q are sets with extra structure coming from + and ×. In this whole course,
all we will study are sets with some carefully chosen extra structure.

Basic Logic and Set Notation

Writing mathematics is fundamentally no different than writing English. It


is a language which has certain rules which must be followed to accurately
express what we mean. Because mathematical arguments can be highly
intricate it is necessary to use simplifying notation for frequently occurring
concepts. I will try to keep these to a minimum, but it is crucial we all
understand the following:

• If P and Q are two statements, then P ⇒ Q means that if P is true


then Q is true. For example: x odd ⇒ x 6= 2. We say that P implies
Q.

• If P ⇒ Q and Q ⇒ P then we write P ⇔ Q, which should be read as


P is true if and only if Q is true.

• The symbol ∀ should be read as “for all”.

• The symbol ∃ should be read as “there exists”. The symbol ∃! should


be read as “there exists unique”.

Let S and T be two sets.

• If s is an object contained in S then we say that s is an element, or a


member of S. In mathematical notation we write this as s ∈ S. For
example 5 ∈ Z. Conversely s ∈ / S means that x is not contained in S.
1
For example ∈ / Z.
2
• If S has finitely many elements then we say it is a finite set. We denote
its cardinality (or size) by |S|.

7
• The standard way of writing down a set S is using curly bracket nota-
tion.

S = { Notation for elements in S | Properties which specifies being in S }.

The vertical bar should be read as “such that”. For example, if S is


the set of all even integer then

S = {x ∈ Z | 2 divides x}.
We can also use the curly bracket notation for finite sets without using
the | symbol. For example, the set S which contains only 1,2 and 3 can
be written as
S = {1, 2, 3}.

• If every object in S is also an object in T , then we say that S is


contained in T . In mathematical notation we write this as S ⊂ T , and
in English we say that S is a subset of T . Note that S ⊂ T and T ⊂ S
⇒ S = T . If S is not contained in T we write S 6⊂ T .

• If S ⊂ T then T \S := {x ∈ T | x ∈
/ S}. T \S is called the complement
of S in T .

• The set of objects contained in both S and T is call the intersection of


S and T . In mathematical notation we denote this by S ∩ T .

• The collection of all objects which are in either S or T is call the union
on S and T . In mathematical notation we denote this by S ∪ T .

• S × T = {(a, b) | a ∈ S, b ∈ T }. We call this new set the (Cartesian)


product of S and T . We may naturally extend this concept to finite
collections of sets.

• The set which contains no objects is called the empty set. We denote
the empty set by ∅. We say that S and T are disjoint a if S ∩ T = ∅.
The union of two disjoint sets is often written as S T.

• A set is finite if it has finitely many elements. A more formal definition


is that S is finite if there is an integer n ∈ Z≥0 such that there is a
bijection from S to the set {1, 2, · · · , n} (note that this is the empty
set if n = 0).

8
We recommend looking at https://math.berkeley.edu/~gbergman/ug.
hndts/sets_etc,t=1.pdf for a guide to basic logic and set notation. We
also highly recommend Section 6 of that article for some subtleties about
the use of English in the logic of mathematical proofs.

1.3 Functions

Definition 1.1. A map (or function) f from S to T is a rule which assigns


to each element of S a unique elements of T . We express this information
using the following notation:

f :S → T
x 7→ f (x)

Here are some examples of maps of sets:

1. S = T = N,

f :N → N
a 7→ a2

2. S = Z × Z, T = Z,

f :Z×Z → Z
(a, b) 7→ a + b

This very simple looking abstract concept hides enormous depth. To illus-
trate this, observe that calculus is just the study of certain classes of functions
(continuous, differentiable or integrable) from R to R.
Definition 1.2. Let S and T be two sets, and let f : S → T be a map.

1. We say that S is the domain (also known as source) of f and T is the


codomain (also known as target) of f .

9
2. We say that f is the identity map if S = T and f (x) = x, ∀x ∈ S. In
this case we write f = IdS .

Observe that if R, S, and T are sets, and f : R → S and g : S → T are


maps, then we may compose them to give a new function: g ◦ f : R → T .
Note that this is only possible if the domain of g is the same as the codomain
of f .
Remark 1.3. We say that composition of functions is associative. This
means that if R, S, T , and U are sets, and f : R → S, g : S → T , and
h : T → U are maps, then the functions

(h ◦ g) ◦ f : R → U

and
h ◦ (g ◦ f ) : R → U
are the same function.

You might be surprised to see the word “codomain” where you might be
used to seeing the word “range”. In fact, we will also talk about the range
(or “image”), of a function, but it is not the same thing as codomain:
Definition 1.4. Let S and T be two sets, and let f : S → T be a map. We
define the image (also known as range) of f to be:

Im(f ) := {y ∈ T | ∃x ∈ S such that f (x) = y}

For example, if S = T = R, and f is the function sending x to x2 , then


the codomain of f is R, but the image (or range) of f is only R≥0 = {x ∈
R | x ≥ 0}.
Definition 1.5. Let f : S → T , and suppose U ⊆ T . Then we define the
preimage of U under f to be

f −1 (U ) := {s ∈ S | f (s) ∈ U }

10
Note that f −1 (T ) = S always. In fact, f −1 (Im(f )) = S. In general,
−1
f (U ) is a subset of S.

1. f is injective if f (x) = f (y) ⇒ x = y ∀ x, y ∈ S.

2. f is surjective if given y ∈ T , there exists x ∈ S such that f (x) = y.

3. If f is both injective and surjective we say it is bijective. Intuitively


this means f gives a perfect matching of elements in S and T .

4. If there is a bijection between S and T then we that S and T are in


bijection.

Remark 1.6. The codomain and image of f are the same if and only if f is
surjective.
Remark 1.7. If S and T are finite, then they are in bijection if and only if
they have the same number of elements. More general, for infinite sets, one
defines what it means to “have the same number of elements” by saying that
two sets have the same number of elements if they are in bijection.
Remark 1.8. A set is infinite if and only if it has the same number of
elements as a proper subset of itself. (In particular, if S is finite, then any
proper subset of S has strictly fewer elements than S.)
Exercise 1.1. Let S and T be two sets. Let f be a map from S to T . Show
that f is a bijection if and only if there exists a map g from T to S such that
f ◦ g = IdT and g ◦ f = IdS .
Definition 1.9. Let f be a map from S to T .

1. A left inverse for f is a map g : T → S such that g ◦ f = IdS .

2. A right inverse for f is a map g : T → S such that f ◦ g = IdT .

For example, let S = {1, 2, 3} and T = {1, 2, 3, 4, 5}, and let f (x) = x for
x ∈ S. Define g : T → S so that g(x) = x for 1 ≤ x ≤ 3, and g(4) = g(5) = 1.
Then g is a left inverse for f but not a right inverse. One can show that f
does not have a right inverse because it is not surjective.

11
In this language, Exercise 1.1 is asking you to show that f is a bijection
if and only if there is a map g that is both a left and right inverse for f .

Algebra is the general study of laws of composition or binary operations.


The notion of function just defined allows us to formally define the notion of
a binary operation on a set.

Definition 1.10. Let S be a set. Then a binary operation ∗ on S is a


function
∗ : S × S → S.
We often write a ∗ b in place of ∗(a, b), for a, b ∈ S.

Example 1.11. 1. Let S be either N, Z, Q, R, C, or Mn (R). Then both


+ and × define binary operations.

2. If S is R or Z, then the function x ∗ y = x2 + 2y + 1 defines a binary


operation on S. Note that this is not associative like the previous
examples.

3. Here’s a nonexample. If S = {−1, 0, 1, 2, 3, · · · } is the set of integers


greater than or equal to −1, then addition does NOT define a binary
operation on S. That’s because the codomain of a binary operation is
always the original set S, but −1 + (−1) is NOT in S.

Note that some books define the notion of a binary operation that is
“closed” and will say that the last example is not closed. But for us, we will
simply say that that last example is not a binary operation in the first place.

1.4 Equivalence Relations

Within a set it is sometimes natural to talk about different elements being


related in some way. For example, in Z we could say that x, y ∈ Z are related
if x − y is divisible by 2. Said another way, x and y are related is they are
both odd or both even. This idea can be formalized as something called an
equivalence relation.

12
Definition 1.12. A relation on a set S is a subset U ⊂ S × S. (This
is also sometimes called a homogeneous relation, such as at https://en.
wikipedia.org/wiki/Binary_relation#Definition.)

We often denote the relation by ∼. In this case, we write x ∼ y if and


only if (x, y) ∈ U .

Definition 1.13. An equivalence relation on a set S is a subset U ⊂ S × S


satisfying:

1. (x, y) ∈ U ⇔ (y, x) ∈ U . (This is called the symmetric property.)

2. ∀x ∈ S, (x, x) ∈ U . (This is called the reflexive property.)

3. Given x, y, z ∈ S, (x, y) ∈ U and (y, z) ∈ U ⇒ (x, z) ∈ U . (This is


called the transitive property.)

If U ⊂ S × S is an equivalence relation then we say that x, y ∈ S are


equivalent if and only if (x, y) ∈ U . In more convenient notation, we write
x ∼ y to mean that x and y are equivalent.

Definition 1.14. Let ∼ be an equivalence relation on the set S. Let x ∈ S.


The equivalence class containing x is the subset

[x] := {y ∈ S | y ∼ x} ⊂ S.

Remark 1.15. 1. Notice that the reflexive property implies that x ∈ [x].
Hence equivalence classes are non-empty and their union is S.

2. The symmetric and transitive properties imply that y ∈ [x] if and only
if [y] = [x]. Hence two equivalence classes are equal or disjoint. It
should also be noted that we can represent a given equivalence class
using any of its members using the [x] notation.

Definition 1.16. Let S be a set. Let {Xi } be a collection of subsets for


i ∈ I, some index set. We say that {Xi } forms a partition of S if each Xi is
non-empty, they are pairwise disjoint and their union is S.

13
We’ve seen that the equivalence classes of an equivalence relation natu-
rally form a partition of the set. Actually there is a converse: Any partition
of a set naturally gives rise to an equivalence relation whose equivalence
classes are the members of the partition. The conclusion of all this is that
an equivalence relation on a set is the same as a partition. In the example
given above, the equivalence classes are the odd integers and the even inte-
gers. Equivalence relations and equivalence classes are incredibly
important. They will the foundation of many concepts throughout
the course. Take time to really internalize these ideas.

If f : S → I is a surjective map, then Si := f −1 ({i}) for i ∈ I forms a


partition of S. Conversely, any partition of S (indexed by a set I) leads to a
surjective map from S to I.

So in fact, the following three concepts are essentially the same:

1. Equivalence relations on S
2. Partitions of S
3. Surjective maps with domain S

Specifically, an equivalence relation leads to a partition by equivalence


classes, and conversely a partition leads to an equivalence relation by saying
that x ∼ y iff x and y are in the same Si . A partition gives rise to a surjective
map from S to the index set (so s ∈ S maps to the unique i ∈ I such that
s ∈ Si ), and a surjective map f : S → I gives rise to a partition Si = f −1 ({i})
as mentioned above.

In the case of the partition of Z into the even and odd numbers, these
three are

1. We say x ∼ y iff x and y have the same parity (or equivalently, if x − y


is divisible by 2)
2. We partition Z into the set S1 of odd numbers and the set S2 of even
numbers

14
3. We have a map f : Z → {0, 1} sending even numbers to 0 and odd
numbers to 1

Note that two surjective maps f : S → I and f 0 : S → I 0 define the


same equivalence relation (or equivalently, the same partition) if there is a
bijection between I and I 0 that identifies f with f 0 . For example, whether
our index set for even and odd is {0, 1} or {even, odd}, we get the same
partition and same equivalence relation, even though we technically have
different surjections (because their codomains are different). So when we
say that these three concepts are “essentially” the same, keep in
mind that two different surjections can technically give rise to the
same partition/equivalence relation. But any two such surjections
are related in a certain way.

2 The Structure of + and × on Z

2.1 Basic Observations

We may naturally express + and × in the following set theoretic way:

+:Z×Z → Z
(a, b) 7→ a + b

×:Z×Z → Z
(a, b) 7→ a × b

Here are 4 elementary properties that + satisfies:

• (Associativity): a + (b + c) = (a + b) + c ∀a, b, c ∈ Z

• (Existence of additive identity) a + 0 = 0 + a = a ∀a ∈ Z.

• (Existence of additive inverses) a + (−a) = (−a) + a = 0 ∀a ∈ Z

15
• (Commutativity) a + b = b + a ∀a, b ∈ Z.

Here are 3 elementary properties that × satisfy:

• (Associativity): a × (b × c) = (a × b) × c ∀a, b, c ∈ Z

• (Existence of multiplicative identity) a × 1 = 1 × a = a ∀a ∈ Z.

• (Commutativity) a × b = b × a ∀a, b ∈ Z.

The operations of + and × interact by the following law:

• (Distributivity) a × (b + c) = (a × b) + (a × c) ∀a, b, c ∈ Z.

From now on we’ll simplify the notation for multiplication to a × b = ab.

Remarks

1. Each of these properties is totally obvious but will form the foundations
of future definitions: groups and rings.

2. All of the above hold for + and × on Q. In this case there is an extra
property that non-zero elements have multiplicative inverses:

Given a ∈ Q\{0}, ∃ b ∈ Q such that ab = ba = 1.

This extra property will motivate the definition of a field.

3. The significance of the Associativity laws is that summing and multi-


plying a finite collection of integers makes sense, i.e. is independent of
how we do it.

It is an important property of Z (and Q) that the product of two non-zero


elements is again non-zero. More precisiely: a, b ∈ Z such that ab = 0 ⇒

16
either a = 0 or b = 0. Later this property will mean that Z is something
called an integral domain. This has the following useful consequence:

Cancellation Law: For a, b, c ∈ Z, ca = cb and c 6= 0 ⇒ a = b.

This is proven using the distributive law together with the fact that Z is
an integral domain. I leave it an exercise to the reader.

You might wonder why we are making such a big point about the cancel-
lation law - after all, you already know this from your previous experience
with. The reason we emphasize it is that the cancellation law does not always
hold in Z/mZ; it holds only for certain m.

2.2 Factorization and the Fundamental Theorem of


Arithmetic

Definition 2.1. Let a, b ∈ Z. Then a divides b ⇔ ∃c ∈ Z such that b = ca.


We denote this by a | b and say that a is a divisor (or factor) of b.

Observe that 0 is divisible by every integer. The only integers which divide
1 are 1 and -1. Any way of expressing an integer as the product of a finite
collection of integers is called a factorization.

Definition 2.2. A prime number p is an integer greater than 1 whose only


positive divisors are p and 1. A positive integer which is not prime is called
composite.

Remark 2.3. Z is generated by 1 under addition. By this I mean that every


integer can be attained by successively adding 1 (or −1) to itself. Under
multiplication the situation is much more complicated. There is clearly no
single generator of Z under multiplication in the above sense.

Definition 2.4. Let a, b ∈ Z. The highest common factor of a and b, denoted


HCF (a, b), is the largest positive integer which is a common factor of a and
b. It is also called gcd(a, b). Two non-zero integers a, b ∈ Z are said to be
coprime if HCF (a, b) = 1.

17
Here are some important elementary properties of divisibility dating back
to Euclid (300BC), which I’ll state without proof. We’ll actually prove them
later in far more generality.
Theorem 2.5 (Remainder Theorem). Given a, b ∈ Z, if b > 0 then ∃! q, r ∈
Z such that a = bq + r with 0 ≤ r < b.
Theorem 2.6. Given a, b ∈ Z, ∃u, v ∈ Z such that au + bv = HCF (a, b).
In particular, a and b are coprime if an only if there exist u, v ∈ Z such that
au + bv = 1.
Euclid’s Lemma. Let p be a prime number and a, b ∈ Z. Then

p | ab ⇒ p | a or p | b

Proof. If p - a, then a and p are coprime. Therefore, we can find u, v ∈ Z


such that au + pv = 1. But then b = b(au + pv) = p(vb) + ab(u). Since p | ab,
it must also divide ab(u) and therefore p(vb) + ab(u), so p | b.
The Fundamental Theorem of Arithmetic. Every positive integer, a,
greater than 1 can be written as a product of primes:

a = p1 p2 ...pr .

Such a factorization is unique up to ordering.

Proof. If there is a positive integer not expressible as a product of primes,


let c ∈ N be the least such element. The integer c is not 1 or a prime, hence
c = c1 c2 where c1 , cc ∈ N, c1 < c and c2 < c. By our choice of c we know
that both c1 and c2 are the product of primes. Hence c much be expressible
as the product of primes. This is a contradiction. Hence all positive integers
can be written as the product of primes.

We must prove the uniqueness (up to ordering) of any such decomposition.


Let
a = p1 p2 ...pr = q1 q2 ...qs
be two factorizations of a into a product of primes. Then p1 | q1 q2 ...qs . By
Euclid’s Lemma we know that p1 | qi for some i. After renumbering we may

18
assume i = 1. However q1 is a prime, so p1 = q1 . Applying the cancellation
law we obtain
p2 ...pr = q2 ...qs .
Assume that r < s. We can continue this process until we have:

1 = qr+1 ..qs .

This is a contradiction as 1 is not divisible by any prime. Hence r = s and


after renumbering pi = qi ∀i.

Using this we can prove the following beautiful fact:


Theorem 2.7. There are infinitely many distinct prime numbers.

Proof. Suppose that there are finitely many distinct primes p1 , p2 .....pr . Con-
sider c = p1 p2 ...pr + 1. Clearly c > 1. By the Fundamental Theorem of
Arithmetic, c is divisible by at least one prime, say p1 . Then c = p1 d for
some d ∈ Z. Hence we have

p1 (d − p2 ...pr ) = c − p1 p2 ..pr = 1.

This is a contradiction as no prime divides 1. Hence there are infinitely many


distinct primes.

The Fundamental Theorem of Arithmetic also tells us that every positive


element a ∈ Q can be written uniquely (up to reordering) in the form:

a = pα1 1 · · · pαnn ; pi prime and αi ∈ Z

The Fundamental Theorem also tells us that two positive integers are coprime
if and only if they have no common prime divisor. This immediately shows
that every positive element a ∈ Q can be written uniquely in the form:

α
a= , α, β ∈ N and coprime.
β

19
We have seen that both Z and Q are examples of sets with two concepts of
composition (+ and ×) which satisfy a collection of abstract conditions. We
have also seen that the structure of Z together with × is very rich. Can we
think of other examples of sets with a concept of + and × which satisfy the
same elementary properties?

2.3 Congruences

Fix m ∈ N. By the remainder theorem, if a ∈ Z, ∃ ! q, r ∈ Z such that


a = qm + r and 0 ≤ r < m. We call r the remainder of a modulo m. This
gives the natural equivalence relation on Z:

a ∼ b ⇔ a and b have the same remainder modulo m ⇔ m | (a − b)

Exercise 2.1. Check this really is an equivalence relation! (We did most of
this in class.)
Definition. a, b ∈ Z are congruent modulo m ⇔ m | (a − b). This can
also be written:

a ≡ b mod m.

Remark 2.8. 1. The equivalence classes of Z under this relation are in-
dexed by the possible remainders modulo m. These possible remainders
are the integers 0 through m − 1. Hence, there are m distinct equiva-
lence classes which we call residue classes. We denote the set of all
residue classes Z/mZ.
2. There is a natural surjective map
[ ] : Z → Z/mZ
a 7→ [a] (1)

Note that this is clearly not injective as many integers have the same
remainder modulo m. Also observe that Z/mZ = {[0], [1], ...[m − 1]}.

20
The following result allows us to define + and × on Z/mZ.

Proposition. Let m ∈ N. Then,∀a, b, a0 , b0 ∈ Z :

[a] = [a0 ] and [b] = [b0 ] ⇒ [a + b] = [a0 + b0 ] and [ab] = [a0 b0 ].

Proof. This is a very good exercise. We went over it in class.

Definition. We define addition and multiplication on Z/mZ by

[a] × [b] = [a × b] ∀a, b ∈ Z [a] + [b] = [a + b] ∀a, b ∈ Z

Remark 2.9. Note that there is ambiguity in the definition, because it seems
to depend on making a choice of representative of each residue class. The
proposition shows us that the resulting residue classes are independent of this
choice, hence + and × are well defined on Z/mZ. This means that addition
and multiplication mod n are well-defined.

Our construction of + and × on Z/mZ is lifted from Z, hence they satisfy


the eight elementary properites that + and × satisfied on Z. In particular
[0] ∈ Z/mZ behaves like 0 ∈ Z:

[0] + [a] = [a] + [0] = [a], ∀[a] ∈ Z/mZ;

and [1] ∈ Z/mZ behaves like 1 ∈ Z:

[1] × [a] = [a] × [1] = [a], ∀[a] ∈ Z/mZ.

We say that [a] ∈ Z/mZ is non-zero if [a] 6= [0]. Even though + and × on
Z/mZ share the same elementary properties with + and × on Z, they behave
quite differently in this case. As an example, notice that

[1] + [1] + [1] + · · · + [1](m times)= [m] = [0]

21
Hence we can add 1 (in Z/mZ) to itself and eventually get 0 (in Z/mZ).

Also observe that if m is composite with m = rs, where r < m and s < m
then [r] and [s] are both non-zero (6= [0]) in Z/mZ, but [r] × [s] = [rs] =
[m] = [0] ∈ Z/mZ. Hence we can have two non-zero elements multiplying
together to give zero, unlike in Z.

Proposition. For every m ∈ N, a ∈ Z the congruence

ax ≡ 1 mod m

has a solution (in Z) iff a and m are coprime.

Proof. This is just a restatement of the fact that a and m coprime ⇔ ∃u, v ∈
Z such that au + mv = 1.

Observe that the congruence above can be rewritten as [a] × [x] = [1] in
Z/mZ. We say that [a] ∈ Z/mZ has a multiplicative inverse if ∃[x] ∈ Z/mZ
such that [a] × [x] = [1]. Hence we deduce that the only elements of Z/mZ
with muliplicative inverse are those given by [a], where a is coprime to m.

Recall that Q had the extra property that all non-zero elements had
multiplicative inverses. When does this happen in Z/mZ?. By the above we
see that this can happen ⇔ {1, 2, · · · , m − 1} are all coprime to m. This can
only happen if m is prime. We have thus proven the following:

Corollary. All non-zero elements of Z/mZ have a multiplicative inverse


⇔ m is prime.

Later this will be restated as Z/mZ is a field ⇔ m is a prime. These are


examples of things called finite fields.

Exercise 2.2. Show that if m is prime then the product of two non-zero
elements of Z/mZ is again non-zero.

22
Key Observation: There are naturally occuring sets (other than Z and Q)
which come equipped with a concept of + and ×, whose most basic properties
are the same as those of the usual addition and multiplication on Z or Q.
Don’t be fooled into thinking all other examples will come from
numbers. As we’ll see, there are many examples which are much
more exotic.

3 Groups

3.1 Basic Definitions

Definition. Let G be a set. A binary operation is a map of sets:

∗ : G × G → G.

For ease of notation we write ∗(a, b) = a ∗ b ∀a, b ∈ G. Any binary operation


on G gives a way of combining elements. As we have seen, if G = Z then +
and × are natural examples of binary operations. When we are talking about
a set G, together with a fixed binary operation ∗, we often write (G, ∗).
Fundamental Definition. A group is a set G, together with a binary
operation ∗, such that the following hold:

1. (Associativity): (a ∗ b) ∗ c = a ∗ (b ∗ c) ∀a, b, c ∈ G.
2. (Existence of identity): ∃e ∈ G such that a ∗ e = e ∗ a = a ∀a ∈ G.
3. (Existence of inverses): Given a ∈ G, ∃b ∈ G such that a ∗ b = b ∗ a = e.
Remark 3.1. 1. We have seen five different examples thus far: (Z, +),
(Q, +), (Q\{0}, ×), (Z/mZ, +), and (Z/mZ \ {[0]}, ×) if m is prime.
Another example is that of a real vector space under addition. Note
that (Z, ×) is not a group. Also note that this gives examples of groups
which are both finite and infinite. The more mathematics you learn the
more you’ll see that groups are everywhere.

23
2. A set with a single element admits one possible binary operation. This
makes it a group. We call this the trivial group.

3. A set with a binary operation is called a monoid if only the first two
properties hold. From this point of view, a group is a monoid in which
every element is invertible. (Z, ×) is a monoid but not a group. We
will not talk about monoids anymore, but it’s good to know that the
word exists.

4. Observe that in all of the examples given the binary operation is com-
mutative, i.e. a ∗ b = b ∗ a ∀a, b ∈ G. We give this a name:

Definition. A group (G, ∗) is called Abelian if it also satisfies

a ∗ b = b ∗ a ∀a, b ∈ G.

This is also called the commutative property.

The most basic Abelian group is (Z, +). Notice also that any vector space
is an Abelian group under its natural addition.

You might be wondering why we care about groups that a not Abelian (or
“non-abelian”). Here’s an important example of a non-abelian group that
you should have already seen in linear algebra:

GLn (R) := {M ∈ Mn (R) | det(M ) 6= 0}.

Note that a square matrix has nonzero determinant if and only if it is in-
vertible, i.e., has an inverse under matrix multiplication. This means that
every element of GLn (R) has an inverse under matrix multiplication. Be-
cause matrix multiplication is associative and because there is an identity
matrix, {GLn (R), ×} forms a group. For n ≥ 2, it is a non-abelian group.

Notice that elements of GLn (R) can be thought of in a geometric way, as


symmetries of Rn , where the group operation is composition of symmetries
(i.e., composition of linear transformations). We will eventually encounter
many non-abelian groups related to symmetries of geometric objects.

24
So a group is a set with extra structure. In set theory we have the natural
concept of a map between sets (a function). The following is the analogous
concept for groups:
Fundamental Definition. Let (G, ∗) and (H, ◦) be two groups. A ho-
momorphism f , from G to H, is a map of sets f : G → H, such that
f (x ∗ y) = f (x) ◦ f (y) ∀x, y ∈ G. If G = H and f = IdG we call f the
identity homomorphism.
Remark 3.2. 1. Intuitively one should thing about a homomorphism as
a map of sets which preserves the underlying group structure. It’s the
same idea as a linear map between vector spaces.

2. A homomorphism f : G → H which is bijective is called an isomor-


phism. Two groups are said to be isomorphic if there exists an
isomorphism between them. Intuitively two groups being isomorphic
means that they are the “same” group with relabelled elements.

3. A homomorphism from a group to itself (i.e. f : G → G) is called an


endomorphism. An endomorphism which is also an isomorphism is
called an automorphism.
Example 3.3. Here are some examples of homomorphisms:

1. The inclusion map from (Z, +) into (Q, +) is a homomorphism. This


is an example of an injective homomorphism that is not surjective.

2. The map from (Z, +) to (Z/mZ, +) sending a ∈ Z to [a] ∈ Z/mZ is a


surjective homomorphism that is not injective.

3. For any group G, the identity map from G to itself is an automorphism


of G.

4. Complex conjugation is an automorphism of (C, +).

5. The map from GLn (R) to (R \ {0}, ×) sending a matrix A to its de-
terminant det(A) is a homomorphism. This is because det(AB) =
det(A)det(B).

6. The exponential function x 7→ ex is a homomorphism from (R, +) to


(R \ {0}, ×). It is injective but not surjective.

25
7. The complex exponential z 7→ ez from (C, +) to (C \ {0}, ×) is a ho-
momorphism. In contrast with the real exponential function, it is sur-
jective but not injective, because ez = ez+2πi .
8. The logarithm is a homomorphism from (R>0 , ×) to (R, +). In fact, it
is an isomorphism.
9. For any group G and any group H, the map sending all elements of G
to eH ∈ H is a homomorphism. It is the trivial homomorphism from
G to H.
Proposition 3.4. Let (G, ∗), (H, ◦) and (M, ) be three groups. Let f : G →
H and g : H → M be homomorphism. Then the composition gf : G → M is
a homomorphism.

Proof. Let x, y ∈ G. gf (x ∗ y) = g(f (x) ◦ f (y)) = gf (x)gf (y).


Remark 3.5. Composition of homomorphism gives the collection of endo-
morphisms of a group the structure of a monoid. The subset of automor-
phisms has the stucture of a group under composition. We denote it by
Aut(G). This is analogous to the collection of n × n invertible matrices being
a group under matrix multiplication.
Proposition 3.6. Let (G, ∗) be a group. The identity element is unique.

Proof. Assume e, e0 ∈ G both behave like the identity. Then e = e ∗ e0 =


e0 .
Proposition 3.7. Let (G, ∗) be a group. For a ∈ G there is only one element
which behaves like the inverse of a.

Proof. Assume a ∈ G has two inverses, b, c ∈ G. Then:


(a ∗ b) = e
c ∗ (a ∗ b) = c∗e
(c ∗ a) ∗ b = c (associativity and identity)
e∗b = c
b = c

26
The first proposition tells us that we can write e ∈ G for the identity and
it is well-defined. Similarly the second proposition tells us that for a ∈ G
we can write a−1 ∈ G for the inverse in a well-defined way. The proof of
the second result gives a good example of how we prove results for abstract
groups. We use only the axioms, nothing else.

Given r ∈ Z and a ∈ G, we write



a ∗ a ∗ · · · ∗ a (r times),
 if r > 0
r
a = e, if r = 0
 −1
 −1 −1
a ∗ a ∗ ··· ∗ a (−r times), if r < 0

Cancellation Law for Groups. Let a, b, c ∈ G a group. Then

a ∗ c = a ∗ b ⇒ c = b and c ∗ a = b ∗ a ⇒ c = b

Proof. Compose on left or right by a−1 ∈ G, then apply the associativity and
inverses and identity axioms.
Remark 3.8. Here are a couple of facts that follow easily from the group
axioms:

1. For any x, y ∈ G, where (G, ∗) is a group, we have (x ∗ y)−1 = y −1 ∗ x−1 .


To prove this, just check that y −1 ∗ x−1 is an inverse of x ∗ y using the
associative property and the definition of x−1 and y −1 .

2. (x ∗ y)−1 = x−1 ∗ y −1 iff x ∗ y = y ∗ x, i.e., iff x and y commute.

3. Similarly, (x ∗ y)2 = x2 ∗ y 2 iff x ∗ y = y ∗ x.

Proposition 3.9. Let (G, ∗) and (H, ◦) be two groups and f : G → H a


homomorphism. Let eG ∈ G and eH ∈ H be the respective identities. Then

• f (eG ) = eH .

• f (x−1 ) = (f (x))−1 , ∀x ∈ G

27
Proof. • f (eG ) ◦ eH = f (eG ) = f (eG ∗ eG ) = f (eG ) ◦ f (eG ). By the
cancellation law we deduce that f (eG ) = eH .

• Let x ∈ G. Then eH = f (eG ) = f (x ∗ x−1 ) = f (x) ◦ f (x−1 ) and


eH = f (eG ) = f (x−1 ∗ x) = f (x−1 ) ◦ f (x). Hence f (x−1 ) = (f (x))−1 .

3.1.1 Cayley Tables for Binary Operations and Groups

Recall that a binary operation on a set S is a map from S × S to S. We


can express the set S × S as a grid or table whose rows and columns each
respectively correspond to the elements of S. For finite S, we can actually
draw this. For S = {a, b, c, d}, and we illustrate it as follows:

a b c d
a
b
c
d

Each empty cell in this table corresponds to a unique element of S × S;


for example, the pair (b, a) corresponds to the cell in the second row and first
column (so the row always comes first in the pair). A binary operation on S
is then simply a way of putting an element of S in each of the 16 cells of the
table.

Example 3.10. Here’s an example of a binary operation on S:

a b c d
a a b c d
b a c a d
c d b d c
d b c a a

28
This is what you know from grade school as a multiplication table. Many
abstract algebra textbooks call it the Cayley table of the binary operation.

However, this binary operation does not make S into a group. To see
why, we prove the following proposition:
Proposition 3.11. In the Cayley table of a group G, every element of G
appears exactly once in every row and in every column.

Proof. To say that every row contains every element of G exactly once is to
say that if (G, ∗) is a group, then for fixed a, b ∈ G the equation
a∗x=b
has exactly one solution x ∈ G.

Similarly, to say that every column contains every element of G exactly


once is to say that the equation
x∗a=b
has exactly one solution.

Why is this true? Well, the fact that it has at most one solution is the
Cancellation Law for groups, which we already proved.

How do we know that there is at least one solution? Well, for the row
equation, we just take x = a−1 ∗ b, and for the column equation, we take
x = b ∗ a−1 .

How do we know that the Cayley table of Example 3.10 is not a group?
We can note, for example, that the second row contains a twice rather than
once (as well, we could note that it does not contain b).

As another example, we can consider ({0, 1}, ×). The Cayley table is

0 1
0 0 0
1 0 1

29
One can check that this is not a group because, for example, 0 appears
twice in the first row.

Finally, let’s give an example of the Cayley table of a binary operation


that is actually a group. Specifically, here is the Cayley table of (Z/4Z, +),
where we let a = [0], b = [1], c = [2], and d = [3]:

a b c d
a a b c d
b b c d a
c c d a b
d d a b c

Finally, here’s another Cayley table of a group on the set S = {a, b, c, d}:

a b c d
a a b c d
b b a d c
c c d a b
d d c b a

This group is called the Klein Four-group1 . Later, in Section 3.8, we will
see that it is isomorphic to something we will call Z/2Z × Z/2Z. It is also
isomorphic to the group
        
1 0 −1 0 1 0 −1 0
, , , ,×
0 1 0 1 0 −1 0 −1

3.2 Subgroups, Cosets and Lagrange’s Theorem

In linear algebra, we can talk about subspaces of vector spaces. We have an


analogous concept in group theory.
1
There is also a music group called the Klein Four Group, famous for https://www.
youtube.com/watch?v=BipvGD-LCjU.

30
Intuitively, a subgroup H of (G, ∗) is just a subset H of G that is a group
under the same operation as G. Formally, we can define it as follows:
Definition. Let (G, ∗) be a group. A subgroup of G is a subset H ⊂ G
such that

1. e ∈ H

2. x, y ∈ H ⇒ x ∗ y ∈ H

3. x ∈ H ⇒ x−1 ∈ H
Remark 3.12. If H is a subgroup of G, and K is a subgroup of H (all with
the same operation), then K is a subgroup of G.
Example 3.13. 1. If G is any group, then {eG } and G are both subgroups
of G. The former is called the trivial subgroup. The latter is a non-
proper subgroup (so all subgroups not equal to the whole group are
called proper subgroups).

2. The group (Z, +) is a subgroup of (Q, +), which is a subgroup of (R, +),
which is itself a subgroup of (C, +).

3. The group (Q\{0}, ×) is a subgroup of (R\{0}, ×), which is a subgroup


of (C \ {0}, ×).

4. The set of complex numbers with absolute value 1 is a subgroup of


(C \ {0}, ×).

5. If m ∈ Z, then the subset mZ := {ma | a ∈ Z}is a subgroup of (Z, +).


Note that it is isomorphic to (Z, +).

6. If V is a vector space over R then it is naturally an Abelian group


under addition. If W is a subspace then it is also under a subgroup
under addition.

7. The set of purely imaginary numbers {ri | r ∈ R} is a subgroup of


(C, +). Note that this is a special case of the previous example.

8. The set {±1} is a subgroup of (Q \ {0}, ×).

9. The set {2n | n ∈ Z} of integer powers of 2 is a subgroup of (Q\{0}, ×).

31
10. The set of matrices of the form
  
a b
| a, b, c ∈ R ac 6= 0
0 c

is a subgroup of GL2 (R).

11. The subsets   


a 0
| a, c ∈ R ac 6= 0
0 c
and   
0 b
| b∈R
0 0
are both subgroups of the group in the previous example (and therefore
also of GL2 (R)).

Proposition. H, K ⊂ G subgroups ⇒ H ∩ K ⊂ G is a subgroup.

Proof. 1. As H, K subgroups, e ∈ H and e ∈ K ⇒ e ∈ H ∩ K.

2. x, y ∈ H ∩ K ⇒ x ∗ y ∈ H and x ∗ y ∈ K ⇒ x ∗ y ∈ H ∩ K.

3. x ∈ H ∩ K ⇒ x−1 ∈ H and x−1 ∈ K ⇒ x−1 ∈ H ∩ K.

This result clearly extends to any collection of subgroups of G.

Let (G, ∗) be a group and let H ⊂ G be a subgroup. Let us define a re-


lation on G using H as follows:

Given x, y ∈ G, x ∼ y ⇔ x−1 ∗ y ∈ H

Proposition 3.14. This gives an equivalence relation on G.

Proof. We need to check the three properties of an equivalence relation:

32
1. (Reflexive )e ∈ H ⇒ x−1 ∗ x ∈ H ∀x ∈ G ⇒ x ∼ x

2. (Symmetric) x ∼ y ⇒ x−1 ∗ y ∈ H ⇒ (x−1 ∗ y)−1 ∈ H ⇒ y −1 ∗ x ∈


H⇒y∼x

3. (Transitive) x ∼ y, y ∼ z ⇒ x−1 ∗y, y −1 ∗z ∈ H ⇒ (x−1 ∗y)∗(y −1 ∗z) ∈


H ⇒ x−1 ∗ z ∈ H ⇒ x ∼ z

Definition. We call the equivalence classes of the above equivalence relation


left cosets of H in G.
Proposition. For x ∈ G the equivalence class (or left coset) containing x
equals

xH := {x ∗ h | h ∈ H} ⊂ G

Proof. The easiest way to show that two subsets of G are equal is to prove
containment in both directions.

x ∼ y ⇔ x−1 ∗ y ∈ H ⇔ x−1 ∗ y = h for some h ∈ H ⇒ y = x ∗ h ∈ xH.


Therefore {Equivalence class containing x} ⊂ xH.

y ∈ xH ⇒ y = x ∗ h for some h ∈ H ⇒ x−1 ∗ y ∈ H ⇒ y ∼ x. Therefore


xH ⊂ {Equivalence class containing x}.

This has the following very important consequence:


Corollary 3.15. Hence for x, y ∈ G, xH = yH ⇔ x−1 ∗ y ∈ H.

Proof. By the above proposition we know that xH = yH ⇔ x ∼ y ⇔


x−1 ∗ y ∈ H.

It is very important you understand and remember this fact. An immediate


consequence is that y ∈ xH ⇒ yH = xH. Hence left cosets can in general be

33
written with different representatives at the front - just like an equivalence
class modulo n can be written with many different representatives.
This is very important.

Also observe that the equivalence class containing e ∈ G is just H. Hence


the only equivalence class which is a subgroup H, as no other contains the
identity. If H = {e} then the left cosets are singleton sets.
Remark 3.16. Let G = R3 , thought of as a group under addition. Let H
is a two dimensional subspace. Recall this is a subgroup under addition.
Geometrically H is a plane which contains the origin. Geometrically the left
cosets of H in R3 are the planes which are parallel to H.
Definition 3.17. Let (G, ∗) be a group and H ⊂ G a subgroup. We denote
by G/H the set of left cosets of H in G. If the size of this set is finite then
we say that H has finite index in G. In this case we write

(G : H) = |G/H|,

and call it the index of H in G.

For m ∈ N, the subgroup mZ ⊂ Z has index m. Note that Z/mZ is


naturally the set of residue classes modulo m previously introduced. The
vector space example in the above remark is not finite index as there are
infinitely many parallel planes in R3
Proposition 3.18. Let x ∈ G. The map (of sets)

φ : H −→ xH
h −→ x ∗ h

is a bijection.

Proof. We need to check that φ is both injective and surjective. For injectiv-
ity observe that for g, h ∈ H, φ(h) = φ(g) ⇒ x ∗ h = x ∗ g ⇒ h = g. Hence
φ is injective. For surjectivity observe that g ∈ xH ⇒ ∃h ∈ H such that
g = x ∗ h ⇒ g = φ(h).

34
Now let’s restrict to the case where G is a finite group.
Proposition. Let (G, ∗) be a finite group and H ⊂ G a subgroup. Then
∀x ∈ G , |xH| = |H|.

Proof. We know that there is a bijection between H and xH. Both must be
finite because they are contained in a finite set. A bijection exists between
two finite sets if and only if they have the same cardinality.
Lagrange’s Theorem. Let (G, ∗) be a finite group and H ⊂ G a subgroup.
Then |H| divides |G|.

Proof. We can use H to define the above equivalence relation on G. Be-


cause it is an equivalence relation, its equivalence classes cover G and are all
disjoint. Recall that this is called a partition of G.

We know that each equivalence class is of the form xH for some (clearly
non-unique in general) x ∈ G. We know that any left coset of H has size
equal to |H|. Hence we have partitioned G into subsets each of size |H|. We
conclude that |H| divides |G|.

This is a powerful result. It tightly controls the behavior of subgroups of a


finite group. For example:
Corollary 3.19. Let p ∈ N be a prime number. Let (G, ∗) be a finite group
of order p. Then the only subgroups of G are G and {e}.

Proof. Let H be a subgroup of G. By Lagrange |H| divides p. But p is prime


so either |H| = 1 or |H| = p. In the first case H = {e}. In the second case
H = G.

3.3 Generating Sets for Groups

Definition. Let G be a group and X ⊂ G be a subset. We define the


subgroup generated by X to be the intersection of all subgroups of G
containing X. We denote it by gp(X) ⊂ G.

35
Remark 3.20. 1. gp(X) is the minimal subgroup containing X. By min-
imal we mean that if H ⊂ G is a subgroup such that X ⊂ H then
gp(X) ⊂ H.

2. A more constructive way of defining gp(X) is as all possible finite com-


positions of elements of X and their inverses. I leave it as an exercise
to check that this subset is indeed a subgroup.

3. Let us consider the group (Z, +) and X = {1} ⊂ Z. Then gp(X)=Z.


This is the precise sense in which Z is “generated” by 1 under addition.

Definition 3.21. We say that a subset X ⊆ X generates a group (G, ∗) if


gp(X)=G.

Definition. We say a group (G, ∗) is finitely generated if it is generated


by a finite subset.

Remark 3.22. 1. Clearly all finite groups are finitely generated.

2. The group (Z, +) is also finitely generated, even though Z is infinite.

3. The fact that there are infinitely many primes implies that (Q\{0}, ×)
is not finitely generated.

Definition 3.23. A group (G, ∗) is said to be cyclic if ∃x ∈ G such that


gp({x}) = G, i.e. G can be generated by a single element. In concrete terms
this means that G = {xn | n ∈ Z}.

By the above observations (Z, +) and (Z/mZ, +) are examples.

Proposition 3.24. Any group of prime order is cyclic.

Proof. Let G be a group of prime order p. Let x be a non-identity element of


G. Then gp({x}) ⊂ G is non-trivial and by Lagrange’s theorem must have
order p. Hence G = gp({x}).

Remark 3.25. It is important to understand that not all groups are cyclic.
We’ll see many examples throughout the course.

36
Let G be a group (not necessarily cyclic). For r, s ∈ Z and x ∈ G,
x x = xr+s = xs+r = xs xr . Hence gp({x}) ⊂ G is Abelian. We deduce that
r s

all cyclic groups are Abelian.

Theorem. Let G be a cyclic group. Then

1. If G is infinite, G ∼
= (Z, +)
2. If |G| = m ∈ N, then G ∼
= (Z/mZ, +)

Proof. We have two cases to consider.

1. If G = gp({x}), then G = {· · · x−2 , x−1 , e, x, x2 · · · }. Assume all ele-


ments in this set are distinct, then we can define a map of sets:

φ:G → Z
xn → n

Then,∀a, b ∈ Z, φ(xa ∗ xb ) = φ(xa+b ) = a + b = ϕ(xa ) + ϕ(xb ) so ϕ is


a homomorphism which by assumption was bijective. Thus, (G, ∗) is
isomorphic to (Z, +).

2. Now assume ∃a, b ∈ Z, b > a such that xa = xb . Then x(b−a) = e ⇒


x−1 = x(b−a−1) ⇒ G = {e, · · · , xb−a−1 }. In particular G is finite.
Choose minimal m ∈ N such that xm = e. Then G = {e, x, · · · , xm−1 }
and all its elements are distinct by minimality of m. Hence |G| = m.

Define the map

φ : G → Z/mZ
xn → [n] for n ∈ {0, ...m − 1}

37
This is clearly a surjection, hence a bijection because |G| = |Z/mZ| = m.
Again ∀a, b ∈ {0, ..., m−1} we know φ(xa ∗xb ) = φ(xa+b ) = [a+b] = [a]+[b] =
ϕ(xa ) + ϕ(xb ) is a homomorphism. Hence (G, ∗) is isomorphic to (Z/mZ, +).

Hence two finite cyclic groups of the same size are isomorphic. What are the
possible subgroups of a cyclic group?

Proposition. A subgroup of a cyclic group is cyclic.

Proof. If H is trivial we are done. Hence assume that H is non-trivial. By


the above we need to check two cases.

1. (G, ∗) ∼
= (Z, +). Let H ⊂ Z be a non-trivial subgroup. Choose m ∈ N
minimal such that m ∈ H(m 6= 0). Hence mZ = {ma | a ∈ Z} ⊆ H.
Assume ∃n ∈ H such that n ∈ / mZ. By the remainder theorem, n =
qm + r, r, q ∈ Z and 0 < r < m ⇒ r ∈ H. This is a contradiction by the
minimality of m. Therefore mZ = H. Observe that gp({m}) = mZ ⊂
Z. Hence H is cylic.

2. (G, ∗) ∼
= (Z/mZ, +). Let H ⊂ Z/mZ be a non-trivial subgroup. Again,
choose n ∈ N minimal and positive such that [n] ∈ H. The same argu-
ment as above shows that the containment gp({[n]}) ⊆ H is actually
equality. Hence H is cyclic.

Proposition 3.26. Let (G, ∗) be a finite cyclic group of order d. Let m ∈ N


such that m divides |G|. Then there is a unique cyclic subgroup of order m.

Proof. Because |G| = d we know that G ∼ = (Z/dZ, +). Hence we need only
answer the question for this latter group. Let m be a divisor of d. Then
if n = d/m then gp({[n]}) ⊂ Z/dZ is cyclic of order m by construction. If
H ⊂ Z/dZ is a second subgroup of order m then by the above proof we

38
know that the minimal n ∈ N such that [n] ∈ H must be n = d/m. Hence
H = gp({[n]}).

Let (G, ∗) be a group (not necessarily cyclic) and x ∈ G. We call gp({x}) ⊂ G


the subgroup generated by x. By definition it is cyclic.

Definition 3.27. If |gp({x})| < ∞ we say that x is of finite order and its
order, written ord(x) equals |gp({x})|. If not we say that x is of infinite order.

Remark 3.28. 1. Observe that by the above we know that if x ∈ G is of


finite order, then

ord(x) = minimal m ∈ N such that xm = e

2. e ∈ G is the only element of G of order 1

3. For n ∈ Z, if xm = e, then ord(x) divides m. This essentially follows


by the second part of the theorem, because n is congruent to 0 modulo
ord(x) if and only if it’s divisible by ord(x).

4. The only element with finite order in Z is 0.

Proposition 3.29. Let (G, ∗) be a finite group and x ∈ G. Then ord(x)


divides |G| and x|G| = e.

Proof. By definition ord(x) = |gp({x})|. Therefore, by Lagrange’s theorem,


ord(x) must divide |G|. Also note that by definition xord(x) = e. Hence
|G| |G|
x|G| = x(ord(x)× ord(x) ) = e ord(x) = e.

39
3.4 Permutation Groups and Finite Symmetric Groups

Definition. Let S be a set. We define the group of permutations of S


to be the set of bijections from S to itself, denoted Σ(S), where the group
binary operation is composition of functions.
Remark 3.30. 1. By composition of functions we always mean on the
left, i.e. ∀f, g ∈ Σ(S) and s ∈ S (f ∗ g)(s) = f (g(s)).
2. Associativity clearly has to hold. The identity element e of this group
is the identity function on S, i.e. ∀x ∈ S, e(s) = s. Inverses exist
because any bijective map from a set to itself has an inverse map.
3. Let n ∈ N. We write Symn := Σ({1, 2, ..., n}). If S is any set of
cardinality n then Σ(S) is isomorphic to Symn , the isomorphism being
induced by writing a bijection from S to {1, 2, ..., n}. We call these
groups the finite symmetric groups.
4. Observe that given σ ∈ Σ(S) we can think about σ as “moving” S
around. In this sense the group Σ(S) naturally “acts” on S, and what
σ does its called its action. The word “action” here will be made more
precise in Section 3.5.
5. If σ, τ ∈ Σ(S), we often write στ to denote σ ◦ τ . I.e., as for groups in
general, we leave out the group operation, but it is understood that
the group operation is composition of functions.

The symmetric groups Symn give lots of good examples of finite groups
(both themselves, and, as we shall see later, some of their subgroups). They
also give a bunch of examples of non-abelian groups besides groups of matri-
ces. Let’s study them in detail.

I recommend that you also read Section 5.1 of Judson’s book.

3.4.1 Active vs. Passive Notation for Permutations

Before we go on, I need to tell you that there are two different ways to write
the same permutation, and that can lead to some confusion.

40
Recall that a permutation of a set S is a bijection from S to itself. So let
S = {1, 2, 3}, and let σ be a permutation for which σ(1) = 2, σ(2) = 3, and
σ(3) = 1. Thus σ ∈ Sym3 . We often represent σ by the notation (123).

We can think of σ as sending 1 to 2, sending 2 to 3, and sending 3 to 1.


We might therefore represent σ as follows:

1 −→ 2
σ : 2 −→ 3
3 −→ 1

In general, if f is a function such that f (A) = B, then we write A −→ B.


This justifies the notation. Another way to write the same permutation is
 
1 2 3
2 3 1

We refer to the beginning of Chapter 5 of Judson for this notation.

Writing down all of these arrows is a bit cumbersome. So it’s easier to refer
to a permutation by a sequence of numbers. Notice that from the notation
above we get the sequence 2, 3, 1. So it would seem that the sequence 2, 3, 1
is a good way to represent this permutation.

HOWEVER, using the sequence 2, 3, 1 to refer to σ is known as passive


notation.

Here’s a different way to think of it. The fact that σ(1) = 2 suggests that
we should move 1 to the place of 2. Similarly, we should move 2 to where 3
was, and move 3 to where 1 was.

This gives the sequence 3, 1, 2, representing the same permutation. This


is active notation. The distinction between active and passive is discussed in
https://en.wikiversity.org/wiki/Permutation_notation#Active_vs._passive.

41
 We will
 primarily use passive notation via the unambiguous notation
1 2 3
, and because this is in Judson.
2 3 1

However, one has to be extra careful


 either
 way when composing two
1 2 3
permutations. Let’s also consider τ = , and let’s try to understand
1 3 2
τ σ. As explained in Remark 3.30.1, this means that we first apply σ and
then apply τ .

First, let’s find τ σ carefully by thinking of τ and σ as function. We


have σ(1) = 2, σ(2) = 3, and σ(3) = 1. Thus τ (σ(1)) = τ (2) = 3,
τ (σ(2)) = τ (3) = 2, and τ (σ(3)) = τ (1) = 1. We therefore find that τ σ
is the permutation  
1 2 3
,
3 2 1
also known as (13), that switches 1 and 3.

Let’s see what happens when we think about σ and τ in active notation.
We apply σ to get the sequence 3, 1, 2. Then, when we wish to apply τ = (23)
to the sequence 3, 1, 2, there is some ambiguity. Do we switch the numbers
3 and 2, to get 2, 1, 3, or do we switch the numbers in the second and third
places to get 3, 2, 1? Since we know the answer should give us 3, 2, 1, it turns
out that when applying a permutation to a sequence in active notation, the
permutation (23) switches whatever is in the second and third places.

On the other hand, if we write σ in passive notation, we get 2, 3, 1, then


τ means that we switch the numbers 2 and 3, wherever they now appear.

This issue is discussed at https://gowers.wordpress.com/2011/10/


16/permutations/, which uses passive and not active notation. If you use
active notation, you have to do the opposite of what Gowers writes.

IN CONCLUSION: The best way to be careful is to think of permutations


as functions and to compose the functions. But if you want to think of
permutations as sequences, then you have to be consistent when composing
permutations.

42
Remark 3.31. Notice that the active notation for σ −1 is 2, 3, 1, and the
passive notation for σ −1 is 3, 1, 2. This is not an accident. In general, for
ANY permutation σ, the active notation for σ −1 is the passive notation for
σ, and vice versa.

One important consequence of this is that if σ has order 2 (which is


equivalent to saying that σ = σ −1 , then the active and passive notations for
σ are the same. This is the case for τ above.

3.4.2 The Symmetric Group Sym3

Let’s first discuss the simplest example of a non-abelian symmetric group,


which is Sym3 .

One kind of element of Sym3 is a transposition, which switches or “trans-


poses” two numbers. For example, we have the transposition σ = (12), which
switches 1 and 2:

1 −→ 2
σ : 2 −→ 1
3 −→ 3

The notation “(12)” shouldn’t be hard to understand, although we will ex-


plain it in general soon.

We similarly have the transposition (23) and (13).

In addition to transpositions, there are what are known as 3-cycles. One


example is σ = (123)

1 −→ 2
σ : 2 −→ 3
3 −→ 1

43
We also have the 3-cycle (132):
1 −→ 3
σ : 2 −→ 1
3 −→ 2
Note that (321) and (213) refer to the same thing as (132). Similarly, (123) =
(231) = (312).

Thus far we have listed five elements of Sym3 , namely, (12), (23), (13), (123), (132).
We expect there to be 3! = 6 in total. The one element we have not listed
is the identity. It is the identity both in the sense that it is the identity
function from {1, 2, 3} to itself, and in that it is the identity of the group
Sym3 . We denote the identity either by e or by (1) (or (2) or (3) are equally
good notation).

Now that we’ve listed some elements of Sym3 , let’s talk about the group
operation. As described above, the group operation is just composition of
functions. Let’s give an example to make sure it’s clear

Let’s say σ = (123) and τ = (23). Let’s find τ σ. As explained in Section


3.4.1, this is (13).

What about στ ? Applying τ , one gets 1, 3, 2, and then applying σ, one


gets 2, 1, 3, which in total is the same as applying (12). Thus στ = (12).
Notice that this is not the same as τ σ, so they do not commute.

3.4.3 Symmetric Groups in General

The most basic property of a finite group is how many elements it has (also
known as its order ). Let’s see what this is for Symn .
Proposition 3.32. For n ∈ N, |Symn | = n!.

Proof. Any permutation σ of {1, 2...n} is totally determined by a choice of


σ(1), then σ(2) and so on. At each stage the possibilities drop by one. Hence
the number of permutations is n!.

44
We need to think of a way of elegantly representing elements of Symn .
For a ∈ {1, 2...n} and σ ∈ Symn we represent the action of σ on a by a cycle.
For example, we represent a 6-cycle as:

(abc...f ) where b = σ(a), c = σ(b)...σ(f ) = a.

Note that a, b, c, d, e, f can be any elements of the set {1, 2, · · · , n}, and
they don’t have to be in order. In general, we define:
Definition 3.33. If S is a set, and a1 , a2 , · · · , ak is sequence of distinct
elements of S, then the k-cycle
σ = (a1 a2 · · · ak )
is the element of Σ(S) such that σ(a1 ) = a2 , σ(a2 ) = a3 , · · · , σ(ak−1 ) =
ak , σ(ak ) = a1 , and σ(a) = a whenever a is NOT in the finite set {a1 , · · · , ak }.
Remark 3.34. 1. A k-cycle always has order k.
2. There are exactly k different ways to write a given k-cycle. For example,
(123) = (231) = (312) ∈ Sym3 .
3. A 1-cycle is the same as idS .

We are going to explain how to write every permutation as a product of


cycles in an essentially unique way, known as disjoint cycle notation. For
this, we need to explain what disjoint means, and before we explain what
disjoint means, we need a couple of definitions:
Definition 3.35. Let S be a set and σ ∈ Σ(S). Then we define
F ixed(σ) := {s ∈ S | σ(s) = s}.
Note that F ixed(σ) is a subset of S.

Notice that F ixed(σ) = S iff σ = idS .

If σ = (a1 a2 · · · ak ) is a k-cycle, for k ≥ 2, then F ixed(σ) = S\{a1 , · · · , ak }.

Here is a related definition:

45
Definition 3.36. Let S be a set and s ∈ S. Then we define

F ix(s) := {σ ∈ Σ(S) | σ(s) = s}.

Note that F ix(s) is a subset of Σ(S). It is clearly a subset of Σ(S). In fact,


it is a subgroup, which we will prove in Section 3.5.1. It is also called the
stabiliser subgroup of s.

Notice that s ∈ F ixed(σ) iff σ ∈ F ix(s). In this case, we say that σ fixes
s.

We can now define what it means to be disjoint:


Definition 3.37. If σ, τ ∈ Σ(S), then we say that σ and τ are disjoint if
F ixed(σ) ∪ F ixed(τ ) = S. In other words, every element of S is either fixed
by σ or fixed by τ .
Proposition 3.38. If σ, τ are disjoint, then στ = τ σ.

Proof. We need to show that for every s ∈ S, we have σ(τ (s)) = τ (σ(s)).

First, suppose that s ∈ F ixed(σ). Then τ (σ(s)) = τ (s). Then either


τ (s) ∈ F ixed(σ), or τ (s) ∈ F ixed(τ ). In the first case, we have σ(τ (s)) =
τ (s) = τ (σ(s)), so we are done. In the second case, we have τ (τ (s)) =
τ (s). Applying τ −1 to both sides, we find that τ (s) = s. This implies that
σ(τ (s)) = σ(s) = s = τ (s) = τ (σ(s)), so we are done.

Next, suppose that s ∈ F ixed(τ ). We can similarly reason based on


whether σ(s) ∈ F ixed(σ) or σ(s) ∈ F ixed(τ ). The proof in this case is the
same.

Since Symn is finite, every element has finite order by Proposition 3.29.
We know that eventually we get back to a because σ has finite order. Thus
σ eventually takes everything back to where it started. In this way every
σ ∈ Symn can be written as a product of disjoint cycles:

σ = (a1 ...ar )(ar+1 ...as )...(at+1 ...an ).

46
This representation is unique up to internal shifts and reordering the cycles.
We will give a detailed example of this in Example 3.40.

E.g. Let n = 5 then σ = (123)(45) corresponds to

1 −→ 2
2 −→ 3
σ: 3 −→ 1
4 −→ 5
5 −→ 4

If an element is fixed by σ we omit it from the notation.

E.g. Let n = 5 then σ = (523) corresponds to

1 −→ 1
2 −→ 3
σ: 3 −→ 5
4 −→ 4
5 −→ 2

This notation makes it clear how to compose two permutations. For example,
let n = 5 and σ = (23), τ = (241), then τ σ = (241)(23) = (1234) and
στ = (23)(241) = (1324). Observe that composition is on the left when
composing permutations. This example also shows that in general Symn is
not Abelian.

Hence, given σ ∈ Symn , we naturally get a well-defined partition of n,


taking the lengths of the disjoint cycles appearing in σ. This is call the cycle
structure of σ. In other words, if we can write

σ = σ1 σ2 · · · σm ,
X
where σi is a ki -cycle, and ki = n, and the σi are all disjoint, then the
i
cycle structure is {k1 , · · · , km }. Note that the ki need not be all distinct, and

47
we usually write them in ascending order. The number of ki ’s equal to 1 is
just the size of F ixed(σ).

Proposition 3.39. Let σ ∈ Symn have cycle structure {n1 , · · · , nm }. Then


ord(σ) = LCM (n1 , ...nm ), where LCM denotes the lowest common multiple.

Proof. Let σ = (a1 , · · · , ar )(ar+1 , · · · , as ) · · · (at+1 , · · · , an ), be a representa-


tion of σ as the disjoint product of cycles. We may assume that r = n1 ,
etc, without any loss of generality. Observe that a cycle of length d ∈ N
must have order d in Symn . Also recall that if G is a finite group then for
any d ∈ N, x ∈ G, xd = e ⇔ ord(x) | d. Also observe that for all d ∈ N,
σ d = (a1 , · · · , ar )d (ar+1 , · · · , as )d · · · (at+1 , · · · , an )d , because disjoint cycles
commute by Proposition 3.38. Thus we know that σ d = e ⇔ ni | d ∀i. The
smallest value d can take with this property is LCM (n1 , ...nm ).

Example 3.40. Let’s give an example of how you would write a permuta-
tion in disjoint cycle notation. For example, suppose we have the following
element of Sym9 :

1 −→ 5
2 −→ 1
3 −→ 7
4 −→ 3
σ: 5 −→ 2
6 −→ 6
7 −→ 9
8 −→ 8
9 −→ 4

We start by seeing where 1 goes. We get σ(1) = 5, σ(σ(1)) = 2, and


finally σ 3 (1) = 1. So that means that 1 travels to 5, then to 2, then back to
1, so we get the cycle (152) as one of the disjoint cycles making up σ.

Next, we see what happens to 2. But aha! We already had 2 appear in a


cycle. So there’s no point in considering it, and we move on to 3.

48
For 3, we see what happens when we keep applying σ. We get σ(3) = 7,
σ (3) = 9, σ 3 (3) = 4, and finally σ 4 (3) = 3 again. So 3 goes to 7, then to 9,
2

then to 4, then back to 3. In other words, we have another cycle (3794).

We can ignore 4, as it appears in (3794), and we can ignore 5, as it already


appears in (152). But when we get to 6, we notice that it doesn’t appear
in any of the cycles we already wrote down. In fact, 6 is fixed by σ, so 6 is
in a 1-cycle, (6). Of course, because all 1-cycles are the identity, we don’t
actually need to write it in the disjoint cycle decomposition.

Finally, the only number left out is 8, and like with 6, we have σ(8) = 8.

Thus the disjoint cycle notation for σ is (152)(3794). Note that we could
also write (152)(3794)(6)(8), if we want to remember the 1-cycles, but we
usually leave them out. The one good thing about including the one-cycles
is that it helps you write the cycle structure: in this case, the cycle structure
is {1, 1, 3, 4}.

Let’s also see what Proposition 3.39 says in this case. The order must be
LCM (1, 1, 3, 4) = 12.

As mentioned, the disjoint cycle notation for σ is unique, but only up


to internal shifts and reordering the cycles. Notice that (152) = (521) =
(215), and (3794) = (7943) = (9437) = (4379) (as mentioned above, there
are k ways to write a k-cycle). This gives (3)(4) = 12 different ways to
write the disjoint cycle notation. In fact, there are 2 different ways to order
these disjoint cycles, so we have 24 different ways to write the disjoint cycle
notation.

It’s fine to write a given permutation’s disjoint cycle notation


in whichever way you want. The important thing to understand is
that disjoint cycle notation is unique only up to internal shifts and
reordering the cycles.

This is similar to how prime factorization is “unique,” but only up to


reordering the prime factors.

49
Definition 3.41. A transposition is a cycle of length 2.

Observe that we can write any cycle as a product of transpositions:

Hence any permutation σ ∈ Symn may be written as the (not necessarily


disjoint) product of transpositions. This representation is non-unique as the
following shows:

e.g. n=6, σ=(1 2 3)=(1 3) (1 2)=(4 5)(1 3)(4 5)(1 2)

Notice that both expressions involve an even number of transpositions.


Theorem. Let σ ∈ Symn be expressed as the product of transpositions in
two potentially different ways. If the first has m transpositions and the
second has n transpositions then 2 | (m − n).

Proof. First notice that a cycle of length r can be written as the product of
r − 1 transpositions by the above. Let us call σ even if there are an even
number of even length cycles (once expressed as a disjoint product); let us
call σ odd if there are an odd number of even length cycles. We also define
the sign of σ, denoted sgn(σ), to be +1 or −1 depending on whether σ is
even or odd.

Consider how sign changes when we multiply by a transposition (1 i). We


have two cases:

1. 1 and i occur in the same cycle in σ. Without loss of generality we


consider (1 2 · · · i · · · r) as being in σ.

(1 i)(1 2 · · · i · · · r)=(1 2 · · · i − 1)(i i + 1 · · · r)

If r is even then either we get two odd length cycles or two even length
cycles. If r is odd then exactly one of the cycles on the right is even
length. In either case, sgn((1 i)σ) = −sgn(σ).

50
2. 1 and i occur in distinct cycles. Again, without loss of generality we
may assume that (1 · · · i − 1)(i · · · r) occurs in σ. In this case

(1 i)(1 2 · · · i − 1)(i · · · r)=(1 · · · r).

In either of the cases r even or odd, we see that the number of even
length cycles must drop or go up by one. Hence sgn((1 i)σ) = −sgn(σ)
as in case 1.

We deduce that multiplying on the left by a transposition changes the sign


of our permutation. The identity must have sign 1, hence by induction we
see that the product of an odd number of transpositions has sign −1, and
the product of an even number of transpositions has sign 1.

Note that if we write any product of transpositions then we can immedi-


ately write down an inverse by reversing their order. Let us assume that we
can express σ as the product of transpositions in two different ways, one with
an odd number and one with an even number. Hence we can write down σ
as the product of evenly many transpositions and σ −1 as a product of an odd
number of transpositions. Thus we can write e = σ ∗ σ −1 as a product of an
odd number of transpositions. This is a contradiction as sgn(e) = 1.

We should observe that from the proof of the above we see that ∀σ, τ ∈
Symn , sgn(στ ) = sgn(σ)sgn(τ ). Because sgn(e) = 1 we deduce that
sgn(σ) = sgn(σ −1 ) for all σ ∈ Symn .

In particular this shows that the set of even elements of Symn contains
the identity and is closed under composition and taking inverse. Hence we
have the following:

Definition 3.42. The subgroup of Altn ⊂ Symn consisting of even elements


is called the Alternating group of rank n.

Observe that Altn contains all 3-cycles (cycles of length 3).

Proposition 3.43. Altn is generated by 3-cylces.

51
Proof. By generate we mean that any element of Altn can be expressed as the
product of three cycles. As any element of Altn can be written as the product
of three cycles we only have to do it for the product of two transpositions.
There are two cases:

1. (i j)(k l) = (k i l)(i j k).

2. (i j)(i k) = (i k j).

n!
Proposition 3.44. |Altn | = .
2

Proof. Recall that |Symn | = n!, hence we just need to show that (Symn :
Altn ) = 2. Let σ, τ ∈ Symn . Recall that

σAltn = τ Altn ⇔ σ −1 τ ∈ Altn .

But sgn(σ −1 τ ) = sgn(σ)sgn(τ ), hence

σAltn = τ Altn ⇔ sgn(σ) = sgn(τ ).

Hence Altn has two left cosets in Symn , one containing even permutations
and one odd permutations.

Later we shall see that the alternating groups for n ≥ 5 have a very
special property.

3.5 Group Actions

Definition. Let (G, ∗) be a group and S a set. By a group action of (G, ∗)


on S we mean a homomorphism

ϕ : G → Σ(S)

52
If the action of the group is understood we will write
g(s) = ϕ(g)(s) ∀ g ∈ G, s ∈ S.
Note that ϕ(g)(s) means that ϕ(g) is an element of Σ(S), and we apply that
element of Σ(S) to s ∈ S.
Remark 3.45. An action of G on S is the same as a map:

µ:G×S →S

such that

1. ∀x, y ∈ G, s ∈ S, µ(x ∗ y, s) = µ(x, µ(y, s))


2. µ(e, s) = s

We also sometimes write φ in place of µ (as was done in class).

One can write µ in terms of ϕ by setting


µ(g, s) = ϕ(g)(s)

It is an exercise for you to check that µ then satisfies the axioms listed,
and conversely that every such µ corresponds to an action ϕ.

We will often define actions in terms of µ rather than ϕ. But the definition
in terms of ϕ is more intuitive for most people.

The notation g(s) = µ(g)(s) makes the axioms clearer: (1) becomes (x ∗
y)(s) = x(y(s)) ∀x, y ∈ G, s ∈ S and (2) becomes e(s) = s ∀s ∈ S.
Example 3.46. 1. Notice that there is a natural action of Σ(S) on S:
µ : Σ(S) × S → S
(f, s) → f (s)
In terms of ϕ, this is just the identity homomorphism from Σ(S) to
itself.

53
2. Let (G, ∗) be a group. There is a natural action of G on itself:

µ:G×G → G
(x, y) → x ∗ y

Property (1) holds as ∗ is associative. Property (2) holds because


e ∗ x = x ∀x ∈ G. This is called the left regular representation of
G.

3. We define the trivial action of G on S by

µ:G×S → S
(g, s) → s ∀s ∈ S, g ∈ G

4. There is a natural action of GLn (R) on Rn . If we represent v ∈ Rn


as a column vector, and g ∈ GLn (R), then we define µ(g, v) to be
the matrix multiplication gv. More generally, if G is any subgroup of
GLn (R), then we get an action of G on V .

5. There is another natural action of G on itself:

µ:G×G → G
(x, y) → x∗ y ∗ x−1

Property (1) holds because of associativity of ∗ and that (g ∗ h)−1 =


h−1 ∗ g −1 . Property (2) is obvious. This action is called conjugation.
We will discuss it in more detail in Section 3.5.2.

Definition 3.47. An action of G on S is called faithful if

ϕ : G → Σ(S)

is injective.

54
Notice that if G and H are two groups and f : G → H is an injective
homomorphism then we may view G as a subgroup of H by identifying it
with its image in H under f . Hence if G acts faithfully on S then G is
isomorphic to a subgroup of Σ(S).
Cayley’s Theorem. Let G be a group. Then G is isomorphic to a subgroup
of Σ(G). In particular if |G| = n ∈ N, then G is isomorphic to a subgroup of
Symn .

Proof. The result will follow if we can show that the left regular represen-
tation is faithful. Let ϕ : G → Σ(G) be the homomorphism given by the
left regular representation. Hence for g, s ∈ G, ϕg (s) = g ∗ s. Forh, g ∈ G,
suppose ϕh = ϕg . Then h ∗ s = g ∗ s ∀s ∈ G ⇒ h = g. Hence ϕ is injective.

In particular, if G has n elements, then there is a bijection between G


and {1, · · · , n} (in some set theory textbooks, this is the definition of having
n elements). It is not hard to see that if two sets S, T are in bijection, then
Σ(S) and Σ(T ) are isomorphic. Thus Σ(G) is isomorphic to Symn , so every
subgroup of Σ(G) is isomorphic to a subgroup of Symn .

3.5.1 The Orbit-Stabiliser Theorem

Definition. Let (G, ∗) be a group, together with an action ϕ on a set S. We


can define an equivalence relation on S by

s ∼ t ⇔ ∃g ∈ G such that g(s) = t


Remark 3.48. This is an equivalence relation as a consequence of the group
axioms, together with the definition of an action. I leave it as an exercise to
check this.
Definition 3.49. Let (G, ∗) be a group, together with an action ϕ on a
set S. Under the above equivalence relation we call the equivalence classes
orbits, and we write
Orb(s) := {t ∈ S | ∃g ∈ G such that g(s) = t} ⊂ S

55
for the equivalence class containing s ∈ S. We call it the orbit of s.

It is important to observe that Orb(s) is a subset of S and hence is merely a


set with no extra structure.

Definition 3.50. Let (G, ∗) be a group, together with an action ϕ on a


set S. We say that G acts transitively on S is there is only one orbit.
Equivalently, ϕ is transitive if given s, t ∈ S, ∃g ∈ G such that g(s) = t.

An example of a transitive action is the natural action of Σ(S) on S. This


is clear because given any two points in a set S there is always a bijection
which maps one to the other. If G is not the trivial group (the group with
one element) then conjugation is never transitive. To see this observe that
under this action Orb(e) = {e}.

Definition 3.51. Let (G, ∗) be a group, together with an action ϕ on a set


S. Let s ∈ S. We define the stabiliser subgroup of s to be all elements of G
which fix s under the action. More precisely

Stab(s) = {g ∈ G | g(s) = s} ⊂ G

In the special case of Σ(S) acting on S, we called this F ix(s).

For this definition to make sense we must prove that Stab(s) is genuinely
a subgroup.

Proposition. Stab(s) is a subgroup of G.

Proof. 1. e(s) = s ⇒ e ∈ Stab(s)

2. x, y ∈ Stab(s) ⇒ (x ∗ y)(s) = x(y(s)) = x(s) = s ⇒ x ∗ y ∈ Stab(s).

3. x ∈ Stab(s) ⇒ x−1 (s) = x−1 (x(s)) = (x−1 ∗ x)(s) = e(s) = s ⇒ x−1 ∈


Stab(s)

56
Thus we may form the left cosets of Stab(s) in G:

G/Stab(s) := {xStab(s) | x ∈ G}.


Recall that these subsets of G are the equivalence classes for the equivalence
relation:

Given x, y ∈ G, x ∼ y ⇔ x−1 ∗ y ∈ Stab(s),

hence they partition G into disjoint subsets.

Proposition 3.52. Let x, y ∈ G then xStab(s)=yStab(s)⇔ x(s) = y(s).

Proof. Recall that x and y are in the same left coset ⇔ x−1 y ∈ Stab(s).
Hence x−1 y(s) = s. Composing both sides with x and simplifying by the
axioms for a group action implies that x(s) = y(s).

We deduce that there is a well defined map (of sets):

φ : G/Stab(s) −→ Orb(s)
xStab(s) −→ x(s)

Proposition. φ is a bijection.

Proof. By definition, Orb(s) := {x(s) ∈ S | x ∈ G}. Hence φ is trivially


surjective.

Assume φ(xStab(s)) = φ(yStab(s)) for some x, y ∈ G. This implies the


following:

57
x(s) = y(s) ⇒ x−1 (y(s)) = s
⇒ (x−1 ∗ y)(s) = s
⇒ x−1 ∗ y ∈ Stab(s)
⇒ xStab(s) = yStab(s)

Therefore φ is injective.

This immediately gives the following key result:

Orbit-Stabiliser Theorem. Let (G, ∗) be a group together with an action,


ϕ, on a set S. Let s ∈ S such that the orbit of s is finite (|Orb(s)| < ∞).
Then stab(s) ⊂ G is of finite index and

(G : Stab(s)) = |Orb(s)|

Proof. Immediate from previous proposition.

We have the following corollary:

Corollary 3.53. If (G, ∗) is a finite group acting on a set S and s ∈ S then

|G| = |Stab(s)| · |Orb(s)|.

Proof. In this case (G : Stab(s)) = |G|/|Stab(s)|. Applying the orbit-


stabiliser theorem yields the result.

Here are some examples of orbits and stabilisers:

Example 3.54. 1. If G acts trivially on a set S, then for all s ∈ S, we


have Stab(s) = G. The orbits are all one-element sets.

58
2. In the left regular representation, all stabilisers are trivial, i.e. {e}.
This is because if gx = x, then g = e. The action is transitive, i.e.,
there is one orbit.
3. In the conjugation action of G on itself, we have Stab(x) = {g ∈ G |
gx = xg}. This is known as the centraliser of x and is the topic of
Section 3.5.2.
4. In the natural action of GLn (R) on Rn , the stabiliser of the unit vector
(1, 0, · · · , 0) is the set of invertible matrices whose first column is
 
1
0
 
 .. 
.
0

More generally, the stabiliser of the ith unit vector is the set of invertible
n × n matrices whose ith column is the ith unit vector.
In all of these cases, the orbit is the set of nonzero elements of Rn .
5. When G = Symn acts in the natural way on S = {1, · · · , n}, the
stabiliser of any k ∈ S is a subgroup of Symn isomorphic to Symn−1 .
Note that this agrees with the Orbit-Stabiliser Theorem: the action is
transitive, so the orbit of any k ∈ S is all of S, hence has size n. Thus
|Symn | n!
the stabiliser has size = = (n − 1)!, which is indeed the size
n n
of Symn−1 .
6. More specifically, in the previous example, if n = 2, then Stab(1) =
{id, (23)}, and Stab(3) = {id, (12)}.
7. Again referring to the previous example, if n = 4, then Stab(2) =
{id, (13), (14), (34), (134), (143)}.

3.5.2 Centralizers and Conjugacy Classes

The Orbit-Stabiliser theorem will allow us to prove non-trivial results about


the structure of finite groups (in Section 3.5.3) when we apply it to the action
known as conjugation. We now discuss the conjugation action in more detail.

59
For g ∈ G, the map
[g] : G → G
defined by
[g](x) = gxg −1
for x ∈ G is called conjugation by g. For a fixed g, this gives a bijection
from G to itself (to see that it is bijection, notice that [g −1 ] gives the inverse
bijection). As we let g vary, this defines an action of G on itself.

Given x ∈ G, define the centralizer of x in G by

CG (x) = {g ∈ G | gxg −1 = x}.

It is clear from the definition that this is the stabilizer

StabG (x)

of x under the conjugation action, so CG (x) must be a subgroup. It is also


simply the set of g ∈ G that commute with x.

So that tells us that g ∈ CG (x) if and only if g commutes with x. But


that’s the same thing as saying that x commutes with g, so it’s equivalent to
saying that x ∈ Cg (x). In fact, by that reasoning, we have

CG (x) = {g ∈ G | x ∈ CG (g)}.

Recall the following definition from the homework:

Definition 3.55. Let (G, ∗) be a group. The center of G is the subset

Z(G) := {h ∈ G | g ∗ h = h ∗ g, ∀g ∈ G}.

It is clear that Z(G) is the intersection of all centralizers, i.e.,


\
Z(G) = CG (x).
x∈G

60
Note that the conjugation action is faithful iff the center is trivial, i.e.,
Z(G) = {e}.

The following is an important fact about conjugation that distinguishes


it from the left regular representation:

Proposition 3.56. If g ∈ G, then the map

ϕ(g) : G → G
x 7→ gxg −1

is a homomorphism from G to itself.

Proof. We have

ϕ(g)(xy) = gxyg −1
= gxeyg −1
= gxg −1 gyg −1
= (gxg −1 )(gyg −1 )
= ϕ(g)(x)ϕ(g)(y),

which shows that ϕ(g) is a homomorphism. Notice that we don’t bother


writing parentheses for the most part, because the group operation is always
associative (but the order does matter!).

On the other hand, for the left regular representation, notice that ϕ(g)
sends e to g, so it can’t be a homomorphism from G to itself unless g = e.

Thus in the case of the conjugation action, the image of the homomor-
phism
ϕ : G → Σ(G)
sits inside the subset
Aut(G) ⊆ Σ(G)
of bijections from G to itself that also happen to be homomorphisms. One
implication of this is that every element in the same conjugacy class must
have the same order.

61
The orbit of x under this action is known as the conjugacy class of x and
is denoted
Conj(x) = Orb(x) = {g −1 ∗ x ∗ g | g ∈ G}.

Remark 3.57. The following are equivalent:

• x ∈ Z(G)

• Conj(x) has one element

• CG (x) = G

By the Orbit-Stabilizer Theorem, we know that the size of the conjugacy


class of x times the size of CG (x) is |G| (at least assuming these are finite).

The previous fact is very important for computing the centralizer of an


element. If you just have some elements that commute with x, then you
know you’ve found some of CG (x), but it’s not clear that you’ve found all of
CG (x). But if you know |CG (x)|, and you’ve found that many elements that
commute with x, then you know you’ve found all of CG (x). See, for example,
Example 3.60.

Remark 3.58. Since x commutes with all of powers, we always have

gp({x}) ⊆ CG (x).

Assuming G is finite, note that |gp({x})| is just the order of x. It follows


that |CG (x)| ≥ ord(x), and thus the conjugacy class of x has size at most
|G|
.
ord(x)
Example 3.59. If G is abelian (for example, Z/mZ), then CG (x) = G for
every element x, and every conjugacy class has one element. This is the same
as saying that the conjugation action is the trivial action.

Example 3.60. As an example, consider x = (123) ∈ G = Sym5 . Then the


conjugacy class of x is the setofall 3-cycles. To count the number of three-
5
cycles, notice that there are = 10 ways to choose the three elements
3

62
that are cycled, and there are two cycles for each triple (think about how
(123) and (132) are different elements of Sym5 ), so there are 20 three-cycles.
Since |Sym5 | = 5! = 120, the size of CG (x) must be 120/20 = 6.

What are these six elements? We can take the subgroup generated by
(123)(45). Notice that this element has order 6 (as it’s the LCM of 2 and 3)
and is in CG (x), so it must in fact generate all of CG (x).

Example 3.61. For G = Sym3 , the conjugacy classes are {id}, {(12), (23), (31)},
and {(123), (132)}. Since |G| = 6, the stabilizer of id is G, the centralizer
of (12) has two elements, and the centralizer of (123) has three elements. In
fact, in the latter two cases, the centralizer of the element is just the sub-
group it generates (so the inclusion in Remark 3.58 is in fact an equality of
subgroups in these cases).

In fact, in general, there is a simple description of the conjugacy classes


in Symn . We begin with a lemma.

Lemma 3.62. If σ ∈ Symn is the k-cycle

(s1 · · · sk ),

and α ∈ Symn is any element, then

ασα−1 = (α(s1 ) · · · α(sk )),

i.e., it is still a k-cycle.

Theorem 3.63. Two permutations are conjugate in Symn if and only if they
have the same cycle structure.

Proof. Let σ, τ ∈ Symn have the same cycle structure {k1 , · · · , kr }. Hence
we may represent both in the form:

σ = σ1 σ2 · · · σr ,

τ = τ1 τ2 · · · τr ,

63
r
X
where σi is a ki -cycle, and ki = 1. Write
i=1

σi = (ai1 · · · aiki )

τi = (bi1 · · · biki )
for i = 1, 2, · · · , r. Define α ∈ Symn such that α(aij ) = bij for all 1 ≤ i ≤ r,
1 ≤ j ≤ ki . Then by Lemma 3.62, we have

ασi α−1 = τi

for each i. Since conjugation by α is an automorphism of Symn (and hence


a homomorphism), we get
ασα−1 = τ.

Conversely, if α is any element of Symn , and

σ = σ1 σ2 · · · σr ,

then
ασi α−1
is a ki -cycle for each i by Lemma 3.62, and they are all disjoint (because α
is a bijection), so ασα−1 has the same cycle structure as σ.

Corollary 3.64. Conjugacy classes in Symn are indexed by cycle structures


(i.e. partitions of n).

Proof. Immediate from the above.

Thus the number of conjugacy classes in Symn is the number of par-


titions of n, often denoted p(n). To learn more about p(n), see https:
//en.wikipedia.org/wiki/Partition_function_(number_theory).

64
Example 3.65. For G = Sym4 , the conjugacy classes are {id}, {(12), (23), (34), (41), (13), (24)},
{(123), (132), (124), (142), (134), (143), (234), (243)}, {(1234), (1243), (1324), (1342), (1423), (1432)}
and {(12)(34), (13)(24), (14), (23)}.

The centralizer of id is G, as always.

The centralizer of (12) must have four elements, as its conjugacy class
has 24/4 = 6 elements. Recall that disjoint cycles don’t commute, so (34) is
in CG ((12)). As well, by Remark 3.58, we know that (12) ∈ CG ((12)). So we
have CG ((12)) = {id, (12), (34), (12)(34)}.

Notice that the conjugacy class of (123) has eight elements, so its cen-
tralizer has 24/8 = 3 elements. In fact, it has order 3, so its centralizer is
just the subgroup it generates. Similarly, the conjugacy class of (1234) has
six elements, so its centralizer has 4 elements, and it has order 4, so it must
generate its centralizer.

Finally, note that the conjugacy class of (12)(34) has three elements, so
its centralizer must have eight elements. Recall that (12)(34) commutes with
the cycles (12) and (34), so it commutes with the subgroup they generate,
i.e., {id, (12), (34), (12)(34)}. Finally, more subtle is the fact that (13)(24)
also commutes with (12)(34). Note carefully that (13) and (24) do NOT
commute with it. We can then take the subgroup generated by (13)(24)
and {id, (12), (34), (12)(34)}, and this indeed has eight elements so it is the
centralizer.

Example 3.66. If G = GLn (R), the group of invertible n × n matrices with


real entries (under matrix multiplication), then two matrices are in the same
conjugacy class if and only if they are similar, or equivalently, have the same
Jordan normal form. Note in particular that any two conjugate matrices
have the same eigenvalues and characteristic polynomial.

We end with a nice little fact about conjugation and stabilisers:

Fact 3.67. Let G act on a set S, let x, y ∈ S, and suppose τ ∈ G. If


τ (x) = y, then
σ ∈ Stab(x)

65
if and only if
τ στ −1 ∈ Stab(y).

In particular, elements of Stab(x) are conjugate to elements of the sta-


biliser of any element of the orbit of x.

3.5.3 Sylow’s Theorem

We start with a remark that will help us prove Theorem 3.69. This will be
our first example of a non-trivial theorem that uses the theory of conjugacy
classes.

Remark 3.68. If C1 , · · · , Cr ⊂ G are the distinct conjugacy classes, we


deduce that r
X
|G| = |Ci |
i=1

and |Ci | | |G| ∀i ∈ {1, · · · , r}. This is known as the class equation.

We can now prove the following.

Theorem 3.69. If |G| > 1 is a power of some prime number p, then Z(G)
is nontrivial (i.e., has more than one element).

Proof. This essentially follows by Remarks 3.57 and 3.68. Note that every
conjugacy class has size dividing |G|, so it must be a power of p. Therefore,
every conjugacy class has size divisible by p or size 1.

Let’s group the conjugacy classes into those of size 1 and those of size p;
say C1 , · · · , Cs have size 1, and Cs+1 , · · · , Cr have size divisible by p. Notice
that by Remark 3.57, s is just the size of Z(G).

We have r r
X X
|G| = |Ci | = s + |Ci |,
i=1 i=s+1

66
so r
X
s = |G| − |Ci |.
i=s+1

But p | |G|, and p | |Ci | for i > s, so p | s. Thus s > 1, and we are
done.

Recall that Lagrange’s theorem says that if G is a finite group and H is


a subgroup then |H| divides |G|. It is not true, in general, that given any
divisor of |G| there is a subgroup of that order. We shall see an example
of such a group later. There are, however, partial converses to Lagrange’s
theorem.

Sylow’s Theorem. Let (G, ∗) be a finite group such that pn divides |G|,
where p is prime. Then there exists a subgroup of order pn .

Proof. Assume that |G| = pn m, where m = pr u with HCF (p, u) = 1. Our


central strategy is to consider a cleverly chosen group action of G and prove
one of the stabilizer subgroups has size pn . We’ll need to heavily exploit the
orbit-stabilizer theorem.

Let S be the set of all subsets of G of size pn . An element of S is an


unordered n-tuple of distinct elements in G. There is a natural action of G
on S by term-by-term composition on the left.

Let ω ∈ S. If we fix an ordering ω = {ω1 , · · · , ωpn } ∈ S, then g(ω) :=


{g ∗ ω1 , · · · , g ∗ ωpn }.

• We first claim that |Stab(ω)| ≤ pn . To see this define the function

f : Stab(ω) → ω
g → g ∗ ω1

By the cancellation property for groups this is an injective map. Hence


|Stab(ω)| ≤ |ω| = pn .

67
• Observe that
pn −1 n pn −1 n
pn m!
 n  Y p m−j Y p m−j
p m
|S| = = = = m .
pn pn !(pn m − pn )! pn − j pn − j
j=0 j=1

Observe that if 1 ≤ j ≤ pn − 1 then j is divisible by p at most n − 1


times. This means that pn m − j and pn − j have the same number of
p factors, namely the number of p factor of j. This means that
pn −1
Y pn m − j

j=1
pn − j

has no p factors. Hence |S| = pr v, where HCF (p, v) = 1.


Now recall that S is the disjoint union of the orbits of our action of
G on S. Hence there must be an ω ∈ S such that |Orb(ω)| = ps t,
where s ≤ r and HCF (p, t) = 1. By the orbit-stabilizer theorem we
u
know that |Stab(ω)| = pn+r−s . Because |Stab(ω)| ∈ N and u and t
t
u
are coprime to p, we deduce that ∈ N. Hence |Stab(ω)| ≥ pn .
t

For this choice of ω ∈ S, Stab(ω) is thus a subgroup of size pn .

Historically this is a slight extension of what is called Sylow’s First The-


orem. There are two more which describe the properties of such subgroups
in greater depth.

3.6 Symmetry of Sets with Extra Structure

Let S be a set and Σ(S) its permutation group. The permutation group
completely ignores the fact that there may be extra structure on S.

As an example, Rn naturally has the structure of a vector space. The


permutation group Σ(Rn ) does not take this into account. However within
the full permutation group there are linear permutations, namely GLn (R).
These are permutations which preserve the vector space stucture.

68
Symmetry in Euclidean Space

Definition 3.70. Given n ∈ N, n-dimensional Euclidean space is the vector


space Rn equipped with the standard inner product (the dot product).

   
x1 y1
 ..   .. 
Concretely, if x =  .  , y =  .  ∈ Rn then hx, yi := x1 y1 + · · · +
xn yn
xn yn .

Definition 3.71. The distance between x and y in Rn is


p
d(x, y) := hx − y, x − yi.

Definition 3.72. An isometry of Rn is a map of sets f : Rn → Rn (not nec-


essarily linear) such that ∀x, y ∈ Rn , d(x, y) = d(f (x), f (y)). The collection
of all isometries of Rn is denoted by Isom(Rn ).

Remark 3.73. • The identity function is an isometry and the composi-


tion of any two isometries is an isometry.

• We say an isometry, f , fixes the origin if f (0) = 0. It is a fact that f


fixes the origin if and only if f (x) = Ax for all x ∈ Rn , where A is an
orthogonal matrix.

• We say an isometry, f , is a translation if

f : Rn −→ Rn
x −→ x + y.

for some y ∈ Rn .

• Every isometry of Rn is a composition of an origin fixing isometry and


a translation. As a consequence, all isometries are bijective and their
inverses are isometries. This means Isom(Rn ) is a subgroup of Σ(Rn ).

Let X ⊂ Rn be a subset (not necessarily a subspace).

69
Definition 3.74. We define the symmetry group of X to be the subgroup
Sym(X) ⊂ Isom(Rn ) with the property that f ∈ Sym(X) if and only if f
permutes X.

There is a natural action of Sym(X) on the set X, coming from the fact
there is a natural homomorphism Sym(X) → Σ(X). Sym(X) measures how
much symmetry X has. The more symmetric X, the larger Sym(X).

The Dihedral Group

Let m ∈ N and X ⊂ R2 be a regular m-gon centered at the origin. We call


the symmetry group of X the dihedral group of rank m, and we denote it by
Dm .

First observe that every element of Dm must fix the center of X (the
origin). Thus we may view Dm as a subgroup of the group of 2×2 orthogonal
matrices. We shall not take this approach here.

Also observe that f ∈ Dm acts faithfully and transitive on the set of


vertices of X. Hence Dm can naturally by identified with a subgroup of

Symm . Let σ be the rotation by clockwise about the origin. All possible
m
rotational symmetries are generated by σ, namely
Rotm = {e, σ, σ 2 , · · · , σ m−1 } ⊂ Dm .
Hence Rotm is cyclic of order m.

Given a vertex a, Stab(a) = {e, τ }, where τ is the reflection through the


straight line containing a and the origin. By the orbit-stabilizer theorem
|Dm | = 2m, hence (Dm : Rotm ) = 2. We deduce that

a
Dm = Rotm τ Rotm .

The left coset τ Rotm is precisely the set of reflective symmetries. Hence
every element of Dm can be written in the form σ k (if a rotation) or τ σ k (if

70
a reflection). The group structure is completely determined by the following
properties

• ord(σ) = m

• ord(τ ) = 2

• τ σ = σ −1 τ (consider the action on the vertices)

Observe that the third property implies that Dm is not Abelian. Here is a
picture for n = 3.

The Cube in R3

Let X ⊂ R3 be a solid cube centered at the origin. Again, elements of


Sym(X) must fix the origin, hence, if we wished, we could identify Sym(X)
with a subgroup of the group of 3 × 3 orthogonal matrices.

Again Sym(X) acts faithfully and transitively on the vertices. If a ∈ X is


a vertex, then Stab(a) can naturally be identified with D3 (see below figure)
which has size 6. Hence, by the orbit-stabilizer theorem, |Sym(X)| = 48.
The same logic applies to Rot , the rotational symmetries, although the
stabilizer of a now has size 3. This tells us that |Rot | = 24.

71
If τ ∈ Sym(X) is the symmetry sending x to −x (this is not a rotation),
then again

a
Sym(X) = Rot τ Rot .

It can be shown that τ σ = στ for all σ ∈ Rot . Thus it remains to


determine the group structure of Rot .

Color the vertices with four colors, making sure that opposite vertices
have the same color (see below figure). Rotational symmetries act on this set
of four colors, inducing a homomorphism from Rot to Sym4 . Given any two
colors, it is possible to transpose them (leaving the others fixed) by a rotation.
Because Sym4 is generated by transpositions, the induced homormorphism
Rot → Sym4 must be surjective. However, |Rot | = 24 = 4! = |Sym4 |.
Hence it must be an isomorphism. We deduce that Rot is isomorphic to
Sym4 .

72
Interesting Question:

Let (G, ∗) be an abstract group. When is it true that we can find X ⊂ Rn ,


for some n ∈ N such that
G∼ = Sym(X)?
Less formally, when can an abstract group be realised in geometry?

3.7 Normal Subgroups and Isomorphism Theorems

In linear algebra the predominant objects we study are the maps between
vector spaces, and not the vector spaces themselves. The structure preserving
maps between vector spaces are more interesting than the spaces themselves.
This a deep observation and it is true far beyond the confines of linear algebra.
Philosophically it’s saying that an object in isolation is uninteresting; it’s
how it relates to what’s around it that matters. The world of group theory
is no different. Here the objects are groups and the maps between them are
homomorphisms. Now we’ll study homomorphisms between abstract groups
in more detail.

Let G and H be two groups. We’ll suppress the ∗ notation as it will


always be obvious where composition is taking place. Let eG and eH be the
respective identity elements. Recall that a homomorphism from G to H is a
map of sets f : G → H such that ∀x, y ∈ G, f (xy) = f (x)f (y).
Definition. Given f : G → H a homomorphism of groups, we define the
kernel of f to be:

Ker(f ) := {x ∈ G | f (x) = eH }

We define the image of f to be:

Im(f ) := {y ∈ H | ∃x ∈ G such that f (x) = y}


Proposition. Given a homomorphism f : G → H, Ker(f )⊆ G and Im(f )
⊆ H are subgroups.

73
Proof. First we will show true for Ker(f ):

1. f (eG ) = eH ⇒ eG ∈ Ker(f ).

2. Suppose x, y ∈ Ker(f ). Then f (xy) = f (x)f (y) = eH ⇒ xy ∈ Ker(f ).

3. Given x ∈ Ker(f ), f (x−1 ) = e−1


H = eH ⇒ x
−1
∈ Ker(f ).

Now we will show that Im(f ) is a subgroup:

1. f (eG ) = eH so eH ∈ Im(f ).

2. f (xy) = f (x)f (y)∀x, y ∈ G so Im(f ) is closed under composition.

3. Note that f (x)−1 = f (x−1 ) ⇒ y ∈ Im(f ) ⇒ y −1 ∈ Im(f ).

Proposition. A homomorphism f : G → H is injective if and only if ker(f )


is trivial.

Proof. f injective ⇒ Ker(f ) = {eG } trivially. Now assume ker(f ) = {eG }.


Suppose x, y ∈ G such that f (x) = f (y).

f (x) = f (y) ⇒ f (x)f (y)−1 = eH


⇒ f (x)f (y −1 ) = eH
⇒ f (xy −1 ) = eH
⇒ xy −1 = eG
⇒ x=y

Thus f is injective.

74
Recall that for m ∈ N the set of left cosets of mZ in Z, denoted Z/mZ natu-
rally inherited the structure of a group from + on Z. It would be reasonable
to expect that this was true in the general case, i.e. given G a group and H,
a subgroup, the set G/H naturally inherits the structure of a group from G.
To make this a bit more precise let’s think about what naturally means. Let
xH, yH ∈ G/H be two left cosets. Recall that x and y are not necessarily
unique. The only obvious way for combining xH and yH would be to form
(xy)H.

Warning: in general this is not well defined. It will depend on the choice of
x and y.

Something very special happens in the case G = Z and mZ = H.

Some examples of kernels and images are given in Example 3.77, after
the statement of the First Isomorphism Theorem.

Fundamental Definition. We call a subgroup H ⊆ G normal if, for all


g ∈ G, gHg −1 = {ghg −1 | g ∈ G, h ∈ H} = H. We denote normal subgroup
by H / G.

Remark 3.75. 1. Observe that this is not saying that given g ∈ G and
h ∈ H, then ghg −1 = h. It is merely saying that ghg −1 ∈ H. See
Example 3.78 for a good example of this.

2. A normal subgroup is the union of conjugacy classes of G.

3. If G is Abelian, every subgroup is normal as ghg −1 = h∀g, h ∈ G.

4. For any group G, the whole group G and the trivial group {eG } are
both normal as subgroups.

5. Let G=Sym3 , H = {e, (12)}. Then (13)(12)(13) = (23) ∈


/H
Hence H is not normal in Sym3 , so in general not all subgroups of a
group are normal.

6. The subgroup H = {id, (123), (132)} is normal in G = Sym3 .

Proposition 3.76. Let G and H be two groups. Let f : G → H a homo-


morphism. Then Ker(f ) ⊂ G is a normal subgroup.

75
Proof. Let h ∈ Ker(f ) and g ∈ G. Then f (ghg −1 ) = f (g)f (h)f (g −1 ) =
f (g)eH f (g)−1 = eH ⇒ ghg −1 ∈ Ker(f ).

In general Im(f ) ⊂ H is not normal.

Fundamental Definition. We say a group G is simple if its only normal


subgroups are {e} and G.

Cyclic groups of prime order are trivially simple by Lagrange’s theorem. It


is in fact true that for n ≥ 5, Altn is simple, although proving this will take
us too far afield. As we shall see later simple groups are the core building
blocks of groups theory.

The importance of normal subgroups can be seen in the following:

Proposition. Let H ⊆ G be a normal subgroup. Then the binary operation:

G/H × G/H → G/H

(xH, yH) 7→ (xy)H

is well defined.

Proof. As usual the problem is that that coset representatives are not unique
and thus we could have two representatives giving different maps. Thus our
goal is to show:

∀x1 , x2 , y1 , y2 ∈ G such that x1 H = x2 H and y1 H = y2 H, then


(x1 y1 )H = (x2 y2 )H

By assumption we know x−1 −1


1 x2 , y1 y2 ∈ H. Consider

u = (x1 y1 )−1 (x2 y2 ) = y1−1 x−1


1 x2 y2

76
Hence uy2−1 y1 = y1−1 (x−1 −1
1 x2 )y1 . Therefore, by the normality of H, uy2 y1 ∈
H ⇒ u ∈ H ⇒ (x1 y1 )H = (x2 y2 )H.

This shows that if H ⊂ G normal, G/H can be endowed with a natural


binary operation.

Proposition. Proposition Let G be a group; H ⊂ G a normal subgroup.


Then G/H is a group under the above binary operation. We call it the
quotient group.

Proof. Simple check of three axioms of being a group.

1. ∀x, y, z ∈ G, (xy)z = x(yz) ⇒ (xH ∗ yH) ∗ zH ⇒ xH ∗ (yH ∗ zH).

2. xH ∗ H = xH = H ∗ xH ⇒ H ∈ G/H is the identity.

3. xH ∗ x−1 H = xx−1 H = eH = H = x−1 xH = x−1 H ∗ xH ⇒ inverses


exist.

Proposition. The natural map

φ : G −→ G/H
x −→ xH

is a homomorphism with Ker(φ) = H.

Proof. Observe that ∀x, y ∈ G, φ(xy) = xyH = xHyH = φ(x)φ(y) ⇒ φ is a


homomorphism.

Recall that the identity element in G/H is the coset H. Hence for x ∈ Ker(φ)
⇔ φ(x) = xH = H ⇔ x ∈ H. Hence Ker(φ) = H.

77
Observe that this shows that any normal subgroup can be realised as the
kernel of a group homomorphism.

The First Isomorphism Theorem

Let G and H be groups, with respective identities eG and eH . Let φ : G → H


be a homomorphism. Recall that Ker(φ) ⊂ G is a normal subgroup. Hence
we may form the quotient group G/Ker(φ). Let x, y ∈ G such that they are
in the same left coset of Ker(φ). Recall that xKer(φ) = yKer(φ) ⇔ x−1 y ∈
Ker(φ) ⇔ φ(x−1 y) = eH ⇔ φ(x−1 )φ(y) = eH ⇔ φ(x)−1 φ(y) = eH ⇔ φ(x) =
φ(y). In summary, φ(x) = φ(y) ⇔ xKer(φ) = yKer(φ) Hence φ is constant
on each coset of Ker(φ).

Hence we get a map of sets :

ϕ : G/Ker(φ) −→ Im(φ)
xKer(φ) −→ φ(x)
This is well define precisely because of the above observations.
The First Isomorphism Theorem. Let G and H be two groups. Let
φ : G → H be a homomorphism, then the induced map

ϕ : G/Ker(φ) −→ Im(φ)
xKer(φ) −→ φ(x)

is an isomorphism of groups.

Proof. Firstly we observe that the induced φ is by definition of Im(φ) sur-


jective. Note that given x, y ∈ G, ϕ(xKer(φ)) = ϕ(yKer(φ)) ⇔ φ(x) =
φ(y) ⇔ xKer(φ) = yKer(φ), hence ϕ is injective.

78
It is left for us to show that ϕ is a homomorphism. Given x, y ∈ G,
ϕ(xKer(φ)yKer(φ)) = ϕ(xyKer(φ)) = φ(xy) = φ(x)φ(y) = ϕ(xKer(φ))ϕ(yKer(φ)).

Therefore φ : G/Ker(φ) → Im(φ) is a homomorphism, and thus an isomor-


phism.
Example 3.77. Here’s how the First Isomorphism works for every part of
Example 3.3:

1. The inclusion map from (Z, +) into (Q, +) is a homomorphism. Its ker-
nel is trivial, and its image is isomorphic to (Z, +), so it just expresses
the fact that for any group G, we have G/{eG } is isomorphic to G.

2. The map from (Z, +) to (Z/mZ, +) sending a ∈ Z to [a] ∈ Z/mZ is


surjective, so its image is Z/mZ, and its kernel is mZ. This expresses
the fact that Z/mZ is indeed the quotient of Z by mZ, as the notation
would suggest.

3. For any group G, the identity map from G to itself is an automorphism


of G. Like the first example, this is injective, so its kernel is trivial, so
the image is isomorphic to the original group G.

4. Complex conjugation is an automorphism of (C, +). Same comment as


in the previous example.

5. The map from GLn (R) to (R \ {0}, ×) sending a matrix A to its deter-
minant det(A) is a homomorphism. The kernel is the group SLn (R) of
matrices with determinant 1. The image is all of R \ {0}, so this tells
us that GL2 (R)/SL2 (R) is isomorphic to (R \ {0}, ×).

6. The exponential function x 7→ ex is a homomorphism from (R, +) to


(R \ {0}, ×). Its kernel is trivial, but its image is R>0 = {x ∈ R | x >
0}, so it gives an isomorphism between (R, +) and (R>0 , ×).

7. The complex exponential z 7→ ez from (C, +) to (C \ {0}, ×) is a ho-


momorphism. In contrast with the real exponential function, it is sur-
jective but not injective, because ez = ez+2πi . Its kernel is 2πiZ, i.e.,
all complex numbers of the form {2πin | n ∈ Z}. This tells us that
(C, +)/2πiZ is isomorphic to (C \ {0}, ×).

79
8. The logarithm is a homomorphism from (R>0 , ×) to (R, +). In fact, it
is an isomorphism, so the same comment as in the first example applies.
9. For any group G and any group H, the map sending all elements of G
to eH ∈ H is a homomorphism. Its kernel is the whole group G, and is
image is the trivial group {eH }. The First Isomorphism Theorem here
expresses the fact that that for any group G, G/G is isomorphic to the
trivial group.
Example 3.78. Consider the group Aff(1, R) of invertible affine maps from
the line R to itself. This may defined as the subset of Σ(R) consisting of
maps of the form x 7→ ax + b for a, b ∈ R, a 6= 0.

One can also define it as the set R \ {0} × R, with the binary operation
(a, b) ∗ (c, d) = (ac, ad + b). Notice that the group is not abelian. When we
write Aff(1, R), we thus mean R \ {0} × R with this group operation.

The map sending (a, b) ∈ R \ {0} × R to a ∈ R \ {0} is a surjective


homomorphism from Aff(1, R) to (R \ {0}, ×). The kernel is the set of trans-
lations in Aff(1, R), i.e., the set of elements for which a = 1. The subgroup
of translations is isomorphic to (R, +).

Let transc denote the translation x 7→ x + c. Notice that if σ = (a, b),


then σtransc σ −1 is NOT equal to transc . Rather, it equals transac . Despite
this, the set of translations is still a normal subgroup of Aff(1, R).

The Third Isomorphism Theorem

Let G be a group and N a normal subgroup. The third isomorphism theorem


concerns the connection between certain subgroups of G and subgroups of
G/N .

Let H be a subgroup of G containing N . Observe that N is automatically


normal in H. Hence we may form the quotient group H/N = {hN | h ∈ H}.
Observe that H/N is naturally a subset of G/N .

80
Lemma. H/N ⊂ G/N is a subgroup.

Proof. We need to check the three properties.

1. Recall that N ∈ G/N is the identity in the quotient group. Observe


that N ⊂ H ⇒ N ∈ H/N .

2. Let x, y ∈ H. By definition xy ∈ H. Thus xN yN = (xy)N ∈ H/N .

3. Let x ∈ H. By definition x−1 ∈ H. Thus (xN )−1 = x−1 N ∈ H/N .

Conversely, let M ⊂ G/N be a subgroup. Let HM ⊂ G be the union of


the left cosets contained in M .

Lemma. HM ⊂ G is a subgroup.

Proof. We need to check the three properties.

1. Recall that N ∈ G/N is the identity in the quotient group. Hence


N ∈ M ⇒ N ⊂ HM . N is a subgroup hence eG ∈ N ⇒ eG ∈ HM .

2. Let x, y ∈ HM . This implies that xN, yN ∈ M . M is a subgroup,


hence xN yN = xyN ∈ M . This implies that xy ∈ HM .

3. Let x ∈ HM . Hence xN ∈ M . M is a subgroup, hence (xN )−1 =


x−1 N ∈ M . This implies that x−1 ∈ HM .

Hence we have two maps of sets:

81
α : {Subgroups of G containing N } −→ {Subgroups of G/N }
H −→ H/N
and
β : {Subgroups of G/N } −→ {Subgroups of G containing N }
M −→ HM
Proposition 3.79. These maps of sets are inverse to each other.

Proof. We need to show that composition in both directions gives the identity
function.

1. Let H be a subgroup of G containing N . Then βα(H) = β(H/N ) = H.


Thus βα is the identity map on {Subgroups of G containing N }.
2. Let M be a subgroup of G/N . then αβ(M ) = α(HM ) = M . Thus αβ
is the identity map on {Subgroups of G/N }.

We deduce that both α and β are bijections and we have the following:
The Third Isomorphism Theorem. Let G be a group and N ⊂ G a
normal subgroup. There is a natural bijection between the subgroups of G
containing N and subgroups of G/N .

Proof. Either map α or β exhibits the desired bijection.

Normalizers

Let G be a group and H ⊆ G a subgroup. Given g ∈ G, let


gHg −1 := {ghg −1 | h ∈ H}.

82
Since conjugation by g is a group automorphism of G, and the image of a
subgroup under a homomorphism is also a subgroup, we find that gHg −1 is
a subgroup of G. Letting Sub(G) denote the set of all subgroups of G, this
construction defines an action of G on Sub(G).

The stabilizer of a subgroup H is known as the normalizer of H and


denoted NG (H). In other words,
NG (H) = {g ∈ G | gHg −1 = H}.

It follows obviously from the definition of normal that H is normal if and


only if NG (H) = G.

In fact, more generally, we have that H is a normal subgroup of NG (H)


(but not necessarily of G).

One of your homework problems asks you to commute a normalizer in a


simple case.

3.8 Direct Products and Direct Sums

Definition 3.80. Let G and H be two groups, with respective identities eG


and eH . We may form the direct product G × H = {(x, g) | x ∈ G g ∈ H}.
Let x, y ∈ G and g, h ∈ H. Observe that there is a natural binary operation
on G × H given by:
(x, g) ∗ (y, h) := (xy, gh).
Lemma. G × H is a group under the natural binary operation.

Proof. 1. Associativity holds for both G and H ⇒ associativity hold for


G × H.
2. (eG , eH ) is the identity.
3. For g ∈ G and h ∈ H (g, h)−1 = (g −1 , h−1 ).

83
There is an obvious generalization of this concept to any finite collection of
groups.
Remark 3.81. The set {(x, eG ) | x ∈ G} is a subgroup of G×H isomorphic
to G. It is in fact a normal subgroup, and the quotient by this subgroup is
isomorphic to H.

Similarly, there’s a subgroup isomorphic to H, and the quotient by this


subgroup is isomorphic to G.
Definition 3.82. Let G be a group and H, K ⊂ G two subgroups. Let us
furthermore assume that

1. ∀h ∈ H and ∀k ∈ K, hk = kh.
2. Given g ∈ G there exist unique h ∈ H, k ∈ K such that g = hk.

Under these circumstances we say that G is the direct sum of H and K


and we write G = H ⊕ K. Observe that the second property is equivalent
to:

3. H ∩ K = {eG } and for g ∈ G there exist h ∈ H, k ∈ K such that


g = hk.

For example, (Z/15Z, +) is the direct sum of gp([3]) and gp([5]).


Proposition 3.83. If G is the direct sum of the subgroups H, K ⊂ G then
G∼= H × K.

Proof. Define the map

φ : H × K −→ G
(h, k) −→ hk
Let x, y ∈ H and g, h ∈ K. By property one φ((x, g)(y, h)) = φ(xy, gh) =
xygh = xgyh = φ(x, g)φ(y, h). Hence φ is a homomorphism. Property two
ensures that φ is bijective.

84
Remark 3.84. As in Remark 3.81, the subgroups H and K are normal in
G. Then G/K is isomorphic to H, and G/H is isomorphic to K.

The concept of direct sum has a clear generalization to any finite collection
of subsets of G.

Note that for us, the main use of the term ‘direct sum’ is as a way to
recognize when a group is the direct product of two of its subgroups. So

3.9 Finitely Generated Abelian Groups

Let G be an Abelian group. We shall now use additive notation to express


composition within G. In particular we will denote the identity by 0 (not
to be confused with 0 ∈ Z). We do this because we are very familiar with
addition on Z being commutative. Given m ∈ Z and a ∈ G, we write

a ∗ a ∗ · · · ∗ a (m times),
 if m > 0
ma = 0, if m = 0
 −1
 −1 −1
a ∗ a ∗ ··· ∗ a (−m times), if m < 0

We have the identities:

1. m(a + b) = ma + mb

2. (m + n)a = ma + na

3. (mn)a = m(na)

∀a, b ∈ G; m, n ∈ Z

Now assume that G is finitely generated. Hence ∃{a1 , · · · , an } ⊂ G such


that gp({a1 , · · · , an }) = G. In other words, because G is Abelian, every
x ∈ G can be written in the form

x = λ1 a1 + · · · + λn an λi ∈ Z.

85
In general such an expression is not unique. For example is G is of order
m ∈ N then (m + 1)a = a for all a ∈ G. This is because ma = 0 A
reasonable goal would be to find a generating set such that every expression
of the above form was unique (after possibly restricting 0 ≤ λ1 < ord(ai )) for
a given x ∈ G. Such a generating set is called a basis for G. Observe that it
is not clear that such a basis even exists at present. If {a1 , · · · , an } ⊂ G were
a basis then letting Ai = gp(ai ) ⊂ G we have the direct sum decomposition:

G = A1 ⊕ · · · ⊕ An .
Conversely, if G can be represented as the direct sum of cyclic subgroups
then choosing a generator for each gives a basis for G.

Definition 3.85. Let G be an Abelian group. x ∈ G is torsion is it is of


finite order. We denote the subgroup of torsion elements by tG ⊂ G, called
the torsion subgroup.

Lemma. tG ⊂ G is a subgroup.

Proof. This critically requires that G be Abelian. It is not true in general.

1. ord(0) = 1 ⇒ 0 ∈ tG

2. Let g, h ∈ tG ⇒ ∃n, m ∈ N such that ng = mg = 0 ⇒ nm(g + h) =


(mng + nmh) = m0 + n0 = 0 ⇒ g + h ∈ tG.

3. ng = 0 ⇒ −(ng) = n(−g) = 0. Hence g ∈ tG ⇒ −g ∈ tG.

Clearly if G is finite then tG = G.

Definition 3.86. If tG = G we say that G is a torsion group. If tG = {0}


we say that G is torsion free.

Proposition 3.87. If G is torsion and finitely generated then G is finite.

86
Proof. Let {a1 , · · · , an } ⊂ G be a generating set. Each element is of finite
order hence every element x ∈ G can be written in the form

x = λ1 a1 + · · · + λn an , λi ∈ Z, 0 ≤ λ1 < ord(ai ).

This is a finite set.

Proposition 3.88. G/tGis a torsion free Abelian group.

Proof. Firstly note that tG ⊂ G is normal as G is Abelian, hence G/tG is


naturally an abelian group. Let x ∈ G. Assume that x + tG ∈ G/tG is
torsion. Hence ∃n ∈ N such that n(x + tG) = nx + tG = tG. Hence nx ∈ tG
so ∃m ∈ N such that mnx = 0. Hence x ∈ tG ⇒ xtG = tG.

Definition. An finitely generated Abelian group G is said to be free Abelian


if there exists a finite generating set {a1 , · · · , an } ⊂ G such that every ele-
ment of G can be uniquely expressed as

λ1 a1 + · · · λn an where λi ∈ Z.
In other words, if we can find a basis for G consisting of non-torsion elements.

In this case

G = gp(a1 ) ⊕ · · · ⊕ gp(an ) ∼
= Z × Z · · · × Z = Zn .
Proposition 3.89. Let G be a finitely generated free abelian group. Any two
bases must have the same cardinality.

Proof. Let {a1 , · · · , an } ⊂ G be a basis. Let 2G := {2x | x ∈ G}. 2G ⊆ G


is a subgroup. Observe that 2G = {λ1 a1 + · · · λn an | λ ∈ 2]z}. Hence
(G : 2G) = 2n . But the left hand side is defined independently of the basis.
The result follows.

Definition 3.90. Let G be a finitely generated free Abelian group. The


rank of G is the size of a any basis.

87
Theorem 3.91. A finitely generated abelian group is free Abelan ⇔ it is
torsion free.

Proof. (⇒) is trivial.

(⇐)

Assume G is torsion-free, let {a1 , · · · , an } ⊂ G generate G. We will prove


the result by induction on n.

Base Case: n = 1. G = gp(a) ∼


= (Z, +) which is free abelian. Therefore
result is true for n = 1.

If {a1 , · · · , an } ⊂ G is a basis we have nothing to prove. Suppose that it is


not a basis. then we have a non-trivial relation:

λ1 a1 + λ2 a2 + · · · + λn an = 0

λ1 a1 λ2 a2
If ∃d ∈ Z such that d | λi for all i, then have d( + + · · · + ...) = 0.
d d
λ1 a1 λ2 a2
As G is torsion-free, ( + + · · · + ...) = 0. We can therefore assume
d d
that the λi are collectively coprime. If λ1 = 1, then we can shift terms
to get a1 = −(λ2 a2 + λ3 a3 + · · · + λn an ). Therefore, G is generated by the
{a2 , · · · , an } ⊂ G and the result follows by induction. We will reduce to
this cases as follows: Assume |λ1 | ≥ |λ2 | > 0. By the remainder theorem
we may choose α ∈ Z such that |λ1 − αλ2 | < |λ2 |. Let a02 = a2 + αa1 and
λ01 = λ1 − αλ2 , then

λ01 a1 + λ2 a02 + · · · + λn an = 0.

Also observe that {a1 , a02 , · · · , an } ⊂ G is still a generating set and {λ01 , · · · , λn }
are still collectively coprime. This process must must eventually terminate
with one of the coefficients equal either 1 or −1. In this case we can apply
the inductive step as above to conclude that G is free abelian.

Proposition 3.92. Let G be finitely generated and Abelian. Then G/tG is


a finitely generated free Abelian group.

88
Proof. G/tG is torsion free. We must show that G/tG is finitely generated.
Let {a1 , · · · , an } ⊂ G generate G. Then {a1 + tG, · · · , an + tG} ⊂ G/tG
forms a generating set. By the above theorem G/tG is free Abelian.
Definition 3.93. Let G be a finitely generated Abelian group. We define
the rank of G to be the rank of G/tG.

Let G be finitely generated and Abelian. Let G/tG be of rank n ∈ N


and let f1 , · · · , fn be a basis for G/tG. Let φ : G → G/tG be the natural
quotient homomorphism. Clearly φ is surjective. Choose {e1 , · · · , en } ⊂ G
such that φ(ei ) = fi ∀i ∈ {1, · · · , n}. None of the fi have finite order ⇒ none
of the ei have finite order. Moreover

φ(λ1 e1 + · · · + λn en ) = λ1 f1 + · · · + λn fn ∈ G/tG.

Because {f1 , · · · , fn } is a free basis for G/tG we deduce that λ1 e1 + · · · +


λn en = 0 ⇔ λi = 0∀i ⇒ F := gp{e1 , · · · , en } ⊆ G is free abelian with basis
{e1 , · · · , en } ⇒ F is torsion free. Therefore F ∩ tG = {0}.

Let g ∈ G. By definition, ∃λ1 , · · · , λn ∈ Z such that φ(g) = λ1 f1 +· · ·+λn fn .


Then we have:

φ(g) = λ1 f1 + · · · + λn fn ⇒ φ(g) = φ(λ1 e1 + · · · + λn en )


⇒ φ(g − (λ1 e1 + · · · + λn en )) = 0
⇒ g − (λ1 e1 + · · · + λn en ) ∈ kerφ = tG
⇒ ∃h ∈ tG s.t. g = (λ1 e1 + · · · + λn en ) + h

Hence every x may be written uniquely in the form x = f + g where f ∈ F


and g ∈ tG.
Proposition. Every finitely generated Abelian group can be written as a
direct sum of a free Abelian group and a finite group.

Proof. By the above, we may write

89
G = F ⊕ tG

Define the homomorphism :

G = F ⊕ tG −→ tG
f + h −→ h

This is surjective with kernel F , hence by the first isomorphism theorem tG is


isomorphic to G/F . The image of any generating set of G is a generating set
for G/F under the quotient homomorphism. Hence tG is finitely generated
and torsion, hence finite. F is free Abelian by construction.

Hence we have reduced the study of finitely generated Abelian groups to


understanding finite Abelian groups.

3.10 Finite Abelian Groups

Definition. A finite group G (not necessarily Abelian) is a p-group, with


p ∈ N a prime, if every element of G has order a power of p.

By Sylow’s Theorem the order of a finite p-group must be a power of p. From


now on let G be a finite Abelian group. Let p ∈ N be a prime. We define
Gp := {g ∈ G | ord(p) is a power of p} ⊂ G.

Theorem 3.94. Gp ⊂ G is a subgroup.

Proof. 1. ord(0) = 1 = p0 ⇒ 0 ∈ Gp .

2. Let g, h ∈ Gp ⇒ ∃r, s ∈ N such that pr g = ps h = 0 ⇒ pr+s (g + h) =


ps (pr g) + pr (ps h) = 0 + 0 = 0 ⇒ g + h ∈ Gp .

90
3. Let g ∈ Gp ⇒ ∃r ∈ N such that pr g = 0 ⇒ −pr g = pr (−g) = 0 ⇒
−g ∈ Gp

This critically relies on G being Abelian. By definition Gp is a p-group. Recall


that ∀g ∈ G, ord(g) | |G| by Lagrange’s Theorem. Therefore Gp = 0 unless
possibly if p divides |G|. By Sylow’s Theorem we deduce that if |G| = pn u,
where HCF (p, u) = 1, then |Gp | = pn . Thus Gp ⊆ G is the maximal p-
subgroup contained in G. The importance of the maximal p-subgroups is the
following theorem.

Theorem 3.95. Let G is a finite Abelian group. Let {p1 , · · · , pr } be the


primes dividing |G|. Then

G = Gp1 ⊕ · · · ⊕ Gpr

Moreover this is the unique way to express as the direct sum of p-subgroups
for distinct primes.

Proof. Let |G| = n = a1 a2 · · · ar where ai = pαi i . Let Pi = n/ai . {P1 , · · · , Pr } ⊂


Z are collectively coprime ⇒ ∃ Q1 , · · · , Qr ∈ Z such that

P1 Q1 + · · · + Pr Qr = 1 (Extension of Euclid)

Let g ∈ G and gi = Pi Qi g. Clearly g = g1 +g2 +· · ·+gr and pαi i gi = Qi (ng) =


0. Hence gi ∈ Gpi .

We must prove the uniquness of this sum. Assume we had

g = g10 + · · · + gr0 , gi0 ∈ Gpi .

Therefore x = g1 − g10 = (g20 − g2 ) + (g30 − g3 ) + · · · + (gr0 − gr ). The right


hand size has order dividing P1 , the left hand side has order dividing Q1 .
P1 and Q1 are coprime ⇒ ∃ u, v ∈ Z such that up1 + vq1 = 1 ⇒ x =

91
u(p1 x) + v(q1 x) = 0 + 0 = 0 ⇒ g1 = g10 . Similarly we find gi = gi0 for all
i ∈ {1, · · · , r}, hence the sum is unique and we deduce

G = Gp1 ⊕ · · · ⊕ Gpr .

Let {qi , · · · , qs } be a finite collection of distinct primes. Assume that G


can be expressed as the direct sum

G = H1 ⊕ · · · ⊕ Hs ∼
= H1 × · · · × Hs
where Hi is a finite qi -subgroup. Clearly Gqi = Hi and if p is a prime not
in {q1 , · · · , qs } Gp = {0}. Thus {p1 , · · · , pr } = {q1 , · · · , qs } and any such
representation is unique.

We have however reduced the study of finite abelian groups to finite abelian
p-groups.

Theorem 1. Every finite Abelian p-group is a direct sum of cyclic groups.

Proof. Let G be a finite Abelian p-group. If G is cyclic, we are done, other-


wise take a cyclic subgroup B = gp(b) of maximal order, say pn . Our strategy
is to show that there is a p-subgroup D ⊂ G such that G = B ⊕ D. We
apply the following inductive hypothesis: For any finite Abelian p-group F
of size less than |G|, if M ⊂ F is a maximal cyclic subgroup then there exists
N ⊂ F such that M ⊕ N = F . This is clearly true for F trivial.

We claim that there is a subgroup C of order p such that B ∩ C = {0}.


Recall that because G is Abelian G/B is naturally an Abelian p-group. Let
c ∈ G \ B and suppose cB ∈ G/B has order pr for r > 0. Observe that
the maximal order of any element in G/B is less than or equal to pn . Thus
we know n ≥ r. By definition pr (cB) = B ⇒ pr c ∈ B. Thus there ex-
ists s ∈ N such that pr c = sb. By maximality of the order of b we know
0 = pn c = spn−r b. But ord(b) = pn , hence pn |spn−r . Therefore we have p|s,
say s = ps0 . Hence c1 = pr−1 c − s0 b has order p and is not in B. Therefore
C = gp(c1 ) is the required subgroup.

92
Let BC = {ab | a ∈ B, b ∈ C}. We claim that BC ⊂ G is a subgroup.

1. eG ∈ B and eG ∈ C ⇒ eG ∈ BC.

2. Let a1 , a2 ∈ B, b1 , b2 ∈ C. Then (a1 b1 )(a2 b2 ) = (a1 a2 )(b1 b2 ) ∈ BC.


Hence BC is closed under composition.

3. Let a1 ∈ B, b1 ∈ C. Then (a1 b1 )−1 = b−1 −1 −1 −1


1 a1 = a1 b1 ∈ BC. Hence
BC is closed under taking inverses.

First observe that |G/C| < |G|. Hence the inductive hypothesis applies
to G/C. Observe that BC ⊂ G is a subgroup containing C. Observe that
BC/C is cyclic, generated by bC ∈ BC/C. Because B ∩ C = {0} we also
know that |BC/C| = pn . Note that the size of the maximal cyclic subgroup of
G must be larger than or equal to the size of the maximal cyclic subgroup of
G/C. However we have constructed a cyclic subgroup BC/C ⊂ G/C whose
order equals that of a B. Hence BC/C ⊂ G/C is a maximal cyclic subgroup.
Thus by our inductive hypothesis ∃N ⊂ G/C such that BC/C ⊕ N = G/C.
By the third isomorphism theorem we know that N = D/C for a unique
subgroup D ⊂ G containing C. We claim that G is the direct sum of B and
D.

Let g ∈ G. Then gC ∈ G/C is uniquely expressible in form g + C =


(a + C) + (d + C) = (a + d) + C, where a ∈ B and d ∈ D. Hence g = a + d + e
for some c ∈ C. However C ⊂ D so this expresses g as a sum of elements of
B and D. Let x ∈ B ∩ D. Hence xC ∈ BC/C ∩ D/C. Assume that x 6= 0.
Note that x ∈ / C. Hence xC is non-zero on BC/C and D/C. However by
construction BC/C ∩D/C = {C}. This is a contraction. Hence B ∩D = {0}
and we deduce that G = B ⊕ D.

Thus we have shown that given any finite Abelian p-group G and a max-
imal cyclic subgroup B ⊂ G, there exists a subgroup D ⊂ G such that
G = B ⊕ D. Observe that D is a finite Abelian p-group, thus we can con-
tinue this process until eventually it must terminate. The end result will be
an expression of G as a direct sum of cyclic p-groups.

93
Corollary 3.96. For any finite Abelian p-group G , there exist a unique
decreasing sequence of natural numbers {r1 , · · · , rn } ⊂ N such that

G∼
= Z/pr1 Z × · · · × Z/prn Z.

Proof. By the previous theorem we know that G is the direct sum of cyclic
groups each of p-power order. Thus we know that such integers exist . We will
prove uniqueness by induction on |G|. Assume that there is are isomorphisms

G∼
= Z/pr1 Z × · · · × Z/prn Z ∼
= Z/ps1 Z × · · · × Z/psm Z,
where the ri and sj are a decreasing sequence of natural numbers. We there-
n m
Pn Pm X X
r s
fore see that |G| = p i=1 = p j=1 . Hence
i j
ri = sj .
i=1 j=1

Let pG = {pg | g ∈ G}. It is a straightforward exercise (which we leave


to the reader) to prove that pG is a subgroup of G. Note that for r > 1,
Z/pr−1 Z ∼
= p(Z/pr Z), where the isomorphism is given by sending a + pr−1 Z
to pa + pr Z. We deduce therefore that there are isomorphisms

pG ∼
= Z/pr1 −1 Z × · · · × Z/prn −1 Z ∼
= Z/ps1 −1 Z × · · · × Z/psm −1 Z.

Observe now that |pG| < |G|, thus by induction we deduce that the ri
and sj agree when restricted to entries strictly greater than 1. This, together
n
X Xm
with the fact that ri = sj , implies that the two sets are the same and
i=1 j=1
thus uniqueness is proven.

Proposition 3.97. Let G is an Abelian group such that p ∈ N is a prime


dividing |G|. Then Gp is non-trivial.

94
Proof. Recall that if {p1 , · · · , pr } are the primes dividing |G| then

G∼
= Gp1 × · · · × Gpr .

Hence |G| = |Gp1 | · · · |Gpr |. By the above corollary pi divides |G| if and
only if Gpi is non-trivial.
Structure Theorem for Finitely Generated Abelian Groups. Every
finitely generated Abelian group G can be written as a direct sum of cyclic
groups:

G = β1 ⊕ · · · ⊕ βr

where each βi is either infinite or of prime power order, and the orders which
occur are uniquely determined (up to reordering of the indices).

Proof. G=F ⊕ tG. F is free and finitely generated, hence the direct sum of
infinite cyclic groups (Z, +). The number equals the rank of G. tG is finite
Abelian, hence the is the unique direct sum of p-groups for distinct primes
p. Each p-group is the unique direct sum (up to order) of p-power cyclic
groups.

Note that we could have stated this theorem with direct product in place of
direct sum. Thus we have classified all finitely generate Abelian groups up
to isomorphism.

3.11 The Classification of Finite Groups (Proofs Omit-


ted)

In the last section we classified all finite Abelian groups up to isomorphism.


Is it possible to do the same for all finite groups? It turns out that the
situation is far more complicated in the non-Abelian case.

95
Here is the basic strategy:

• Show that any finite group G can be broken down into simple pieces.

• Classify these simple pieces.

• Understand how these simple pieces can fit together.

Definition 3.98. let G be a finite group. A composition series for G is


a nested collection of subgroups

{e} = G0 / G1 / · · · / Gr−1 / Gr = G.
such that

• Gi−1 6= Gi for all 0 < i ≤ r.

• Gi /Gi−1 is simple for all 0 < i ≤ r.

Remark 3.99. By the third isomorphism theorem a composition series can-


not be extended, meaning we cannot add any intermediate normal subgroups.

Theorem 3.100. Any finite group G has a composition series.

Observe that if G is simple that {e} = G0 / G1 = G is a composition series.

If G = Sym3 then

{e} / gp((123)) / Sym3


gives a composition series. To see why, observe that each quotient group has
size 3 or 2 and are therefore isomorphism to Z/3Z or Z/2Z which are both.

Jordan-Holder Theorem. Let G be a finite group. Suppose we have two


composition series for G

{e} = G0 / G1 / · · · / Gr−1 / Gr = G.

96
{e} = H0 / H1 / · · · / Hs−1 / Hs = G.
Then r = s and the quotient groups

{G1 /G0 , · · · , Gr /Gr−1 }, {H1 /H0 , · · · , Hs /Hs−1 }

are pairwise isomorphic (perhaps after reordering).


Definition 3.101. If G has composition series

{e} = G0 / G1 / · · · / Gr−1 / Gr = G.

we call the quotient groups

{G1 /G0 , · · · , Gr /Gr−1 }

the simple components of G.

By the Jordan-Holder Theorem the simple components are well-defined


up to isomorphism. It is possible that two non-isomorphic groups have the
same (up to isomorphism) simple components. As an example Sym3 and
Z/6Z both have simple components {Z/2Z, Z/3Z}.
Definition 3.102. A finite group is called solvable (or soluble) if its simple
components are Abelian. Note that Solvable groups need not be Abelian
themselves

Note thatSym3 is solvable, while Alt5 (being simple and non-Abelian) is


non-solvable.

To summarize our study: Finite group theory if much like the theory of
chemical molecules.

• The simple groups are like atoms


• Finite groups have simple components, like molecules have constituent
atoms.
• Non-isomorphic finite groups with the same simple components are like
molecules with the same atoms but different structure (isomers).

97
We now have two goals

• Classify all finite simple groups up to isomorphism.

• Classify all finite simple groups with given simple components.

The theory of groups was initiated by Galois in 1832. Galois was the first to
discover the first known simple groups, namely Z/pZ for p prime and Altn
for n > 4. Amazingly it took until 2004 until a complete classification was
known. The proof stretches across over 10000 pages and is the combined
work of thousands of mathematicians. Here’s a very rough breakdown the
the different four distinct classes of finite simple group:

• Cyclic groups of prime order. These are the only Abelian simple groups.

• Altn for n > 4

• Finite groups of Lie type. These groups are very complicated to de-
scribe in general. The basic idea is that they can be realized as sub-
groups and quotients of matrix groups. There are 16 infinite families
of finite simple groups of Lie type.

• There are 26 sporadic groups. Very strangely these do not fall into
any fixed pattern. The first were discovered in 1852 by Mathieu, while
he was thinking about subgroups of finite permutation groups with
extremely strong transitivity properties. The largest sporadic group
was discovered in the 1970s. It’s called the monster group and has size

246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71

The monster contains all but six of the other sporadic groups as quo-
tients of subgroups.

The theory of finite simple groups is one of the crown jewels of mathematics.
It’s demonstrates how profound the definiton of a group really is. All of this
complexity is contained in those three innocent axioms.

98
The next question, of course, is to classify all finite groups with given
simple components. This is still a wide open problem. As such a complete
classification of all finite groups is still unknown.

One may ask about classifying infinite groups. Unsurprisingly the situ-
ation is even more complicated, although much progress has been made if
specific extra structure (topological, analytic or geometric) is imposed.

99
4 Rings, Ideals, and Homomorphisms

4.1 Basic Definitions

A group (G, ∗) is a set with a binary operation satisfying three properties.


The motivation for the definition reflected the behavior of (Z, +). Observe
that Z also comes naturally equipped with multiplication ×. In the first
lectures we collected some of the properties of (Z, +, ×). Motivated by this
we make the following fundamental definition:
Definition. A ring is a set R with two binary operations, +, called addition,
and ×, called multiplication, such that:

1. R is an Abelian group under addition.

2. R is a monoid under multiplication (inverses do not necessarily exist).

3. + and × are related by the distributive law:


(x + y) × z = x × z + y × z and x × (y + z) = x × y + x × z ∀x, y, z ∈ R

The identity for + is “zero”, denoted 0R (often just written as 0), and the
identity for × is “one”, denoted 1R (often just written as 1).
Remark 4.1. 1. To simplify the notation we will write x × y = xy for all
x, y ∈ R.

2. Distributivity implies that we can “multiply” together finite sums:


X X X
( xi )( yj ) = xi y j

in a well-defined way.

Here are some examples of rings:

1. The integers under the usual addition and multiplication.

100
2. The rational numbers under the usual addition and multiplication.

3. The real numbers under the usual addition and multiplication.

4. The complex numbers under the usual addition and multiplication.

5. Z/mZ under the addition and multiplication described in 2.3.

6. Let S be a set and P(S) be the set of all subsets. This is called the
power set of S. On P(S) define + and × by

X + Y = (X ∩ Y 0 ) ∪ (X 0 ∩ Y ), XY = X ∩ Y

Where X 0 denotes the complement of X in S. Then P(S) is a ring


with ∅ = 0 and S = 1. This strange looking ring has applications to
mathematical logic.

7. In linear algebra the collection of linear maps from Rn to Rn is the


set Mn×n (Rn ) (also denoted Mn (Rn )). This has the structure of a ring
under the usual addition and multiplication of matrices. This is an
example of a non-commutative ring.

8. The product ring Z × Z. It is defined as the set of pairs of integers,


with addition and multiplication defined component-wise.

9. More generally, if (R, +R , ×R ) and (S, +S , ×S ) are rings, we define the


product ring
(R × S, +R×S , ×R×S )
by

(r1 , s1 ) +R×S (r2 , s2 ) := (r1 +R r2 , s1 +S s2 ) (2)


(r1 , s1 ) ×R×S (r2 , s2 ) := (r1 ×R r2 , s1 ×S s2 ) (3)

This can lead to strange rings, such as Q × Z/5Z.

10. If R is any ring, then the set Mn×n (R) of n×n matrices with coefficients
in R is a ring, with addition component-wise, and multiplication defined
by the usual formulas for matrix multiplication. If n ≥ 2, this is not
commutative.

101
11. If R is any ring, there is a ring R[x] of polynomials in the variable x with
coefficients in R. A polynomial is a sequence a0 +a1 x+a2 x2 +· · ·+an xn ,
where ai ∈ R for 0 ≤ i ≤ n. Addition and multiplication are defined
like usual addition and multiplication of polynomials. We discuss this
in much more detail in Section 4.4.
12. The subset of C defined by
{a + bi | a, b ∈ Z},
with the usual operations of addition and multiplication, is a ring
known as the Gaussian Integers. This ring is denoted Z[i].
13. The subset of R defined by

{a + b 2 | a, b ∈ Q}
√ √
is a ring. This ring is denoted Q[ 2]. There is also a version Z[ 2]
with integer instead of rational coefficients.

Note that matrix multiplication is not commutative in general. So it is


perfectly possible for a multiplication not to be commutative in a ring.
Definition 4.2. Let R be a ring with multiplication ×. If × is commutative,
i.e. xy = yx ∀ x, y ∈ R then we say that R is a commutative ring.
Definition 4.3. Let R and S be two rings. A homomorphism φ from R
to S is a map of sets φ : R → S such that ∀x, y ∈ R

1. φ(x + y) = φ(x) + φ(y)


2. φ(xy) = φ(x)φ(y)
3. φ(1R ) = 1S

Once again, if R = S and φ = IdR then we call it the identity homomorphism.

Note that R and S are abelian groups under + so φ is a group homomorphism


with respect to + so φ(0R ) = 0S . We have to include (3) as (R, ×) is only a
monoid, so it does not follow from (2) alone that φ(1R ) = 1S .

102
Remark 4.4.
Example 4.5. Here’s an example that shows why the last remark is impor-
tant. The map from Z to Z × Z sending n ∈ Z to (n, 0) ∈ Z × Z satisfies the
first two properties of ring homomorphisms, but it is NOT a ring homomor-
phism. That’s because the multiplicative identity of Z × Z is (1, 1).

1. As for groups, the composition of two ring homomorphisms is again a


ring homomorphism.
2. As before, an isomorphism is a bijective homomorphism, or equiva-
lently one with an inverse homomorphism. A homomorphism from R
to itself is called an endomorphism. An endomorphism which is also
an isomorphism is called an automorphism. This is exactly the same
terminology as for groups.
Example 4.6. Here are some examples of ring homomorphisms:

1. The inclusion map from Z into Q is a ring homomorphism.


2. More generally, the inclusion map from Q into R, the inclusion map
from R into C, and the compositions of any of the above are ring
homomorphisms.
3. The map from Z to Z/mZ sending k ∈ Z to [k] = k + mZ ∈ Z/mZ
is a ring homomorphism. This known as the projection or reduction
homomorphism.
4. The identity map from a ring to itself is always a ring homomorphism
(in fact an isomorphism).
5. Complex conjugation, the map from C to C sending a + bi to a − bi,
is a ring homomorphism. In fact, it is an isomorphism that is not the
identity (aka, a non-trivial automorphism). Non-trivial automorphisms
are very important in Galois theory.
There is aring homomorphism from R to M3 (R) sending x ∈ R to
6. 
x 0 0
 0 x 0 . More generally, we have such a ring homomorphism to
0 0 x
Mn (R) for any n.

103
7. The previous example actually works with R replaced by ANY ring R
(yes, even a finite ring like Z/mZ).

8. For any ring R and r ∈ R, there is a homomorphism

evr : R[x] → R

sending f (x) ∈ R[x] to f (r) ∈ R.

9. There is a homomorphism evi : R[x] → C sending a polynomial f (x) ∈


R[x] to f (i) ∈ C. Notice that this homomorphism sends x2 + 1 to 0.

10. There is a homomorphism ev√2 : Q[x] → R sending f (x) ∈ Q[x] to


√ √
f ( 2) ∈ R. Its image is Q[ 2]. Notice that it sends x2 − 2 to 0.

11. If R, S are two rings, then there is a ring homomorphism prR : R × S →


R sending (r, s) ∈ R × S to r ∈ R. There is a similar homomorphism
prS : R × S → S sending (r, s) to s ∈ S. These are called projection
homomorphisms.

In any ring R we have the following elementary consequences of the ax-


ioms:

x0 = x(0 + 0) = x0 + x0 ⇒ x0 = 0

Similarly, 0x = 0 for all x ∈ R.

If R consists of one element, then 1 = 0, conversely if 1 = 0 then ∀x ∈


R, x = x1 = x0 = 0, hence R consists of one element. The ring with one
element is called the trivial ring.

In a ring we abbreviate expressions like

a + a + a + · · · + a (n times) = na(n ∈ N)

It is clear that we may naturally extend this to all n ∈ Z.

104
Similarly,
a × a × · · · × a (n times) = an f or n ∈ N.

By the Distributive Law, we have the identities

1. m(a + b) = ma + mb

2. (m + n)a = ma + na

3. (mn)a = m(na)

∀a, b ∈ R and m, n ∈ Z.

Definition 4.7. Given R and S two rings we say that R is a subring of S


if it is a subset and is a ring under the induced operations (with same 0 and
1). Eg. (Z, +, ×) ⊂ (Q, +, ×). More precisely,

1. R is a subgroup of S under addition.

2. R is closed under multiplication.

3. 1S ∈ R.

Remark 4.8. As with subgroups, an arbitrary intersection of subrings is


again a subring.

4.2 Ideals, Quotient Rings and the First Isomorphism


Theorem for Rings

Let G and H be groups and φ : G → H a group homomorphism. Recall


that ker(φ) ⊂ G is a normal subgroup, thus the set of right cosets G/ker(φ)
naturally forms a group (the quotient group). Recall that all normal sub-
groups arise in this manner. The 1st Isomorphism theorem states that there
is a natural isomorphism

105
G/ker(φ) ∼
= Im(φ).
Does something analogous hold for rings?

Let R and S be two rings. Let φ : R → S be a ring homomorphism.


Definition 4.9. The kernel of φ is the subset

ker(φ) := {r ∈ R | φ(r) = 0S } ⊂ R.

The image of φ is the subset

Im(φ) := {s ∈ S | ∃r ∈ R s.t. φ(r) = s} ⊂ S.

Remember that φ is a group homomorphism with respect to the additive


Abelian group structures on R and S. With respect to this structure these
definitions are exactly the same as in group theory. In particular we know
that
ker(φ) = {0R } ⇔ φ is injective .

We also know that ker(φ) ⊂ R and Im(φ) ⊂ S are subgroups under


addition.
Proposition 4.10. Im(φ) ⊂ S is a subring.

Proof. We need to check that Im(φ) is closed under multiplication and con-
tains 1S . Let s1 , s2 ∈ Im(φ). Hence ∃r1 , r2 ∈ R such that φ(r1 ) = s1 and
φ(r2 ) = s2 . But s1 s2 = φ(r1 )φ(r2 ) = φ(r1 r2 ). Hence s1 s2 ∈ Im(φ). Hence
Im(φ) is closed under multiplication.

By definition φ(1R ) = 1S . Hence 1S ∈ Im(φ). Thus Im(φ) is a subring.

If S is non trivial then because φ(1R ) = 1S we know that 1R ∈


/ ker(φ).
Hence in this case ker(φ) ⊂ R is not a subring. What properties does it
satisfy?

106
1. ker(φ) ⊂ R is a subgroup under +.

2. Let a ∈ ker(φ) and r ∈ R. Observe that φ(ra) = φ(r)φ(a) = φ(r)0S =


0S . Hence ra ∈ ker(φ). Similarly ar ∈ ker(φ). Hence ker(φ) is closed
under both left and right multiplication by all of R.

Definition 4.11. Let R be a ring. An ideal I ⊂ R is a subset which is a


subgroup under addition and is closed under both left and right multiplication
by all of R. More precisely, if x ∈ I then xr, rx ∈ I for all r ∈ R.

Example 4.12. If R is a commutative ring, and a ∈ R, then the set aR :=


{ra | r ∈ R} of all multiples of a in R is an ideal. Many of our ideals will
be of this form.

Remark 4.13. Notice that Z ⊆ Q is a subring, but it is NOT an ideal.


In particular, the quotient Q/Z is a group under addition, but it is NOT a
ring, because multiplication is not well-defined. (For example, 1/2 and 3/2
 2   
1 1 1 3 3
represent the same element of Q/Z, but = and = do
2 4 2 2 4
not represent the same element of Q/Z.

We have just shown that the kernel of a homomorphism is always an


ideal. An ideal is the ring theoretic analogue of normal subgroup in group
theory.

Let I ⊂ R be an ideal. Recall that (R, +) is an abelian group, Hence


(I, +) ⊂ (R, +) is a normal subgroup. Hence the right cosets R/I natu-
rally have a group structure under addition. We have completely ignored the
multiplicative structure on R. Let us define a multiplication by:

(a + I) × (b + I) := (ab) + I, ∀a, b ∈ R.

Lemma. This binary operation is well defined.

Proof. Let a1 +I = a2 +I and b1 +I = b2 +I where a1 , a2 , b1 , b2 ∈ R. Observe


that
a1 b1 − a2 b2 = a1 (b1 − b2 ) + (a1 − a2 )b2

107
is contained in I because I is an ideal. Thus

a1 b1 + I = a2 b2 + I.

Proposition 4.14. R/I is a ring under the natural operations. We call it


the quotient ring.

Proof. This is just a long and tedious exercise to check the axioms which all
follow because they hold on R. Unsurprisingly 0 + I is the additive identity
and 1 + I is the multiplicative identity.

As in the case of groups there is a natural surjective quotient ring homomor-


phism
φ : R → R/I.
From the definitions we see that ker(φ) = I. We deduce that ideals of a ring
are precisely the kernels of ring homomorphisms. This is totally analogous
to the group theory situation.
The First Isomorphism Theorem. Let φ : R → S be a ring homomor-
phism. Then the induced map

ϕ : R/ker(φ) −→ Im(φ)
a + ker(φ) −→ φ(a)

is a ring isomorphism.

Proof. The first isomorphism theorem for groups tells us that it is an iso-
morphism of additive group. Hence we merely need to check that it is a ring
homomorphism.

Let a, b ∈ R. ϕ((a + ker(φ))(b + ker(φ))) = ϕ(ab + ker(φ)) = φ(ab) =


φ(a)φ(b) = ϕ(a + ker(φ))ϕ(b + ker(φ)). Also ϕ(1 + I) = φ(1) = 1.

Hence ϕ is a ring homomorphism and we are done.

108
Definition 4.15. An injective ring homomorphims φ : R → S is called
and embedding. By the first isomorphism theorem, R is isomorphic to the
subring Im(φ) ⊂ S.
Example 4.16. Here are some examples of how the First Isomorphism The-
orem applies:

1. Most of the examples in Example 4.6 are injective, which means their
kernel is the zero ideal {0R } (which is the same as 0R R, the set of
multiples of 0R ).
2. The kernel of the homomorphism from Z to Z/nZ listed in Example
4.6 is nZ, the set of multiples of n.
3. We will later prove that the kernel of evi : R[x] → C is (x2 + 1)R[x],
the set of multiples of the polynomial x2 + 1.
4. As an example of an ideal not of the form aR, consider the ring R =
Z[x], and consider the set of elements of the form
{5f (x) + xg(x) | f (x), g(x) ∈ Z}.
This is the set of polynomials with integers coefficients whose constant
term is a multiple of 5. Then the quotient by this ideal is the ring Z/5Z
(with the usual operations, of course).
To see this, consider the homomorphism from Z[x] to Z/5Z sending
a polynomial f (x) ∈ Z[x] to the residue class of its constant term
modulo 5. This is a homomorphism because it is the composition of
the homomorphism ev0 : Z[x] → Z with the projection homomorphism
Z → Z/5Z. Its kernel is precisely the ideal mentioned above.

4.3 Properties of Elements of Rings

Definition 4.17. Let R be a ring. An element a ∈ R is said to be invertible,


or a unit, if it has a multiplicative inverse, i.e. ∃a0 ∈ R such that a0 a = aa0 =
1. We know that such an inverse is unique if it exists, hence we shall write
it as a−1 . Note that if 1 6= 0 then 0 is never invertible. We denote the set of
units in R by R∗ .

109
It is clear that for any ring R, (R∗ , ×) is a group.
Definition. A non-trivial ring R in which every non-zero element is invertible
(i.e R \ {0} = R∗ ) is called a division ring (or skew field). If R is a
commutative division ring then R is called a field.
Remark 4.18. 1. (Q, +, ×) is the canonical example of a field. Other
natural examples include (R, +×), (C, +, ×) and (Z/pZ, +, ×), where
p is a prime number. There are examples of division rings which are
not fields (i.e. not commutative) but we will not encounter them in
this course.

2. All of linear algebra (except the issue of eigenvalues existing) can be


set up over an arbitrary field. All proofs are exactly the same, we never
used anything else about R or C.

In an arbitrary ring it is possible that two non-zero elements can multiply


to give zero. For example, in M2×2 (R), the non-zero matrices

   
0 1 0 2
A= and B =
0 0 0 0
multiply to give the zero matrix.
Definition. Let R be a non-trivial ring. Given a ∈ R \ {0}, if there exists
b ∈ R \ {0} such that ab = 0 or ba = 0, then a is said to be a zero-divisor.
Note that 0 is not a zero-divsor.
Definition 4.19. A non-trivial ring R with no zero divisors is said to
be entire; a commutative entire ring is called an integral domain. More
concretely: R is entire if and only if 1 6= 0 and ∀x, y ∈ R, xy = 0 ⇒ x = 0 or
y = 0.

(Z, +, ×), (Q, +, ×) are integral domains. (Z/m, +, ×) is an integral domain


⇔ m prime. The above example shows that M2 (R) is not entire.
Theorem. A ring R is entire ⇔ its set of non-zero elements forms a monoid
under multiplication. Another way to state this is that R entire ⇔ R \ {0}
is closed under multiplication.

110
Proof. In any ring R observe that if x, y ∈ R are two non-zero divisors then
by definition xy ∈ R must be a non-zero divisor. Hence, If R is non-trivial
the non-zero divisors of R are a monoid under multiplication. If R is entire
the set of non-zero divisors is precisely R\{0}, which implies it is a monoid
under multiplication. Conversly if R\{0} is a monoid then firstly it is non-
empty so R is non-tivial. But if x, y ∈ R\{0} then xy ∈ R\{0}. Hence R is
entire by definition.

Corollary. Any field F is an integral domain.

Proof. If x, y ∈ F , x 6= 0 6= y then ∃ x−1 , y −1 ∈ F such that xx−1 = x−1 x =


1 = yy −1 = y −1 y, therefore xy is invertible so is non-zero.

Hence, non-zero elements are closed under multiplication, so F is entire. F


is a field so F is commutative, so it is an integral domain.

Cancellation Law:

Let R be a ring. If c ∈ R is not a zero-divisor, then for any a, b ∈ R


such that ca = cb or ac = bc, then a = b.

This is because ca − cb = c(a − b) and ac − bc = (a − b)c. In particular, if


R is entire, then we can “cancel” any non-zero element. It is important to
note that we cannot do this in an arbitrary ring.

Theorem 4.20. Every finite integral domain R is a field.

Proof. We need to show that R∗ = R \ {0}. Let a ∈ R \ {0}. Define the


following map of sets:

ψ : R \ {0} → R \ {0}

r 7→ ra.

111
ψ is well define because R is an integral domain. By the cancellation law for
integral domains, we know thatgiven r1 , r2 ∈ R r1 a = r2 a ⇒ r1 = r2 ⇒ ψ
injective. Since R\{0} is finite, ψ is surjective ⇒ ∃ b ∈ R\{0} such that ba =
ab = 1. Hence a has a multiplicative inverse. Therefore, R∗ = R \ {0}.

4.4 Polynomial Rings

Let R be a ring.

Definition. The polynomial ring in x with cofficients in a ring R consists


of formal expressions of the form:

g(x) = b0 + b1 x + b2 x2 + · · · + bm xm , bi ∈ R, m ∈ N

If f (x) = a0 + a1 x + · · · + an xn is another polynomial then we decree that


f (x) = g(x) ⇔ ai = bi ∀i. Note that we set ai = 0 if i > n and bj = 0 if
j > m. We refer to x as the indeterminant, bm as the leading coefficient, and
b0 as the constant term.

Addition and multiplication are defined by the rules

1. f (x) + g(x) = (a0 + b0 ) + (a1 + b1 )x + · · · + (an + bn )xn (if m ≤ n)

2. f (x) × g(x) = (a0 b0 ) + (a0 b1 + a1 b0 )x + (a0 b2 + a1 b1 + a2 b2 )x2 + · · · +


an bm xn+m

We will denote this ring by R[x].

Exercise 4.1. Check this genuinely gives a ring structure on the set of
polynomials in x with coefficients in R.

Note that there is a natural embedding:

112
φ : R −→ R[x]
a −→ a (polynomial with m = 0 and a = a0 )

The image of this embedding is the set of constant polynomials.


Remark 4.21. 1. The zero and one elements in R[x] are the image of the
zero and one element in R under φ.

2. R commutative ⇒ R[x] commutative.

3. Given f (x) ∈ R[x] we can construct a map (of sets):

ϕf : R −→ R
a 7→ f (a),

where f (a) ∈ R is the element of R given be replacing x by a. For a


general ring R this process can be quite subtle as we shall see.

4. Alternatively, if we fix a and let f vary, we get a ring homomorphism

eva : R[x] → R

sending f (x) ∈ R[x] to f (a) ∈ R.


Definition 4.22. Let R be a ring and f ∈ R[x] be a non-zero polynomial.
We say that a ∈ R is a root, or zero, of f if f (a) = 0.

Note that a is a root of f if and only if x − a is in the kernel of eva .


Definition 4.23. Let R be a ring and f ∈ R[x] be a non-zero polynomial.
Hence we may write f = cn xn + cn−1 xn−1 + · · · + c0 , ci ∈ R, cn 6= 0. We
call n the degree of f and write deg(f )=n. If in addition cn = 1, we say
that f is monic. The elements of degree 0 are precisely the nonzero constant
polynomials.
Remark 4.24. If f (x) is the zero polynomial, then its degree is undefined.
One may also consider its degree to be −∞ and check that all of the state-
ments in the following theorem still make sense.

113
Theorem 4.25. The following facts are true about degree:

1. ∀f, g ∈ R[x] \ {0}, deg(f + g) ≤ max{deg(f ), deg(g)}

2. If deg(f + g) 6= max{deg(f ), deg(g)}, then f and g have the same


degree, and their leading coefficients are negatives (additive inverses)
of each other.

3. ∀f, g ∈ R[x]\{0}, if deg(f ) 6= deg(g), then deg(f +g) = max{deg(f ), deg(g)}

4. If R is entire, then ∀f, g ∈ R[x]\{0} ⇒ f g 6= 0 and deg(f g) = deg(f )+


deg(g).

Proof. By the definition of degree, (1) and (2) are clear. (3) follows easily
from (2). For (4):

Let deg(f ) = n, deg(g) = m. Then suppose an , bm the leading coefficients of f


and g respectively. Hence f g has maximal power of x given by an bm xn+m . As
R is entire, an bm 6= 0 ⇒ f g 6= 0 and deg(f g) = n + m = deg(f ) + deg(g).

Corollary. R entire ⇒ R[x] entire.

Proof. Immediate from above.

Corollary. R an integral domain ⇒ R[x] and integral domain.

Proof. Immediate from above.

Example 4.26. Note that R = Z/15Z is not an entire ring. If we let


f (x) = [1]+[3]x and g(x) = [2]+[5]x, both in R[x], then deg(f ) = deg(g) = 1,
but
deg(f g) = deg([2] + [11]x) = 1 6= deg(f ) + deg(g) = 2.

114
The process of adjoining indeterminants to a ring R can be iterated to form
polynomials in more than one variable with coefficients in R. We of course
use another symbol for the indeterminants, ie. R[x][y], polynomials in x and
y with coefficients in R, e.g. x2 + y 2 x + x3 y 6 .

We simplify this notation to R[x][y] = R[x, y]. Inductively, we define

R[x1 , · · · , xn ] = R[x1 , · · · , xn−1 ][xn ]

f ∈ R[x1 , · · · , xn ] has a unique expression of the form

X
f= ai1 ···in xi1 · · · xinn (ai1 ···in ∈ R)

where the sum is finite.

Expressions of the form m(i) = xi11 · · · xinn are called monomials. The exam-
ple we’ll study most deeply is when R is a field.

Definition 4.27. Similar to the case of one indeterminant, we have evalua-


tion homomorphisms for rings in multiple variables. If a = (a1 , · · · , an ) ∈ Rn ,
then we have an evaluation homomorphism

ev(a1 ,··· ,an ) = eva : R[x1 , · · · , xn ] → R

defined by sending f (x1 , · · · , xn ) ∈ R[x1 , · · · , xn ] to f (a1 , · · · , an ).

Remark 4.28. Notice that xi − ai ∈ ker eva for 1 ≤ i ≤ n.

4.5 Ring Extensions

let R be a subring of S. Recall that this means R is a subgroup under


addition, is closed under multiplication and contains 1S . In this case, we say
that S is a ring extension of R.

115
Given a ring extension S of R and (a1 , · · · , an ) ∈ S n , we have a more
general form of evaluation homomorphism:
ev(a1 ,··· ,an ) = eva : R[x1 , · · · , xn ] → S

Example 4.29. Let’s suppose that R = Q and S = R, and let α = 2.
Then we have the evaluation homomorphism
ev√2 : Q[x] → R

The image of this homomorphism is the √ ring Q[ 2]. It follows by the First
Isomorphism Theorem (for rings) that Q[ 2] is a quotient of the ring Q[x] by
ev√2 . Notice that x2 − 2 is in this ideal; we will explain in Proposition 5.19
and Remark 5.20 why this kernel is precisely the principal ideal (x2 − 2)Q[x].
Remark 4.30. In the preceding example, the kernel of ev√2 contains no

nonzero polynomials of degree less than 2, because if it did, then 2 would
be rational.

However, if we were to take R = S √ = R as in Remark 4.21(4), then


ev√ 2 contains the linear polynomial x − 2. Therefore, when considering the
kernel of an evaluation homomorphism, it’s important to specify the domain
of the homomorphism (which is not inherent in the notation eva ).
Definition 4.31. The ring extension of R by {α1 , · · · , αn } ⊂ S is the image
of ev(α1 ,··· ,αn ) . Equivalently, it is the subring

R[α1 , · · · , αn ] = {f (α1 , · · · , αn ) | f ∈ R[x1 , · · · , xn ]}

This is the intersection of all subrings of S that contain both R and the
subset {α1 , · · · , αn }.

4.6 Field of Fractions

What is the process by which we go from (Z, +, ×) to (Q, +, ×)? Intuitively,


we are “dividing” through by all non-zero elements. Let us think more care-
fully about what is actually happening and try to generalize the construction

116
to R an integral domain. What is an element of Q? We usually write it in
a a c
the form with a, b ∈ Z, b 6= 0. This is not unique. = ⇔ ad − bc = 0.
b b d
As we are all aware, we define + and × by the following rules:

a c ad + cb
1. + =
b d bd
a c ac
2. × =
b d bd

We should therefore think of elements of Q as pairs of integers (a, b) such


that b 6= 0, up to an equivalence relation.

(a, b) ∼ (c, d) ⇔ ad − cb = 0

Hence, Q can be thought of as (Z × Z \ {0}/ ∼). The well-definedness of +


and × is not obvious and needs checking, i.e. choosing different elements of
the same equivalence class should give the same results.

Let us now generalise this construction. Let R be an integral domain. We


define the relation on R × R\{0} by:

(a, b) ∼ (c, d) ⇔ ad − bc = 0.
Proposition. ∼ is an equivalence relation.

Proof. 1. (a, b) ∼ (a, b) as ab − ab = 0 since R is commutative.


2. (a, b) ∼ (c, d) ⇒ ad − bc = 0 ⇒ bc − ad = 0 ⇒ (c, d) ∼ (a, b)
3. Let (a, b) ∼ (c, d) and (c, d) ∼ (e, f ). Then ad − bc = 0, cf − de = 0.
Consider
(af − be)d = adf − bed
= f (ad − bc) + b(cf − de)
= f 0 + b0 = 0

117
d 6= 0 ⇒ af − be = 0 ⇒ (a, b) ∼ (e, f )

Let us denote the equivalence classes by (R×(R\{0}))/ ∼. It is convenient to


use the usual notation: for (a, b) ∈ R × (R \ {0}) we denote the equivalence
a
class containing (a, b) by . Let us define multiplication and addition on
b
R × R \ {0}/ ∼ by

a c ad + bc a c ac
+ = × =
b d bd b d bd

Proposition. + and × are well-defined on (R × (R \ {0}))/ ∼.

Proof. The first thing to note is that if b, d ∈ R \ {0} ⇒ bd ∈ R \ {0}


as R is an integral domain. We just need to check that choosing different
representatives gives the same answer. It’s just an exercise in keeping the
notation in order - you can do it.
Proposition. 0 ∈ (R × (R \ {0}))/ ∼ is given by the equivalence class
containing (0, 1). 1 ∈ R × (R \ {0})/ ∼ is given by the equivalence class
containing (1, 1).

Proof. For all (a, b) ∈ (R × (R \ {0})),


a 0 a×1+b×0 a
+ = = .
b 1 b × 1) b

a 1 a1 a
× = =
b 1 b1 b
Both operations are clearly commutative because R is commutative. Hence
we are done.

118
It is a straight forward exercise to check that under these operations (R×(R\
{0}))/ ∼ is a commutative ring. Also observe that (a, b) ∈ (R × (R \ {0})) is
in the zero class if and only if a = 0. Similarly (a, b) give the one class if and
only in a = b. This is good. It’s the same as in Q, so we’ve done something
right.
Theorem. (R × (R \ {0}))/ ∼ is a field.

Proof. We just need to check non-zero elements have multiplicative inverses.


a
Let ∈ (R × (R \ {0}))/ ∼ be non-zero. By the above this implies that
b
b
a 6= 0. Hence ∈ (R × (R \ {0}))/ ∼. But
a
a b ab 1
× = = .
b a ab 1
Hence we are done. Multiplication:

Let (a1 , b1 ) ∼ (a2 , b2 ) and (c1 , d1 ) ∼ (c2 , d2 ).

Definition 4.32. Let R be an integral domain. The field of fractions of R


is the field F rac(R) := (R × (R \ {0}))/ ∼.

The canonical example is F rac(Z) = Q.


Definition 4.33. Given an integral domain R and indeterminants {x1 , · · · , xn }
we know that R[x1 , · · · , xn ] is an integral domain. We define

R(x1 , · · · , xn ) := F rac(R[x1 , · · · , xn ]).


Theorem. The map

φ : R → F rac(R)
a
a 7→
1
119
is an embedding.

Proof. We need to check that φ is a homomorphism first.

a+b a b
1. Given a, b ∈ R, φ(a + b) = = + = φ(a) + φ(b).
1 1 1
ab a b
2. Given a, b ∈ R, φ(ab) = = × = φ(a)φ(b).
1 1 1
1
3. φ(1) = .
1

To check it is injective we just need to show that the kernel (as a homomor-
phism of Abelain groups) is trivial.

a 0
φ(a) = = ⇔ a = 0. Thus the kernel is trivial and so φ is injective.
1 1
Corollary 4.34. Every integral domain may be embedded in a field.

Proposition. Let R be a field. The natural embedding R ⊂ F rac(R) is an


isomorphism.

Proof. We must show φ is surjective. Let φ denote the natural embedding


a
R ⊂ F rac(R). Let ∈ F rac(R). R is a field so there exist b−1 , a multiplica-
b
a ab−1
tive inverse to b. But = = φ(ab−1 ). Hence φ is surjective. Therefore
b 1
φ is an isomorphism.

This is backed up by our intuition. Clearly taking fractions of rationals just


gives the rationals again.

120
4.7 Characteristic

4.7.1 Characteristic in a General Ring

For any two rings A, B, we let Hom(A, B) denote the set of ring homomor-
phisms from A to B. We then have the following fact:

Fact 4.35. For any ring R, the set Hom(Z, R) has exactly one element.

This homomorphism from Z to R sends 0, 1 ∈ Z to 0R , 1R ∈ R, and more


generally sends n ∈ Z to nR ∈ R.

Definition 4.36. For a ring R, let I denote the kernel of the unique homo-
morphism Z → R. Let m denote the unique non-negative positive integer
such that I = mZ. Then m is called the characteristic of R.

Fact 4.37. By the First Isomorphism Theorem for rings, the image of this
homomorphism is a subring of R isomorphic to Z/mZ (note that Z/0Z = Z,
and Z/1Z is the trivial ring).

Example 4.38. 1. Any subring of C has characteristic 0. This includes


Z, Q, R, etc.

2. A ring has characteristic 1 iff it is the trivial ring.

3. The ring Z/mZ has characteristic m. Thus Fp := Z/pZ has character-


istic p.

4. If R is a ring, then R[x] has the same characteristic as R.

5. More generally, if R is a subring of a ring S, then R and S have the


same characteristic.

6. As an example of the previous part, Fp [x] is a ring of finite characteristic


but infinitely many elements.

7. If R1 has characteristic m1 and R2 has characteristic m2 , then the


characteristic of R1 × R2 is LCM (m1 , m2 ).

121
8. For example, Z/2Z × Z/4Z has characteristic 4, Z/6Z × Z/4Z has
characteristic 12, and Z × R has characteristic 0 for any R.

In the last example, different elements have different additive orders (i.e.,
orders as elements of the abelian group (R, +)). In an entire ring, however,
every element has the same additive order. We therefore now focus our
attention on entire rings:

4.7.2 Characteristic in Entire Rings

Let R be entire (non-trivial with no zero-divisors). Recall that (R, +) is


an abelian group, hence given a ∈ R we may talk about its additive order.
Recall that if a ∈ R does not have finite order, then we say it has infinite
order.
Theorem. In an entire ring R, the additive order of every non-zero element
is the same. In addition, if this order is finite then it is prime.

Proof. Let a ∈ R \ {0} be of finite (additive) order k > 1, i.e. k is minimal


such that ka = 0. This implies (k × 1R )a = 0 ⇒ k × 1R = 0 as R is
entire and contains no zero-divisors. Therefore if we choose b ∈ R \ {0} then
kb = (k × 1R )b = 0 × b = 0 ⇒ every element has order dividing k. Choosing
a with minimal order k > 1 ensures that every nonzero element must have
order k. If no element has finite order, all elements must have infinite order.

Now assume that 1R ∈ R has finite order k > 1 and that we have factored
k = rs in N. Then k1R = (rs)1R = (r1R )(s1R ) = 0. Since R entire, either
r1R = 0 or s1R = 0. However, since k is the minimal order of 1R , r = k or
s = k. Therefore, k must be prime.

Fact 4.39. Suppose R an entire ring. R has characteristic zero if all of


its non-zero elements have infinite additive order, denoted char(R)=0. If all
non-zero elements of R are of additive order p ∈ N, then R is characteristic
p, or char(R)=p. In this case, R is finite characteristic.

122
When studying abstract fields, the characteristic is very important.

Eg. Q, R, C are all fields (hence entire) of characteristic zero. If p is a prime


number Z/pZ is a field of characteristic p. We denote this later field by Fp .

Theorem. There is an embedding of Q in any field F of characteristic 0.

Proof. Let 1F denote the multiplicative identity in F . Let 0F denote the


additive identity in F . We must find a suitable embedding of Q in F .
Because char(F ) = 0 the natural map homomorphism:

φ:Z→F

n 7→ n1F

is injective. We claim that it is a homomorphism (of rings). Let a, b ∈ Z,


then φ(ab) = ab1F = ab1F 1F = a1F b1F = φ(a)φ(b); φ(a + b) = (a + b)1F =
a1F + b1F =φ(a) + φ(b). φ(1) = 1F . Thus φ is an injective homomorphism.

Now we will extend this notion to Q. We define the following map:

ψ:Q→F
n
7→ φ(n)φ(m)−1
m

We must check that ψ is well defined and is an embedding.


n a
For a, b, n, m ∈ Z, = ⇒ nb − am = 0. Therefore
m b
φ(nb − am) = φ(0) = 0F = φ(nb) − φ(am) ⇒ φ(nb) = φ(am)
⇒ φ(n)φ(b) = φ(a)(m)
⇒ φ(n)φ(m)−1 = φ(a)φ(b)−1
n a
⇒ ψ( ) = ψ( )
m b

123
This shows that ψ is well defined.

Next: ψ is a homomorphism.
a n am + bn
ψ( + ) = ψ( )
b m bm
= (φ(a)φ(m) + φ(b)φ(n))φ(bm)−1
= φ(a)φ(b)−1 + φ(n)φ(m)−1
a n
= ψ( ) + ψ( )
b m

an an
ψ( ) = ψ( )
bm bm
=φ(an)φ(bm)−1
=φ(a)φ(n)φ(b)−1 φ(m)−1
=φ(a)φ(b)−1 φ(n)φ(m)−1
a n
= ψ( )ψ( )
b m

1
By definition ψ( ) = 1F . Thus we have a homomorphism. We claim that
1
it is injective.

We must show that the kernel (as a homomorphism of Abelian groups) is


n n
trivial. Let ∈ Q such that ψ( ) = 0. Then φ(n)φ(m)−1 = 0 ⇒ φ(n) =
m m
0 ⇒ n = 0 as φ was already shown to be injective. Therefore the kernel is
trivial, so ψ is an embedding.
Theorem. Let p be a prime number and F a field of characteristic p. There
is an embedding of Fp into F .

Proof. Note that {0F , 1F , · · · , (p − 1)1F } ⊆ F is closed under + and ×,


hence forms a subring. Clearly Fp is isomorphic to this subring under the
embedding
ψ : Fp −→ F
[a] −→ a1F

124
4.8 Principal, Prime and Maximal Ideals

Definition 4.40. An ideal I ⊂ R is proper if I 6= R.

Note that I ⊂ R is proper if and only if R/I is a non-trivial ring.

Definition 4.41. Let R be a commutative ring. We say an ideal I ⊂ R is


principal if there exist a ∈ R such that I = {ra | r ∈ R}. In this case we
write I = (a).

Definition 4.42. Let R be a commutative ring. We say an ideal I ⊂ R is


prime if it is proper and given a, b ∈ R such that ab ∈ I then either a ∈ I or
b ∈ I.

Proposition 4.43. Let R be a commutative ring. Let I ⊂ R be an ideal.


Then I is prime if and only if R/I is an integral domain.

Proof. I is a proper ideal hence R/I is non-trivial.

Observe that R commutative trivially implies that R/I is commutative.


Let I ⊂ R be prime and assume that R/I has zero divisors. Then there
exists a, b ∈ R such that a, b ∈
/ I but (a + I)(b + I) = 0 + I. But this trivially
implies that ab ∈ I. But this contradicts the fact that I is prime.

Assume that R/I is an integral domain but I is not prime. Hence we can
find a, b ∈ R such that ab ∈ I but a, b ∈/ I. But then (a + I) and (b + I) are
zero divisors, which is a contradiction.

Definition 4.44. Let R be a commutative ring. We say that an ideal is


maximal if it is maximal among the set of proper ideals. More precisely
I ⊂ R is a maximal ideal if given an ideal J ⊂ R such that I ⊂ J, then
either I = J or J = R.

125
Proposition 4.45. Let R be a commutative ring. Let I ⊂ R be an ideal.
Then I is maximal if and only if R/I is a field.

Proof. First observe that R commutative trivially implies that R/I is com-
mutative.

Assume that I ⊂ R is maximal. Take a non-zero element of R/I, i.e.


a + I for a ∈
/ I. Consider the ideal (a) ⊂ R. Consider the following new
ideal:

(a) + I = {ra + b | r ∈ R, b ∈ I}.


Note that this is certainly an ideal because it is closed under addition and
scalar multiplication by all R. Note that by construction I ⊂ (a) + I and
a ∈ (a) + I. Hence I is strictly contained in (a) + I. But I is maximal. Hence
(a) + I = R. Thus there exist r ∈ R and b ∈ I such that ra + b = 1. Hence
(r + I)(a + I) = ra + I = 1 + I. Thus (a + I) has a multiplicative inverse.
Hence R/I is a field.

Assume that R/I is a field. Assume that J is a proper ideal of R which


strictly contains I, i.e. I is not maximal. Let a ∈ J and a ∈
/ I. Thus (a + I)
is non-zero in R/I. Thus it has a multiplicative inverse. Hence there exists
b ∈ R such that ab + I = 1 + I. This implies that ab − 1 ∈ I, which in turn
implies that ab − 1 ∈ J. But a ∈ J, hence 1 ∈ J, which implies that J = R.
This is a contradiction. Hence I is maximal.

Corollary 4.46. Let R be a commutative ring. Let I ⊂ R be an ideal. Then


I maximal implies that I is prime.

Proof. I maximal ⇒ R/I is a field ⇒ R/I is an integral domain ⇒ I prime.

Example 4.47. 1. In R = Z, the ideal mZ is maximal iff m if a prime


number. The only nonmaximal prime ideal is {0}.

2. In a field, the ideal {0} is maximal.

126
3. In R = Q[x], the ideal {0} is prime but not maximal. The ideals xR,
(x − 3)R, (x2 + 1)R, (x2 − 2), and (x2 − 3)R, are maximal.
4. In R = R[x], the ideals (x2 − 2)R and (x2 − 3)R are not prime, but the
ideal (x2 + 1)R is maximal (and therefore also prime).
5. In R = C[x], and f ∈ R, then the ideal f R is maximal iff f is a linear
polynomial with nonzero slope.
6. In R = Z[x], the ideals {0}, (7), and (x−3) are prime but not maximal,
and the ideal ({7, x − 3}) is maximal.
7. If S is an integral domain, and R = S[x], then every ideal of the form
(x − s)R is prime, and such an ideal is maximal iff S is a field.

5 Polynomials and Factorization

5.1 Factorisation in Integral Domains

Let R be a ring. In Z we have the “Fundamental Theorem of Arithmetic” -


every non-zero element of Z is ±1 times a unique product of prime numbers.
Does something analogous hold for R? Clearly, if R is not commutative or
has zero-divisors the issue is very subtle. Hence we will resrict to the case
when R is an integral domain.

At some point, mathematicians proved that the unique factorization the-


orem holds in some sense in rings such as Q[x], Fp [x], and Z[i] (in fact, it
holds in F [x], where F is any field). At some point in the 19th century, they
realized, to their dismay, that it does √
not hold in every integral domain. The
most basic example is that of R = Z[ −5], in which the equality
√ √
6 = (2)(3) = (1 + −5)(1 − −5)
demonstrates that unique factorization does not hold in R.

In this section, we will define some useful terms to explain what we even
mean by unique factorization in a general integral domain.

127
Let a, b ∈ R. As in Z, a | b will mean that ∃ c ∈ R such that b = ac.

5.1.1 Associated Elements

The first thing we have to deal with is what “unique” means in unique fac-
torization. More specifically, in Z, there is the subtlety that a and −a are
essentially the same as far as divisibility is concerned. We formalize this with
the following notion:
Definition. Two non-zero elements a, b in an integral domain R are associ-
ated if a | b and b | a, i.e. ∃ c, d ∈ R such that b = ac and a = bd.
Theorem 5.1. In R an integral domain, and a, b ∈ R be two non-zero
elements. Then, a and b are associated ⇔ a = bu for u ∈ R∗ .

Proof. Association of a and b ⇒ a | b and b | a ⇒ ∃c, d ∈ R such that a = bd


and b = ac ⇒ a = acd ⇒ a = 0 or cd = 1. If a = 0 ⇒ b = 0, which is not
true by assumption. Thus we have cd = 1 ⇒ c, d are inverses of each other
and thus units.
Theorem 5.2. Let R be an integral domain with a, b ∈ R. Then (a) ⊂ (b) ⇔
b | a. Hence a and b are associated if and only if (a) = (b).
Example 5.3. 1. In Z, m and n are associated if and only if n = ±m.
2. In Z[i], z and w are associated if and only if z equals ±w OR ±iw.

3. In Z[ 2],
√ there are infinitely many √
units, given by all the integer powers
of 1 +√ 2. Therefore, if α ∈ Z[ 2], then all elements of the form
α(1 + 2)n , for nZ, are associated to α. Thus every nonzero element
has infinitely many associates.
4. If F is a field, and R = F [x] then the units of R are precisely the
polynomials of degree 0 (aka the nonzero constant polynomials). Then
f, g ∈ R are associate iff one is a constant multiple of the other.
5. In general, if S is an integral domain, and R = S[x], then S × = R× .
Thus Z[x]× = {±1}, and two polynomials are associate in Z[x] iff they
are equal or negatives of each other.

128
6. In Z[1/2], the units are all numbers of the form ±2n for n ∈ Z.

7. In Z[1/6], the units are all numbers of the form ±2n 3m for m, n ∈ Z.

5.1.2 Irreducible and Prime Elements

The second issue to address is what does a prime element of R mean? The
problem, as we will see, is that we can easily come up with several different
natural definitions which are equivalent in Z, but may not be equivalent
in every integral domain. Those two notions are those of prime element
and
√ irreducible element, which are√equivalent√ in Z but not, for example, in
Z[ −5]. As√ we shall see, 2, 3, 1 + −5, 1 − −5 are all irreducible, but not
prime, in Z[ −5].

Definition 5.4. We call a ∈ R\{0} an irreducible element of R if it is a


non-unit and is NOT the product of two non-units.

Remark 5.5. Notice that whether an element is irreducible depends on


which ring R you’re considering. For example, 5 is irreducible in Z, but not
irreducible in Z[i], as 5 = (1 + 2i)(1 − 2i).

Definition 5.6. An element a ∈ R is prime if the ideal (a) = aR is a prime


ideal.

Notice that a is prime if a | bc implies a | b or a | c. In particular, Euclid’s


Lemma states that prime numbers are prime in the sense of Definition 5.6.

Proposition 5.7. If a is prime in R, then it is irreducible.

Proof. Suppose that a were reducible, i.e., a = bc, where b and c are non-
units. Then a | b or a | c, so WLOG let a | b. Then b | a, so a and b are
associated, hence Theorem 5.1 tells us that a = bu for a unit u. But then
bu = bc, so since R is an integral domain, we have c = u, contradicting the
assumption that c is not a unit.

Example 5.8. 1. In Z, every irreducible element is prime. In fact, this


is true in any UFD (definition below).

129

2. In Z[ −5], √the element
√ 2 is irreducible, but not √prime, because
√ it
divides (1 + −5)(1 − −5) yet divides neither 1 + −5 nor 1 − −5.

If a is irreducible or prime, then so are all its associates.

In Z, m is irreducible if and only if it is ±1 times a prime.

5.1.3 Unique Factorization Domains

The Fundamental Theorem of Arithmetic says that every m ∈ Z can be


factored into irreducible elements in “essentially” one way. Here, essentially
means up to switching irreducibles for associated irreducibles, i.e. 10 =
2 × 5 = (−2) × (−5). This motivates the important definition:

Definition. A unique factorization domain (UFD) is an integral domain


in which every element NOT zero or a unit can be written as the product of
irreducibles. Moreover, given 2 complete factorizations of the same element

x = a1 · · · an = b 1 · · · b m ,

into irreducibles, n = m and after renumbering ai is associated to bi for all


i ∈ {1, · · · , n}.

Clearly Z is a UFD by the Fundamental Theorem of Artithmetic. A natural


question to ask is whether all integral domains are UFDs. The answer, rather
surprisingly, is no.

Let R be a UFD. Many of the properties of Z carry over to R. For example


we can talk about highest common factor (HCF) and least common multiple
(LCM) for two a, b ∈ R \ {0}.

Definition. Given a, b ∈ R \ {0} a highest common factor of a and b is


element d ∈ R such that

1. d | a and d | b

130
2. Given d0 ∈ R such that d0 | a and d0 | b, then d0 | d.

Definition. Given a, b ∈ R \ {0} a lowest common multiplie of a, b ∈ R is


an element c ∈ R such that

1. a | c and b | c

2. Given c0 ∈ R such that a | c0 and b | c0 , then c | c0 .

Remark 5.9. 1. It should be observed that there is no reason to believe


that HCFs and LCMs exist in an arbitrary integral domain. Indeed it
is not true in general.

2. Clearly a HCF (if it exists) is NOT unique: If d is an HCF of a and b


then so is d0 for d0 associated to d. Similarly for LCM. Hence when we
talk about the HCF or LCM of two elements we must understand they
are well defined only up to association.

Theorem. In a UFD any two non-zero elements have a HCF. Moreover,


if a = upα1 1 · · · pαr r and b = vpβ1 1 · · · pβr r where u, v are units, and the pi are
pairwise non-associated irreducible elements, then HCF (a, b) = pγ11 · · · pγr r
where γi = min(αi , βi ).

Proof. Let d be a common factor of a and b. By the uniqueness of complete


factorisation we know that (up to association) d is a product of pi for i ∈
{1, · · · pr }. Without loss of generality we may therefore assume that d =
Yr
pδi i . Again by the uniqueness of complete factorisation d is a common
i=1
factor of a and b ⇔ δi ≤ αi and δi ≤ βi ∀i. Therefore, δi ≤ γi ⇒ HCF (a, b) =
pγ11 · · · pγr r .

Proposition. In a UFD any two non-zero elements have a LCM. Moreover,


if a = upα1 1 · · · pαr r and b = vpβ1 1 · · · pβr r where u, v are units, and the pi are
pairwise non-associated irreducible elements, then LCM (a, b) = pγ11 · · · pγr r
where γi = max(αi , βi ).

131
Proof. Exactly the same argument as above works in this case observing that
Yr
d= pδi i is a common multiple of a and b if and only if δi ≥ αi and δi ≥ βi
i=1
for all i ∈ {1, · · · pr }.
Remark 5.10. If a ∈ R a unit then

HCF (a, b) = 1, LCM (a, b) = b ∀b ∈ R \ {0}

5.2 Remainder Theorem for Polynomials

Recall that for a polynomial f (x) = a0 + a1 x + · · · + an xn ∈ R[x] where


an 6= 0R is said to have degree n. Note that degree is defined only when f (x)
is not the zero polynomial. Recall that if R is an integral domain, then
deg f (x)g(x) = deg f (x) + deg g(x).

Note that a polynomial f (x) has degree zero if and only if it is nonzero
and constant.

We then have the remainder theorem for polynomials:


Theorem 5.11. Let F be a field, and let f (x), g(x) ∈ F [x], with g(x) 6= 0
(i.e., it is not the zero polynomial). Then there exist q(x), r(x) ∈ F [x] such
that
f (x) = q(x)g(x) + r(x),
where either r(x) = 0, or deg r(x) < deg g(x).

Proof. Let f = a0 + a1 x + · · · + an xn , g = b0 + b1 x + · · · + bm xm where


ai , bj ∈ F, n, m ∈ N ∪ {0}, and an 6= 0, bm 6= 0.

Assume deg(f ) ≥ deg(g) ⇒ n ≥ m ⇒ n − m ≥ 0 ⇒ xn−m ∈ F [x] ⇒


xn−m b−1 n
m an g has leading term an x ⇒ deg(f − x
n−m −1
bm an g) < deg (f ).

Hence setting c = an b−1


m x
n−m
we have deg(f − cg) < deg(f ).

132
Remark 5.12. Notice this proof crucially uses the fact that F is a field,
because you might have to divide by a coefficient. Therefore, the theorem
is false for Z[x] in place of F [x], as can be seen by taking f (x) = x and
g(x) = 2.
Remark 5.13. Notice that this looks very similar to the Remainder Theorem
for integers (2.5), with the absolute value in place of the degree function. The
notion of Euclidean Domain is a generalization of both of these examples,
and the absolute value (in the case of Z) and the degree (in the case of F [x])
are examples of Euclidean functions.

You don’t technically need to know the term “Euclidean domain,” but
you should understand the similarity between the Remainder Theorem for
Z and that for F [x]. And that similarity is precisely what the notion of
“Euclidean domain” is about.

The Remainder Theorem is useful because it allows one to show that F [x]
is a PID, which also implies that it is a UFD. We now talk about PID’s.

5.3 PID

Definition 5.14. An integral domain R is a principal ideal domain (PID) if


every ideal of R is of the form aR = (a) for some element a ∈ R.

Here are a few non-examples:


Example 5.15. The ring R = F [x, y] is not a PID. One may check that the
ideal (x, y) is not principal.
Example 5.16. The ring R = Z[x, y] is not a PID. One may check that the
ideal (2, x) is not principal.

Example 5.17. The ring R = Z[ −5] is not a PID. This is a bit harder,
and it follows from 6(b) on HW 11.

Here are some examples:

133
√
√ √

−163 + 1
Example 5.18. The rings Z, Z[i], Z[ 2], Z[ −2], and Z are
2
PID’s. The first four can be proven using methods similar to those used for
F [x] below; the last one is harder to prove (and is not something we will
cover).

5.3.1 RT implies PID

We now explain how the remainder theorem can be used to show that F [x]
is a PID.

Proposition 5.19. The ring R = F [x] is a PID.

Proof. Let I be an ideal in R. If I = {0} or I = R, then I is principal (as it


is (0) or (1), respectively).

If not, then I has at least some nonzero element, call it f (x). Then f (x)
has a degree, which is a non-negative integer. If f (x) has the smallest possible
degree among nonzero elements of I, then we fix f (x); if not, we replace f (x)
with an element of I with smallest possible degree (there is always a smallest
possible degree, because the degree is ≥ 0). Let this element be g(x).

We want to show that I = g(x)R. For this, let h(x) be a general element
of I. We want to show that h(x) is a multiple of g(x). For this, apply the
remainder theorem to find h(x) = q(x)g(x) + r(x), where r(x) is zero or has
smaller degree than g(x). Note that because g(x), h(x) ∈ I, we have r(x) =
h(x) − q(x)g(x) ∈ I. Therefore, r(x) cannot have smaller degree than g(x)
(by the definition of g(x)), so r(x) = 0. That means that h(x) = g(x)q(x),
so g(x) divides h(x), as desired.

Remark 5.20. The proof of Proposition 5.19 implies that if I is an ideal and
f is an element of I of minimal degree, then f generates I. This is explained
in more detail in Proposition 5.30 below.

134
5.3.2 Consequences of Being a PID

One important fact about PID’s is that they are UFD’s. We will not give
the entire proof of this fact. The proof has two important steps:

1. Showing that factorization into irreducibles exist

2. Showing that that factorization is unique

1. can be proven using the stuff about ascending chains of ideals in 4.10 of
Paulin, but we won’t worry about that. For 2., an important step is showing
that all irreducible elements are prime (this fact is both true in a UFD and
is a step in proving that a given ring is a UFD).
Example 5.21. To see why “irreducible implies
√ prime” is related to unique-
ness of factorization, consider the ring Z[ √−5], which√ is neither a PID nor
a UFD. Then the fact that (2)(3) = (1 + −5)(1 − −5) are two different
factorizations into irreducibles of the same element (6) is related
√ to the
√ fact
that 2 is not prime. Indeed, 2 is irreducible, divides (1 + −5)(1 − −5),
but does not divide either factor. So 2 is irreducible but not prime.

Let’s now prove that every irreducible element of a UFD is prime. Re-
member that a nonzero element is prime if and only if the ideal it generates
is prime (by definition), and recall that maximal ideals are always prime.
Therefore, it suffices to prove that every irreducible element generates an
ideal that is maximal:
Proposition 5.22. Let R be a PID, and suppose a ∈ R is irreducible. Then
aR is maximal.

Proof. First, note that aR is not R, or else a would be a unit, and by defi-
nition irreducible elements are not units.

Now suppose that J is an ideal with aR ⊆ J ⊆ R. As R is a PID, we


have J = bR for some b ∈ R. Since a ∈ aR ⊆ J, we know that b | a. As a is
irreducible, this means that either b is a unit, or b is associated to a. Int he

135
former case, we have J = R, and in the latter case, we have J = aR. As J
was arbitrary, this means that aR is maximal.

As mentioned before, maximal implies prime (for ideals). And in any


UFD, every irreducible element is prime. HOWEVER, in UFD’s that are
NOT PID’s, there can be nonzero non-maximal primes. For example:

Example 5.23. In R = Z[x], let I = xR. Then x is in fact prime, but


R/I ∼= Z, which is an integral domain but not a field. Therefore, I is
maximal but not prime. This essentially happens because R is a UFD but
not a PID.

In fact, note that the ONLY non-maximal prime ideal in a PID is the
zero ideal. Therefore, if R is a PID, then every prime ideal is either aR for
a ∈ R irreducible
√ or {0}. This latter fact can be used to prove, for example,
that Q[ 2] is not just a ring but also a field; for it is the quotient of Q[x] by
the ideal generated by the irreducible polynomial x2 − 2, and this ideal must
be maximal.

Here’s another fact that holds in PID’s but not in general UFD’s: Recall
that HCF’s always exist in a UFD. In a PID, we can say a little bit more
about HCF’s than just that they exist. Specifically, we can say the following:

Proposition 5.24. Let R be a PID, x, y ∈ R, and d an HCF of x and y.


Then there exist a, b ∈ R such that ax + by = d.

Proof. Let I be the ideal generated by x and y. Then I is the set of all
elements of R of the form ax + by for a, b ∈ R. Thus we have to show that
d ∈ I.

Because R is a PID, we know that I is principal, say I = zR for some


z ∈ R. Then since x, y ∈ I, we know z | x and z | y. By definition of HCF,
we find that z | d. But then d ∈ I = zR, so we are done.

136
5.4 Factorization of Polynomials

We collect some facts about factorization of polynomials that will be useful


in our discussion of field extensions in Section 5.5. These facts are mostly
just summaries of what was discussed in the previous two sections.

First, here’s a description of all the ideals in F [x]:

Fact 5.25. 1. Every ideal in F [x] is of the form (f (x)) for f (x) ∈ F [x].

2. Two such ideals (f (x)) and (g(x)) are the same ideal if and only if g(x)
is a nonzero constant multiple of f (x).

3. The only non-maximal prime ideal is (0).

4. The maximal ideals are precisely those of the form (p(x)) for an irre-
ducible polynomial p(x).

5. If I is a nonzero ideal, then it has a unique monic generator (recall that


monic means the leading coefficient is 1).

Recall that whether a given polynomial is irreducible depends on F . For


example, x2 − 2 is irreducible in Q[x] but not in R[x].

Now let’s talk about how to recognize whether a given element generates
an ideal. First, a definition:

Definition 5.26. If I is an ideal in R = F [x], then f (x) has minimal degree


in I if for any g(x) ∈ I \ {0}, we have deg f (x) ≤ deg g(x).

Example 5.27. In C[x], the ideal generated by x3 − x + 1 has no elements


of degrees 0, 1, or 2. All of its elements are either 0 (which does not have a
degree) or have degree at least 3.

Example 5.28. More generally, if f (x) is a polynomial of degree n in F [x],


then (f (x)) has no elements of degree less than n.

Example 5.29. An ideal I ⊆ F [x] has elements of degree zero if and only
if it is the whole ring.

137
We then have the following facts, which essentially follow from the proof
of 5.11 and from the facts mentioned above:

Fact 5.30. 1. If I is an ideal of F [x], and if f (x) ∈ I has minimal degree,


then f (x) generates I.

2. Any two elements of I of minimal degree are constant multiples of each


other.

3. If I is a nonzero ideal, then I has an element of minimal degree.

4. If I is generated by 0 6= f (x) ∈ F [x], then f (x) has minimal degree in


I.

Finally, we note the following important fact:

Proposition 5.31. If I is an ideal not equal to R, p(x) ∈ I is an irreducible


element of F [x], then p(x) generates I. In particular, p(x) has minimal
degree in I, and any other irreducible q(x) ∈ I is a nonzero constant multiple
of p(x).

Proof. The ideal (p(x)) is then contained in I. Since (p(x)) is maximal, and
I is not all of R, we must have I = (p(x)).

5.4.1 Linear Factors of Polynomials

Finally, here’s an important consequence of the Remainder Theorem. This


gives a relationship between factorization of polynomials and roots of poly-
nomials.

Lemma 5.32. If F is a field, α ∈ F , and f (x) ∈ F [x], then (x − α) | f (x)


if and only if f (α) = 0.

Proof. If (x − α) | f (x), then f (x) = (x − α)g(x) for some g(x) ∈ F [x].


Therefore, f (α) = (α − α)g(α) = 0g(α) = 0.

138
Conversely, suppose f (α) = 0. By the Remainder Theorem, we can write

f (x) = q(x)(x − α) + r(x),

where r(x) is either zero or has degree 0. Therefore, r(x) is constant. But
r(α) = f (α)−q(α)(α−α) = 0, so r(x) is just 0. Therefore, f (x) = q(x)(x−α),
so (x − α) | f (x).

One can repeatedly apply Lemma 5.32 to show that a polynomial of


degree n can have at most n roots.

5.5 Ring and Field Extensions

Recall that if R is a subring of a ring S, and α ∈ S, then there is an evaluation


homomorphism
evα : R[x] → S,
sending f (x) ∈ R[x] to f (α) ∈ S. The image is a subring of S denoted R[α],
and R[α] is the smallest subring of S containing R and α.

If R and S are both fields, then we let R(α) denote the smallest subfield
of S containing R and α. Note that we always have R[α] ⊆ R(α), and these
are equal if and only if R[α] is already a field. We want to understand when
this does and doesn’t happen.

We now set F = R and E = S. The pair of E and F , often denoted E/F ,


is called a field extension. Note that this is NOT any kind of quotient - the
use of “/” is just historically a piece of notation used for field extensions.

Given a field extension E/F and α ∈ E, we set

Iα := ker evα .

Note that Iα is an ideal in F [x], so we can apply everything we know from


Section 5.4 to it.

139
By the first isomorphism theorem for rings, we have F [α] ∼= F [x]/Iα . Note
that E is a field and therefore an integral domain, so it has no zero-divisors.
But that means that F [α], being a subring of E, also has no zero-divisors.
Therefore, F [α] is an integral domain (ID), so Iα is a prime ideal in F [x].

By Fact 5.25, we either have Iα = (p(x)) for p(x) an irreducible element of


F [x], or Iα = {0}. We distinguish these two cases with a pair of definitions:

Definition 5.33. If E/F is a field extension and α ∈ E, then α is algebraic


over F if Iα has a nonzero element. Equivalently, α is the root of a nonzero
polynomial with coefficients in F .

Definition 5.34. If E/F is a field extension and α ∈ E, then α is transcen-


dental over F if Iα = {0}. Equivalently, α is not the root of any nonzero
polynomial with coefficients in F .

By Fact 5.25, we find that F [α] is a field (and hence F [α] = F (α)) if and
only if α is algebraic over F .

Example 5.35. Any element of F itself is trivially algebraic over F .


√ √ √ 3 √
Example 5.36. The numbers 2, 3, 2, and more generally, n a for
a ∈ Q and n ∈ N are algebraic over Q.

Example 5.37. The complex number i is algebraic over Q (and over R).

Example 5.38. The numbers e = 2.71828... and π = 3.14159... are tran-


scendental over Q. See https://en.wikipedia.org/wiki/Lindemann%E2%
80%93Weierstrass_theorem for some of the history of this. It was a con-
jecture of Lambert in 1768, but only proven (in the case of π) in 1882 by
Lindemann.

Example 5.39. The number π is algebraic over R, even though it is tran-


scendental over Q. In fact, it is also algebraic over a field like Q(π 2 ).

The last two examples show that whether an element is transcendental


or algebraic depends on the field F . Clasically, people only considered the
following definition:

140
Definition 5.40. A complex number α is said to be an algebraic number
if it is algebraic over Q. It is said to be a transcendental number if it is
transcendental over Q.

For some time, people didn’t know if there even were transcendental num-
bers. The first number proven to be transcendental was

X 1
n!
,
n=1
10

which was done by Liouville. You can find out about more numbers that are
known to be transcendental at https://en.wikipedia.org/wiki/Transcendental_
number#Numbers_proven_to_be_transcendental.
Remark 5.41. This material is non-examinable: The subset of C consisting
of all algebraic numbers is denoted Q. It turns out that this subset is in fact a
subfield, and it is known as the algebraic closure of Q. It is algebraically closed
in the sense that every polynomial with coefficients in it has a root (and in fact
splits into linear factors). You can read more at https://en.wikipedia.
org/wiki/Algebraic_number#The_field_of_algebraic_numbers and the
links contained therein.

5.5.1 Minimal Polynomials

Let E/F be a field extension, and suppose that α ∈ E is algebraic. Then the
ideal Iα ⊆ F [x] has a unique monic generator by Fact 5.25. This generator
is called the minimal polynomial of α over F .

How can we tell if a given polynomial is the minimal polynomial of a


given α ∈ R? Well if p(x) ∈ F [x] is irreducible such that p(α) = 0, then Fact
5.31 implies that p(x) generates Iα . Therefore, the unique monic multiple of
p(x) is the minimal polynomial of α.

Note that α ∈ F if and only if its minimal polynomial is x − α (or


equivalently, as long as its minimal polynomial is linear).

Example 5.42. The minimal polynomial of 2 over Q is x2 − 2.

141
√ √ √
Example 5.43. The minimal polynomial of 2 over Q[ 2] is just x − 2.

Example 5.44. The minimal polynomial of i over Q, or even over R, is


x2 + 1.

5+1
Example 5.45. The minimal polynomial of the Golden Ratio over
2
Q is x2 − x − 1.

In order to find the minimal polynomial of α, it suffices to find a polyno-


mial f (x) ∈ F [x] such that f (α) = 0 and then show that f (x) is irreducible.
This latter step can be tricky.
2π 2π
Example 5.46. The minimal polynomial of α = e2πi/7 = cos + i sin
7 7
is f (x) = x6 + x5 + x4 + x3 + x2 + x + 1. Note that

x7 − 1
f (x) = ,
x−1
so it is clear that f (α) = 0. To show that f (x) is irreducible in Q[x], one
needs to use the Eisenstein Criterion from p.74 of Paulin’s notes. However,
we will not cover the Eisenstein Criterion in this semester.

We can, however, prove irreducibility in the following cases:

Proposition 5.47. If f (x) ∈ F [x] is quadratic or cubic (i.e., degree 2 or 3),


then f (x) is irreducible iff f (x) has no root in F .

Proof. If f (x) were reducible, then because degrees add when you multiply
polynomials, it would have to have a (non-constant) linear factor. But any
nonconstant linear polynomial over F is of the form ax + b for a, b ∈ F , with
b
a 6= 0. Since F is a field, this linear polynomial has a solution − ∈ F .
a
Therefore, if f (x) is reducible, then f (x) has a root in F .

Conversely, if f (x) has a root in F , then by Lemma 5.32, it is divisible


by a linear polynomial, so it is reducible (since it is quadratic or cubic and
therefore not linear).

142
Example 5.48. √ This allows one to prove that x2 − 2 is indeed the minimal
2
polynomial of 2 or that x + 1 is indeed the minimal polynomial of i (once
you prove that neither 2 nor −1 has a square root in Q).

Example 5.49. For a cubic example, note that 2 has no cube root in Q,√so
x3 − 2 is irreducible in Q[x], hence x3 − 2 is the minimal polynomial of 2
3

over Q.

Example 5.50. √ Note that x3 −2 is reducible in R[x], so it is not the minimal


x3 −2 =
3
polynomial
√ of√ 2 over √
R. In fact, over R[x], we have the
√ factorization

(x2 + 2x + 4)(x − 2), yet the polynomial x2 + 2x + 4 is irreducible
3 3 3 3 3

over R (i.e., in R[x]) because it has no real roots.

Note that there are reducible quartic polynomials with no root in F . For
an easy example, take (x2 − 2)2 for F = Q.

6 Material Beyond Our Course

6.1 Toward Galois Theory

6.1.1 Degree of a Field Extension

Definition 6.1. Let E/F be a field extension. Then a basis of E over F is


a subset {x1 , · · · , xn } ⊆ F such that every x ∈ E can be uniquely expressed
as a linear combination

λ 1 x1 + λ 2 x2 + · · · + λ n xn

for λi ∈ F .

Definition 6.2. We say that an extension E/F is finite if it has a finite


basis. The degree of a finite extension E/F , denoted [E : F ], is the size of
the basis. (Note: it is a theorem that this size does not depend on which
basis one chooses.)

143
Given √ an extension,
√ how can one determine its degree? It should be clear
that Q[ 2] = {a + b 2 | a, b ∈ Q} has degree 2 over Q. We also know that
[E : F ] = 1 if and only if E = F . The following fact is helpful:

Fact 6.3. If α ∈ E is algebraic over F , then F [α]/F is a finite extension


whose degree is the degree of the minimal polynomial of α over F .

More concretely, if that degree is n, then one can choose 1, α, α2 , · · · , αn−1


as a basis.
√ √
For a more complicated field extension like E = Q[ 2, 3], there are a
couple
√ of
√ approaches.
√ One could try to show that in this case, E = {a +
b 2 + c 3 + d 6 | a, b, c, d ∈√Q} and
√ that this representation is unique. Or,
one could show that√E = √ Q[ 2 + 3] as in one of the homework problems,
and then show that 2 + 3 has minimal polynomial x4 − 10x2 + 1 over Q.

An even better way is to use the following proposition:

Fact 6.4. If E/K/F is a sequence of (finite) field extensions, then [E : F ] =


[E : K][K : F ].

The proof proceeds by taking a basis x1 , · · · , xn of E over K and a basis


y1 , · · · , ym of K over F and then showing that the set of mn products {xi yj }
is a basis for E over F .

The notion of degree of a field extension is the key to showing that one
cannot construct, for example, a septagon, using ruler and compass. The idea
is this: whenever one does a ruler and compass construction, the coordinates
of the points one can construct can be found by addition, multiplication,
subtraction, division, and square roots (because of the distance formula in
Cartesian geometry, and because ruler and compass construction is all about
drawing circles!). This means that they all lie in a field extension of Q given
by taking square roots; by applying Fact 6.4 over and over, one sees that
such an extension must have degree a power of 2.

Note that any divisor of a power of 2 is also a power of 2. Thus any


subfield of such a field extension would also have degree a power of 2 by Fact

144
2π x2 x 1
6.4. However, the number cos has minimal polynomial x3 + − − ,
7 2 2 8

which has degree 3. Therefore, Q[cos ] has degree 3 over Q, so it cannot
7
be contained in a field extension of Q whose degree is a power of 2.

6.1.2 Galois Theory

Given a polynomial f (x) ∈ F [x], one can define the splitting field Ef of f (x)
over F . It is a field obtained by “adjoining” (inside some larger field, such
as C) all the roots of f (x) to F . In other words, it is the smallest field E in
which f (x) splits into linear factors in E[x].

One then defines the Galois group Gal(Ef /F ) of f (x) over F to be the
group of automorphisms of the field Ef that act as the identity on F .

Note that if σ ∈ Gal(Ef /F ), α ∈ Ef , and g(x) ∈ F [x], then σ(g(α)) =


g(σ(α)) (this is an exercise in the definition of a ring homomorphism and of
a polynomial!). In particular, if α is a root of f (x), then σ(α) is also a root
of f (x). In particular, the elements of Gal(Ef /F ) permute the roots of f (x).
If we let Z denote the set of roots of f (x), then Gal(Ef /F ) is a subgroup of
Σ(Z). For example, if f (x) is a quintic polynomial with distinct roots, then
Gal(Ef /F ) is a subgroup of Sym5 .

A basic result of Galois says that Gal(Ef /F ) acts transitively on the set
of roots. The philosophy behind this is that any two roots “look the same
algebraically, from the viewpoint of F .”

Here are some examples:



Example 6.5. If n ∈ Z is not a square, then Q[ n]/Q has Galois group
of√order 2. The
√ non-identity element corresponds to the automorphism a +
b n 7→ a − b n.
Example 6.6. We can do the previous example over R instead of Q if n is
negative, to get the extension C/R. The Galois group is once again size 2,
and the non-trivial automorphism (i.e. non-identity element of the group) is
complex conjugation.

145
√3
√3
Example 6.7. The field Q[ 2] is not a splitting field, because 2 is not
the only root of the polynomial x3 − 2 ∈ Q[x]. √ In fact, we have to all add
√ √ −1 + i 3
the roots ω 2 and ω 2 2, where ω =
3 3
is a primitive third root of
√ 2
unity. The field Q[ 2, ω] is the splitting field of x3 − 2 over Q. In this case,
3

there are three roots, and the Galois group is the full symmetric group Sym3 .
This is an example of a non-abelian Galois group.

To learn more about Galois theory, pick up any text on abstract algebra,
search “Galois theory notes” on Google, or see my notes at https://math.
berkeley.edu/~dcorwin/files/galoisthy.pdf.

One set of notes I particularly like are those of Miles Reid at https:
//homepages.warwick.ac.uk/~masda/MA3D5/Galois.pdf. He has a really
nice introductory section that explains the cubic and quartic formulas in
light of the philosophy of Galois theory, so it really helps motivate Galois
theory. Or see my account of the same topic at https://math.berkeley.
edu/~dcorwin/files/symmetry_cubic.pdf.

You can also find some short articles about topics in Galois theory at
https://kconrad.math.uconn.edu/blurbs/.

6.2 Algebraic Geometry

Algebraic geometry is a very important field of mathematics that has in-


fluenced many other fields, ranging from number theory to mathematical
physics, and even to computer engineering.

Algebraic geometry is, on its surface, the study of solutions to polyno-


mial equations in multiple variables. More specifically, it is the study of
the relationship between solution sets of polynomials in multiple variables
(geometry) and the ring theory of certain rings (algebra).

How does one associate a ring to a system of polynomial equations? Let’s


say f1 (x1 , · · · , xn ), · · · , fm (x1 , · · · , xn ) is collection of m polynomials in n

146
variables with coefficients in a field F . We define the solution set or variety
defined by f1 , · · · , fm to be the set
V (f1 , · · · , fm ) = {(x1 , · · · , xn ) ∈ F n | fi (x1 , · · · , xn ) = 0 ∀ i = 1, · · · , m}.

Then one associates the ring


A = A(V (f1 , · · · , fm )) := F [x1 , · · · , xn ]/(f1 , · · · , fm ).

The idea is that the polynomials f1 , · · · , fm are zero as functions on


V (f1 , · · · , fm ), so we should mod out by them.

The ring A(V (f1 , · · · , fm )) is known as the affine coordinate ring of the
variety. Algebraic geometers in the first half of the 20th century made the
important observation that geometric properties of V (f1 , · · · , fm ) are equiv-
alent to certain algebraic properties of the ring A. Here are three examples
of this phenomenon:

If F is algebraically closed, then Hilbert’s Nullstellensatz says that there’s


a natural bijection between the points of V (f1 , · · · , fm ) and the maximal
ideals of the ring A.

If the variety is smooth (this means that the Jacobian of partial deriva-
tives of the map from F n to F m defined by the polynomials fi has full rank
at every point of V (f1 , · · · , fm ), so that one may apply the implicit function
theorem), then A is a UFD.

Finally, the ring A is an integral domain if and only if the variety is


irreducible, which roughly means that it cannot be broken down as a union
of smaller varieties. For example, the variety defined by the single equation
x1 x2 = 0 is reducible, because it is the union of the variety defined by x1 = 0
and the variety defined by x2 = 0. Indeed, notice that F [x1 , x2 ]/(x1 x2 ) is not
an integral domain.

The book https://www.amazon.com/Invitation-Algebraic-Geometry-Universitext/


dp/0387989803 is a wonderful introduction to algebraic geometry. You can
probably even start reading this book just with the background you learned
in this class!

147
6.3 p-adic Numbers

Check out http://www.maths.gla.ac.uk/~ajb/dvi-ps/padicnotes.pdf to


learn about p-adic numbers. These also show up in a more advanced algebraic
number theory course or in a commutative algebra course.

However, the basics of p-adic numbers are not too difficult, and I recom-
mend learning about them now!

6.4 Algebraic Number Theory

Galois theory studies fields, especially fields of the form Q[α], where α ∈ C
is some algebraic number. Because Q[α] is a field, every nonzero element
divides every other element, so divisibility isn’t interesting. Similarly, every
ideal is either {0} or the whole ring, so the theory of ideals is not interesting.

On the other hand, if we consider Z[α], divisibility and ideals are both
much more
√ interesting.
√ For example, we might consider rings of the form
Z[i], Z[ 2], Z[ −5], and more. We might ask whether or not they are UFD
or PID. Algebraic number theory studies questions like these.

Remark 6.8. In general, we never have a chance of getting a UFD unless √


we take the ring of integers.
" √ Concretely,
# this means that in the field Q[ 5],
5+1 √
we must take the ring Z instead of the seemingly obvious Z[ 5].
2
More generally, if K = Q[α], we define

OK := {β ∈ K | β is the root of a monic polynomial with integer coefficients.}.

Then OK is known as the ring of integers of K, and it is a subring of K


whose field of fractions is K.

Usually, people take a course titled “Algebraic Number Theory” or “Num-


ber Fields” after taking a course in Galois theory. However, there’s a lot that
you can learn before taking a full course in Galois theory. You can especially

148
study quadratic integer rings, as the Galois theory in that case is very simple
(it is just conjugation). The quadratic integer rings take the form

Z[ d]
d−1
if d ≡ 2, 3 (mod 4). If d ≡ 1 (mod 4), then x2 −x− is monic polynomial
√ 4
1± d
with integer coefficients with roots , so
2
"√ #
d + 1
OQ[√d] = Z
2

if d ≡ 1 (mod 4), for d a squarefree integer. To read more about quadratic in-
teger rings, check out https://kconrad.math.uconn.edu/blurbs/gradnumthy/
quadraticgrad.pdf or Chapter 13 of Algebra by Michael Artin. These are
sources you should be able to read now.

For even more material, you can look at https://kconrad.math.uconn.


edu/blurbs/. For example, https://kconrad.math.uconn.edu/blurbs/
gradnumthy/dedekindf.pdf introduces the idea that we should factor into
ideals rather than elements, and https://kconrad.math.uconn.edu/blurbs/
gradnumthy/idealfactor.pdf proves some basic facts about factorization
of ideals. Some standard introductory textbooks on algebraic number the-
ory are https://www.springer.com/gp/book/9783319902326 and https:
//www.maa.org/press/maa-reviews/algebraic-theory-of-numbers.

6.5 Commutative Algebra

If you want to learn even more about the general theory of commutative
rings, check out http://www.math.toronto.edu/jcarlson/A--M.pdf.

Generally, commutative algebra is seen more as a tool for subjects like


algebraic number theory and algebraic geometry than as a subject in itself.
Therefore, some people find it too abstract to learn before learning more
about algebraic number theory and algebraic geometry for context - espe-
cially because algebraic geometry gives geometric intuition for concepts in

149
commutative algebra. On the other hand, some people may like to learn
abstract theory before learning how to apply it.

150

You might also like