Gary Notes
Gary Notes
Gary Notes
3 Linear Codes 13
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 The Generator and Parity Check Matrices . . . . . . . . . . . . . . . . . . 16
3.3 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.1 Maximum Likelihood Decoding (MLD) of Linear Codes . . . . . . . 21
5 Hamming Codes 31
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Extended Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6 Golay Codes 37
6.1 The Extended Golay Code : C24 . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 The Golay Code : C23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7 Reed-Muller Codes 43
7.1 The Reed-Muller Codes RM (1, m) . . . . . . . . . . . . . . . . . . . . . . 46
8 Decimal Codes 49
8.1 The ISBN Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.2 A Single Error Correcting Decimal Code . . . . . . . . . . . . . . . . . . . 51
8.3 A Double Error Correcting Decimal Code . . . . . . . . . . . . . . . . . . . 53
iii
iv CONTENTS
9 Hadamard Codes 61
9.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.2 Definition of the Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.3 How good are the Hadamard Codes ? . . . . . . . . . . . . . . . . . . . . . 67
10 Introduction to Cryptography 71
10.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
10.2 Affine Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
10.2.1 Cryptanalysis of the Affine Cipher . . . . . . . . . . . . . . . . . . . 74
10.3 Some Other Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.3.1 The Vigenère Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . 76
10.3.2 Cryptanalysis of the Vigenère Cipher : The Kasiski Examination . 77
10.3.3 The Vernam Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
13 Factorisation Algorithms 97
13.1 Pollard’s p − 1 Factoring Method (ca. 1974) . . . . . . . . . . . . . . . . . 97
A Assignments 121
vi CONTENTS
Part I Coding
2 CONTENTS
Chapter 1
1.1 Introduction
A model of a communication system is shown in Figure 1.1.
✗ Source Encoder : The source encoder transforms the source data into binary
digits (bits). Further, the aim of the source encoder is to minimise the number of
bits required to represent the source data. We have one of two options here in that
we could require that the source data be perfectly reconstructible (as in computer
data) or we may allow a certain level of errors in the reconstruction (as in images
or speech).
✗ Encryption : Is intended to make the data unintelligible to all but the intended
receiver. That is, it is intended to preserve the secrecy of the messages in the
presence of unwelcome monitoring of the channel. Cryptography is the science of
maintaining secrecy of data from both passive intrusion (eavesdropping) and active
intrusion (introduction or alteration of messages). This introduces the following
distinctions
3
4 CHAPTER 1. INTRODUCTION AND BASIC IDEAS
✗ Channel Encoder : Tries to maximise the rate at which information can be reliably
transmitted on the channel in the presence of disruptions (noise) that can introduce
errors. Error correction coding, located in the channel encoder, adds redundancy in
a controlled manner to messages to allow transmission errors to be detected and/or
corrected.
✗ Modulator : Transforms the data into a format suitable for transmission over the
channel.
✗ Channel : The medium that we use to convey the data. This could be in the form
of a wireless link (as in cell phones), a fiber optic link or even a storage medium
such as magnetic disks or CD’s.
✗ Demodulator : Converts the received data back into its original format (usually
bits).
✗ Decryption : Converts the data back into an intelligible form. At this point one
would also perform authentication and data integrity checks.
✗ Source Decoder : Adds back the original (naturally occurring) redundancy that
was removed by the source encoder.
1.2 Terminology
Definition 1.2.1 (Alphabet). An alphabet is a finite, nonempty set of symbols. Usually
the binary alphabet K = {0, 1} is used and the elements of K are called binary digits or
bits.
Example 1.2.4. C1 = {00, 01, 10, 11} and C2 = {000, 001, 01, 1} are both codes over the
alphabet K.
Definition 1.2.5 (Block Code). A block code has all codewords of the same length;
this number is called the length of the code.
In our example C1 is a (binary) block code of length 2, while C2 is not a block code.
Definition 1.2.6 (Prefix code). A prefix code is a code such that there do not exist
distinct words wi and wj such that wi is a prefix (initial segment) of wj .
1. If n symbols are transmitted, then n symbols are received — though maybe not the
same ones. That is, nothing is added or lost.
3. Noise is scattered randomly rather than being in clumps (called bursts). That is
the probability of error is fixed and the channel stays constant over time. Most
channels do not satisfy this property with one notable exception — the deep space
channel (used to transmit images and other data between spacecraft and earth). In
6 CHAPTER 1. INTRODUCTION AND BASIC IDEAS
almost all other channels the probability of error varies over time. For example on
a CD, where certain areas are more damaged than others, and in cell phones where
the radio channel varies continually as the users move about their terrain.
A binary channel is called symmetric if 0 and 1 are transmitted with equal accuracy.
The reliability of such a channel is the probability p, 0 ≤ p ≤ 1, that the digit sent is the
digit received. The error probability q = 1 − p is the probability that the digit sent is
received in error.
Errors can be detected when the received word is not a codeword. If a codeword is
received, then it could be that no errors, or several errors (changing one codeword into
another) have occurred.
Consider C1 = {00, 01, 10, 11}. Every received word is a codeword, so no errors can
be detected (and none can be corrected). On the other hand consider
C2 = {000000, 010101, 101010, 111111},
obtained from C1 by repeating each word in C1 three times — this is known as a repetition
code. Here we can detect 2 or fewer errors as we need 3 or more errors to change one
codeword into another.
Example 2.1.2. Let C2 = {000000, 010101, 101010, 111111} be the code defined above.
Assume that 010101 is sent, but that 110111 is received. By examining C2 we conclude
that at least 1 or 2 errors could have occurred (more is also possible). Keep in mind that
we as the receiver only know the received word.
We can correct 1 error by using a majority rule : examine each pair of bits starting at
the left and decode the received word to a word that has its repeating bits that pair that
occurred the majority of times. So if 101010 is sent and 111010 is received, we decode
this to 101010. The idea is that if 1 error occurs, then there exists a unique codeword
that differs in 1 place from the received word; we decode to this word. If 2 errors occur,
the possibility exists that we may decode to the wrong codeword : if 010101 is sent and
110111 is received we will incorrectly decode this to 111111. Therefore this code cannot,
in general, correct 2 errors (specific cases may be possible, but not all possible patterns
of two errors are correctable).
Consider C3 = {000, 011, 101, 110}, formed from C1 by adding a third digit to each
codeword such that the total number of ones in the resulting word is even. This is called
7
8 CHAPTER 2. DETECTING AND CORRECTING ERRORS
a parity check code and the digit that was added is called the parity check digit. This
code can detect 1 error as no two codewords differ in exactly one place, but it cannot
detect 2 errors since two errors may change one codeword into another codeword.
Example 2.1.3. Let C3 = {000, 011, 101, 110} be the parity check code from above. If 000
is sent and 001 is received we immediately detect an error as the received word has an
odd number of ones. If we had been in a position whereby we knew that only one error
has occurred (in general we won’t know how many errors will have occurred) this won’t
even help us in correctly decoding the word above as 011, 101 and 000 all could have been
possible sent words.
If 011 is sent and 110 is received (i.e two errors occurred) we will not detect an error
since the received word has the required number of ones.
If u and v are binary words of the same length, we define u + v to be the binary word
obtained by component-wise addition modulo 2. That is 0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1
and 1 + 1 = 0.
Example 2.1.4. 01101 + 11001 = 10100
Definition 2.1.5 (Hamming weight). Let v be a binary word of length n. The Ham-
ming weight of v, denoted by wt(v), is the number of times the digit 1 occurs in v.
Note that d(u, v) = wt(u + v) — in the places where u and v differ a 1 will appear in
u + v and in the places where u and v are the same a 0 will appear in u + v — from the
way that + was defined.
Example 2.1.7. d(01011, 00111) = 2 = wt(01011 + 00111) = wt(01100) and
d(10110, 10110) = 0 = wt(10110, 10110) = wt(00000).
Let C be a binary code of length n. If v ∈ C is sent and w is received, then the error
pattern is u = v + w. That is the error pattern indicates those positions where an error
occurred by a 1. Also since u = v + w, w = v + u by the addition defined earlier (addition
and subtraction are the same under this rule). Therefore, the received word equals the
transmitted word plus the error pattern.
Definition 2.1.8 (Detecting an error pattern). A code C detects the error pattern
u if u + v ∈
/ C ∀v ∈ C.
9
Example 2.1.9. Let C = {001, 101, 110}. Then C detects the error pattern u = 010
because none of the codewords added to the error pattern is again a codeword : 001+010 =
011 ∈/ C, 101 + 010 = 111 ∈ / C and 110 + 010 = 100 ∈ / C. On the other hand C
does not detect u = 100 because the codeword 001 added to u is again a codeword :
001 + 100 = 101 ∈ C.
Definition 2.1.10 (Minimum distance of a code). For a code C with |C| ≥ 2, the
minimum distance of C, dmin (C), is the smallest distance between two distinct codewords.
That is
Theorem 2.1.12. A code C can detect all nonzero error patterns of weight at most d − 1
⇐⇒ dmin (C) ≥ d.
Proof.
⇐:
Suppose C has dmin (C) ≥ d. Let u be a nonzero error pattern with wt(u) ≤ d − 1 and
v ∈ C. Suppose v is sent and w = v + u is received. Then d(v, w) = wt(v + w) =
wt(v + v + u) = wt(u) ≤ d − 1. Since C has minimum distance at least d, w ∈ / C as
the codeword closest to v (in terms of Hamming distance) is at least a distance d from v.
Therefore C detects the error pattern u.
⇒:
Suppose C can detect all nonzero error patterns of weight at most d− 1. Let v, w ∈ C and
v 6= w. Then u = v + w is not a detectable error pattern as v + u = v + v + w = w ∈ C.
Thus wt(u) = 0 or wt(u) ≥ d. Since v 6= w, wt(u) 6= 0. Therefore wt(u) ≥ d, so that
d(v, w) = wt(v + w) = wt(u) ≥ d. This shows that dmin (C) ≥ d since v and w were two
arbitrary, distinct codewords.
10 CHAPTER 2. DETECTING AND CORRECTING ERRORS
So, by Theorem 2.1.12 a code C is t-error detecting if and only if C has minimum
distance t + 1.
Example 2.1.15. Let C = {0000, 1010, 0111}. Then C corrects the error pattern u = 0100 :
• If 0000 is sent, then (with u as error pattern) 0000 + u = 0100 is received. This
received word is closer to 0000 than to any other codeword.
• If 1010 is sent, then 1010 + u = 1110 is received. This received word is closer to
1010 than to any other codeword.
• If 0111 is sent, then 0111 + u = 0011 is received. This received word is closer to
0111 than to any other codeword.
On the other hand C does not correct the error pattern 1000 : If 0000 is sent, then
0000 + 1000 = 1000 is received and d(1000, 0000) = 1 = d(1000, 1010). Thus there is more
than one codeword that is closest to the received word.
Theorem 2.1.16. A code C will correct all error patterns of weight at most t ⇐⇒
dmin (C) ≥ 2t + 1.
Proof.
⇒:
Suppose that C corrects all error patterns of weight at most t, but that dmin (C) ≤ 2t.
Let v, w ∈ C such that d(v, w) = dmin (C) = d ≤ 2t. Let u by any error pattern obtained
from v + w by replacing 1’s by 0’s until only dd/2e 1’s remain. If v is sent and the error
pattern u occurs, then
» ¼
d
d(v, v + u) = wt(v + v + u) = wt(u) = ,
2
¹ º » ¼
d d
d(w, v + u) = wt(w + v + u) = dmin (C) − ≤ .
2 2
11
The second to last step, wt(w + v + u) = dmin (C) − dd/2e, follows from the fact that
wt(w + v) = dmin (C) and that u has its 1’s in exactly the same locations where w + v
has some of its 1’s. So by computing w + v + u, u will cancel exactly dd/2e of (w + v)’s
1’s. Therefore the received word, v + u, is at least as close to w as to v. This implies that
C does not correct u, but u is an error pattern of weight dd/2e and dd/2e ≤ t, so C was
supposed to be able to correct u, a contradiction.
⇐:
Suppose C has dmin (C) ≥ 2t + 1. Let u be a nonzero error pattern of weight at most t. If
v ∈ C is sent and v + u is received, then d(v, v + u) = wt(v + v + u) = wt(u) ≤ t. Now for
any other w ∈ C, w 6= v, d(w, v +u) = wt(w +v +u) ≥ 2t + 1− wt(u) ≥ 2t +1 − 2t = t + 1.
As above, wt(w + v + u) is bounded by realizing that wt(w + v) = d(w, v) ≥ 2t + 1 since
w and v are distinct codewords and dmin (C) ≥ 2t + 1. Further u will be able to cancel
at most wt(u) ≤ t of (w + v)’s 1’s. Therefore the received word v + u is closer to v than
to any other codeword w, w 6= v. Thus v + u can be correctly decoded showing that C
corrects all error patterns of weight at most t.
Linear Codes
3.1 Introduction
Definition 3.1.1 (Linear code). A code C is called a linear code if u +v ∈ C whenever
u, v ∈ C.
Example 3.1.2. Let C1 = {000, 001, 101}. C1 is not a linear code since 001 + 101 = 100 ∈ /
C1. Let C2 = {0000, 1001, 0110, 1111}, then C2 is linear : If v ∈ C2 , then v +v = 0000 ∈ C
and also 0000 + u = u ∀u ∈ C. Therefore we only have to consider the addition of two
nonzero, distinct words :
1001 + 0110 = 1111 ∈ C,
1001 + 1111 = 0110 ∈ C,
0110 + 1111 = 1001 ∈ C.
Theorem 3.1.3. The minimum distance of a linear code is the smallest weight of a
nonzero codeword.
Proof.
See assignment 1 in the appendix.
Recall that we let K = {0, 1} be the binary alphabet. If we define K n to be the set
of words (vectors) of length n over K, then K n together with the addition defined earlier
and scalar multiplication by elements of K is a vector space.
Let S ⊆ K n, say S = {v1 , v2 , . . . , vk }, then the subspace spanned by S (or generated
by S) is
hSi = {w ∈ K n | w = α1 v1 + α2 v2 + · · · + αk vk , αi ∈ K},
13
14 CHAPTER 3. LINEAR CODES
if S 6= ∅. If S = ∅, then hSi = {0}. Note that a linear code can also be thought of as a
subspace generated by some set S.
Example 3.1.4. Let S = {0100, 0011, 1100}. The the subspace (code) generated by S,
C = hSi, is
C = {w | w = α1 (0100) + α2 (0011) + α3(1100) αi ∈ K}.
α1 α2 α3 w
0 0 0 0000
0 0 1 1100
0 1 0 0011
0 1 1 1111
1 0 0 0100
1 0 1 1000
1 1 0 0111
1 1 1 1011
Recall that two vectors u = {u1 , u2 , . . . , un} and v = {v1 , v2 , . . . , vn } in K n are or-
thogonal if their dot product u · v = u1 v1 + u2 v2 + · · · + un vn = 0.
Example 3.1.5. If u = 11001 and v = 01101, then
u · v = 1 × 0 + 1 × 1 + 0 × 1 + 0 × 0 + 1 × 1 = 0 + 1 + 0 + 0 + 1 = 0,
so u and v are orthogonal, keeping in mind that addition is done mod 2.
If S ⊆ K n , a vector v ∈ K n is orthogonal to S if v · x = 0 ∀x ∈ S. The set of vectors
orthogonal to S is called the orthogonal complement of S and is denoted by S ⊥ .
Theorem 3.1.6. If V is a vector space and S ⊆ V (note a subset, not necessarily a
subspace), then S ⊥ is a subspace of V .
If S ⊆ K n and C = hSi (the linear code generated by S), we write C ⊥ = S ⊥ and call
C ⊥ the dual code of C.
Example 3.1.7. Let S = {0100, 0101}. Then C = hSi = {0000, 0101, 0100, 0001}. To find
C ⊥ we must find all words v = v1 v2 v3 v4 such that v · 0100 = 0 and v · 0101 = 0. It is
enough to ensure that v is orthogonal to the two basis vectors as v will also be orthogonal
to any linear combination of these vectors. This translates to
0v1 + 1v2 + 0v3 + 0v4 = 0,
0v1 + 1v2 + 0v3 + 1v4 = 0.
3.1. INTRODUCTION 15
From the first equation we find that v2 = 0 and then from the second equation we see
that v4 = 0. Thus as long as v2 = v4 = 0, v will be orthogonal to C; v1 and v3 may
assume arbitrary values. This implies that
Definition 3.1.8 (Dimension). The dimension of a linear code C = hSi, is the dimen-
sion of hSi. We denote this by dim(C).
Proof.
Suppose C = hSi has dimension k and let {v1 , v2, . . . , vk } be a basis for hSi. Then each
codeword w ∈ C can be uniquely written as : α1 v1 + α2 v2 + · · · αk vk ; αi ∈ K = {0, 1}.
Since there are 2 choices for each αi , there are 2k choices for (α1 , α2 , . . . , αk ). Each such
choice gives a different word.
log2 |C|
i(C) = .
n
Corollary 3.1.12. The information rate of a linear code C of dimension k and length n
is k/n.
Proof.
A 000
E 001
H 010
K 011
L 100
M 101
P 110
X 111
Then the message HELP is encoded as a sequence of four codewords by computing
Definition 3.2.2 (Generator matrix). A generator matrix for a linear code C = hSi
is a matrix G whose rows are a basis for C.
• Any linear code C has a generator matrix in row echelon form or reduced row echelon
form.
• Any matrix G whose rows are linearly independent is the generator matrix for some
linear code.
Definition 3.2.3 (Parity check matrix). A parity check matrix for a linear code C is
a matrix whose columns are a basis for the dual code C ⊥ .
Therefore H is a parity check matrix for C if and only if H T is a generator matrix for
C ⊥ . Also, if C has length n and dim(C) = k, then C ⊥ has length n and dim(C ⊥ ) = n − k.
Thus H has n rows and n − k columns.
Theorem 3.2.4. Let C be a linear code of length n, dimension k and let H be a parity
check matrix for C. Then C = {w ∈ K n | wH = 0}.
Proof.
Let H = [x1 | x2 | x3 | · · · | xn−k ], where the columns of H, x1 , x2 , x3 , . . . , xn−k , are a
basis for C ⊥ . Then wH is the vector [w · x1 , w · x2 , w · x3 , . . . , w · xn−k ]. Now w ∈ K n has
wH = 0 if and only if each w · xi = 0. That is wH = 0 if and only if w is orthogonal to
each xi if and only if w ∈ (C ⊥ )⊥ = C.
Note that if G is a generator matrix for C and H a parity check matrix for C, GH = 0.
This follows from the fact that the rows of G are a basis for C, while the columns of H
are a basis for C ⊥ .
We now consider the following special case : Let C = hSi be a linear code of length n
and dimension k. If A is the matrix whose rows are the words in S and if A can be put in
reduced row echelon form (which is not always possible, hence the special case), that is
" #
Ik X
A→ ,
0 0
18 CHAPTER 3. LINEAR CODES
then G = [Ik | X] is a generator matrix for C. Note the dimensions of the sub matrix X :
k rows → [Ik | |{z}
X ].
(n−k)
Let
" #
X
H= ,
In−k
then
" #
X
GH = [Ik | X] ,
In−k
= Ik X + XIn−k ,
= X + X = 0.
This implies that every column of H lies in C ⊥ , furthermore there are n − k linearly
independent columns (because of the identity matrix) and so the columns of H form a
basis for C ⊥ . Thus H is a parity check matrix for C.
So in this special case we can transform between the matrices HC ⊥ , GC ⊥ , GC and HC ,
which are the parity check and generator matrices for the codes C and C ⊥ , as shown in
the following diagram.
∗
HC ⊥ ←→ GC ⊥
T l T l
∗
GC ←→ HC
Here T denotes transpose and ∗ denotes the operation described above.
Example 3.2.5. (This example is continued from the previous one.)
In the previous example we had S = {11101, 10110, 01011, 11010} and C = hSi. We found
a basis for C by writing the elements of S into a matrix a (as its rows) and reducing A
to row echelon form. We now take this one step further and reduce A to reduced row
echelon form.
1 1 1 0 1 1 1 1 0 1 1 1 0 1 0
from previous
1 0 1 1 0 example 0 1 0 1 1 0 1 0 1 1
A = −−−−−−→ → →
0 1 0 1 1 0 0 1 1 1 0 0 1 1 1
1 1 0 1 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 " #
0 1 0 1 1 I X
3
= .
0 0 1 1 1 0 0
0 0 0 0 0
3.2. THE GENERATOR AND PARITY CHECK MATRICES 19
Therefore
0 1
G = I3 1 1 ,
1 1
so that C1 = {000, 001, 100, 101} (this can easily be seen, as C1 is just all possible sums
of the rows of G).
Let C2 be obtained from C1 by exchanging the last two digits of every codeword of
C1. Then C2 = {000, 010, 100, 110}, so that C1 and C2 are equivalent.
Equivalent codes are exactly the same, except for the order of the digits.
Proposition 3.2.8. Any linear code C is equivalent to a (linear) code C 0 having a gen-
erator matrix in the standard form G0 = [Ik | X].
is a generator matrix for C2 in standard form and by the previous example we know that
C1 and C2 are equivalent.
Let G = [Ik | X] be a standard form generator matrix for a code C, and let y ∈ K k be
encoded as w = yG = y[Ik | X] = [yIk | yX] = [y | yX]. Then from the last expression
it is clear that the first k digits of w carry the information being encoded (i.e. y) and the
last n − k digits are there to detect/correct errors. In this case the first k digits are known
as the information digits and the last n − k digits are known as the parity check digits.
3.3 Cosets
Recall from Algebra that (K n, +) may be regarded as a group and that a linear code
C ⊆ K n can be thought of as a subgroup of (K n , +).
C + u = {v + u | v ∈ C}.
Furthermore, from Algebra it also follows that distinct cosets partition K n . The
following result is well known.
1. u ∈ C + u.
2. u ∈ C + v ⇒ C + u = C + v.
3. u + v ∈ C ⇒ C + u = C + v.
4. u + v ∈
/ C ⇒ C + u 6= C + v.
5. Either C + u = C + v or (C + u) ∩ (C + v) = ∅.
6. |C + u| = |C|.
8. C = C + 0 is a coset.
Example 3.3.3. Let C = {0000, 1011, 0101, 1110}. Then the cosets (in this case left and
right cosets are the same since addition is commutative) of C are :
The cosets are determined as follows. By Theorem 3.3.2 number 8, C itself is always a
coset. We therefore write down its elements. Next, Theorem 3.3.2 number 5 guarantees
that the cosets partition K n . We now look for a word in K n that does not appear in any
of the cosets that we have already found and add it to the elements of C. This generates
a new coset as all elements in this coset does not appear in any other coset (by number
5 of the previous Theorem). We keep on repeating this process until we have accounted
for all the elements of K n .
The need to perform these operations as quickly as possible prompts the following use of
the parity check matrix.
The syndrome for w = 1101 is wH = 11. In fact all x ∈ C + w have syndrome 11 (see
next Theorem).
Theorem 3.3.7. Let C ⊆ K n be a linear code with parity check matrix H. Then for
u, w ∈ K n
1. wH = 0 ⇐⇒ w ∈ C.
Proof.
2.
wH = uH ⇐⇒ (w + u)H = 0,
⇐⇒ w + u ∈ C (by 1.),
⇐⇒ C + w = C + u.
• We can identify each coset by its syndrome. If C has dimension k, there are 2n−k
cosets and 2n−k syndromes. The fact that there are 2n−k syndromes follows from
considering the form of the parity check matrix : it has an identity matrix of size
n − k in its bottom part and all possible words of length n − k can be formed by
summing these rows (which is what the product wH, that arises when calculating
the syndrome, amounts to) the appropriate way.
• If C is a linear code, H its parity check matrix, v ∈ C is sent and the error pattern
u occurs. Then w = v + u will be received. The syndrome of w will be wH =
(v + u)H = vH + uH = 0 + uH = uH (since v is a codeword, vH = 0). Therefore
the syndrome of w will be the sum of those rows of H that correspond to the
positions in v where the errors occurred.
4. Decode w as w + u.
Definition 3.3.8 (Standard decoding array, Coset leader). A standard decoding ar-
ray (SDA) is a table that matches each syndrome with a word of least weight — the coset
leader — in the corresponding coset.
24 CHAPTER 3. LINEAR CODES
Here ❖ serves to indicate that the coset contains more than one word of least weight.
Therefore we either ask for a retransmission (IMLD) or choose one of the two words
arbitrarily (CMLD).
Chapter 4
In this chapter we consider a variety of bounds on the parameters of a code. Note that
we do not restrict our attention only to linear codes. We will explicitly indicate when a
code is linear. We begin with the following Proposition.
Proof.
¡ ¢
Note that ni counts the number of words that differ from v in exactly i places. The final
result follows by summing over all possible values for i.
Proof.
For v ∈ C define the set
25
26 CHAPTER 4. BOUNDS FOR CODES
By Proposition 4.0.10
µ ¶ µ ¶ µ ¶ µ ¶
n n n n
|Bt (v)| = + + + ··· + ,
0 1 2 t
so that
·µ ¶ µ ¶ µ ¶ µ ¶¸
n n n n
|C| + + + ··· + ≤ 2n .
0 1 2 t
26
|C| ≤ ¡6¢ ¡¢
0
+ 61
64
=
7
1
= 9 .
7
Since |C| ∈ N ∪ {0}, we have |C| ≤ 9. If C is a linear code then |C| is a power of two
(Proposition 3.1.10). In this case |C| ≤ 8 and dim(C) ≤ 3.
Proof.
Suppose the inequality holds. We construct a parity check matrix H for the proposed
code. Such a matrix H would have n rows and n − k linearly independent columns. To
27
The reason is that any such choice would create a linearly dependent set of size at most
d − 1 while we are trying to construct linearly independent sets of size d − 1. The number
of rows that are not allowed is
any sum of sum of
row 2 rows d−2 rows
z}|{
µ ¶ z}|{
zero row µ ¶ zµ }| ¶{
z}|{ l l l
1 + + +··· + ,
1 2 d−2
µ ¶ µ ¶ µ ¶ µ ¶
l l l l
= + + + ··· + ,
0 1 2 d−2
µ ¶ µ ¶ µ ¶ µ ¶
n−1 n−1 n−1 n−1
≤ + + + ···+ ,
0 1 2 d−2
< 2n−k .
¡¢ ¡ ¢
Here we used the fact that if l ≤ n − 1, then li ≤ n−1 i . Therefore another row is
available and so by induction H can be constructed.
Since the columns of H are linearly independent this implies that the dual code has
dimension n − k which in turn implies that the code itself has dimension k. By selecting
the last row to be the sum of some d − 1 rows of H, we guarantee that the minimum
distance equals d.
2n−1
|C| ≥ ¡n−1¢ ¡n−1¢ ¡ ¢ ¡ ¢.
0
+ 1
+ n−1
2
+ · · · + n−1
d−2
28 CHAPTER 4. BOUNDS FOR CODES
Proof.
Let k be the largest integer less than or equal to n such that
µ ¶ µ ¶ µ ¶ µ ¶
n−1 n−1 n−1 n−1
+ + + ··· + < 2n−k .
0 1 2 d−2
Such an integer k exists since the inequality holds in at least one case namely k = 0. For
this k we have by the Proposition a linear code C with |C| = 2k and
·µ ¶ µ ¶ µ ¶ µ ¶¸
k n−1 n−1 n−1 n−1
2 + + + ··· + < 2k 2n−k = 2n .
0 1 2 d−2
Since we chose k to be the largest integer such that the inequality holds, the inequality
will be reversed for k + 1. Therefore
2n
2k < ¡n−1¢ ¡n−1¢ ¡n−1¢ ¡n−1¢ ≤ 2k+1 ,
0
+ 1
+ 2
+ ···+ d−2
or written differently
2n−1
|C| = 2k ≥ ¡n−1¢ ¡n−1¢ ¡ ¢ ¡ ¢.
0 + 1 + n−1
2 + · · · + n−1
d−2
Definition 4.0.15 ([n, k, d]-code). An [n, k, d]-code is a linear code with length n, di-
mension k and minimum distance d.
Example 4.0.16.
(a) Is there a (9, 2, 5)-code ?
According to the Gilbert-Varshamov bound such code will exist if
µ ¶ µ ¶ µ ¶ µ ¶
8 8 8 8
+ + + = 1 + 8 + 28 + 56 = 93,
0 1 2 3
is less than 29−2 = 27 = 128. Since the inequality holds we know such a code exists.
(c) Find bounds on the size and dimension of a linear code C with n = 9 and dmin (C) =
5.
Using the Hamming bound we get
29
|C| ≤ ¡9¢ ¡ 9¢ ¡9¢ , (dmin (C) = 2t + 1 = 5 ⇒ t = 2),
0
+ 1 + 2
512
= ,
1 + 9 + 36
512
= ,
46
≈ 11.1 .
2n
2n = |K n | = ¡n¢ .
0
(b) The code C = {000 · · · 0, 111 · · · 1} with length n = 2t + 1 and dmin (C) = 2t + 1 is
perfect :
2n
2 = |C| = ¡n¢ ¡n ¢ ¡ ¢,
0 + 1 + · · · + nt
n
2
= 1 n,
22
= 2.
¡ ¢ ¡ ¢ ¡ ¢
Here we used the fact that if n = 2t + 1, then n0 + n1 + · · · + nt only includes
¡ ¢ ¡ ¢ ¡ ¢ ¡n¢ ¡ n ¢ ¡n¢
half of the terms in n0 + n1 + · · · + nt + t+1 + · · · + n−1 + n = 2n .
The two examples above — K n and the repetition code — are called the trivial perfect
codes.
Theorem 4.0.19 (Tietäväinen and van Lint [?, ?]). If C is a nontrivial perfect code
of length n and minimum distance dmin (C) = 2t + 1 then either
Hamming Codes
5.1 Introduction
In this chapter we focus our attention on one of the first classes of codes that was discov-
ered, namely the Hamming codes. We do this by describing their parity check matrices.
Definition 5.1.1 (Hamming codes [?]). Let r ≥ 2 and let H be the (2r −1)×r matrix
whose rows are the nonzero binary words of length r. The linear code that has H as its
parity check matrix is called the Hamming code of length 2r − 1.
Note that H has linearly independent columns since r of its rows contain those binary
words with only a single 1. In the example below (for the case r = 3) these rows are the
last three rows.
Example 5.1.2. If we let r = 3 in the definition above, we find that the Hamming code
of length 7 has the following parity check matrix.
1 1 1
1 1 0
1 0 1
H= 0 1 1 .
1 0 0
0 1 0
0 0 1
In the example above we see that the matrix is in standard form. It will always be
possible to place it in this form by ensuring that the appropriate rows are at the bottom
31
32 CHAPTER 5. HAMMING CODES
of the matrix. If
" #
X
H= ,
Ir
then the generator matrix will be of the form
£ ¤
G = I2r −1−r X .
r
Therefore the Hamming code has dimension 2r − r − 1, so that it has 22 −r−1 codewords.
Since H has no row equal to zero and no two identical rows, every set of 2 rows is lin-
early independent, so that the minimum distance is at least three. On the other hand there
is a linearly dependent set of three rows, for example {1000 · · · 00, 0100 · · · 00, 1100 · · · 00}.
This implies that the minimum distance of the Hamming code is exactly three, i.e. it is a
1-error correcting code.
Let’s Consider the Hamming bound in this case, here n = 2r − 1 and dmin = 3 =
2t + 1 ⇒ t = 1. Therefore
r
2r −r−1 22 −1
2 = |C| ≤ ¡2r −1¢ ¡2r −1¢ ,
0 + 1
2r −1
2
= ,
1 + (2r − 1)
r
22 −1
= ,
2r
r
= 22 −r−1 .
Therefore Hamming codes are perfect.
Its easy to make up a standard decoding array for a Hamming code : The coset leaders
are the words of weight at most 1 (see the exercises). All we therefore need is to compute
the syndromes for each one.
Example 5.1.3. (Continued from previous example.)
Recall that equivalent codes have exactly the same parameters. Consider then the
Hamming code whose parity check matrix are the nonzero binary words of length r in
numerical order. For the example above this would mean that
0 0 1
0 1 0
0 1 1
0
H = 1 0 0 ,
1 0 1
1 1 0
1 1 1
p1 = x1 + x2 + x3 ,
p2 = x1 + x2 + x4 ,
p3 = x1 + x3 + x4 .
Definition 5.2.2 (Extended code). Let C be a linear code of length n and C ∗ a code
obtained from C by adding one extra digit to each codeword so that each word in C ∗ has
even weight. Then C ∗ is called the extended code of C.
If C is a linear code with parity check matrix H, consider the following matrix
" #
H j
H∗ = ,
0 1
obtained from H by adding a column of 1’s to H and adding a row that has zeros
everywhere except in its last entry to this. Thus the final column of H ∗ is made up of 1’s.
Then H ∗ has n − k + 1 linearly independent columns and for every v ∈ C ∗, vH ∗ = 0.
Furthermore, dim(C ∗ ) = k : We can use the same set of basis vectors for C ∗ that was used
for C by adding a digit to each one of the basis vectors such that each new basis vector
has even weight. Each new codeword is still a sum of these new basis vectors. Therefore
C ∗ is the null space of H ∗ , implying that C ∗ is a linear code. As discussed above the
generator matrix, G∗ , for C ∗ can be obtained from the generator matrix of C by adding
a bit to every row such that each row has even weight. We then again have
" # GH · ¸
H j j
∗ ∗
G H = [G | i] = =
[G | i] = 0,
0 1 1
0
where i represents the column that was added to G so that each row of G∗ now has an
even number of 1’s. The product in the last column is zero since each entry is just the
sum of the corresponding row of [G | i], but since each row has an even number of 1’s
such a sum equals 0. Of course we also have GH = 0.
Example 5.2.3. Extending the Hamming code of length 7 we get the following parity
5.2. EXTENDED CODES 35
Therefore
(
dmin (C) if dmin (C) is even,
dmin (C ∗ ) =
dmin (C) + 1 if dmin (C) is odd.
Thus if dmin (C) is odd, then C ∗ will detect one more error than C. This implies that
extending C is only useful when dmin (C) is odd.
The extended Hamming code of length 8, Ce , has a minimum distance of 4 (it was
3 before it was extended). Looking back at the example above in which the generator
and parity check matrices were given for this code, we see that any two rows of G∗ are
orthogonal (even if the rows aren’t distinct). Therefore
G∗ (G∗ )T = 0.
This implies that the rows of G∗ are all in the dual code, Ce⊥ . The dual code has dimension
4 and G∗ has 4 linearly independent rows. Therefore the rows of G∗ are (also) a basis for
Ce⊥ . This shows that the extended Hamming code of length 8 is self-dual : Ce = Ce⊥ .
36 CHAPTER 5. HAMMING CODES
Chapter 6
Golay Codes
Step 1 : Let B1 be the 11 × 11 matrix whose first row is 11011100010 and each subsequent
row is obtained by cyclically shifting its predecessor one position left. That is
1 1 0 1 1 1 0 0 0 1 0
1 0 1 1 1 0 0 0 1 0 1
0 1 1 1 0 0 0 1 0 1 1
1 1 1 0 0 0 1 0 1 1 0
1 1 0 0 0 1 0 1 1 0 1
B1 =
1 0 0 0 1 0 1 1 0 1 1 .
0 0 0 1 0 1 1 0 1 1 1
0 0 1 0 1 1 0 1 1 1 0
0 1 0 1 1 0 1 1 1 0 0
1 0 1 1 0 1 1 1 0 0 0
0 1 1 0 1 1 1 0 0 0 1
37
38 CHAPTER 6. GOLAY CODES
1. B = B T (B1 is symmetric.)
2. BB T = BB = I.
Step 3 : The extended Golay code, C24 , is the linear code with generator matrix
G = [I12 | B]
1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1
0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1
0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 1 1 1
0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 1 0 1 1 0 1
0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 1 1
0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1
=
0
.
0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 1 1
0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 1 1 0 1
0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 1 0 0 1
0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 1 1 1 0 0 0 1
0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 1 1 1 0 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0
We now have
2. The matrix
· ¸
B
,
I12
3. Furthermore
· ¸
I12
H= ,
B
is also a parity check matrix for C24 (with respect to the same generator matrix G) :
· ¸
I12 2
GH = [I12 | B] = I12 + BB = I + I = 0,
B
5. C24 is self-dual :
GGT = GH = 0,
⊥ ⊥
therefore every basis vector for C24 belongs to C24 . Further, dim(C24) = dim(C24 )=
⊥
12 so that the basis vectors for C24 are also basis vectors for C24 . This implies that
⊥
C24 = C24 .
The last step follows from the fact that G has rows of weight 8 and each of
these rows are also codewords of C24.
(c) C24 has no codewords of weight 4 :
Suppose v ∈ C24 has weight 4. We noted earlier that both [I12 | B] and [B | I12 ]
are generator matrices for C24 . Therefore v = w1 [I12 | B] = w2 [B | I12 ] for
some w1 , w2 6= 0. Now neither of the two halves of v can be identically zero.
This is because of the identity matrices in the generator matrices and also since
w1, w2 6= 0. Further, if either half of v contained only one 1, this would imply
that v equalled a row of either [I12 | B] or [B | I12 ], but each row has weight at
least eight. Therefore each half of v must contain exactly two 1’s. This implies
that wt(w1 ) = wt(w2) = 2, but the sum of two rows of B has weight at least
4. Therefore wt(v) = wt(wi ) + wt(wiB) > 2 + 4 > 4 — a contradiction.
Recall, that the extended Golay code, C24, has generator matrix [I12 | B], where
· ¸
B1 j T
B= .
j 0
The Golay code C23 is obtained by removing the last digit from every codeword in C24 .
Therefore it has the generator matrix [I12 | B 0 ], where
· ¸
0 B1
B = .
j
Then G is a 12 × 23 matrix with linearly independent rows. Therefore C23 has length
n = 23 and dimension k = 12.
Notice that the extended Golay code, C24 , is the extension (as defined previously) of
the Golay code C23 (or C23 resulted from puncturing C24 ). Since the extended code, C24 ,
has minimum distance 8, C23 has minimum distance 7 or 8. Since G has rows of weight
7, dmin (C23 ) = 7. Thus C23 is a [23, 12, 7]-code. This also means that C23 is perfect (as
the computation below shows) and so corrects all error patterns of weight at most 3 and
no others (every word in K 23 is within distance 3 of some codeword in C23 ).
223
212 = |C23 | = ¡23¢ ¡23¢ ¡23¢ ¡23¢,
0 + 1 + 2 + 3
223
= ,
1 + 23 + 253 + 1771
223
= 11 = 212 .
2
42 CHAPTER 6. GOLAY CODES
Chapter 7
Reed-Muller Codes
The rth order Reed-Muller [?, ?] code (of length 2m ), denoted by RM (r, m), 0 ≤ r ≤ m,
is defined recursively as
m
1. RM (0, m) = {00 · · · 0}, 11
| {z · · · 1}}. Further, RM (m, m) = K 2 .
| {z
2m 2m
Example 7.0.2.
RM (0, 0) = {0, 1}.
RM (0, 1) = {00, 11}.
RM (1, 1) = {00, 01, 10, 11}.
RM (0, 2) = {0000, 1111}.
RM (1, 2) = {(x, x + y) | x ∈ RM (1, 1) and y ∈ RM (0, 1)},
= {(x, x + y) | x ∈ {00, 01, 10, 11} and y ∈ {00, 11}}
= {0000, 0011, 0101, 0110, 1010, 1001, 1111, 1100}.
RM (2, 2) = K 4.
Let G(r, m) denote the generator matrix for RM (r, m). These can be defined recur-
sively.
1.
G(0, m) = [1| 1{z
· · · 1}].
2m
43
44 CHAPTER 7. REED-MULLER CODES
3.
· ¸
G(m − 1, m)
G(m, m) = .
00 · · · 01
Example 7.0.3.
1.
G(0, 1) = [1 1].
2.
" #
1 1
G(1, 1) = .
0 1
3.
· ¸ 1 1 1 1
G(1, 1) G(1, 1)
G(1, 2) = = 0 1 0 1 .
0 G(0, 1)
0 0 1 1
4.
1 1 1 1
· ¸
G(1, 2) 0 1 0 1
G(2, 2) = = .
00 · · · 1 0 0 1 1
0 0 0 1
5.
1 1 1 1 1 1 1 1
· ¸
G(1, 2) G(1, 2) 0 1 0 1 0 1 0 1
G(1, 3) = = .
0 G(0, 2) 0 0 1 1 0 0 1 1
0 0 0 0 1 1 1 1
45
Theorem 7.0.4. The r’th order Reed-Muller code RM (r, m) has the following properties
1. Length n = 2m .
3. Dimension,
Xr µ ¶
m
k= .
i=0
i
Proof.
We only prove 1, 2 and 4.
All proofs are by induction. From the previous example we see that all statements are
true when m = 0, 1, 2.
1. When r = 0 or r = m this follows from the definition. If 0 < r < m then RM (r, m) is
constructed from two Reed-Muller codes both of length 2m−1. Therefore RM (r, m)
will have length 2m−1 + 2m−1 = 2m .
4. The proof uses induction on r+m. We see from a previous example that RM (0, 1) ⊆
R(1, 1), therefore the statement is true if r + m ≤ 2. Assume that if r + m < t and
r > 0, then RM (r − 1, m) ⊆ RM (r, m) and consider the situation where r + m = t
and r > 0. There are three cases.
and
· ¸
G(r, m − 1) G(r, m − 1)
G(r, m) = .
0 G(r − 1, m − 1)
By hypothesis G(r − 1, m− 1) is a sub-matrix of G(r, m− 1) and G(r −2, m− 1)
is a sub-matrix of G(r − 1, m − 1). Therefore G(r − 1, m) is a sub-matrix of
G(r, m) and the result follows by induction.
2. Again the proof uses induction on r+m. The result is clear if r+m = 0 or r+m = 1.
Suppose the minimum distance of RM (s, t) = 2t−s whenever s + t < l and consider
the case r + m = l and the code RM (r, m). If r = 0 or r = m, then the result is
true, so assume 0 < r < m. We know
d(u, w) > 6, then d(w, u + 11111111) < 2 and so w may be decoded to u + 11111111.
Therefore no more than half the codewords need to be examined to find a codeword closest
to w, if one exists. Note that 11111111 is always a codeword so that u + 11111111 is a
codeword if u is a codeword.
48 CHAPTER 7. REED-MULLER CODES
Chapter 8
Decimal Codes
Group Identifier
This identifies the group of countries where the book was published. It has more digits
if the group produces fewer books. As an example x1 = 0 identifies the English speaking
countries, x1 = 3 the German speaking countries and x1 x2 = 87 Denmark.
Publisher Prefix
This is anywhere from 2 to 7 digits and identifies the publisher.
Title Number
It is 1 to 6 digits in length and is assigned by the publisher.
Parity Check
The tenth digit, x10 , is chosen so that
49
50 CHAPTER 8. DECIMAL CODES
1 · 0 + 2 · 2 + 3 · 0 + 4 · 1 + 5 · 3 + 6 · 4 + 7 · 2 + 8 · 9 + 9 · 2 + 10 · 8
= 0 + 4 + 0 + 4 + 15 + 24 + 14 + 72 + 18 + 80
≡ 4 + 4 + 4 + 2 + 3 + 6 + 7 + 3 (mod 11)
≡0 (mod 11).
1 · x1 + 2 · x2 + · · · + 10x10 + i · e,
≡ 0 + i · e (mod 11).
Suppose i · e ≡ 0 (mod 11). This implies that 11|i · e, so that 11|i or 11|e, but
neither i nor e are congruent to 0 mod 11. Therefore we will have a nonzero sum if
an error occurred.
• It is possible to correct an error if you know which digit is in error. We show this
by way of an example.
1 · 0 + 2 · 1 + 3 · 3 + 4 · 8 + 5 · 3 + 6 · x + 7 · 0 + 8 · 9 + 9 · 4 + 10 · 3,
≡ 6x + 9 ≡ 0 (mod 11),
∴ 6x ≡ −9 ≡ 2 (mod 11),
x ≡ 4 (mod 11).
• The ISBN code detects any single error arising from transposing two digits (adjacent
or not).
8.2. A SINGLE ERROR CORRECTING DECIMAL CODE 51
Let x1 x2 · · · x10 be the correct ISBN number. Assume that xi and xj get interchanged
in the recording of the number (i < j). The check sum becomes
This simplifies to
Adding the two equation together we get 10x9 ≡ 0 (mod 11), which implies that x9 = 0.
From this we get x10 = 1.
So in general we will find that
Suppose now that x1 x2 · · · x10 is sent and a single error e 6= 0 occurs in the i’th digit.
Then the received word becomes x1 x2 · · · xi−1 (xi + e)xi+1 · · · x10 . Therefore
So, the second equation gives the magnitude of the error, e. Once we know this we can
also calculate the position of the error : i ≡ e−1 S1 ≡ S2−1 S1 .
Therefore the decoding may be summed up as follows.
1. If the received word is r, then compute the syndrome,
· ¸
S1
rH = , (mod 11),
S2
where
1 1
2 1
3 1
H= .
.. ..
. .
10 1
Note that point 4 above always occurs when two different digits are transposed (S1 6= 0
and S2 = 0). Therefore this code can detect all errors involving the transposition of two
digits.
• Length n = 10.
• The last four digits, x7 x8 x9 x10 , are the parity check digits. They are chosen so that
10
X
S1 = ixi ≡ 0 (mod 11),
i=1
X10
S2 = xi ≡ 0 (mod 11),
i=1
X10
S3 = i2 xi ≡ 0 (mod 11),
i=1
X10
S4 = i3 xi ≡ 0 (mod 11),
i=1
The digits x1 through x6 will be known since they are information digits. Therefore the
equations above will assume the form
Suppose that x1x2 · · · x10 are sent and that a single error occurs in location i and is
of size e 6= 0. This situation is exactly the same as in the previous section and we will be
able to correct this single error if we know that only this error occurred. It turns out that
it is indeed possible to detect the presence of only one error. To do this we examine the
general case of two errors first.
Suppose therefore that two errors occur in locations i and j with sizes e1 6= 0 and
e2 6= 0. The presence of these errors affects the parity check equations above as follows.
ax2 + bx + c,
where
a = S12 − S2 S3 ,
b = S2 S4 − S1 S3 ,
c = S32 − S1 S4 .
This quadratic can be solved using a formula analogous to the one used for real numbers.
The derivation of this formula is done below. Note that all calculations are in Z11 and
8.3. A DOUBLE ERROR CORRECTING DECIMAL CODE 55
that the square-root used here may not exist. The root represents the number in Z11 that
upon squaring yields the number under the root sign.
ax2 + bx + c = 0, a 6= 0,
a−1 ax2 + a−1 bx + a−1 c = 0,
x2 + a−1 bx + (2−1 a−1 b)2 − (2−1 a−1 b)2 = −a−1 c,
(x + 2−1 a−1 b)2 = (2−1 a−1 b)2 − a−1 c,
√
x + 2−1 a−1 b = ± 2−2 a−2 b2 − a−1 c,
p
x = −2−1 a−1 b ± 2−1 a−1 b2 − (a−1 c)(2−2 a−2 )−1 ,
√
= [−b ± b2 − 4ac](2a)−1
So, using this formula we can find i and j and then solve
S1 ≡ ie1 + je2 ,
S2 ≡ e1 + e2 ,
4. If S 6= 0 and a 6= 0, c 6= 0 and b2 − 4ac is a square in Z11 , then there are two errors
e1 and e2 in locations i and j as before.
S1 = 1 · 3 + 2 · 2 + 3 · 5 + 4 · 4 + · · · + 10 · 6 ≡ 2 (mod 11),
S2 = 3 + 2 + 5 + · · · + 6 ≡ 1 (mod 11),
S3 = 12 · 3 + 22 · 2 + · · · 102 · 6 ≡ 10 (mod 11),
S4 = 13 · 3 + 23 · 2 + · · · + 103 · 6 ≡ 3 (mod 11).
So that
Now
Adding 4 times S2 to S1 gives 7e1 ≡ 6, implying that e1 = 4 and e2 = 8. Thus the decoded
word is 3214574396.
Example 8.4.1. A 50 cents off Kellogg’s Rice Krispies coupon has UPC 53800051150x12 .
Therefore
Thus x12 = 4.
Suppose that a single error of size e 6= 0 occurs in the i’th digit. The check sum
either changes by e or by 3e depending on i being even or odd. Since gcd(10, 3) = 1,
10|3e ⇐⇒ 10|e. Further 1 ≤ e ≤ 9, so that 10 does not divide e. Thus the error is
detectable since we have a nonzero checksum. Note that the error cannot be corrected
unless you know which digit is in error.
If adjacent digits xi and xj are transposed then the change to the check sum is −3xi +
xi − xj + 3xj = 2(xj − xi). This type of error is detected unless (xj − xi) = ±5.
Here x1 x2 x3 · · · x10 represents the number with the digits x1 , x2 , . . . , x10 and is not
the product of the numbers x1 , x2 , . . . , x10. Also 0 ≤ x11 ≤ 8.
Should an error occur in x11 then it will be detected since x1 + x2 + x3 + · · · x10 will
not be equivalent to x11 + e (mod 9).
If an error occurs in the first 10 digits then it will be detected unless a 0 is changed
into a 9 or 9 is changed into a 0. To estimate the probability of this we proceed as follows.
The probability that the i’th digit is a 0 is 109 /1010 = 0.1. The probability of the error
being a 9 is 1/9. Therefore the probability of changing a 0 into a 9 is 1/90. Similarly we
find that the probability of changing a 9 into a 0 is also 1/90. So, the probability of an
error occurring in the i’th (1 ≤ i ≤ 10) digit is 2/90.
• The code is printed in a bar-coded format on the envelope. The bar-code is delimited
by two long bars on either side. Each decimal digit is represented by a group of 5
long and short bars.
• Each such group of 5 bars has exactly 2 long and 3 short bars. Thus there are
¡ 5¢
2
= 10 such bars — one for each decimal digit.
• We can think of a long bar as representing a 1 and a short bar representing a 0. Then
the correspondence between the bars and the decimal digits is given by : if abcde
is the binary sequence of bars then 7a + 4b + 2c + d is the decimal digit associated
with it. In tabular form this is
Binary Decimal
00011 1
00101 2
00110 3
01001 4
01010 5
01100 6
10001 7
10010 8
10100 9
11000 0
8.6. US POSTAL CODE 59
Therefore the following sequence of bars represents 10010 which in turn repre-
sents 8.
Suppose that the scanner makes a single error when reading the code (i.e. it reads a 1
as 0 or a 0 as 1). Then some block of length 5 doesn’t have 2 ones — so the block itself
can be identified. If one error occurs in the i’th block, then the decimal digit associated
with it will be undefined. This digit can be recovered from the parity check equation since
we now know which digit is in error.
60 CHAPTER 8. DECIMAL CODES
Chapter 9
Hadamard Codes
Suppose we want a 2-error correcting code, C, of length 11. Then according to the
Hamming bound we have
211
|C| ≤ ¡11¢ ¡11¢ ¡11¢ ≈ 30.57.
0 + 1 + 2
Therefore C can have at most 30 codewords. Since 24 < 30 < 25 , the largest binary, linear
2-error correcting code of length 11 will have at most 24 = 16 codewords.
On the other hand there exists a nonlinear code of length 11 that is 2-error correcting
and that has 24 codewords — a definite improvement. This is a Hadamard code and its
construction is presented at the end of this chapter. To be able to construct these codes
we need a fair amount of information on Hadamard matrices. After we have constructed
these matrices we use their rows as the codewords for the Hadamard code.
9.1 Background
Definition 9.1.1 (Hadamard Matrix). A Hadamard matrix, H, of order n is an n× n
matrix of +1’s and −1’s such that
HH T = nI.
Therefore any two distinct rows of a Hadamard matrix are orthogonal (using the real
inner product). Furthermore, multiplying any row or column of a Hadamard matrix by −1
61
62 CHAPTER 9. HADAMARD CODES
changes it into another Hadamard matrix. This allows us to put a given Hadamard matrix
into the so-called normalised form where the first row and first column only contains 1’s.
Example 9.1.2. Normalised Hadamard matrices of orders 1, 2, 4 and 8 are shown below.
Note that +1’s are only shown as +, while −1 are shown as −.
· ¸
+ +
n = 1 : H1 = [+], n=2: ,
+ −
+ + + + + + + +
+ − + − + − + −
+ + + + + + − − + + − −
+ − + − + − − + + − − +
n = 4 : H4 = , n = 8 : H8 =
+ +
.
+ + − − + + − − − −
+ − − + + − + − − + − +
+ + − − − − + +
+ − − + − + + −
Proof.
By the previous example we know that Hadamard matrices of order 1 and 2 exist.
Let H be a Hadamard matrix of order n > 2. Put H in normalised form — this does
not affect n, therefore the first row of H is made up entirely of +1’s. The innerproduct
of the first row with any other row has to be zero, but this innerproduct is just the other
row. Therefore all rows of H (except the first) have an equal amount of +1’s and −1’s.
By permuting the columns of H we can change the second row into one that has as its
first n/2 entries +1’s and its second n/2 entries all −1’s. Therefore the first two rows of
H are as follows.
n/2 n/2
z }| { z }| {
+ + ··· + + + + ··· + +
+
| + ·{z
· · + +} −
| − ·{z
· · − −}
n/2 n/2
Let u be any row of H except the first two. Again u has n/2 +1’s and n/2 −1’s. Say u
has x +1’s in its first n/2 positions (so that it has (n/2 − x) −1’s in its first n/2 positions)
9.1. BACKGROUND 63
and y +1’s in its second n/2 positions (and therefore (n/2 − y) −1’s in its second n/2
positions). Since the innerproduct between u and the second row is zero we have
x − (n/2 − x) − y + (n/2 − y) = 2x − 2y = 0,
∴ x − y = 0.
Further the innerproduct between u and the first row is also zero and this leads to
x − (n/2 − x) + y − (n/2 − y) = 2x + 2y − n = 0,
∴ x + y = n/2.
Definition 9.1.4 (Quadratic residue). Let p be an odd prime. The nonzero squares
modulo p : 12 , 22 , 32, . . . reduced (mod p), are called quadratic residues (mod p) (or just
residues (mod p)).
since (p − a)2 = p2 − 2pa + a2 ≡ a2 (mod p). These are all distinct for if i2 ≡ j 2 (mod p)
with 1 ≤ i, j ≤ (p − 1)/2 then (i − j)(i + j) = i2 − j 2 ≡ 0 (mod p). That is p|(i − j)
or p|(i + j). In the first case i ≡ j (mod p) and in the latter case i ≡ −j (mod p), but
(p − 1)/2 + 1 ≤ −j ≤ p − 1 and 1 ≤ i ≤ (p − 1)/2. So i2 ≡ j 2 (mod p) is only possible if
i ≡ j (mod p). Therefore there are (p − 1)/2 quadratic residues (mod p). The remaining
(p − 1)/2 numbers are called nonresidues.
Example 9.1.5. For p = 11, the quadratic residues are
11 = 1,
22 = 4,
32 = 9,
42 = 16 ≡ 5 (mod 11) and
52 = 25 ≡ 3 (mod 11).
1. The product of two quadratic residues or of two nonresidues is a residue, while the
product of a residue and a nonresidue is a nonresidue.
3. We define the following function on Zp , also known as the Legendre symbol (but we
use a different notation to simplify later expressions).
(i) First construct the Jacobsthal matrix Q = (qij ). This is a p × p matrix whose rows
and columns are labelled 0, 1, 2, . . . , p − 1 and qij = χ(j − i).
Note that qij = χ(j − i) = χ(−1)χ(i − j) = −qji since p is of the form 4k + 3 and
by property 2 −1 is a nonresidue. That is Q is skew-symmetric : QT = −Q.
(iii) Let
· ¸
1 j
H= T ,
j Q−I
Therefore
· ¸
T (p + 1) 0
HH = = (p + 1)I.
0 (p + 1)I
Thus H is a normalised Hadamard matrix of order p + 1 which is said to be of Paley
type.
66 CHAPTER 9. HADAMARD CODES
The first twelve rows form the (11, 12, 6) Hadamard code A12 (this is the code referred
to at the start of the chapter). All 24 rows form the (11, 24, 5) Hadamard code B12 .
Theorem 9.3.1 (Plotkin bound). For any (n, M, d) code C with dmin = d > n/2 we
have
¹ º
d
M ≤2 .
2d − n
Proof.
We will assume throughout that M ≥ 2, which is what is needed for a code to be useful.
Consider the sum
XX
S= d(u, v).
u∈C v∈C
µ ¶ 2
2d − n
∴ M 2 − dM ≤ 0,
2
µ ¶
2d − n
∴ M ≤ d,
2
2d
∴M ≤ .
2d − n
68 CHAPTER 9. HADAMARD CODES
The last step being possible since d > n/2. We have that b2xc ≤ 2bxc + 1, so
¹ º
d
M≤2 + 1,
n − 2d
but since M is even
¹ º
d
M ≤2 .
n − 2d
Suppose now that M is odd. Then S is maximised if all xi = (M − 1)/2. In this case
µ ¶µ ¶ µ µ ¶¶
M −1 2M − M + 1 M +1 n
S≤n 2 = n (M − 1) = (M 2 − 1).
2 2 2 2
Therefore
n
M (M − 1)d ≤ (M 2 − 1),
2
n
∴ M (M − 1)d ≤ (M − 1)(M + 1),
2
n
∴ M d ≤ (M + 1),
2
n n
∴ Md − M ≤ ,
µ 2 ¶ 2
2d − n n
∴M ≤ ,
2 2
n 2d
∴M ≤ = − 1.
2d − n 2d − n
Again using b2xc ≤ 2bxc + 1, we find
¹ º ¹ º
d d
M≤2 +1−1=2 .
2d − n 2d − n
Introduction to Cryptography
• A key set, K (also called the key space). Often keys are in a (key) pair k = (e, d),
where e is used for encryption and d is used for decryption.
• Encryption function Ek : M → C.
71
72 CHAPTER 10. INTRODUCTION TO CRYPTOGRAPHY
Our goal is try to provide secrecy and the ability to detect altered or forged messages.
Although the delivery of messages cannot always be guaranteed (meaning that the message
may not arrive at all or may be not the original, intended message) we can send messages
regularly to discover communications disruptions.
“Breaking” a cryptosystem will be taken to mean that the plain-text can be discovered
from the cipher-text. This leads to various levels of security being defined.
A cryptosystem is
• Unconditionally secure if the adversary can’t gain any knowledge about the plain-
text (except maybe the length) regardless of the amount of cipher-text available and
the amount of computing resources available.
3. The apparatus should be portable and be of such a nature that one person can
operate it.
The security of the cipher should rest with the keys alone. That is security is
maintained even if the adversary has the encryption scheme.
• Known plain-text attack. The adversary has some quantity of plain-text and the
corresponding cipher-text.
One of the earliest ciphers is the simple substitution cipher. In this system the
keys are just permutations of A. The following example illustrates this.
Example 10.1.3. (The shift cipher).
Number A, B, C, . . . , Z by 0, 1, 2, . . . , 25. The encryption key e is a fixed shift (mod 26).
That is
α 7→ α + e (mod 26).
The decryption key d = 26 − e; the additive inverse (mod 26). If e = 3 then d = 23,
E3 (HOCKEY) = KRFNHB and D23 (KRFNHB) = HOCKEY.
This system is completely insecure against chosen plain-text attacks — choose as text
A, B, C, . . . , Z.
Two examples of symmetric key encryption schemes are the simple substitution cipher
where the keys are permutations of the alphabet and the shift cipher where the key
specifies the amount by which a letter is shifted mod 26.
Note that for decryption to be possible, Ek (x), needs to be one-to-one. This implies that
we need gcd(a, 26) = 1; any b may be used.
Therefore the encryption key e is a pair, (a, b), with gcd(a, 26) = 1. The number of
encryption keys thus is φ(26) · 26 = 12 · 26 = 312, where φ is Euler’s phi function.
As far as decryption is concerned, note that if y ≡ ax + b (mod 26), then y − b ≡ ax
mod 26, so that x ≡ a−1 (y − b) (mod 26). Here a−1 is the multiplicative inverse of a
among the elements relatively prime to 26. We can determine a−1 as follows : since
74 CHAPTER 10. INTRODUCTION TO CRYPTOGRAPHY
gcd(a, 26) = 1 there exists integers r and s such that ar + 26s = 1. These integers can be
found using the Euclidean algorithm. Therefore
Therefore the decryption key is also a pair (a−1 , b) and the decryption function is
Dk (y) = a−1 (y − b).
Example 10.2.1. As an example of the affine cipher let our key pair be k = (e, d) =
((7, 3), (15, 3)). Therefore
G : 7·6+3 = 45 ≡ 19 → T,
A : 7·0+3 = 3 ≡ 3 → D,
R : 7 · 17 + 3 = 122 ≡ 18 → S,
Y : 7 · 24 + 3 = 171 ≡ 15 → P,
The cryptanalysis of the affine cipher is based on examining the frequency with which
cipher-text occurs. As such we require the frequency with which the letters occur in
everyday usage. Below is a table that shows the letters, their associated decimal number
and their frequency of occurrence in everyday English (plain-text).
10.2. AFFINE CIPHER 75
Consider the following block of text that was obtained from an affine cipher.
FMXVEDKAPHFENDRB
NDKRXRSREFMORUDS
DKDVSHVUFEDKAPRK
DLYEVLRHHRH
Counting the number of occurrences of the letters above we find that the most frequent
characters are : R (8 times), D (8 times), E,H,K (5 times) and F,V (4 times). Based on
this we guess that one of the most frequent characters in the cipher-text, R, represents
76 CHAPTER 10. INTRODUCTION TO CRYPTOGRAPHY
the most frequent character in plain-text, E. Next, we guess that the most frequent letter
after R in the cipher-text, D, represents the second most frequent letter in plain-text, T.
Therefore our guess is that E → R and T → D. This is the same as Ek (4) = 17 and
Ek (19) = 3 or a · 4 + b = 17 and a · 19 + b = 3. This implies that 15a = −14 ≡ 12
(mod 26), so that 7 · 15 · a ≡ a ≡ 7 · 12 ≡ 6 (mod 26) and b = 19. This cannot be a valid
key as gcd(a, 26) 6= 1.
After this we might guess that E → R and T → E, but this leads to the same sort of
problem as does E → R and T → H.
Our next best guess would be that E → R and T → K. Therefore Ek (4) = 17 and
Ek (19) = 10. This implies that 4a + b ≡ 17 (mod 26) and 19a + b ≡ 10 (mod 26). Thus
15a ≡ −7 ≡ 19 (mod 26), so that a ≡ 3 (mod 26) and b ≡ 5 (mod 26). Therefore our
encryption key would be (3, 5), which is a valid key since 3 is relatively prime to 26.
The decryption key corresponding to this is (3−1 , 5) = (9, 5) and then Dk (y) = 9(y − 5).
Applying this to the cipher-text we find the following.
ALGORITHMSAREQUI
TEGENERALDEFINIT
IONSOFARITHMETIC
PROCESSES
This corresponds to the text algorithms are quite general definitions of arithmetic pro-
cesses. Seeing that we have a piece of plain-text that “makes sense” we can assume that
we have the key.
d o u b l e e a g l e
g o l f g o l f g o l
j c f g r s p f m z p
u b p r a g e k u z w t c h s w u i r m
g o l f g o l f g o l f g o l f g o l f
o n e m u s t f o l l o w t h r o u g h
In this case the cipher-text ubpragekuzwtchswucrm corresponds to the pain-text one must
follow through.
Offset Cipher-text
0 UPVZB BVUPN KKFOL OGAKU FBTKF LFXUJ VIPZV KFZXO FIDLO ONLUP
50 KKFUZ OMQFQ MQXKU AFIUP VVVVK KFDFL DMFIU PVVFI ZVTMU XDBZY
100 FVVYF ZTHBA ZQHEY LTXVU JVXFM IDRSQ EJNCI PVZZQ HQEYJ BZQHB
150 YHTWL OUWND OLVUJ VREZA JHTWW VPTZW VLVDM TROPV XWIMN KJBVE
200 FITKV XRQEL FZOBY HSMND TVFOJ DZQHB YLOOZ QTQXK UISLS LNLUP
250 RESWB HOEZQ HERVC MRWJV XWIMR LSISR WMIHF TZQHN CXUBV UJVXF
300 JZTOJ VXGJA REMMU GPEEG PEEWP BYHXI KHS
78 CHAPTER 10. INTRODUCTION TO CRYPTOGRAPHY
The idea of the Kasiski examination is to try and recover the key-length used in
the Vigenère cipher. This is accomplished by analysing the distances between identical
pieces of cipher-text : occasionally identical portions of plain-text will align with the same
fragment of the key producing exactly the same cipher-text. The possibility also exists for
“accidental” matches. That is identical fragments of cipher-text which did not result from
the same plain-text, but these tend to be rare especially with longer match lengths. In
the case of a “correct” match the key-length has to divide the difference in the positions
where these pieces of identical cipher-text occurred. Since accidental matches may also
occur it may not be enough to only examine the greatest common divisor of all distances.
In our example the fragment of cipher-text ZQH occurs in a number of places which
are underlined in the table above. The corresponding offsets are : 110, 138, 146, 226,
258, 286. It is therefore likely that the keyword length, l, divides the difference of any
of these. The differences 138 − 110 = 22 7 and 226 − 110 = 22 29 suggest that l divides
gcd(22 7, 22 29) = 22 . The situation with l = 1 corresponds to a simple shift.
At this point we may now examine the frequency distributions for each candidate
key-length. The idea being that at these fixed lengths the plain-text was combined with
the same letter in the keyword and so will give an accurate distribution. The table below
shows the frequencies that were found at each position of the key, that is for l = 2 we
examine the letters in the odd and even positions separately.
For a fixed source distribution each line of the correct key-length should reflect the
source frequencies. In this example, the frequencies suggest that l = 4 is more likely than
l = 2.
Since we know that the letter ‘e’ is the most frequent letter in everyday English, we
suspect that one of the higher-frequency cipher-text letters, in each line of the l = 4 case,
corresponds to the plain-text letter ‘e’. Now each line in the l = 4 case represents a letter
of the keyword. So if the highest frequency cipher-text letter in each row corresponded
to ‘e’, then all we would have to do is to subtract ‘e’ from the cipher-text letter to get the
corresponding keyword letter and so retrieve the keyword. Since we are not absolutely
10.3. SOME OTHER CIPHERS 79
100% sure that the highest-frequency cipher-text letter in each line corresponds to ‘e’
our best strategy would be to consider the 3 or 4 most frequent letters in each row (‘e’
would definitely be among the highest frequency ones). The table below shows the four
most frequent letters in each row as well as the required keyword letter that maps these
cipher-text letters back to ‘e’.
Cipher-text Keyword
c−e (mod 26)
F J U P −−−−−−−→ B F Q L
c−e (mod 26)
B V M I −−−−−−−→ X R I E
c−e (mod 26)
V Z R X −−−−−−−→ R V N T
c−e (mod 26)
K Q H L −−−−−−−→ G M D H
At this point one would do an exhaustive search using keywords that are built from
the likely letters shown above. That is one chooses a letter from the first row as the
possible first letter of the keyword, a letter from the second row for the second possible
letter of the keyword and so on. This requires 44 = 256 decryptions. If the keyword was
chosen from a dictionary this simplifies things a great deal. In our example this would be
words such as LEND and BIRD. Using BIRD on the cipher-text yields the following.
The water of the Gulf stretched out before her, gleaming with the million lights
of the sun. The voice voice of the sea is seductive, never ceasing, whispering,
clamoring, murmuring, inviting, the soul to wander in abysses of solitude. All
along the white beach, up and down, there was no living thing in sight. A
bird with a broken wing was beating the air above, reeling fluttering, circling
disabled down, down to the water.
The ZQH in the cipher-text used in the Kasiski examination corresponds to ‘ing’ in the
plain-text.
Definition 10.3.4 (Stream cipher (state cipher)). In a stream cipher the mapping
of a block may depend on its position in the message.
If the key digits are the result of independent Bernoulli trials with probability 1/2
(and used only once), then the cipher is known as a one-time-pad and is unconditionally
secure against a cipher-text only attack. The reason for this is that every message of the
same length as the cipher-text maps to the cipher-text for some choice of key, and all keys
are equally likely.
Chapter 11
This chapter is devoted to one of the most widely used forms of cryptography, namely
public key cryptography. The RSA cryptosystem, which is an example of this, is used by
millions of people around the world.
Definition 11.0.5 (Private key system). In a private key cryptosystem Dk is either
the same as Ek , or easy to get from it. If Ek is known the system is insecure. Therefore
Dk and Ek must be kept private.
Definition 11.0.6 (Public key system). In a public key system if Ek is known it is
(believed to be) computationally infeasible to use it to determine Dk . Therefore Ek can
be made public (for instance in a directory) and anyone can lookup Ek when they want
to send a message.
The believed computational infeasibility of determining Dk makes the exchange of
keys unnecessary.
The idea of a public key system arose in 1976 in the work of Diffie and Hellman [?].
The first public key system was discovered in 1977 by Rivest, Shamir and Adleman [?]
(the RSA system).
Public key systems can never be unconditionally secure. If the adversary has the
cipher-text, they can use Ek to encrypt every possible piece of plain-text until a match
is found. This process might not be feasible in practice, but is nonetheless possible in
principle.
Definition 11.0.7 (One-way function). A function f : M → C is one-way if
• f (m) is “easy” to compute for all m ∈ M .
81
82 CHAPTER 11. PUBLIC KEY CRYPTOGRAPHY
It is not known whether one-way functions exist. A number of functions have been
identified that seem to be one-way.
Example 11.0.8. An example of a one-way function is found in the UNIX operating
system. Here the password to a user’s account is stored in a file that is readable to all
users of the system. The passwords are encrypted using a (what is believed to be) one-way
function.
So, in public key systems the extra information in a trapdoor function makes it possible
to find Dk .
• The key pair k = (a, b) where ab ≡ 1 (mod φ(n)) with φ(n) = φ(pq) = (p − 1)(q − 1)
the Euler phi function.
Let’s compute Dk (Ek (x)) to see that we can recover an encrypted message. To do this
we need the following facts.
We have Dk (Ek (x)) = Dk (xb ) = (xb )a (mod n). Here xab = x1+tφ(n) = x · xtφ(n)
(mod n). Now there are two cases to consider.
1. gcd(x, n) = 1.
By Euler’s Theorem xφ(n) ≡ 1 (mod n), so that xtφ(n) = (xφ(n) )t ≡ 1t ≡ 1 (mod n).
Therefore Dk (Ek (x)) = x · xtφ(n) ≡ x (mod n).
2. gcd(x, n) > 1.
Here xtφ(n) = s (mod n) ⇐⇒ xtφ(n) = s (mod p) and xtφ(n) = s (mod q). We know
that φ(n) = (p − 1)(q − 1), φ(p) = p − 1 and φ(q) = q − 1. Thus t · φ(n) = t1 · φ(p) =
t2 · φ(q). Therefore xtφ(n) ≡ xt1 φ(p) ≡ 1 (mod p) and xtφ(n) ≡ xt2φ(q) ≡ 1 (mod q).
By Euler’s Theorem then xtφ(n) ≡ 1 (mod pq). This gives Dk (Ek (x)) = x·xtφ(n) ≡ x
(mod n).
3. Choose a random b with 0 < b < φ(n) and gcd(b, φ(n)) = 1. The last requirement
ensures that b−1 exists, which is needed in the next step.
5. Publish b and n.
The choice of the b in step 3 is made using the using the Euclidean algorithm to check
whether gcd(b, φ(n)) = 1. The probability that a randomly chosen b is relatively prime to
φ(n) is φ(φ(n))/φ(n). Therefore we need to try about φ(n)/φ(φ(n)) different b’s before
one that is relatively prime to φ(n) is found.
84 CHAPTER 11. PUBLIC KEY CRYPTOGRAPHY
The last nonzero remainder is the gcd (=1). We now use the results of the algorithm
in reverse to express 1 (the gcd) in terms of 11200 and 3533.
1 = 5 − 2 × 2,
= 5 − 2(17 − 5 × 3) = 7 × 5 − 2 × 17,
= 7(73 − 17 × 4) − 2 × 17 = 7 × 73 − 30 × 17,
= 7 × 73 − 30(528 − 73 × 7) = 217 × 73 − 30 × 528,
= 217(601 − 528) − 30(528) = 217 × 601 − 247 × 528,
= 217 × 601 − 247(3533 − 601 × 5) = 1452 × 601 − 247 × 3533,
= 1452(11200 − 3533 × 3) − 247 × 3533 = 1452 × 11200 − 4603 × 3533,
= 1452 × 11200 + (−4603) × 3533.
n = pq,
φ(n) = (p − 1)(q − 1),
for the unkowns p and q we can factor n. If we substitute q = n/p into the second
equation, we obtain a quadratic in the unkown p
p2 − (n − φ(n) + 1)p + n = 0.
86 CHAPTER 11. PUBLIC KEY CRYPTOGRAPHY
A yes-based Monte Carlo algorithm has an error probability of ² if the algorithm gives
an incorrect answer “no” (when the answer is really “yes”) with a probability at most ².
Here the probability is computed over all possible random choices made by the algorithm
when it is run with a given input.
The basic idea behind the identification of possible primes is as follows. Generate
random odd numbers and use an algorithm that answers “yes” or “no” to the question
“is n composite ?” A “yes” answer is always correct, but a “no” answer maybe incorrect
with some probability ² < 1. Run the algorithm k times. If it ever answers “yes” then n
is composite. Otherwise n is prime with probability 1 − ²k .
π(n)
lim = 1,
n←∞ n/ ln(n)
From the Prime Number Theorem we see that the probability that a randomly chosen
integer k is prime is approximately
k/ ln(k) 1
= .
k ln(k)
11.2. PROBABILISTIC PRIMALITY TESTING 87
If only odd integers are considered, the probability √is approximately 2/ ln(k). So on
average one could expect to test about ln(k)/2 = ln( k) odd integers before a prime is
found.
a prime with about 100 digits, then k ≈ 10100 above, so that we
If we need to find √
need to test about ln( k) ≈ 115 integers on average before a prime is found.
The algorithm for identifying possible primes that we will be discussing is known as
the Miller-Rabin algorithm. We introduce this algorithm by way of the following theorem.
5. For i = 0, 1, 2, . . . , k − 1
If b ≡ −1 (mod n), then answer “prime” and stop,
else replace b by b2 , i by i + 1 and go to step 5.
Proof.
Assume the algorithm answers “composite” for some prime integer n. We will obtain a
contradiction.
Since the answer is “composite”, it must be the case that am 6≡ 1 (mod n). In step 5
of the algorithm, the sequence of values tested is
2m 3m k−1 k−1 m
b = am , b2 = a2m , b4 = a2 , b8 = a 2 . . . b2 = a2 .
Fact The error probability in the Miller-Rabin Algorithm is less than or equal to 0.25.
Chapter 12
The second example of a public key cryptosystem that we will consider is the Rabin
Cryptosystem [?]. This system has the following features.
• It uses two distinct primes p and q each of which is congruent to 3 (mod 4). Let
n = pq.
For this system to be of use we need to know that there are infinitely many primes of
the form 4k + 3 (i. e primes that are congruent to 3 (mod 4)). This can be seen in one of
two ways. The first is Dirichlet’s Theorem and the second is a direct proof, both of which
are shown below.
89
90 CHAPTER 12. THE RABIN CRYPTOSYSTEM
Since gcd(4, 3) = 1, we know that there are infinitely many primes of the form 4k + 3.
We now show directly that there are infinitely many primes congruent to 3 (mod 4) :
Assume that there are only finitely many primes congruent to 3 (mod 4), say p0 = 3,
p1 = 7, p2 = 11, . . . , pn . Consider N = 4p1 p2 · · · pn + 3 (note that p0 is not included in
this product). We know that if p and q are both congruent to 1 (mod 4), then so is their
product. Therefore a number congruent to 3 (mod 4) must have a prime divisor that
is congruent to 3 (mod 4) (if all were congruent to 1 (mod 4), the number itself would
be congruent to 1 (mod 4)). We have N ≡ 3 (mod 4), so that N has a prime divisor
congruent to 3 (mod 4). On the other hand if 3|N, then 3|4p1 p2 · · · pn and since 3 is
prime this implies that 3|4 or 3|pi for some i ≥ 1, a contradiction. Thus 3 6 | N . Also, if
i ≥ 1 then pi |N and pi|4p1 p2 · · · pn implies that pi|3, a contradiction. That is there exists
a prime congruent to 3 (mod 4) not among p0, p1 , . . . , pn, a contradiction. Therefore the
number of primes congruent to 3 (mod 4) is infinite.
It turns out that the encryption function is not one-to-one. To see this we need the
following two results.
x ≡ a1 (mod n1 ),
x ≡ a2 (mod n2 ),
..
.
x ≡ at (mod nt ),
Proposition 12.0.9. If p and q are distinct odd primes, then the congruence x2 ≡ 1
(mod pq) has exactly four solutions (mod pq).
Proof.
x2 ≡ 1 (mod pq) ⇐⇒ x2 − 1 ≡ 0 (mod pq) ⇐⇒ x2 − 1 = kpq, k ∈ Z ⇐⇒ p|(x2 − 1)
and q|(x2 − 1) ⇐⇒ x2 ≡ 1 (mod p) and x2 ≡ 1 (mod q).
Now x2 − 1 = (x − 1)(x + 1), so p|(x2 − 1) ⇐⇒ p|(x − 1)(x + 1) ⇒ p|(x − 1) or
p|(x + 1). That is x ≡ 1 (mod p) or x ≡ −1 (mod p). Similarly x ≡ ±1 (mod q).
91
By the Chinese Remainder Theorem each system has a unique solution (mod pq) and each
solution leads to a solution of x2 ≡ 1 (mod pq). Since p and q are odd primes, 1 6≡ −1,
so the four solutions are all distinct.
This implies that there are four different plain-texts that encrypt to the same cipher-text
as x.
Next we treat the decryption process. The receiver is given a cipher-text y and wants
to determine x such that x2 + Bx ≡ y (mod n). To simplify notation let x1 = x + B/2,
that is x = x1 − B/2. The congruence then becomes
µ ¶2 µ ¶
B B
y ≡ x1 − + B x1 − ,
2 2
B2 B2
= x21 + − ,
4 2
B2
= x21 − (mod n).
4
Therefore
B2
x21 ≡ y + (mod n).
4
92 CHAPTER 12. THE RABIN CRYPTOSYSTEM
By letting C = y + B 2 /4 we get
Each congruence in the system has zero or two solutions. These can be combined, as
before, to get up to four solutions (mod pq).
To determine the solutions to the congruences above we need the following concept
and theorem.
12 ≡ 1,
22 ≡ 4,
32 ≡ 2,
−22 ≡ 4,
−32 ≡ 2,
−12 ≡ 1.
As an aside we note that if m is prime then there exists (m − 1)/2 quadratic residues
and (m − 1)/2 quadratic non-residues.
One way to determine if a is a quadratic residue (mod m) is to use Euler’s criterion.
Theorem 12.0.13 (Euler’s Criterion). Let m be prime. Then a number a 6≡ 0 (mod m)
is a quadratic residue (mod m) ⇐⇒ a(m−1)/2 ≡ 1 (mod m).
Note that if a is a nonzero quadratic residue then a(m−1)/2 ≡ −1 (mod m).
Euler’s Criterion only answers yes or no as to whether there exists an x such that
2
x ≡ a (mod m). It does not say how to find this x. If our prime, m, is of a specific form
then the determination of this x becomes easy.
93
If m is prime and m ≡ 1 (mod 4), then there is no known, efficient algorithm to find
square roots, i.e. the x, (mod m).
On the other hand when m is a prime and m ≡ 3 (mod 4), the square roots are easy
to find. It is just ±a(m+1)/4 . To check this we compute
¡ ¢2
±a(m+1)/4 ≡ a(m+1)/2 ,
≡ a · a(m−1)/2,
≡ a (mod m),
Using the above procedure we find the two square roots of C (mod p) and the two square
roots (mod q). These may then be combined using the Chinese Remainder Theorem to
find the square roots of C (mod n).
Recall that we have
B
x = x1 −
,
2
B2
C = y+ .
4
We know that
r
√ B2
x1 = C= y+ ,
4
so that
r
B2 B
x= y+ − = Dk (y).
4 2
The four square roots of C (mod n) leads to the four possibilities for x.
Example 12.0.14. n = 7 × 11 = 77 and B = 9.
The encryption function is Ek (x) = x2 + Bx = x2 + 9x (mod 77).
94 CHAPTER 12. THE RABIN CRYPTOSYSTEM
Thus the square roots of 23 (mod 7) are ±4 and the square roots of 23 (mod 11) are ±1.
We use this to get the four square roots of 23 (mod 77).
√ √ √
From 4 ≡ 23 (mod 7) and 1 ≡ 23 (mod 11) we get x1 ≡ 23 (mod 77), where
x1 = 4 × 11 × y1 + 1 × 7 × y2 (mod 77),
and
10 − 43 ≡ 44 (mod 77),
67 − 43 ≡ 24 (mod 77),
32 − 43 ≡ 66 (mod 77),
45 − 43 ≡ 2 (mod 77).
n. Since it is suspected that factorisation is “difficult,” one can expect that the design
of such a decryption algorithm will be at least as hard as designing an algorithm that
factors n.
So suppose that the attacker has a decryption algorithm A. We may then proceed as
follows.
1. Choose a random r, 1 ≤ r ≤ n − 1.
4. Let x1 = x + B/2.
We will see below that this algorithm factors n with probability at least 1/2. First we
explain its working.
In step 2 we have
µ ¶ µ ¶µ ¶
B B B
Ek r − = r− r− +B ,
2 2 2
µ ¶µ ¶
B B
= r− r+ ,
2 2
B2
≡ r2 − (mod n),
4
= y.
are all distinct by Proposition 12.0.9. Note that any two values in the same equivalence
class give the same value for y in step 2. In step 3, given y the decryption algorithm
returns x. By the calculation in step 4, x1 is a member of the equivalence class of r. If it
is ±r, the algorithm fails. If it is ±ωr the algorithm factors n as explained above.
Since r is chosen at random it is equally likely to be any of the four members of its
equivalence class. Two of the four members leads to success, therefore the probability of
success is at least 1/2.
Chapter 13
Factorisation Algorithms
The purpose of this chapter is to discuss some of the algorithms available for attempting
to factor a given integer.
Proposition 13.1.1. If for any prime power q|(p − 1) we have q ≤ B, then (p − 1)|B!.
Proof.
Suppose p − 1 has the prime factorisation
97
98 CHAPTER 13. FACTORISATION ALGORITHMS
Example 13.1.2. Let’s apply the algorithm to n = 36259 and B = 5. We need to calculate
25! (mod 36259). This can be done in five steps.
That is a ≡ 25251 (mod 36259). So gcd(a − 1, n) = gcd(25250, 36259) = 101. From this
we find that n = 36259 = 101 × 359.
Note that in the example we have p = 101 so that p − 1 = 100 = 22 52 . By the
proposition we need B to be at least 25 for the algorithm to be guaranteed to work. In
our case B = 10 worked, so the algorithm apparently does not need B to be as big as
required by the proposition. The proposition just says that if B is this big the algorithm
will work.
Chapter 14
In this chapter we introduce yet another public key cryptosystem. It is known as the
ElGamal Cryptosystem [?] and relies on discrete logarithms which we introduce first.
So if a is a primitive element of Z∗p , then a, a2 , a3 , . . . , ap−1 (mod p) are just the el-
ements 1, 2, 3, . . . , p − 1 in some order. That is for each x ∈ Z∗p there is a number
e ∈ {1, 2, 3, . . . , p − 1} such that ae ≡ x (mod p).
Definition 14.1.2 (Discrete Logarithm). We call e the discrete logarithm (or index)
of x with respect to a if ae ≡ x (mod p) and denote it by loga (x).
The problem of finding loga (x) in Zp is generally regarded as being difficult. Modular
exponentiation is easy, but its inverse — discrete logarithm — is not. That is modular
exponentiation is believed to be a one-way function.
99
100 CHAPTER 14. THE ELGAMAL CRYPTOSYSTEM
• The keys k = (p, α, e, β), where β = αe (mod p), that is logα (β) = e (mod p).
Ek (x) = (y1 , y2 ),
where
y1 = αk (mod p),
k
y2 = xβ (mod p),
Firstly let’s check that decryption works. So with y1 and y2 as given above we find.
• Then compute
Euclidean algorithm.
23 = 1 × 17 + 6,
17 = 2 × 6 + 5,
6 = 1 × 5 + 1.
1 = 5 − 6,
= 6 − (17 − 2 × 6) = 3 × 6 − 17,
= 3(23 − 17) − 17 = 3 × 23 − 4 × 17.
From the last step we find −4×17 = 1+(−3)×23 ≡ 1 (mod 23). That is 17−1 ≡ −4 ≡ 19
(mod 23).
To complete decryption we compute y2 (y1e )−1 = 4 × 19 ≡ 76 ≡ 7 (mod 23).
e = mj + i, 0 ≤ i ≤ m − 1.
§√ ¨
Since m = p − 1 and 0 ≤ e ≤ p − 2, 0 ≤ j ≤ m − 1. Now β ≡ αe (mod p) ⇐⇒
β ≡ αmj+i (mod p) ⇐⇒ βα−i ≡ αmj (mod p). This is the basis of Shanks’ algorithm.
102 CHAPTER 14. THE ELGAMAL CRYPTOSYSTEM
1. Compute αmj (mod p), 0 ≤ j ≤ m − 1 and store the pairs (j, αmj ) in a list sorted
by increasing value of the second coordinate. The reason for storing the numbers in
this way is to simplify searching through the list later.
2. Compute βα−i (mod p), 0 ≤ i ≤ m − 1 and store the pairs (i, βα−i) in a list sorted
by increasing second coordinate.
3. Find a pair (j, y) in the list from step 1 and a pair in the list from step 2 having the
same second coordinate.
4. e = logα β = mj + i (mod p − 1)
Note that in the last step we are computing (mod p − 1). The reason for this is that the
powers of α (the generator) are always going to be in the range 1, 2, 3, . . . , p − 1.
Example 14.3.1. Let p = 23 and α = 5. We would like to find log5 (11).
§√ ¨ §√ ¨
m= p−1 = 22 = 5, so that m − 1 = 4.
We compute the following.
Therefore list 1 is (0, 1), (2, 9), (4, 12), (3, 19), (1, 20).
Next we need to compute 11 · 5−i, 0 ≤ i ≤ 4. To do this note that 522 ≡ 1 (mod 23).
Therefore 5−i ≡ 522−i (mod 23).
This gives list 2 as (3, 8), (0, 11), (1, 16), (2, 17), (4, 20).
14.3. ATTACKING THE ELGAMAL SYSTEM 103
Scanning through the lists we find (1, 20) in the first list and (4, 20) in the second list,
giving
This chapter discusses a generalisation of the ElGamal cryptosystem, namely elliptic curve
cryptography. This was proposed by Koblitz and Miller [?, ?].
y1 = αk ,
y2 = x ◦ β k ,
105
106 CHAPTER 15. ELLIPTIC CURVE CRYPTOGRAPHY
(“y1e ”)−1 = −0 ≡ 0 (mod 30). Lastly we compute y2 ◦ (“y1e ”)−1 = 17 + 0 ≡ 17 (mod 30).
Suppose that Eve intercepts the cipher-text (9, 16) and tries to decrypt it. She knows
that α = 3 and β = 12 (these are made public). She needs to find an e such that e×α = β
in Z30. That is
If Eve knew that |H| = 10, then she’d be done. On the other hand by Lagrange’s Theorem
we have that |H| divides |G| and therefore |H| ≤ 15 which rules out 24.
This shows that we might also want to keep order of H secret.
Definition 15.2.1 (Elliptic Curve). Let p > 3 be prime. The elliptic curve
y 2 = x3 + ax + b,
y 2 ≡ x3 + ax + b (mod p),
An elliptic curve E can be made into an Abelian group by using the following opera-
tion, where arithmetic is in Zp .
Let P = (x1 , y1 ) and Q = (x2 , y2 ) be points on E, then
(
O if x1 = x2 and y1 = −y2 ,
P +Q=
(x3 , y3 ) otherwise,
where
x3 = λ2 − x1 − x2 ,
y3 = λ(x1 − x3 ) − y1 ,
and
(
y2 −y1
x2 −x1 if P 6= Q,
λ= 3x21 +a
2y1
if P = Q.
x x3 + x + 6 square ? y
0 6 65 ≡ −1 ∴ ✗
1 8 85 ≡ −1 ∴ ✗
2 5 55 ≡ 1 ∴ ✓ ±53 ≡ ±4
3 3 ✓ ±33 ≡ ±5
4 8 ✗
5 4 ✓ ±2
6 8 ✗
7 4 ✓ ±2
8 9 ✓ ±3
9 7 ✗
10 4 ✓ ±2
Therefore E has 13 points (including the point at infinity) and they are :
O; (2, 4); (2, 7); (3, 5); (3, 6); (5, 2); (5, 9);
(7, 2); (7, 9); (8, 3); (8, 8); (10, 2); (10, 9).
Since 13 is prime any nonidentity element will generate the group. Note also that
(E, +) is cyclic and isomorphic to Z13 .
Let α = (2, 7). Then the powers of α are multiples of α in this group and we have the
following.
Also,
k kα
1 (2, 7)
2 (5, 2)
3 (8, 3)
4 (10, 2)
5 (3, 6)
6 (7, 9)
7 (7, 2)
8 (3, 5)
9 (10, 9)
10 (8, 8)
11 (5, 9)
12 (2, 4)
In general one would like to be able to know how many points there are on a given el-
liptic curve over Zp . This is needed so that one may be able to construct a correspondence
between plain-text and the points on the curve. The following theorem gives bounds on
the number of points.
Theorem 15.2.3 (Hasse’s Theorem). Let p > 3 be prime and E an elliptic curve over
Zp . Then the number, N(E), of points on E satisfies
√ √
p + 1 − 2 p ≤ N(E) ≤ p + 1 + 2 p.
Theorem 15.2.4. Let p > 3 be prime and E an elliptic curve over Zp . Then there exist
integers n1 and n2 such that
(E, +) ∼
= Zn1 × Zn2 .
The last theorem implies that there exists a cyclic subgroup of (E, +) isomorphic to
Zn1 . We may be able to use this in an ElGamal system if we can find it.
Example 15.2.5. (Continued)
110 CHAPTER 15. ELLIPTIC CURVE CRYPTOGRAPHY
We use the elliptic curve in the previous example to set up an ElGamal system with
α = (2, 7) and an exponent e = 7. So β = αe = 7 · α = (7, 2) (from table).
Ek (x) = (kα, kβ + x) = (k(2, 7), k(7, 2) + x), where x ∈ E and k is chosen at random
from {0, 1, 2, . . . , 12}.
Dk (y1 , y2 ) = y2 − ey1 = y2 − 7y1 .
To encrypt (10, 9) (which is a point on the curve) :
1. Choose k, say k = 3.
2. Compute
y1 = 3α = (8, 3),
y2 = 3(7, 2) + (10, 9),
= 3(7α) + (10, 9),
= 21α + (10, 9),
= 8α + (10, 9),
= (3, 5) + (10, 9),
= (10, 2).
• An elliptic curve E over Zp with p > 3 is used such that (E, +) has a cyclic subgroup
H = hαi for which computing discrete logarithms is “hard.”
and
(c1 , c2 ) = kβ ∈ E.
x0 = y1 c−1
1 (mod p),
00
x = y2 c−1
2 (mod p),
and
We now show that the decryption function is in fact the inverse of the encryption
function. So suppose (as above) that we have encrypted the message (x1 , x2 ) as (y0, y1 , y2 ).
Then computing Dk (y0 , y1 , y2 ) yields the following.
ey0 = ekα = kβ = (c1 , c2 ),
x0 = y1 c−1 −1
1 ≡ c1 x1 c1 ≡ x1 (mod p),
x00 = y2 c−1 −1
2 ≡ c2 x2 c2 ≡ x2 (mod p).
2. Compute
This chapter describes a system that was developed around 1978, but that was subse-
quently broken a few years later. It remains an interesting system in spite of this that
can be used in conjunction with other systems.
The Merkle-Hellman [?] or “knapsack” cryptosystem revolves around the subset sum
problem.
Definition 16.0.2 (Subset sum problem). Given positive integers s1 , s2, . . . , sn and
T — the sizes and target — try to find a binary vector x = (x1 , x2 , . . . , xn) such that
x1 s1 + x2 s2 + · · · xnsn = T .
This problem is known to be NP-complete in general, but there are easy special cases.
for 2 ≤ j ≤ n.
If the list of sizes in the subset sum problem is superincreasing, then the problem is
easy to solve, as demonstrated by the following algorithm (shown in Figure 16) that finds
the binary vector in this case.
The reason why the algorithms works is simply because sn is greater than the sum of
all the other sizes, so if sn ≤ T it has to be chosen to try and reach the target — all the
other sizes put together is not enough to reach the target. Once sn is chosen (or discarded
113
114 CHAPTER 16. THE MERKLE-HELLMAN “KNAPSACK” SYSTEM
Figure 16.1: Algorithm for solving the subset problem for a superincreasing sequence
if sn > T ) and the size of T reduced to T − sn (or left as T if sn is not chosen) we have a
new subset sum problem with a smaller target (or the same) and n − 1 sizes that have to
be chosen; we just repeat this procedure. To see the uniqueness, realize that each si that
was chosen had to be chosen — there was no possibility of not using it.
The Merkle-Hellman “knapsack” system is made up of the following components.
• The keys k = (S, p, a, t), where t is made public and S, p and a are kept private.
First let’s show that the decryption function is the inverse of the encryption function.
P
So suppose that the binary vector x1 , x2 , . . . , xn is encrypted as y = xiti , as shown
P −1
above. We need to show that xisi = T = a y (mod p).
y = x1 t1 + x2 t2 + · · · + xn tn,
∴ a−1 y ≡ x1 a−1 t1 + x2 a−1 t2 + · · · + xn a−1 tn (mod p),
≡ x1 s1 + x2 s2 + · · · xn sn (mod p),
= x1 s1 + x2 s2 + · · · + xnsn .
P
The equality in the last step follows from the fact that p > si . Also since (s1 , s2 , . . . , sn )
is superincreasing the solution is unique.
Example 16.0.4. S = (2, 5, 10, 25) is superincreasing. Choose p = 53 and a = 10. The
public list of sizes, t, is
t1 = 20 (mod 53),
t2 = 50 (mod 53),
t3 = 100 ≡ 47 (mod 53),
t4 = 250 ≡ 38 (mod 53).
1 · 20 + 0 · 50 + 1 · 47 + 1 · 38 = 105.
In the previous chapter we saw an example of a cryptosystem that was constructed using
an easy instance of a “hard” problem. The system that we present in this chapter is based
on the same idea and appeared in [?]. Here the hard problem is that of decoding a binary
linear code where the generator matrix is given. As an easy special case we consider the
class of Goppa codes (which include the Hamming codes).
The Goppa codes have the following properties.
✗ They are [2m , 2m − mt, 2t + 1]-codes.
✗ There exist many inequivalent codes in this family all with the same parameters.
The McEliece cryptosystem has the following components.
• G is a generator matrix for a [2m , 2m − mt, 2t + 1] Goppa code.
• G0 = SGP .
• The keys k = (G, S, P, G0 ), where G0 is made public and S, P and G are kept private.
117
118 CHAPTER 17. THE MCELIECE CRYPTOSYSTEM
• The decryption function is a four step process that operates on y ∈ (Z2 )n as follows.
(i) Compute y1 = yP −1 .
(ii) Decode y1 , obtaining y1 = x1 + e1 , where x1 ∈ C.
(iii) Compute x0 ∈ (Z2 )k such that x0 G = x1 .
(iv) Compute x = x0 S −1 .
Suppose now that we would like to encrypt the plain-text x = (1, 1, 0, 1). Since the
Hamming code is a single error correcting code our random error vector has to be of
weight one. Say we choose e = (0, 0, 0, 0, 1, 0, 0). The corresponding cipher-text is
y = xG0 + e,
1 1 1 1 0 0 0
1 1 0 0 1 0 0
= (1, 1, 0, 1) + (0, 0, 0, 0, 1, 0, 0),
1 0 0 1 1 0 1
0 1 0 1 1 1 0
= (0, 1, 1, 0, 0, 1, 0) + (0, 0, 0, 0, 1, 0, 0),
= (0, 1, 1, 0, 1, 1, 0).
Assume now that we receive the cipher-text (0, 1, 1, 0, 1, 1, 0) and that we would like
to decrypt it. First we compute
y1 = yP −1 ,
0 0 0 1 0 0 0
1 0 0 0 0 0 0
0 0 0 0 1 0 0
= (0, 1, 1, 0, 1, 1, 0) 0 1 0 0 0 0 0 ,
0 0 0 0 0 0 1
0 0 0 0 0 1 0
0 0 1 0 0 0 0
= (1, 0, 0, 0, 1, 1, 1).
Next we need to decode y1 . Looking at the generator matrix we see that y1 is Hamming
distance one from the first row of G and since the Hamming code is a single error correcting
code we would decode y1 as x1 = (1, 0, 0, 0, 1, 1, 0). At this point we get x0 = (1, 0, 0, 0).
Finally we compute
1 1 0 1
1 1 0 0
x = x0 S −1 = (1, 0, 0, 0) = (1, 1, 0, 1).
0 1 1 1
1 0 0 1
120 CHAPTER 17. THE MCELIECE CRYPTOSYSTEM
Appendix A
Assignments
MATH 433D/550
Assignment 1
Due: Monday, January 21, 2002, at the start of class.
1. Recall that the reliability of a binary symmetric channel (BSC) is the probability
p, 0 ≤ p ≤ 1, that the digit sent is the digit received.
(a) [1] Would you use a BSC with p = 0? If so, how? What about a BSC with p = 1/2?
(b) [1] Explain how to convert a BSC with 0 ≤ p < 1/2 into a channel with 1/2 < p ≤ 1.
121
122 APPENDIX A. ASSIGNMENTS
(iii) If the channel is in constant use, about how long do you expect must pass between
undetected incorrectly transmitted words? Express your answer as a number of days.
3. [4] Establish the following three properties of the Hamming distance (for binary codes
C):
(a) d(u, w) = 0 if and only if u = w.
(b) d(v, w) = d(w, v).
(c) d(v, w) ≤ d(v, u) + d(u, w), ∀u ∈ C.
4. Let C be the code consisting of all binary words of length 4 that have even weight.
(a) [2] Find the error patterns C detects.
(b) [2] Find the error patterns C corrects.
5. [2] Prove that the minimum distance of a linear code is the smallest weight of a non-zero
codeword.
6. [6] Prove that a code can simultaneously correct all error patterns of weight at most
t, and unambiguously detect all non-zero error patterns of weight t + 1 to d (where
t ≤ d) if and only if it has minimum distance at least t + d + 1. (For example, consider
C = {000, 111}. This single error correcting code detects all non-zero error patterns of
weight at most 2. But, if 000 is sent and 110 is received, then only one error is detected
and the received word is incorrectly decoded as 111. The ambiguity here is that it is not
clear whether the error pattern is 110 or 001.)
123
MATH 550
Assignment 1
Solutions
Question 1
(a) Yes. Since the probability of error equals 1, each bit is received incorrectly. By
inverting each bit at the receiver we obtain the original bit. On the other hand a
BSC with p = 1/2 is completely unreliable. The probability of seeing a specific bit
at the receiving end equals 1/2, this situation is similar to flipping an unbiased coin
at the receiver and recording heads as 1 and tails as 0. Thus the channel is not able
to carry any information.
Question 2
(a) We have 1 ≤ |C| ≤ 2n (one codeword; all words of length n). Therefore since log is
monotone increasing
log2 (1) log2 |C| log2 (2n)
≤ ≤ ,
n n n
so that
0 ≤ i(C) ≤ 1.
(i)
log2 |C|
i(C) = = 1.
n
(ii) This code cannot detect any errors since all words of length 11 are codewords
— any codeword will be changed into another codeword by any error pattern.
The undetected error probability then is (q = 1 − p the error probability)
µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ µ ¶
11 10 11 2 9 11 3 8 11 4 7 11 5 6 11 6 5
Pe (C) = qp + q p + q p + q p + q p + q p +
1 2 3 4 5 6
µ ¶ µ ¶ µ ¶ µ ¶ µ ¶
11 7 4 11 8 3 11 9 2 11 10 1 11 11
q p + q p + q p + q p + q ,
7 8 9 10 11
= 1.1 × 10−7 .
124 APPENDIX A. ASSIGNMENTS
(iii) 107 bits per second implies 864 × 109 bits per day, this is approximately
78545454545 words (of length 11) per day. We expect a fraction of 1.1 × 10−7
of these to be in error. Therefore about 8640 words per day are in error.
log2 |C 0 | 11
i(C 0 ) = 0
=
n 12
(ii) This parity check code can detect all error patterns of odd weight. On the
other hand an error pattern of even weight results in a received word of even
weight (either two ones cancel or both ones contribute to the weight of the
received word). Thus even weight error patterns are not detectable. Therefore
µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ µ ¶
0 12 2 10 12 4 8 12 6 6 12 8 4 12 10 2 12 12
Pe (C ) = q p + q p + q p + q p + q p + q ,
2 4 6 8 10 12
= 6.6 × 10−15 .
(iii) We are transmitting about 72 × 109 words per day. From part (ii) we know
that the undetected error rate is 6.6 × 10−15 . Therefore about (72 × 109 )(6.6 ×
10−15 ) = 4.752 × 10−4 words per day are in error. This is the same as about 1
word error every 2104 days (≈ 5.77 years).
Question 3
(a) d(u, w) = 0 ⇐⇒ u = w.
⇒:
Let u, w ∈ C and d(u, w) = 0. This means that u and w differ in 0 places, therefore
u = w.
⇐:
d(u, u) = wt(u + u) = wt(0) = 0.
Question 4
(a) Any odd weight error pattern is detectable since it changes a codeword (of even
weight) into a word of odd weight. Even error patterns are not detectable since
they change a codeword into a word if even weight, which is a codeword. Thus the
detectable error patterns are
(b) It is easily verified that this code is linear and so its minimum distance is 2. To
be able to correct all error patterns of weight t we need a minimum distance of at
least 2t + 1. In our case this implies t = 1/2. This then says that no errors of
weight greater than 1 will be correctable but some errors of weight one may still be
correctable. Adding a weight one error pattern to any of the codewords produces a
received word that is closer to more than one codeword, therefore decoding becomes
impossible. Thus the code cannot correct any errors.
Question 5
The last step follows from the fact that C is linear so that the sum of two codewords is
again a codeword. Further, all nonzero codewords can be formed in this way since if v is
a nonzero codeword, then v 6= 0 and v = v + 0.
Question 6
A code C can simultaneously correct all error patterns of weight at most t and detect
all nonzero error patterns of weight t + 1 to d (d ≥ t) ⇐⇒ dmin (C) ≥ t + d + 1.
⇐:
Let dmin (C) ≥ t + d + 1 ≥ 2t + 1, since d ≥ t. By a previous theorem we now know that
C can correct all error patterns of weight t. Let u ∈ C be sent, w be received and z be
an error pattern such that t + 1 ≤ wt(z) ≤ d. Therefore t + 1 ≤ d(u, w) ≤ d. Let v ∈ C
126 APPENDIX A. ASSIGNMENTS
and v 6= u, then
Therefore w lies outside the t-ball around v. Since v was an arbitrary codeword, w does
not lie inside any of the t-balls that surround the codewords of C. This implies that w
will not be decoded to any codeword, as only words that lie inside a t-ball of a codeword
will be decoded to that codeword. We are therefore able to detect that more than t, but
less than d + 1 errors have occurred.
⇒:
Let C be a code that can simultaneously correct t and detect t + 1 to d (d ≥ t) errors.
The fact that C can correct t errors implies that dmin (C) ≥ 2t + 1. Let u, v ∈ C such
that d(u, v) = dmin (C). Place a ball of radius d around u and a ball of radius t around
v. By the error detection property of the code, any error pattern of weight d will not be
able to change u in such a way that it lies inside the t-ball around v. Therefore the two
balls above are disjoint. This implies that d(u, v) = dmin (C) ≥ t + d + 1.
127
Math 433D/550
Assignment 2
3. [4] Let H be a parity check matrix for a linear code C. Prove that C has minimum
distance d if and only if any set of d − 1 rows of H is linearly independent and some
set of d rows of H is linearly dependent.
4. [4] Suppose that C is a linear code of length n with minimum distance at least 2t+1 .
Prove that every coset of C contains at most one word of weight t or less. Use this to
show that syndrome decoding corrects all error patterns of weight at most t.
5. (a) [3] Prove the Singleton bound: For an (n, k, d)-code, d − 1 ≤ n − k. (Hint: consider
a parity check matrix.)
(b) [4] An (n, k, d)-code is called maximum distance separable (MDS) if equality holds in
the Singleton bound, that is, if d = n − k + 1. Prove that the following statements are
equivalent.
128 APPENDIX A. ASSIGNMENTS
(1) C is MDS,
(2) every n − k rows of the parity check matrix are linearly independent,
(3) every k columns of the generator matrix are linearly independent.
(c) [3] Show that the dual of an (n, k, n + k − 1) MDS code is an (n, n − k, k + 1) MDS
code.
129
MATH 550
Assignment 2
Solutions
Question 1
(a) We write the rows of S into the matrix A and put it in reduced row echelon form :
1 1 0 0 0 1 1 0 0 0 1 1 0 0 0
Add row 1 Add row 2
0 1 1 1 1 to row 3 0 1 1 1 1 to row 4 0 1 1 1 1 Add row 3
to row 4
G = −−−−−→ −−−−−→ −−−−−→
1 1 1 1 0 0 0 1 1 0 0 0 1 1 0
0 1 0 1 0 0 1 0 1 0 0 0 1 0 1
1 1 0 0 0 1 0 1 1 1 1 0 0 0 1
Add row 2 Add row 3
0 1 1 1 1 to row 1 0 1 1 1 1 to row 1 0 1 1 1 1 Add row 3
to row 2
−−−−−→ −−−−−→ −−−−−→
0 0 1 1 0 0 0 1 1 0 0 0 1 1 0
0 0 0 1 1 0 0 0 1 1 0 0 0 1 1
1 0 0 0 1 1 0 0 0 1
Add row 4
0 1 0 0 1 to row 3 0 1 0 0 1
−−−−−→ .
0 0 1 1 0 0 0 1 0 1
0 0 0 1 1 0 0 0 1 1
Therefore the generator matrix is
1 0 0 0 1
0 1 0 0 1
G= ,
0 0 1 0 1
0 0 0 1 1
and the parity check matrix is
1
1
H = 1 .
1
1
(c) Using question 3 we see that dmin (C) = 2. Further, GT is a parity check matrix
for C ⊥ from which we see (using question 3 again) that dmin (C ⊥ ) = 5. Therefore
C is a [5, 4, 2]-code and C ⊥ is a [5, 1, 5]-code.
130 APPENDIX A. ASSIGNMENTS
Question 2
(a) The message CALL HOME is encoded as : 0010101 0000000 1011001 1011001
0111000 111100 1100001 0100110.
This is a parity check matrix for the (7, 4, 3) Hamming code. We can also see using
question 3 that the minimum distance is 3. Therefore this code can only correct one
error and so the coset leaders are the words of weight at most 1. The syndrome for
a coset leader of weight 1 will be the row of H corresponding to the position where
the coset leader has a one. The received words have the following syndromes.
Received Syndrome
0111000 000
0110110 101
1011001 000
1011111 110
1100101 100
0100110 000
The syndrome 101, which is row three of H, shows that an error occurred in the
third position of the second received word. The syndrome 110, which is the second
row of H, shows that an error occurred in the second position of the fourth received
word. Similarly, the fifth received word has an error in position five. Of course, the
zero syndrome indicates no errors. Decoding then produces the following words :
0111 0100 1011 1111 1100 0100. This corresponds to : HELP ME.
Question 3
Let C be a linear code and H its parity check matrix. Then C has minimum distance d
if and only if any set of d − 1 rows of H is linearly independent and some set of d rows is
131
linearly dependent.
Proof :
⇐:
Let ri1 , ri2 , . . . , rid be the set of linearly dependent rows. That is ri1 + ri2 + · · · + rid = 0.
Let x = x1 x2 x3 . . . xn ∈ K n such that xj = 1 if j ∈ {i1 , i2 , . . . , ik } and xj = 0 otherwise.
Then xH = ri1 + ri2 + · · · + rid = 0. This implies that wt(x) = d and x ∈ C. Let z ∈ C
with z 6= 0. Then zH = 0. Assume wt(z) ≤ d − 1. This would imply that a sum of d − 1
or fewer rows of H is equal to zero. By hypothesis any set of d − 1 rows of H is supposed
to be linearly independent (any smaller set will also be), so that their sum will be nonzero
— a contradiction. Thus wt(z) ≥ d, so that dmin (C) = d.
⇒:
Let dmin (C) = d. Then there exists x ∈ C such that wt(x) = d and x 6= 0. Further,
xH = 0, which implies that a sum of d rows of H equals zero meaning that they are linearly
dependent. Since wt(z) ≥ d for all z ∈ C, this means that if y ∈ K n and wt(y) ≤ d − 1,
yH 6= 0. Therefore a sum of any set of d−1 or fewer rows of H is linearly independent.
Question 4
Let C be a linear code of length n and dmin (C) ≥ 2t + 1. One of the cosets of C is C
itself. For every x ∈ C with x 6= 0, wt(x) ≥ 2t + 1. Therefore in the coset C, 0 is the
only word of that has weight at most t. Let y ∈ K n with wt(y) ≤ t and consider the
coset y + C. Let z ∈ y + C, therefore z = y + c, c ∈ C. If c 6= 0, then wt(y + c) ≥ t + 1,
since y has only at most t 1’s to cancel the 1’s of c (of which there are at least 2t + 1).
Now y ∈ y + C and y = y + 0, so y is the only word in y + C that has weight at most
t. Thus the set of cosets {y + C | wt(y) ≤ t} each has a unique word of weight at most
t. All cosets are disjoint and any cosets that remain, apart from the ones above (if there
are any), will all contain words of weight at least t + 1. Therefore every coset has at most
one word of weight t or less (some cosets may have none of these words).
Let u be an error pattern of weight at most t, v ∈ C be sent and w = v +u be received.
Then wH = (u + v)H = uH + vH = uH. Therefore the coset is uniquely determined by
the error pattern. If we let the syndrome uH correspond to the coset u + C then u is the
unique word of weight at most t in u + C and so u will be chosen as the (correct) error
pattern and decoding will be successful.
Question 5
(i) An (n, k, d)-code is equivalent to a code in standard form that has a parity
check matrix of the form
" #
X
.
In−k
(b) (i) C is MDS ⇐⇒ every n − k rows of the parity check matrix are linearly inde-
pendent.
⇒:
Follows from question 3.
⇐:
We again use question three and assume that C is in standard form. Note that
the linearly dependent set of n − k + 1 rows can be taken as the n − k rows of
In−k together with a row from X.
(ii) C is MDS ⇐⇒ every k columns of the generator matrix are linearly indepen-
dent.
⇒:
Let C be an MDS code and assume some set of k columns of its generator
matrix, G, are linearly dependent, say ci1 , ci2 , . . . , cik . Consider the square
matrix M = [ci1 , ci2 , . . . , cik ]. Since dimension of the row space is equal to the
dimension of the column space and since M has k linearly dependent columns,
we know that the k rows of M are also linearly dependent. That is they sum
to zero. Thus summing the k rows of G (which is the same as encoding the
all ones word) produces a word with zeros in positions i1 , i2 , . . . , ik . Therefore
this codeword has at least k zeros so that it can have at most n − k ones. Since
C is MDS, dmin = n − k + 1, implying that all nonzero words should have
133
(c) Assume C is in standard form. We know that if G and H are the generator and parity
check matrices for C, then GT and H T are the parity check and generator matrices
for C ⊥ . So from part (b) above we know that the parity check matrix for C ⊥ (GT )
has every set of k rows linearly independent and we can also find a set of k+1 linearly
dependent rows (as above). Then by question 3 this shows that dmin (C ⊥ ) = k + 1.
Therefore C ⊥ is a [n, n − k, k + 1]-code and since k + 1 = n − (n − k) + 1 it is also
MDS.
134 APPENDIX A. ASSIGNMENTS
Math 433D/550
Assignment 3
1. [5] Is it true that in a self-dual code all words have even weight?
2. [4] Let C be a Hamming code of length 15. Find the number of error patterns the
extended code C ∗ will detect and the number of error patterns that C ∗ will correct.
3. [4] Count the number of codewords of weight 7 in the Golay code C23 . (Hint: Start by
proving that every word of weight 4 in K 23 is distance 3 from exactly one codeword.)
4. [4] Let G(1, 3) be the generator matrix for RM (1, 3). Decode the following received
words: (i) 01011110; (ii) 01100111; (iii) 00010100; (iv)11001110.
5. [5] If possible, devise a single error correcting code with length 6, 4 information
digits, and using the digits 0, 1, 2, 3, 4, 5. Describe the code, an encoding procedure,
and a decoding procedure. Prove that your code corrects all single errors. What is its
information rate? If not possible, say why not.
7. Consider the code with 10 decimal digits in which the check digit x10 is the least residue
of x1 x2 · · · x9 (mod 7) (that is, 0 ≤ x10 ≤ 6). (As of my last information, this code is
used by UPS and Federal Express.)
(a) [3] Under what conditions are single errors undetected?
(b) [3] Assuming that each single error is equally likely, what percentage of single errors
are undetected?
(c) [2] Repeat (b) for errors involving transposition of digits.
135
MATH 550
Assignment 3
Solutions
Question 1
Question 2
All Hamming codes have a minimum distance of 3. The extended code C ∗ has minimum
distance 4. Therefore C ∗ can detect all errors of weight 1, 2 or 3. Further C ∗ can detect
all error patterns of odd weight. This can be seen from the parity check matrix for C ∗ .
0 0 0 1 1
0 0 1 0 1
0 0 1 1 1
0 1 0 0 1
0 1 0 1 1
0 1 1 0 1
0 1 1 1 1
∗ 1 0 0 0 1
H = .
1 0 0 1 1
1 0 1 0 1
1 0 1 1 1
1 1 0 0 1
1 1 0 1 1
1 1 1 0 1
1 1 1 1 1
0 0 0 0 1
The syndrome associated with an error pattern of odd weight is the sum of an odd number
of rows of H ∗ . Such a sum will always have a 1 in the last digit and will thus be nonzero,
enabling us to detect the error. The number of odd weight error patterns are
µ ¶ µ ¶ µ ¶ µ ¶
16 16 16 16 1
+ + + ··· + = 216 = 215 .
1 3 5 15 2
136 APPENDIX A. ASSIGNMENTS
An even weight error pattern that is not detectable takes one codeword into a another
codeword. Therefore this error pattern is itself a codeword. All codewords of C ∗ are of
even weight and therefore they are all even weight error patterns that C ∗ cannot detect.
4
C ∗ has 22 −4−1 = 211 codewords. So the number of even weight error patterns that are
detectable are
µ ¶ µ ¶ µ ¶ µ ¶
16 16 16 16 1
+ + + ··· + − 211 = 216 − 211 = 30720.
0 2 4 16 2
Thus the number of detectable error patterns are the number of odd weight error
patterns plus the number of even weight detectable error patterns. This is 215 + 30720 =
63488.
All error patterns of weight one are correctable since the minimum distance is 4. The
syndrome corresponding to an error pattern of weight 2 is the sum of two rows of H ∗ .
Such a sum is equal to a row of H ∗ except that the entry on the right will be 0 instead of
1. Each such sum can arise in at least two different ways : use the row itself plus the last
row of H ∗ and row 1 = row 2 + row 3, row 2 = row 8 + row 10, row 3 = row 1 + row 2,
row 4 = row 3 + row 1, . . . , row 16 = row 9 + row 7. Therefore the syndrome associated
with error patterns of weight 2 is not unique. In other words the coset containing an error
pattern of weight 2 contains at least one other error pattern of weight 2. Thus no error
pattern of weight 2 is correctable.
An error pattern of weight k ≥ 3 is the sum of an error pattern of weight 2 and an
error pattern of weight k − 2. Therefore the syndrome associated with this error pattern
is the sum of the syndrome of the weight 2 error pattern and the syndrome of the weight
k − 2 error pattern. Since the first syndrome is not unique, these syndromes (for weight
k error patterns) can arise in more than one way. So no error pattern of weight k ≥ 3 is
correctable.
Therefore the number correctable error patterns equals the number of single errors
which equals 16.
Question 3
The Golay code C23 is a perfect (23, 12, 7)-code. Therefore every word in K 23 is in exactly
one ball with center a codeword and radius 3. The ball around the zero codeword contains
words of weight at most 3, so no word of weight 4 is inside this ball. Let v ∈ C23 be a
codeword of weight at least 8. Then the words inside the ball around v differ from v in
1, 2 or 3 places. Therefore the words inside this ball have weight at least 5. Every nonzero
codeword in C23 has weight at least 7. So consider now a codeword u ∈ C23 of weight 7.
The ball around u contains words that differ from u in 1, 2 or 3 places. By changing 3 of
137
¡¢
u’s 1’s to 0’s we obtain a word of weight 4 inside this ball. So we have 73 = 35 words
¡ ¢
of weight 4 inside this ball. There are 23 4
= 8855 words of weight 4 in K 23 and they
all have to lie inside a ball of radius 3 with center a codeword of weight 7. So we have
35 words of weight 4 in each such ball so that this requires 8855/35 = 253 such balls.
Therefore there are 253 codewords of weight 7.
Question 4
1 1 1 1 1 1 1 1
0 1 0 1 0 1 0 1
G(1, 3) = .
0 0 1 1 0 0 1 1
0 0 0 0 1 1 1 1
Note that RM (1, 3)⊥ = RM (3 − 1 − 1, 3) = RM (1, 3). Therefore G(1, 3)G(1, 3)T = 0.
Also G(1, 3)T has 4 linearly independent columns so that G(1, 3)T is a parity check matrix
for RM (1, 3). Further dmin (RM (1, 3)) = 4.
(i) 01011110 · G(1, 3) = 1100 + 1110 + 1001 + 1101 + 1011 = 1101. This equals the sixth
row of G(1, 3) and therefore we assume an error in the sixth position and decode to
01011010.
(ii) 01100111 · G(1, 3) = 1100 + 1010 + 1101 + 1011 + 1111 = 1111. This is row 8 so we
assume a single error in position 8 and decode to 01100110.
(iii) 00010100 · G(1, 3) = 1110 + 1101 = 0011. This doesn’t equal a row of G(1, 3) so
more than one error probably occurred. Furthermore this syndrome can arise in
two different ways : errors in positions 4 and 6 as well as errors in positions 3 and
5. So the best we can do is ask for a retransmission.
(iv) 11001110 · G(1, 3) = 1000 + 1100 + 1001 + 1101 + 1011 = 1011. This equals row 7
so we assume a single error in position 7 and decode to 11001100.
Question 5
A codeword will be made up of 4 information digits x1, x2 , x3 and x4 . The two parity
digits will be chosen such that
6
X
S1 = ixi ≡ 0 (mod 7),
i=1
X6
S2 = xi ≡ 0 (mod 7).
i=1
138 APPENDIX A. ASSIGNMENTS
4. If only one of the Si’s is nonzero and the other equal to zero, then assume more
than 3 errors occurred.
We now modify the code so that it only uses the digits 0 through 5. By simply
restricting the possibilities for the digits x1 , x2, x3 and x4 to {0, 1, 2, 3, 4, 5} we ensure
that they meet the requirement. The possibility still exits that x5 , x6 or both could equal
6. So among the 64 codewords that have x1 , x2 , x3 and x4 in {0, 1, 2, 3, 4, 5}, we want to
remove those with x5 , x6 or both equal 6.
For x5 = 6 we have
y1 + y2 + y3 + y4 = 7k + 6,
(1 + x2 + x4 + x6 + x8 + x10 )(1 + x3 + x6 + x9 + x12 + x15 )(1 + x4 + x8 + x12 + x16 + x20 )(1 + x5 + x10 + x
We want the coefficient of xi for i = 6, 13, 20, 27, 34, 41, 48, 55, 62, 69 where i here cor-
responds to the possible values 7k + 6 for k = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. In each case
the corresponding coefficients are 3, 10, 22, 33, 38, 36, 25, 13, 5, 0. So in total there are
3 + 10 + 22 + 33 + 38 + 36 + 25 + 13 + 5 + 0 = 185 solutions (or codewords then) that
have at least x5 = 6.
Proceeding in the same manner as above we find that for x6 = 6 we get
y1 + y2 + y3 + y4 = 7k + 6,
140 APPENDIX A. ASSIGNMENTS
with y1 ∈ {0, 4, 8, 12, 16, 20}, y2 ∈ {0, 3, 6, 9, 12, 15}, y3 ∈ {0, 2, 4, 6, 8, 10} and y4 ∈
{0, 1, 2, 3, 4, 5}.
Here the generating function is
The coefficients that we are after are the coefficients of xi for i = 6, 13, 20, 27, 34, 41, 48, 55, 62, 69.
In this case they are 8, 27, 46, 51, 36, 15, 2, 0, 0, 0. Therefore there are 8 + 27 + 46 + 51 +
36 + 15 + 2 + 0 + 0 + 0 = 185 codewords that have at least x6 = 6.
Lastly we need the case x5 = x6 = 6. In this case we have
y1 + y2 + y3 = 7k,
where y1 = 5x1 , y2 = 2x3 and y3 = 4x4 so that y1 ∈ {0, 5, 10, 15, 20, 25}, y2 ∈ {0, 2, 4, 6, 8, 10}
and y3 ∈ {0, 4, 8, 12, 16, 20}. The generating function for counting the number of solutions
is
Here we want the coefficients of xi for i = 0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70. They are
respectively 1, 1, 5, 5, 7, 7, 3, 2, 0, 0, 0. So there are 1+1+5+5+7+7+3+2+0+0+0 = 31
solutions where x5 = x6 = 6.
Therefore the total number of solutions where x5 , x6 or both equal 6 is 185+185−31 =
339. So by removing these codewords from the code we get a code with 64 − 339 = 957
codewords.
For the rate we notice that there are 64 possible codewords but that only 957 of them
are used, therefore the rate is 957/(64 ) ≈ 0.73.
Question 6
141
(c) If two digits from x1 , x2 , x3 or x4 are transposed this will not be detected since
the parity check equation remains valid. As an example say that x4 and x5 are
transposed, then the check equation becomes x1 + x2 + x3 + x5 ≡ x4 (mod 9). This
is the same as x1 + x2 + x3 + x4 + x5 ≡ 2x4 (mod 9), which in turn is 2x5 ≡ 2x4
(mod 9) implying x4 ≡ x5 (mod 9). The only way in which this can occur is if
x4 = 9 and x5 = 0.
Therefore transpositions involving the check digit will be detectable as long as the
digits involved are not 9 (for x1 through x4 ) and 0 (for x5 ).
Question 7
142 APPENDIX A. ASSIGNMENTS
The check equation is x1 x2 · · · x9 ≡ x10 (mod 7) (as above). Therefore it is the same as
x1 · 108 + x2 · 107 + · · · + x9 · 100 ≡ x10 (mod 7).
(a) Say a single error of size e occurs in position i, 1 ≤ i ≤ 9. The check equation
becomes x1 · 108 + x2 · 107 + · · · + (xi + e) · 109−i + · · · + x9 · 100 ≡ x10 + e · 109−i
(mod 7). The error will be undetectable if e · 109−i ≡ 0 (mod 7). This implies that
e = ±7. This corresponds to the following errors : 0 → 7, 1 → 8, 2 → 9, 7 → 0,
8 → 1 and 9 → 2.
If an undetectable error of size e occurs in position 10, then it has to be the case
(also) that e ≡ 0 (mod 7). This corresponds to the same set of errors as above,
but now since 0 ≤ x10 ≤ 6, the only possible undetectable errors in position 10 are
0 → 7, 1 → 8 and 2 → 9. Since all these errors change x10 into something bigger
than 6, they will be detectable.
(b) Consider first errors occurring in positions 1 through 9. For each of the 9 possible
positions there are 10 possible digits for each position and for each digit and position
there are 9 possible errors. Therefore in the first nine positions there are 10× 9×9 =
810 possible errors. In the tenth position there can be any one of 7 digits and each
digit can be changed into one of 9 possibilities. Therefore there are 7 × 9 = 63 single
errors involving the tenth position. So in total there are 810 + 63 = 873 possible
single errors.
All errors in position 10 are detectable, so we only concern ourselves with the first
nine positions. In these positions there are 6 possible undetectable errors. Each of
these 6 errors can occur in one of 9 positions, so there are 9 × 6 = 54 undetectable
errors. Therefore the percentage of undetectable errors is 54/873 ≈ 6%.
The error will be undetectable if (xj − xi )(109−i − 109−j ) ≡ 0 (mod 7). That is if
(xj − xi) = ±7 or if (109−i − 109−j ) ≡ 0 (mod 7). The last equation is the same as
109−j (10j−i − 1) ≡ 0 (mod 7). Now 7 - 109−j , so 7 | 10j−i − 1. That is 10j−i ≡ 1
(mod 7) implying j − i ≡ 0 (mod φ(7)). Since φ(7) = 6 this corresponds to j = 9,
i = 3; j = 8, i = 2 and j = 7, i = 1. Therefore if the digits in positions 1 and 7, 2
and 8 or 3 and 9 are transposed an undetectable error occurs (regardless of the digits
involved). Now there are 90 × 3 = 270 transpositions involving these positions.
Furthermore if the size of a transposition is ±7, that is (xi − xj ) = ±7, the trans-
position is undetectable. This corresponds to the 6 transpositions 0 ↔ 7, 1 ↔ 8,
2 ↔ 9, 7 ↔ 0, 8 ↔ 1 and 9 ↔ 2. Transpositions of size ±7 have already been
considered for positions 3 and 9, 2 and 8 and positions 1 and 7, above. Thus we
are left with considering transpositions involving all the other positions. There are
(1+2+3+· · ·+8)−3 = 33 of these (the total number of transpositions involving the
first 9 positions − the 3 considered above). So there are 33 × 6 = 198 transpositions
of size 7 in the remaining positions.
The check equation is equivalent to
2x1 + 3x2 + x3 + 5x4 + 4x5 + 6x6 + 2x7 + 3x8 + x9 + 6x10 ≡ 0 (mod 7).
Therefore x10 and x6 can always be transposed and it will not be detected. There
are 90 such transpositions.
So, there are 270 + 198 + 90 = 558 undetectable transpositions. The percentage
then is 558/4050 ≈ 14%.
144 APPENDIX A. ASSIGNMENTS
Math 433D/550
Assignment 4
2. Suppose Bob has an RSA cryptosystem with a large modulus n which can not be
factored easily. Alice sends a message to Bob by representing the alphabetic characters
A, B, ..., Z as 0, 1, ..., 25, respectively, and then encrypting each character (i.e., number)
separately.
(a) [4] Explain how a message encrypted in this way can easily be decrypted.
(b) [2] The following cipher-text was encrypted using the scheme described above with
n = 18721 and b = 25:
365, 0, 4845, 14930, 2608, 2608, 0
Illustrate your method from (a) by decrypting this cipher-text without factoring n.
This example illustrates a protocol failure in RSA. It demonstrates that a cipher-text can
sometimes be decrypted by an adversary if the system is used in a careless way. Thus a
secure cryptosystem is not enough to assure secure communication, it must also be used
properly.
3. [5] What happens if the RSA system is set up using p and q where p is prime but q
is not? Does encryption work (is EK (x) 1-1)? Can all encrypted messages be uniquely
decrypted? Illustrate your points with an example where p and q are two digit numbers.
6. [4] Factor 262063 and 9420457 using the p − 1 method. In each case, how big does B
have to be to be successful?
7. [6] Gary’s Poor Security (GPS) public key cryptosystem has Ek (x) = ax (mod 17575),
where a the receiver’s public key, and gcd(a, n) = 1. The plain-text space and cipher-text
space are both Zn, and each element of Zn represents three alphabetic characters as in
the following examples:
DOG → 3 × 262 + 14 × 26 + 6 = 2398
CAT → 2 × 262 + 0 × 26 + 19 = 1731.
The following message has been encrypted using Gary’s public key, 1411.
7017, 17342, 5595, 16298, 12285
Explain how to break the system and decrypt the message. Do it. Show your work.
8. [5] The following message was encrypted using the Rabin cryptosystem with
n = 19177 = 127 × 151 and B = 5679:
2251, 8836, 7291, 6035
The elements of Zn correspond to triples of alphabetic characters as in question 7. Decrypt
the message. explain how you decided among the four possible plain-texts for each cipher-
text symbol.
146 APPENDIX A. ASSIGNMENTS
MATH 550
Assignment 4
Solutions
Question 1
Letter Frequency
Letter Frequency
A 13
O 2
B 21
P 20
C 32
Q 4
D 9
R 12
E 13
S 1
F 10
U 6
H 1
V 4
I 16
X 2
J 6
Y 1
K 20
Z 4
N 1
The most frequent letters turn out to be C, B, K, P and I. Based on this we guess that
E 7→ C and T 7→ B. That is
Ek (4) = 2, ∴ a · 4 + b = 2,
Ek (19) = 1, ∴ a · 19 + b = 1.
From this we find that a ≡ 19 (mod 26) and b ≡ 4 (mod 26). Therefore
Dk (A) = I
Dk (O) = G
Dk (B) = T
Dk (P) = R
Dk (C) = E
Dk (Q) = C
Dk (D) = P
Dk (R) = N
Dk (E) = A
Dk (S) = Y
Dk (F) = L
Dk (U) = U
Dk (H) = H
Dk (V) = F
Dk (I) = S
Dk (X) = B
Dk (J) = D
Dk (Y) = M
Dk (K) = O
Dk (Z) = X
Dk (N) = V
OCANADATERREDENOSAIEUXTONFRONTESTCEINTDEFLEUR
ONSGLORIEUXCARTONBRASSAITPORTERLEPEEILSAITPOR
TERLACROIXTONHISTOIREESTUNEEPOPEEDESPLUSBRILL
ANTSEXPLOITSETTAVALEURDEFOITREMPEEPROTEGERANO
SFOYERSETNOSDROITS
Ô Canada!
Terre de nos aı̈eux.
Ton front est ceint,
De fleurons glorieux.
Car ton bras
Sait porter l’épée,
Il sait porter la croix.
Ton histoire est une épopée,
des plus brillants exploits.
Et ta valeur,
de foi trempée,
protègera nos foyers et nos droits.
Question 2
148 APPENDIX A. ASSIGNMENTS
(a) The eavesdropper encrypts the alphabet A,B, . . . ,Z. He/She then knows what the
cipher-text of each plain-text letter is and since the encryption function is one-to-one
we know what the inverse of each cipher-text “letter” is.
(b) Encrypting the alphabet using the given parameters we find the following.
A 7→ 0 N 7→ 4845
B 7→ 1 O 7→ 1375
C 7→ 6400 P 7→ 13444
D 7→ 18718 Q 7→ 16
E 7→ 17173 R 7→ 13663
F 7→ 1759 S 7→ 1437
G 7→ 18242 T 7→ 2940
H 7→ 12359 U 7→ 10334
I 7→ 14930 V 7→ 365
J 7→ 9 W 7→ 10789
K 7→ 6279 X 7→ 8945
L 7→ 2608 Y 7→ 11373
M 7→ 4644 Z 7→ 5116
Reading off the plain-text from the table we find that the cipher-text message is :
VANILLA.
Question 3
In general the encryption function will not be one-to-one. To see this let p = 11 and q = 12.
Then n = pq = 132 = 11 × 3 × 22 and φ(n) = [132/(11 × 2 × 2)](11 − 1)(3 − 1)(2 − 1) =
40 = 5 × 23 . Thus if we choose b = 3 then gcd(b, φ(n)) = 1. Further a ≡ b−1 (mod φ(n)),
so that a ≡ 27 (mod 40). We now find that Ek (3) ≡ 3b ≡ 33 ≡ 9 (mod 132) and
Dk (9) ≡ 9a ≡ 927 ≡ 81 (mod 132). So here Dk (Ek (3)) 6= 3. Also, Ek (81) ≡ 81b ≡ 813 ≡ 9
(mod 26). Therefore two different elements, 3 and 81, both encrypt to 9.
Question 4
Question 5
M1 = 211 M2 = 199,
y1 ≡ M1−1 ≡ 83 (mod 199) y2 ≡ M2−1 ≡ 123 (mod 211).
Question 6
Using the p − 1 method we find that 262063 = 521 × 503 and the first B for which the
method produces an answer is B = 13. Further 9420457 = 2351 × 4007 and the smallest
B that works in this case is B = 47.
Question 7
We are given that Ek (x) = ax (mod 17575), with a = 1411. From this we see that
Dk (y) = a−1 y (mod 17575). Therefore all we need to do is find a−1 (mod 17575). We
find that a−1 ≡ 16591 (mod 17575). Decrypting the given cipher-text we get.
Reading off the letters we find that the plain-text is : DILBERT IS MY HERO.
Question 8
Now B 2 ≡ 14504 (mod 19177) and 4−1 ≡ 14383 (mod 19177), so that 4−1 B 2 ≡ 3626
(mod 19177). Also, 2−1 ≡ 9589 (mod 19177), therefore 2−1 B ≡ 12428 (mod 19177). We
now have
p p
Dk (y) = y + 3626 − 12428 ≡ y + 3626 + 6749 (mod 19177).
From x ≡ 17 (mod 127) and x ≡ 21 (mod 151) the Chinese remainder Theorem gives
x ≡ 17 × 151 × y1 + 21 × 127 × y2 (mod 19177). Here y1 ≡ 151−1 ≡ 90 (mod 127) and
y2 ≡ 127−1 ≡ 44 (mod 151). Therefore x ≡ 17 × 151 × 90 + 21 × 127 × 44 ≡ 3192
(mod 19177).
From x ≡ 17 (mod 127) and x ≡ −21 (mod 151), we find x ≡ 17 × 151 × 90 +(−21) ×
127 × 44 ≡ 17797 (mod 19177).
From x ≡ −17 (mod 127) and x ≡ 21 (mod 151), we find x ≡ −17 × 151 × 90 + 21 ×
127 × 44 ≡ 1380 (mod 19177).
From x ≡ −17 (mod 127) and x ≡ −21 (mod 151), we find x ≡ −17 × 151 × 90 +
(−21) × 127 × 44 ≡ 15985 (mod 19177).
Therefore
Dk (2251) ≡ 3192 + 6749 ≡ 9941 (mod 19177),
Dk (2251) ≡ 17797 + 6749 ≡ 5369 (mod 19177),
Dk (2251) ≡ 1380 + 6749 ≡ 8129 (mod 19177),
Dk (2251) ≡ 15985 + 6749 ≡ 3557 (mod 19177).
Giving us the following
9941 = 14 × 262 + 18 × 26 + 9,
5369 = 7 × 262 + 24 × 26 + 13,
8129 = 12 × 262 + 0 × 26 + 17,
9941 = 5 × 262 + 6 × 26 + 21.
In each case this corresponds to the plain-text OSJ, HYN, MAR, FGV. At this point the
plain-text that holds the most promise seems to be MAR.
√
Dk (8836) ≡ 12462 + 6749 (mod 19177). Here we find that the four square-roots of
12462 (mod 19177) are : 18038, 13974, 5203 and 1139. Therefore
Dk (8836) ≡ 18038 + 6749 ≡ 5610 (mod 19177),
Dk (8836) ≡ 13974 + 6749 ≡ 1546 (mod 19177),
Dk (8836) ≡ 5203 + 6749 ≡ 11952 (mod 19177),
Dk (8836) ≡ 1139 + 6749 ≡ 7888 (mod 19177).
We now have
5610 = 8 × 262 + 7 × 26 + 20,
1546 = 2 × 262 + 7 × 26 + 12,
11952 = 17 × 262 + 17 × 26 + 18,
7888 = 11 × 262 + 17 × 26 + 10.
152 APPENDIX A. ASSIGNMENTS
Here the corresponding plain-text is IHU, CHM, RRS, LRK. Considered on their own
none of these seem to be a better choice than the other. If we combine them with the
first set we find that CHM seems to be a good choice as this gives MARCHM.
√
Dk (7291) = 10917 + 6749. The four square-roots of 10917 are 12519, 15567, 3610
and 6658. This gives
This leads to
91 = 0 × 262 + 3 × 26 + 13,
3139 = 4 × 262 + 16 × 26 + 19,
10359 = 15 × 262 + 8 × 26 + 11,
13407 = 19 × 262 + 21 × 26 + 17.
The plain-text is ADN, EQT, PIL, TVR. Out of these four plain-texts the only one that
combines with the result so far in a sensible manner is the first one. This combined with
the result so far gives MARCHMADN.
√
Dk (6035) = 9661 + 6749. The four square-roots of 9661 are 17904, 15618, 3559 and
1273. This gives
The four plain-texts are : ICQ, ESS, PGM and LWO. Here the second one seems to be
the only one that fits in with the results so far. This gives MARCHMADNESS 7→ March
Madness.
154 APPENDIX A. ASSIGNMENTS