Systematic Generator Matrices: 0,0 0, N K 1 1,0 1, N K 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

EE 387, Notes 9, Handout #13

Systematic generator matrices


Definition: A systematic generator matrix is of the form
 
p0,0 · · · p0,n−k−1 1 0 ··· 0
 p1,0 · · · p1,n−k−1 0 1 ··· 0 
G=[P | I ]= . . . .. ..
 
 .. . .. .. .. .. 
. . . 
pk−1,0 · · · pk−1,n−k−1 0 0 ··· 1
Advantages of systematic generator matrices:
◮ Message symbols appear unscrambled in each codeword, in the
rightmost positions n − k, . . . , n − 1.
◮ Encoder complexity is reduced; only check symbols need be computed:
cj = m0 g0,j + m1 g1,j + · · · + mk−1 gk−1,j (j = 0, . . . , n − k − 1)
◮ Check symbol encoder equations easily yield parity-check equations:
cj − cn−k g0,j − cn−k+1 g1,j − · · · − cn−1 gk−1,j = 0 (mi = cn−k+i )
◮ Systematic parity-check matrix is easy to find: H = [ I | −P T ] .
EE 387, October 9, 2015 Notes 9, Page 1
Systematic parity-check matrix
Let G be a k × n systematic generator matrix:
 
p0,0 · · · p0,n−k−1 1 0 ··· 0
 p1,0 · · · p1,n−k−1 0 1 ··· 0 
G = [ P | Ik ] =  . .. .. .. . . ..
 
 .. .. 
. . . . . . 
pk−1,0 · · · pk−1,n−k−1 0 0 ··· 1
The corresponding (n − k) × n systematic parity-check matrix is
 
1 0 · · · 0 −p0,0 · · · −pk−1,0
 0 1 · · · 0 −p0,1 · · · −pk−1,1 
H = [ In−k | −P T ] =  .. .. . . . .. ..
 
..
. ..

 . . . . . 
0 0 · · · 1 −p0,n−k · · · −pk−1,n−k
(The minus signs are not needed for fields of characteristic 2, i.e., GF(2m ).)
Each row of H is corresponds to an equation satisfied by all codewords.
These equations tell how to compute the check symbols c0 , . . . , cn−k−1
in terms of the information symbols cn−k , . . . , cn−1 .
EE 387, October 9, 2015 Notes 9, Page 2
Minimum weight and columns of H
cH T = 0 for every codeword c = (c0 , c1 , . . . , cn−1 ). Any nonzero codeword
determines a linear dependence among a subset of rows of H T . Then
cH T = 0 =⇒ 0 = (cH T )T = HcT = c0 h0 + c1 h1 + · · · + cn−1 hn−1
is a linear dependence among a subset of the columns of H.
Theorem: The minimum weight of a linear block code is the smallest
number of linearly dependent columns of any parity-check matrix.
Proof : Each linearly dependent subset of w columns corresponds to a
codeword of weight w.
Recall that a set of columns of H is linearly dependent if one column is a
linear combination of the other columns.
◮ A LBC has w ∗ ≤ 2 iff one column of H is a multiple of another column.

◮ For binary Hamming codes, w∗ = 3 because no columns of H are equal.


The Big Question: how to find H such that no 2t + 1 columns are LI?
EE 387, October 9, 2015 Notes 9, Page 3
Computing minimum weight
The rank of H is the maximum number of linearly independent columns.
The rank can be determined in time O(n3 ) using linear operations, e.g.,
using Gaussian elimination.
Minimum distance is the smallest number of linearly dependent columns.
Finding the minimum distance is difficult (NP-hard). We might have to
look at large numbers of subsets of columns.
Solution: design codes whose minimum distance can be proven to have
desired lower bounds.
The dimension of the column space of H is n − k. Thus any n − k + 1
columns are linearly dependent. Therefore for any linear block code,
d∗ = w∗ ≤ n − k + 1
This is known as the Singleton bound.
Exercise: Show that the Singleton bound holds for all (n, k) block codes,
not just linear codes.
EE 387, October 9, 2015 Notes 9, Page 4
Maximum distance separable codes
Codes that achieve Singleton bound are called maximum-distance separable
(MDS) codes.
Every repetition code satisfies the Singleton bound with equality:
d∗ = n = (n − 1) + 1 = (n − k) + 1
Another class of MDS codes are the simple parity-check codes:
d∗ = 2 = 1 + 1 = (n − k) + 1
The best known nonbinary MDS codes are the Reed-Solomon codes
over GF(Q). The RS code parameters are
(n, k, d∗ ) = (Q − 1, Q − d∗ , d∗ ) =⇒ n − k = d∗ − 1 .

Exercise: Show that the repetition codes and the simple parity-check codes
are the only nontrivial binary MDS codes.

EE 387, October 9, 2015 Notes 9, Page 5


Linear block codes: summary
◮ An (n, k) linear block code is a k-dimensional subspace of F n .
Sums, differences, and scalar multiples of codewords are also codewords.
◮ A group code over additive group G is closed under sum and difference.
◮ An (n, k) LBC over F = GF(q) has M = q k codewords and rate k/n .
◮ A linear block code C can be defined by two matrices.
◮ Generator matrix G: rows of G are basis for C, i.e., C = {mG : m ∈ F k }
◮ Parity-check matrix H span C ⊥ , hence C = {c ∈ F n : cH T = 0}
◮ Hamming weight of an n-tuple is the number of nonzero components.
◮ Minimum weight w∗ of a block code is the Hamming weight of the
nonzero codeword of minimum weight.
◮ Minimum distance of every LBC equals minimum weight: d∗ = w∗ .
◮ Minimum weight of a linear block code is the smallest number of linearly
dependent columns of any parity-check matrix.

EE 387, October 9, 2015 Notes 9, Page 6


Syndrome decoding
Linear block codes are much simpler than general block codes:
◮ Encoding is vector-matrix multiplication.
(Cyclic codes are even simpler: polynomial multiplication/division.)
◮ Decoding is inherently nonlinear. Fact: linear decoders are very weak.
However, several steps in the decoding process are linear:
◮ syndrome computation
◮ final correction after error pattern and location have been found
◮ extracting estimated message from estimated codeword

Definition: The error vector or error pattern e is the difference between the
received n-tuple r and the transmitted codeword c:

e = r − c =⇒ r = c + e

Note: The physical noise model may not be additive noise, and the probability distribution for the
error e may depend on the data c. We assume a channel error model determined by Pr(e).
EE 387, October 9, 2015 Notes 9, Page 7
Syndrome decoding (cont.)
Multiply both sides of the equation r = c + e by H:

s = rH T = (c + e)H T = cH T + eH T = 0 + eH T = eH T .
The syndrome of the senseword r is defined to be s = rH T .
The syndrome of r (known to receiver) equals the syndrome of the error
pattern e (not known to receiver, must be estimated).
Decoding consists of finding the most plausible error pattern e such that
eH T = s = rH T .
“Plausible” depends on the error characteristics:
◮ For binary symmetric channel, most plausible means smallest number of
bit errors. Decoder estimates ê of smallest weight satisfying êH T = s.
◮ For bursty channels, error patterns are plausible if the symbol errors are
close together.

EE 387, October 9, 2015 Notes 9, Page 8


Syndrome decoding (cont.)
Syndrome table decoding consists of these steps:
1. Calculate syndrome s = rH T of received n-tuple.
2. Find most plausible error pattern e with eH T = s.
3. Estimate transmitted codeword: ĉ = r − e.
4. Determine message m̂ from the encoding equation ĉ = m̂G.
Step 4 is not needed for systematic encoders, since m = ĉ[n−k : n−1].
Only step 2 requires nonlinear operations.
For small values of n − k, lookup tables can be used for step 2.
For BCH and Reed-Solomon codes, the error locations are the zeroes of
certain polynomials over the channel alphabet.
These error locator polynomials are linear functions of the syndrome.
Challenge: find, then solve, the polynomials.

EE 387, October 9, 2015 Notes 9, Page 9


Syndrome decoding: example
An (8, 4) binary linear block code C is defined by systematic matrices:
   
1 0 0 0|0 1 1 1 0 1 1 1|1 0 0 0
0 1 0 0 | 1 0 1 1 1 0 1 1 | 0 1 0 0
H = 0 0 1 0 | 1 1 0 1 =⇒ G = 1 1 0 1 | 0 0 1 0
  

0 0 0 1|1 1 1 0 1 1 1 0|0 0 0 1
Consider two possible messages:
m1 = [ 0 1 1 0 ] m2 = [ 1 0 1 1 ]
c1 = [ 0 1 1 0 0 1 1 0 ] c2 = [ 0 1 0 0 1 0 1 1 ]
Suppose error pattern e = [ 0 0 0 0 0 1 0 0 ] is added to both codewords.
r1 = [ 0 1 1 0 0 0 1 0 ] r2 = [ 0 1 0 0 1 1 1 1 ]
s1 = [ 1 0 1 1 ] s2 = [ 1 0 1 1 ]
Both syndromes equal column 6 of H, so decoder corrects bit 6.

C is an expanded Hamming code with weight enumerator A(x) = 1 + 14x4 + x8 .


EE 387, October 9, 2015 Notes 9, Page 10
Standard array
Syndrome table decoding can also be described using the standard array.
The standard array of a group code C is the coset decomposition of F n
with respect to the subgroup C.
0 c2 c3 ··· cM
e2 c2 + e2 c3 + e2 ··· cM + e2
e3 c2 + e3 c3 + e3 ··· cM + e3
.. .. .. .. ..
. . . . .
eN c2 + eN c3 + eN ··· cM + eN

◮ The first row is the code C, with the zero vector in the first column.
◮ Every other row is a coset.
◮ The n-tuple in the first column of a row is called the coset leader.
We usually choose the coset leader to be the most plausible error
pattern, e.g., the error pattern of smallest weight.
EE 387, October 9, 2015 Notes 9, Page 11
Standard array: decoding
An (n, k) LBC over GF(Q) has M = Qk codewords.
Every n-tuple appears exactly once in the standard array. Therefore the
number of rows N satisfies
M N = Qn =⇒ N = Qn−k .
All vectors in a row of the standard array have the same syndrome.
Thus there is a one-to-one correspondence between the rows of the
standard array and the Qn−k syndrome values.
Decoding using the standard array is simple: decode senseword r to the
codeword at the top of the column that contains r.
The decoder subtracts the coset leader from the received vector to obtain
the estimated codeword.
The decoding region for a codeword is the column headed by that codeword.

EE 387, October 9, 2015 Notes 9, Page 12


Standard array and decoding regions
0 codewords

wt 1
shells of radius 1

wt 2
shells of radius 2
coset leaders
wt t

shells of radius t
wt >t

vectors of weight > t

EE 387, October 9, 2015 Notes 9, Page 13


Standard array: example
The systematic generator and parity-check matrices for a (6, 3) LBC are
   
0 1 1|1 0 0 1 0 0|0 1 1
G = 1 0 1 | 0 1 0 =⇒ H = 0 1
   0 | 1 0 1
1 1 0|0 0 1 0 0 1|1 1 0
The standard array has 6 coset leaders of weight 1 and one of weight 2.
000000 001110 010101 011011 100011 101101 110110 111000
000001 001111 010100 011010 100010 101100 110111 111001
000010 001100 010111 011001 100001 101111 110100 111010
000100 001010 010001 011111 100111 101001 110010 111100
001000 000110 011101 010011 101011 100101 111110 110000
010000 011110 000101 001011 110011 111101 100110 101000
100000 101110 110101 111011 000011 001101 010110 011000
001001 000111 011100 010010 101010 100100 111111 110001

See http://www.stanford.edu/class/ee387/src/stdarray.pl for the short Perl script that generates


the above standard array. This code is a shortened Hamming code.
EE 387, October 9, 2015 Notes 9, Page 14
Standard array: summary
The standard array is a conceptional arrangement of all n-tuples.
0 c2 c3 ··· cM
e2 c2 + e2 c3 + e2 ··· cM + e2
e3 c2 + e3 c3 + e3 ··· cM + e3
.. .. .. .. ..
. . . . .
eN c2 + eN c3 + eN ··· cM + eN

◮ The first row is the code C, with the zero vector in the first column.
◮ Every other row is a coset.
◮ The n-tuple in the first column of a row is called the coset leader.
◮ Senseword r is decoded to codeword at top of column that contains r.
◮ The decoding region for codeword is column headed by that codeword.
◮ Decoder subtracts coset leader from r to obtain estimated codeword.
EE 387, October 9, 2015 Notes 9, Page 15
Syndrome decoding: summary
Syndrome decoding is closely connected to standard array decoding.
1. Calculate syndrome s = rH T of received n-tuple.
2. Find most plausible error pattern e with eH T = s.
This error pattern is the coset leader of the coset containing r.
3. Estimate transmitted codeword: ĉ = r − e.
The estimated codeword ĉ is the entry at the top of the column
containing r in the standard array.
4. Determine message m from the encoding equation c = mG.
In general, m = cR, where R is an n × k pseudoinverse of G.
If the code is systematic, then R = [ 0(n−k)×k | Ik×k ]T .
Only step 2 requires nonlinear operations. Step 2 is conceptually the most
difficult.

Surprisingly, most computational effort is spent on syndrome computation.


EE 387, October 9, 2015 Notes 9, Page 16

You might also like