Using Linear Programming To Decode Binary Linear Codecs
Using Linear Programming To Decode Binary Linear Codecs
=
(
Index TermsBelief propagation (BP), iterative decoding, lowdensity parity-check (LDPC) codes, linear codes, linear programming (LP), LP decoding, minimum distance, pseudocodewords.
I. INTRODUCTION
OW-density parity-check (LDPC) codes were first discovered by Gallager in 1962 [7]. In the 1990s, they were rediscovered by a number of researchers [8], [4], [9], and have
since received a lot of attention. The error-correcting performance of these codes is unsurpassed; in fact, Chung et al. [10]
have given a family of LDPC codes that come within 0.0045 dB
of the capacity of the channel (as the block length goes to infinity). The decoders most often used for this family are based
Manuscript received May 6, 2003; revised December 8, 2004. The work of
J. Feldman was conducted while the author was at the MIT Laboratory of Computer Science and supported in part by the National Science Foundation Postdoctoral Research Fellowship DMS-0303407. The work of D. Karger was supported in part by the National Science Foundation under Contract CCR-9624239
and a David and Lucille Packard Foundation Fellowship. The material in this
paper was presented in part at the Conference on Information Sciences and Systems, Baltimore, MD, June 2003.
J. Feldman is with the Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027 USA (e-mail:
jonfeld@ieor.columbia.edu).
M. J. Wainwright is with the Department of Electrical Engineering and
Computer Science and the Department of Statistics, University of California,
Berkeley, Berkeley, CA, 94720 USA (e-mail: wainwrig@eecs.berkeley.edu).
D. R. Karger is with the Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology, Cambridge, MA 02139
USA (e-mail: karger@mit.edu).
Communicated by R. L. Urbanke, Associate Editor for Coding Techniques.
Digital Object Identifier 10.1109/TIT.2004.842696
955
C. Outline
We begin the paper in Section II by giving background on
factor graphs for binary linear codes, and the ML decoding
problem. We present the LP relaxation of ML decoding in
Section III. In Section IV, we discuss the basic analysis of
LP decoding. We define pseudocodewords in Section V, and
fractional distance in Section VI. In Section VII, we draw
connections between various iterative decoding algorithms and
our LP decoder, and present some experiments. In Section VIII,
we discuss various methods for tightening the LP in order to
get even better performance. We conclude and discuss future
work in Section IX.
D. Notes and Recent Developments
Preliminary forms of part of the work in this paper have appeared in the conference papers [15], [16], and in the thesis of
one of the authors [17]. Since the submission of this work, it
has been shown that the LP decoder defined here can correct a
constant fraction of error in certain LDPC codes [18], and that
a variant of the LP can achieve capacity using expander codes
[19].
Additionally, relationships between LP decoding and iterative
decoding have been further refined. Discovered independently
of this work, Koetter and Vontobels notion of a graph cover
[20] is equivalent to the notion of a pseudocodeword graph
defined here. More recent work by the same authors [21], [22]
explores these notions in more detail, and gives new bounds for
error performance.
II. BACKGROUND
A linear code with parity-check matrix
can be represented by a Tanner or factor graph , which is defined in the
and
following way. Let
be indices for the columns (respectively, rows) of the
parity-check matrix of the code. With this notation, is a bipartite graph with independent node sets and . We refer to
the nodes in as variable nodes, and the nodes in as check
nodes. All edges in have one endpoint in and the other in
. For each
, the edge
is included in if
.
and only if
, denoted by
,
The neighborhood of a check node
is the set of nodes
such that check node is incident to
be the set of check
variable node in . Similarly, we let
.
nodes incident to a particular variable node
a value in
Imagine assigning to each variable node
, representing the value of a particular code bit. A paritycheck node is satisfied if the collection of bits assigned to
have even parity. The binary
the variable nodes s.t.
is a codeword if and only if all check
vector
nodes are satisfied. Fig. 1 shows an example of a linear code
and its associated factor graph. In this Hamming code, if we set
, and
, then the
neighborhood of every check node has even parity. Therefore,
represents a codeword, which we can write as
. Other
,
, and
.
codewords include
denote the maximum variable (left) degree of the
Let
, of the
factor graph; i.e., the maximum, among all nodes
956
We will frequently exploit the fact that the cost vector can
be uniformly rescaled without affecting the solution of the ML
problem. In the BSC, for example, rescaling by
allows us to assume that
if
, and
if
.
III. DECODING WITH LINEAR PROGRAMMING
A factor graph for the (7; 4; 3) Hamming code. The nodes
f1; 2; 3; 4; 5; 6; 7g drawn in open circles correspond to variable nodes,
whereas nodes fA; B; C g in black squares correspond to check nodes.
Fig. 1.
degree of . Let
denote the minimum variable degree. Let
and
denote the maximum and minimum check (right) degree
of the factor graph.
A. Channel Assumptions
A binary-codeword of length is sent over a noisy channel,
and a corrupted word is received. In this paper, we assume
an arbitrary discrete memoryless symmetric channel. We use
to denote the probability that was the
the notation
codeword sent over the channel, given that was received. We
assume that all information words are equally likely a priori. By
Bayes rule, this assumption implies that
Note that
is a polytope contained within the -hyper, and includes exactly those vertices of the hypercube
correcube corresponding to codewords. Every point in
sponds to a vector
, where element is defined
.
by the summation
The vertices of a polytope are those points that cannot be expressed as convex combinations of other points in the polytope.
A key fact is that any linear program attains its optimum at a
vertex of the polytope [23]. Consequently, the optimum will al, and these vertices are
ways be attained at a vertex of
in one-to-one correspondence with codewords.
We can therefore define ML decoding as the problem of minsubject to the constraint
. This
imizing
formulation is a linear program, since it involves minimizing a
.
linear cost function over the polytope
B. LP Relaxation
The most common practical method for solving a linear program is the simplex algorithm [23], which generally requires an
explicit representation of the constraints. In the LP formulation
of exact ML decoding we have just described, although
can be characterized by a finite number of linear constraints,
the number of constraints is exponential in the code length .
Even the Ellipsoid algorithm [24], which does not require such
an explicit representation, is not useful in this case, since ML
decoding is NP-hard in general [25].
Therefore, our strategy will be to formulate a relaxed polytope, one that contains all the codewords, but has a more manageable representation. More concretely, we motivate our LP
relaxation with the following observation. Each check node in a
factor graph defines a local code; i.e., the set of binary vectors
that have even weight on its neighborhood variables. The global
code corresponds to the intersection of all the local codes. In LP
s.t.
(8)
957
is a point in
where all
Proof: Suppose
and all
. Now suppose
is not a
codeword, and let be some parity check unsatisfied by setting
for all
. By the constraints (6), and the fact that
is integral,
for some
, and
for all
where
. By the constraints (7), we have
other
for all
, and
for all
,
. Since
is even, is satisfied by setting
, a contradiction.
be a codeFor the second part of the claim, let
. For all
, let be the set of nodes
word, and let
in
where
. Since
is a codeword,
check is satisfied by , so
is even, and the variable
is
and
for all other
. All
present. Set
constraints are satisfied, and all variables are integral.
Overall, the decoding algorithm based on LCLP consists
of the following steps. We first solve the LP in (8) to obtain
. If
, we output it as the optimal codeis fractional, and we output an error.
word; otherwise,
From Proposition 1, we get the following.
Proposition 2: LP decoding has the ML certificate property:
if the algorithm outputs a codeword, it is guaranteed to be the
ML codeword.
Proof: If the algorithm outputs a codeword , then
has cost less than or equal to all points in . For some codeword
, we have that
is a point in by Proposition 1.
Therefore, has cost less than or equal to .
Given a cycle-free factor graph, it can be shown that any optimal solution to LCLP is integral [26]. Therefore, LCLP is an
exact formulation of the ML decoding problem in the cycle-free
case. In contrast, for a factor graph with cycles, the optimal solution to LCLP may not be integral. Take, for example, the Hamming code in Fig. 1. Suppose that we define a cost vector as
follows: for variable node , set
, and for all other
, set
. It is not hard to verify that
nodes
under this cost function, all codewords have nonnegative cost:
any codeword with negative cost would have to set
, and
therefore set at least two other
, for a total cost of at least
. Consider, however, the following fractional solution to
LCLP: first, set
and then for check
node , set
; at check node , assign
; and lastly at check node , set
. It can be verified that
satisfies
all of the LCLP constraints. However, the cost of this solution
is
, which is strictly less than the cost of any codeword.
Note that this solution is not a convex combination of codewords, and so is not contained in
. This solution gets
outside of
by exploiting the local perspective of the relaxation: check node is satisfied by using the configuration
, whereas in check node , the configuration
is not
used. The analysis to follow will provide further insight into the
nature of such fractional (i.e., nonintegral) solutions to LCLP.
It is worthwhile noting that the local codeword constraints (7)
are identical to those enforced in the Bethe free energy formulation of BP [27]. For this reason, it is not surprising that the
performance of our LP decoder turns out to be closely related to
that of the BP and min-sum algorithms.
958
l
Q
. Then, for every check , we explicitly forbid every bad configuration of the neighborhood of . Specifically, for all
,
odd, we require
(9)
Note that the integral settings of the bits that satisfy these constraints for some check are exactly the local codewords for ,
as before.
be the set of points that satisfy (9) for a particular
Let
check , and all
with
odd. We can further underby rewriting (9) as follows:
stand the constraints in
(10)
In other words, the distance between (the relevant portion of)
and and the incidence vector for each set is at least one. This
constraint ensures that is separated by at least one bit flip from
),
all illegal configurations. In three dimensions (i.e,
it is easy to see that these constraints are equivalent to the convex
, as shown in Fig. 2. In
hull of the even-sized subsets
fact, the following theorem states that in general, if we enforce
(9) for all checks, we get an explicit description of .
Theorem 4: Let the polytope
equivalent. In other words, the polytope
. Then
and
are
s.t.
In other words, we require that the projections of and onto
the
variables are the same. Since the objective function of
variables, optimizing over and
LCLP only involves the
will produce the same result.
In the remainder of this section we define two new polytopes.
The first is an explicit description of that will be useful for
defining (and computing) the fractional distance of the code,
which we cover in Section VI. The second polytope is equivalent
to , but has a small overall representation, even for high-density codes. This equivalence shows that LCLP can be solved efficiently for any binary linear code.
2) Projected Polytope: In this subsection, we derive an explicit description of the polytope . The following definition of
in terms of constraints on was derived from the parity polyfor all
tope of Jeroslow [29], [30]. We first enforce
is exactly the set of points that satisfy (9) for all checks and
where
odd.
all
is the set of points
that satisfy
Proof: Recall that
the local codeword polytope for check . Consider the projection
s.t.
In other words,
is the convex hull of local codeword sets
defined by sets
. Note that
, since each
exactly expresses the constraints associated with check .
is the set of points that satisfy the constraints
Recall that
(9) for a particular check . Since
, it suffices to
for all
. This is shown by Jeroslow [29].
show
For completeness, we include a proof of this fact in Appendix I.
(11)
We now proceed to provide combinatorial characterizations of
decoding success and analyze the performance of LP decoding
in various settings.
959
V. PSEUDOCODEWORDS
In this section, we introduce the concept of a pseudocodeword
for LP decoding, which we will define as a scaled version of a
solution to LCLP. As a consequence, Theorem 5 will hold for
pseudocodewords in the same way that it holds for solutions to
LCLP.
The following definition of a codeword motivates the notion
is the set of even-sized subof a pseudocodeword. Recall that
sets of the neighborhood of check node . Let
.
, and let be a setting of nonnegative
Let be a vector in
for each check and
.
integer weights, one weight
is a codeword if, for all edges
in the
We say that
factor graph ,
. This corresponds exactly to the consistency constraint (7) in LCLP. It is not difficult
to see that this construction guarantees that the binary vector
is always a codeword of the original code.
by reWe obtain the definition of a pseudocodeword
moving the restriction
, and instead allowing each
to take on arbitrary nonnegative integer values. In other words,
of nonnegative
a pseudocodeword is a vector
, the neighborintegers such that, for every parity check
hood
is a sum of local codewords (incidence
vectors of even-sized sets in ).
With this definition, any codeword is (trivially) a pseudocodeword as well; moreover, any sum of codewords is a pseudocodeword. However, in general, there exist pseudocodewords that
cannot be decomposed into a sum of codewords. As an illustration, consider the Hamming code of Fig. 1; earlier, we constructed a fractional LCLP solution for this code. If we simply
scale this fractional solution by a factor of two, the result is
of the following form. We begin by
a pseudocodeword
960
setting
pseudocodeword, set
with label
as
and so
961
, the
Theorem 9: For a code with fractional distance
bits are flipped
LP decoder is successful if at most
by the binary symmetric channel.
Proof: Suppose the LP decoder fails; i.e., the optimal soto LCLP has
. We know that must be
lution
, we have
. This implies
a vertex of . Since
, since the fractional distance is at least
.
that
Let
be the set of bits flipped by the channel.
Under the BSC, and the all-zeros assumption, we have
if
, and
if
. Therefore, we can write the cost
as the following:
of
(12)
Since at most
have that
It follows that
since
. Therefore, by (12), we have
.
However, by Theorem 5 and the fact that the decoder failed, the
to LCLP must have cost less than or
optimal solution
equal to zero; i.e.,
. This is a contradiction.
Note again the analogy to the classical case: just as exact ML
decoding has a performance guarantee in terms of classical distance, Theorem 9 establishes that the LP decoder has a performance guarantee specified by the fractional distance of the code.
962
1We thank G. David Forney for suggesting the study of the normal realizations
of the ReedMuller codes.
963
Theorem 12: Under the BEC, there is a nonzero pseudocodeword with zero cost if and only if there is a stopping set. Therefore, the performance of LP and BP decoding are equivalent for
the BEC.
Proof: If there is a zero-cost pseudocodeword, then
be a pseudocodeword where
there is a stopping set. Let
. Let
. Since all
, we must
have
for all
; therefore,
.
Suppose is not a stopping set; then
where check node has only one neighbor in . By the defi.
nition of a pseudocodeword, we have
(by the definition of ), there must be some
Since
,
such that
. Since has even cardinality, there must be at least one other code bit in , which
by the defis also a neighbor of check . We have
, implying
.
inition of pseudocodeword, and so
This contradicts the fact that has only one neighbor in .
If there is a stopping set, then there is a zero-cost pseudocodeword. Let be a stopping set. Construct pseudocodeword
as follows. For all
, set
; for all
, set
.
Since
, we immediately have
.
. For all
,
For a check , let
even, set
. By the definition of a stopwhere
, so if
is odd, then
. For all
ping set,
, where
odd, let
be an
. If
, set
.
arbitrary size- subset of
Set all other
Set
that we have not set in this process. We have
for all
Additionally
for all
Therefore,
is a pseudocodeword.
B. Cycle Codes
A cycle code is a binary linear code described by a factor
graph whose variable nodes all have degree . In this case, pseudocodewords consist of a collection of cycle-like structures we
call promenades [1]. This structure is a closed walk through the
graph that is allowed to repeat nodes, and even traverse edges in
different directions, as long as it makes no U-turns; i.e., it does
not use the same edge twice in a row. Wiberg [4] calls these same
structures irreducible closed walks. We may conclude from this
connection that iterative and LP decoding have identical performance in the case of cycle codes.
We note that even though cycle codes are poor in general,
they are an excellent example of when LP decoding can decode
beyond the minimum distance. For cycle codes, the minimum
distance is no better than logarithmic. However, we showed [1]
that there are cycle codes for which LP decoding has a WER of
for any
, requiring only that the crossover probability
is bounded by a certain function of the constant (independent
of ).
964
C. Tail-Biting Trellises
On tail-biting trellises, one can write down a linear program
similar to the one we explored for turbo codes [1] such that pseudocodewords in this LP correspond to those analyzed by Forney
et al. [5]. This linear program is, in fact, an instance of network
flow, and therefore is solvable by a more efficient algorithm than
a generic LP solver. (See [17] for a general treatment of LPs for
trellis-based codes, including turbo-like codes.)
In this case, pseudocodewords correspond to cycles in a directed graph (a circular trellis). All cycles in this graph have
for some integer
. Codewords are simple cylength
cles of length exactly . Forney et al. [5] show that iterative
decoding will find the pseudocodeword with minimum weightper-symbol. Using basic network flow theory, it can be shown
that the weight-per-symbol of a pseudocodeword is the same as
the cost of the corresponding LP solution. Thus, these two algorithms have identical performance.
We note that to get this connection to tail-biting trellises, if
the code has a factor graph representation, it is not sufficient
simply to write down the factor graph for a code and plug in
the polytope . This would be a weaker relaxation in general.
One has to define a new linear program like the one we used
for turbo-like codes [1]. With this setup, the problem reduces
directly to min-cost flow.
D. Tree-Reweighted Max-Product
In earlier work [2], we explored the connection between
this LP-based approach applied to turbo codes, and the
tree-reweighted max-product message-passing algorithm developed by Wainwright, Jaakkola, and Willsky [26]. Similar
to the usual max-product (min-sum) algorithm, the algorithm
is based on passing messages between nodes in the factor
graph. It differs from the usual updates in that the messages are
suitably reweighted according the structure of the factor graph.
By drawing a connection to the dual of our linear program, we
showed that whenever this algorithm converges to a codeword,
it must be the ML codeword. Note that the usual min-sum
algorithm does not have such a guarantee.
E. Min-Sum Decoding
The deviation sets defined by Wiberg [4], and further refined
by Forney et al. [6] can be compared to pseudocodeword graphs.
The computation tree of the iterative min-sum algorithm is a
map of the computations that lead to the decoding of a single
bit at the root of the tree. This bit will be decoded correctly
(assuming the all-zeros word is sent) unless there is a negativecost locally consistent minimal configuration of the tree that sets
this bit to . Such a configuration is called a deviation set, or a
pseudocodeword.
All deviation sets have a support, which is the set of nodes
are
in the configuration that are set to . All such supports
acyclic graphs of the following form. The nodes of are nodes
from the factor graph, possibly with multiple copies of a node.
Furthermore,
Since the (graph) is finite, an infinite deviation cannot behave completely irregularly; it must repeat itself somehow.
It appears natural to look for repeatable, or closed
structures (in the graph), with the property that any deviation can be decomposed into such structures. [4]
Our definition of a pseudocodeword is the natural closed
structure within a deviation set. However, an arbitrary deviation
set cannot be decomposed into pseudocodewords, since it may
be irregular near the leaves. Furthermore, as Wiberg points out,
the cost of a deviation set is dominated by the cost near the
leaves, since the number of nodes grows exponentially with the
depth of the tree.
Thus, strictly speaking, min-sum decoding and LP decoding
are incomparable. However, experiments suggest that it is rare
for min-sum decoding to succeed and LP decoding to fail (see
Fig. 7). We also conclude from our experiments that the irregular
unclosed portions of the min-sum computation tree are not
worth considering; they more often hurt the decoder than help it.
F. New Iterative Algorithms and ML Certificates
From the LP Dual
In earlier work [2], we described how the iterative subgradient ascent [34] algorithm can be used to solve the LP dual for
RA codes. Thus, we have an iterative decoder whose error-correcting performance is identical to that of LP decoding in this
case. This technique may also be applied in the general setting
of LDPC codes [17]; thus, we have an iterative algorithm for any
LDPC code with all the performance guarantees of LP decoding.
965
966
APPENDIX I
PROVING THEOREM 4
Recall that
is the set of points such that
all
, and for all
,
odd
for
(13)
For a particular
, let
if
. We specify the facet
and
variables , using the equation
if
,
in terms of the
Since
. Since
we may
967
and
, we have a variable
,
. This variable indicates the contribution
of weight- local codewords.
For all
,
, and
, we have a variable
,
, indicating the portion of
locally assigned to local codewords of weight .
Using these variables, we have the following constraint set:
For all
(14)
flipped;
has odd
,
(15)
(16)
(17)
Note that
for all
, and
Since
, we have
, and so
(18)
(19)
be the set of points
such that the above
Let
constraints hold. This polytope
has only
varivariables, for a total of
ables per check node , plus the
variables. The number of constraints is at
. In total, this representation has at
most
most
variables and constraints. We must now show that
is equivalent to optimizing over . Since
optimizing over
variables, it suffices to
the cost function only affects the
show that the two polytopes have the same projection onto the
variables. Before proving this, we need the following fact.
Lemma 14: Let
,
, and
, where , , , and all are nonnegative integers. Then,
can be expressed as the sum of sets of size . Specifically,
there exists a setting of the variables
to nonnegative integers such that
, and for all
,
.
Proof: By induction on .2 The base case
is
simple; all are equal to either or , and so exactly of them
.
are equal to . Set
For the induction step, assume w.l.o.g. that
. Set
, where
if
, and
otherwise. The fact that
and
for all
implies that
for all
, and
for all
.
for all . We also have
Therefore,
Therefore, by induction,
can be expressed as the sum of
,
where has size . Set
, then increase
by .
This setting of expresses .
Proposition 15: The set
is
s.t.
. Therefore,
equal to the set
optimizing over is equivalent to optimizing over .
Proof: Suppose
. Set
(20)
2We
968
(21)
It is clear that the constraints (17)(19) are satisfied by this setting. Constraint (14) is implied by (7) and (21). Constraint (15)
,
is implied by (6) and (20). Finally, we have, for all
(by (21))
(by (20))
giving constraint (16).
Now suppose
is a vertex of the polytope , and so
,
, consider the set
all variables are rational. For all
APPENDIX III
PROVING THEOREM 6
In this appendix, we show that the all-zeros assumption is
valid when analyzing LP decoders defined on factor graphs.
Specifically, we prove the following theorem.
Theorem 6: The probability that the LP decoder fails is independent of the codeword that was transmitted.
Proof: Recall that
is the probability that the
LP decoder makes an error, given that was transmitted. For an
arbitrary transmitted word , we need to show that
Define
to be the set of received words that
would cause decoding failure, assuming was transmitted. By
Theorem 5
, we get
(25)
The set
consists of integers between and . By (16), we
is equal to . So, by
have that the sum of the elements in
can be expressed as the sum of sets of
Lemma 14, the set
according to
size . Set the variables
Lemma 14. Now set
, for all
.
We immediately satisfy (5). By Lemma 14 we get
(22)
and
(23)
By (14), we have
(by (22))
(26)
(27)
(28)
Equations (26) and (28) follow from the definition of
(27) follows from the symmetry of the channel.
, and
969
(29)
where
. Note that
if
(30)
so we get
, satisfying the distribution
constraints.
satisfies the consistency conIt remains to show that
straints (7). In the following, we assume that sets are contained within the appropriate set , which will be clear from
in , we have
context. For all edges
,
(32)
Case 1:
, there is some
The last step follows from the fact that
Case 2:
as long as
.
. From (32), we have
(by (31))
(33)
, such that
The last step follows from the fact that
Therefore,
. A symmetric argument (using the other
then
.
half of the lemma) shows that if
Before proving Lemma 16, we need to define the notion of a
relative solution in , and prove results about its feasibility and
denote the symmetric
cost. For two sets and , let
. Let
difference of and , i.e.,
be the point in LCLP corresponding to the codeword
sent over the channel. For a particular feasible solution
to LCLP, set
to be the relative solution with respect to
as follows: For all bits , set
. For all
checks , let
be the member of
where
. For all
, set
.
Note that for a fixed
, the operation of making a relative solution is its own inverse; i.e., the relative solution to
is
.
as long as
, we get
, we have
970
and
represent the sets of variable and
Recall that
that are copies of the same node in the uncheck nodes in
, let
derlying factor graph . For a variable node
be the corresponding node in ; i.e.,
if
is a variable node, and (
for some
)
if is a check node.
of
in
Claim 19: For all promenades of length less than the girth
, is a simple path in , and also represents a simple path
. More precisely, for all promenades
, there is some
APPENDIX IV
PROVING THEOREM 11
Before proving Theorem 11, we will prove a few useful facts
about pseudocodewords and pseudocodeword graphs. For all
be a factor graph with all
the theorems in this section, let
variable nodes having degree at least , where
and all
. Let
check nodes having degree at least , where
be the girth of ,
. Let be the graph of some arbitrary
of ,
.
pseudocodeword
We define a promenade to be a path
in that may repeat nodes and edges, but takes no U-turns; i.e.,
,
. We will also use to
for all ,
represent the set of nodes on the path (the particular use will
be clear from context). Note that each could be a variable or a
check node. These paths are similar to the notion of promenade
in [1], and to the irreducible closed walk of Wiberg [4]. A simple
path of a graph is one that does not repeat nodes.
is a simple path in
, and that
is a simple path in .
is a valid path. By construcProof: First note that
in , there must be an edge
tion, if there is an edge
in . If
is simple, then
must be
is simple. This is
simple, so we only need to show that
is less than the girth of the graph.
true since the length of
For the remainder of this appendix, suppose w.l.o.g. that
. Thus,
. Note that is even, since is
, let be the set of nodes in
bipartite. For all
within distance
of
; i.e., is the set of nodes
with a path in of length at most
from
.
Claim 20: The subgraph induced by the node set is a tree.
Proof: Suppose this is not the case. Then, for some node
, there are at least two different paths from
to ,
in
. This implies a cycle in
each with length at most
of length less than ; a contradiction to Claim 19.
Claim 21: The node subsets
in are all mutually disjoint.
,
Proof: Suppose this is not the case; then, for some
and
share at least one vertex. Let be the vertex in
closest to the root
that also appears in . Now consider
the promenade
, where the
to is the unique such path in the tree ,
subpath from
is the unique such path in the
and the subpath from to
. We must show that has no U-turns. The subpaths
tree
and
are simple, so we must
. Since we chose to be the node closest
show only that
that appears in
,
must not appear in
, and so
to
. Since
,
must be simple path by Claim
19. However, it is not, since node appears twice in
, once
at the beginning and once at the end. This is a contradiction.
Claim 22: The number of variable nodes in
is at least
.
Proof: Take one node set . We will count the number
of nodes on each level of the tree induced by . Each level
consists of all the nodes at distance from
. Note that
even levels contain variable nodes, and odd levels contain check
nodes.
Recall that
It follows that
where
971
REFERENCES
[1] J. Feldman and D. R. Karger, Decoding turbo-like codes via linear programming, in Proc. 43rd Annu. IEEE Symp. Foundations of Computer
Science (FOCS), Vancouver, BC, Canada, Nov. 2002, pp. 251260.
[2] J. Feldman, M. J. Wainwright, and D. R. Karger, Linear programming-based decoding of turbo-like codes and its relation to iterative
approaches, in Proc. Allerton Conf. Communications, Control and
Computing, Monticello, IL, Oct. 2002.
[3] C. Di, D. Proietti, I. E. Telatar, T. J. Richardson, and R. L. Urbanke, Finite-length analysis of low-density parity-check codes on the binary erasure channels, IEEE Trans. Inf. Theory, vol. 48, no. 6, pp. 15701579,
Jun. 2002.
[4] N. Wiberg, Codes and decoding on general graphs, Ph.D. dissertation,
Linkping University, Linkping, Sweden, 1996.
[5] G. D. Forney, F. R. Kschischang, B. Marcus, and S. Tuncel, Iterative decoding of tail-biting trellises and connections with symbolic dynamics,
in Codes, Systems and Graphical Models. New York: Springer-Verlag,
2001, pp. 239241.
[6] G. D. Forney, R. Koetter, F. R. Kschischang, and A. Reznik, On the
effective weights of pseudocodewords for codes defined on graphs
with cycles, in Codes, Systems and Graphical Models. New York:
Springer-Verlag, 2001, pp. 101112.
[7] R. Gallager, Low-density parity-check codes, IRE Trans. Inf. Theory,
vol. IT-8, no. 1, pp. 2128, Jan. 1962.
[8] D. MacKay, Good error correcting codes based on very sparse matrices, IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 399431, Mar.
1999.
[9] M. Sipser and D. Spielman, Expander codes, IEEE Trans. Inf. Theory,
vol. 42, no. 6, pp. 17101722, Nov. 1996.
[10] S.-Y. Chung, G. D. Forney, T. Richardson, and R. Urbanke, On the design of low-density parity-check codes within 0.0045 dB of the Shannon
limit, IEEE Commun. Lett., vol. 5, no. 2, pp. 5860, Feb. 2001.
[11] R. McEliece, D. MacKay, and J. Cheng, Turbo decoding as an instance
of Pearls belief propagation algorithm, IEEE J. Sel. Areas Commun.,
vol. 16, no. 2, pp. 140152, Feb. 1998.
[12] T. J. Richardson and R. L. Urbanke, The capacity of low-density paritycheck codes under message-passing decoding, IEEE Trans. Inf. Theory,
vol. 47, no. 2, pp. 599618, Feb. 2001.
[13] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. Spielman,
Improved low-density parity-check codes using irregular graphs and
belief propagation, in Proc. IEEE Int. Symp. Information Theory, Cambridge, MA, Oct. 1998, p. 117.
[14] B. J. Frey, R. Koetter, and A. Vardy, Signal-space characterization of
iterative decoding, IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 766781,
Feb. 2001.
[15] J. Feldman, M. J. Wainwright, and D. R. Karger, Using linear programming to decode linear codes, presented at the 37th Annu. Conf. on Information Sciences and Systems (CISS 03), Baltimore, MD, Mar. 2003.
[16] J. Feldman, D. R. Karger, and M. J. Wainwright, LP decoding, in Proc.
41st Annu. Allerton Conf. Communications, Control, and Computing,
Monticello, IL, Oct. 2003.
[17] J. Feldman, Decoding error-correcting codes via linear programming,
Ph.D. dissertation, MIT, Cambridge, MA, 2003.
[18] J. Feldman, T. Malkin, R. A. Servedio, C. Stein, and M. J. Wainwright,
LP decoding corrects a constant fraction of errors, in Proc. IEEE Int.
Symp. Information Theory, Chicago, IL, Jun./Jul. 2004, p. 68.
[19] J. Feldman and C. Stein, LP decoding achieves capacity, in Proc.
Symp. Discrete Algorithms (SODA 05), Vancouver, BC, Canada, Jan.
2005.
[20] R. Koetter and P. O. Vontobel, Graph-covers and iterative decoding of
finite length codes, in Proc. 3rd Int. Symp. Turbo Codes, Brest, France,
Sep. 2003, pp. 7582.
[21] P. Vontobel and R. Koetter, On the relationship between linear programming decoding and max-product decoding, paper submitted to Int.
Symp. Information Theory and its Applications, Parma, Italy, Oct. 2004.
, Lower bounds on the minimum pseudo-weight of linear codes,
[22]
in Proc. IEEE Int. Symp. Information Theory, Chicago, IL, Jun./Jul.
2004, p. 70.
[23] A. Schrijver, Theory of Linear and Integer Programming. New York:
Wiley, 1987.
[24] M. Grotschel, L. Lovsz, and A. Schrijver, The ellipsoid method and
its consequences in combinatorial optimization, Combinatorica, vol. 1,
no. 2, pp. 169197, 1981.
972