A Refinement of Expurgation

Giuseppe Cocco, Albert Guillén i Fàbregas and Josep Font-Segura Giuseppe Cocco is with the Department of Signal Theory and Communications, Universitat Politècnica de Catalunya, 08034, Barcelona, Spain (e-mail: [email protected]). Albert Guillén i Fàbregas is with the Department of Engineering, University of Cambridge, CB2 1PZ Cambridge, U.K., and also with the Department of Information and Communication Technologies, Universitat Pompeu Fabra, 08018 Barcelona, Spain (e-mail: [email protected]). Josep Font-Segura is with the Department of Information and Communication Technologies, Universitat Pompeu Fabra, 08018, Barcelona, Spain (e-mail: [email protected]). This work was supported in part by the Ramon y Cajal fellowship program (grant RYC2021-033908-I) funded by the Spanish Government through MCIN/AEI/10.13039/501100011033 and the European Union “NextGenerationEU” Recovery Plan, the European Research Council under ERC Agreement 725411 and by the Spanish Ministry of Economy and Competitiveness under Grant PID2020-116683GB-C22.

Abstract

We show that for a wide range of channels and code ensembles with pairwise-independent codewords, with probability tending to $1$ with the code length, expurgating an arbitrarily small fraction of codewords from a randomly selected code results in a code attaining the expurgated exponent.

I Preliminaries

We consider the problem of reliable communication of $M_{n}$ equiprobable messages over noisy channels described by a random transformation $W^{n}\mkern-1.5mu(\boldsymbol{y}|\boldsymbol{x})$ , where $\boldsymbol{x}\in\mathcal{X}^{n}$ and $\boldsymbol{y}\in\mathcal{Y}^{n}$ are the channel input and output sequences, and $\mathcal{X}$ and $\mathcal{Y}$ are the input and output alphabets, respectively. Each message $m\in\{1,\dotsc,M_{n}\}$ , where $M_{n}=\lceil 2^{nR}\rceil$ , $R$ being the code rate, is mapped onto an $n$ -length codeword $\boldsymbol{x}_{m}$ sent over the channel. The code is defined as $\mathcal{C}(M_{n},n)=\{\boldsymbol{x}_{1},\dotsc,\boldsymbol{x}_{M_{n}}\}$ . We denote with $P_{{\rm e},m}\bigl{(}\mathcal{C}(M_{n},n)\bigr{)}$ the error probability when codeword $m\in\{1,\dotsc,M_{n}\}$ from code $\mathcal{C}(M_{n},n)$ is transmitted; similarly $P_{\rm e}\bigl{(}\mathcal{C}(M_{n},n)\bigr{)}=\frac{1}{M_{n}}\sum_{m=1}^{M_{n}% }P_{{\rm e},m}\bigl{(}\mathcal{C}(M_{n},n)\bigr{)}$ denotes the average error probability of the code. Let $\mathsf{C}(M_{n},n)=\{\boldsymbol{X}_{1},\dotsc,\boldsymbol{X}_{M_{n}}\}$ be a random code, i.e., a set of $M_{n}$ random codewords generated with probability $\mathbb{P}[\mathsf{C}(M_{n},n)=\mathcal{C}(M_{n},n)]=\mathbb{P}[\boldsymbol{X}% _{1}=\boldsymbol{x}_{1},\dotsc,\boldsymbol{X}_{M_{n}}=\boldsymbol{x}_{M_{n}}]$ . We assume that codewords are generated in a pairwise independent manner, that is, for any two indices $m,k\in\{1,\ldots,M_{n}\},m\neq k$ , it holds that $\mathbb{P}[\boldsymbol{X}_{m}=\boldsymbol{x}_{m},\boldsymbol{X}_{k}=% \boldsymbol{x}_{k}]=Q^{n}(\boldsymbol{x}_{m})Q^{n}(\boldsymbol{x}_{k})$ , where $Q^{n}(\boldsymbol{x}_{m})=\mathbb{P}[\boldsymbol{X}_{m}=\boldsymbol{x}_{m}]$ is a probability distribution defined over $\mathcal{X}^{n}$ .

Let $P_{{\rm e},m}\big{(}\mathsf{C}(M_{n},n)\big{)}$ and $P_{\rm e}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}$ be the random variables denoting the error probability of the $m$ -th codeword for random code $\mathsf{C}(M_{n},n)$ and the average error probability of the code, respectively. We denote the $n$ -length error exponents of such random variables by $E_{m}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}=-\frac{1}{n}\log P_{{\rm e},m}\big{(}% \mathsf{C}(M_{n},n)\big{)}$ and $E\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}=-\frac{1}{n}\log P_{\rm e}\bigl{(}\mathsf% {C}(M_{n},n)\bigr{)}$ , respectively. For some ensembles and channels the ensemble-average of the code error probability $\mathbb{E}\bigl{[}P_{\rm e}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}\bigr{]}$ is known to decay exponentially in $n$ [1]. A lower bound on the error exponent $-\frac{1}{n}\log\mathbb{E}\bigl{[}P_{\rm e}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}% \bigr{]}$ is given by Gallager’s multi-letter random coding exponent $E_{{\rm r}}^{n}(R,Q^{n})$ in [2, Eq. (5.6.16)]. For the DMC (DMC), this bound is known to coincide with the sphere-packing upper bound on the reliability function [3, 4] in the high rate region.

In [5, Sec. 5.7] Gallager showed that, for some channels and ensembles, there exists a code with strictly higher error exponent than $E_{{\rm r}}^{n}(R,Q^{n})$ at low rates. In order to show this, Gallager considered a pairwise-independent ensemble with $M_{n}^{\prime}=2M_{n}-1$ codewords. Using Markov’s inequality he showed that

\displaystyle\mathbb{P}\Bigl{[}P_{{\rm e},m}(\mathsf{C}(M_{n}^{\prime},n))\geq 2% ^{\frac{1}{s}}\mathbb{E}[P_{{\rm e},m}(\mathsf{C}(M_{n}^{\prime},n))^{s}]^{% \frac{1}{s}}\Bigr{]}\leq\frac{1}{2}

(1)

for any $s>0$ . He then introduced the indicator function

		$\displaystyle\varphi_{m}\bigl{(}\mathcal{C}(M_{n},n)\bigr{)}$
		$\displaystyle\ =\begin{cases}1\ \text{ if }P_{{\rm e},m}\bigl{(}\mathcal{C}(M_% {n},n)\bigr{)}<2^{\frac{1}{s}}\mathbb{E}\bigl{[}P_{{\rm e},m}\big{(}\mathsf{C}% (M_{n},n)\big{)}^{s}\bigr{]}^{\frac{1}{s}}\\ 0\ \text{ otherwise}\end{cases}$		(2)

and showed that, using (1) and (I), the following inequality holds

\displaystyle\mathbb{E}\left[\sum_{m=1}^{M_{n}^{\prime}}\varphi_{m}(\mathsf{C}% (M_{n}^{\prime},n))\right]\geq M_{n}.

(3)

From (3) it follows that, since the average number of codewords that have a probability of error smaller than $2^{\frac{1}{s}}\mathbb{E}\bigl{[}P_{{\rm e},m}\bigl{(}\mathcal{C}(M_{n}^{% \prime},n)\bigr{)}^{s}\bigr{]}^{\frac{1}{s}}$ in a randomly generated code with $M_{n}^{\prime}=2M_{n}-1$ codewords is at least $M_{n}$ , there must exist a code having at least $M_{n}$ codewords, out of the $M_{n}^{\prime}$ , fulfilling this property. Thus, by removing (expurgating) the worst half of the codewords from the code with $M_{n}^{\prime}$ codewords we obtain a new code with $M_{n}$ codewords, each of which satisfies the condition in the first line of the right-hand side in (I). Finally, restricting $s$ to $0<s\leq 1$ , Gallager derives a lower bound on the exponent of $2^{\frac{1}{s}}\mathbb{E}[P_{{\rm e},m}\bigl{(}\mathcal{C}(M_{n}^{\prime},n)% \bigr{)}^{s}]^{\frac{1}{s}}$ , given by

\displaystyle E_{\rm ex}^{n}(R,Q^{n})=E_{\rm x}^{n}(\hat{\rho}_{n},Q^{n})-\hat% {\rho}_{n}R,

(4)

where

\displaystyle E_{\rm x}^{n}(\rho,Q^{n})

\displaystyle=-\frac{1}{n}\log\biggl{(}\sum_{\boldsymbol{x}}\sum_{\boldsymbol{% x}^{\prime}}Q^{n}(\boldsymbol{x})Q^{n}(\boldsymbol{x}^{\prime})Z_{n}(% \boldsymbol{x},\boldsymbol{x}^{\prime})^{\frac{1}{\rho}}\biggr{)}^{\rho},

(5)

$Z_{n}(\boldsymbol{x},\boldsymbol{x}^{\prime})=\sum_{\boldsymbol{y}}\sqrt{W^{n}% \mkern-1.5mu(\boldsymbol{y}|\boldsymbol{x})W^{n}\mkern-1.5mu(\boldsymbol{y}|% \boldsymbol{x}^{\prime})}$ is the Bhattacharyya coefficient between codewords $\boldsymbol{x},\boldsymbol{x}^{\prime}\in\mathcal{X}^{n}$ while

\displaystyle\hat{\rho}_{n}=\operatorname*{arg\,max}_{\rho\geq 1}\bigl{\{}E_{% \rm x}^{n}(\rho,Q^{n})-\rho R\bigr{\}}

(6)

is the parameter that yields the highest exponent. The preceding argument is valid for the maximal probability of error, since every codeword in the expurgated code attains the same exponent. In addition, observe that since (3) uses the standard ensemble-average argument (i.e. by taking the average over the ensemble) we show the existence of a code with the desired property. The exponent in (4) is the expurgated exponent. We refer to the code with $M_{n}^{\prime}$ codewords before expurgation as a mother code. We say that a mother code is good if, once expurgated, we obtain a code with asymptotically the same rate, the codewords of which each have an exponent at least as large as the expurgated.

A refinement of the above follows from (1). Specifically, for $\epsilon>0$ it can be shown that there exists a code with $M_{n}^{\prime}=M_{n}(1+\epsilon)$ codewords such that removing $\epsilon M_{n}$ codewords yields a code that attains the expurgated exponent [6, Lemma 1]. Although [6, Lemma 1] generalizes Gallager’s method, it still only shows the existence of a code that attains the expurgated exponent.

II Main Result

This paper strengthens existing results on expurgation by showing that the probability of finding a code with $M_{n}^{\prime}=(1+\epsilon)M_{n}$ codewords that contains a code with at least $M_{n}$ codewords each of which achieving the expurgated exponent tends to $1$ with the code length. We define the sequence $\delta_{n}=\frac{\hat{\rho}_{n}}{n}\log\gamma_{n}$ , where $\gamma_{n}$ is such that $\lim_{n\rightarrow\infty}\gamma_{n}=\infty$ while $\lim_{n\rightarrow\infty}\frac{\log\gamma_{n}}{n}=0$ , $\hat{\rho}_{n}$ being a positive sequence defined in (6) that depends on the channel, the ensemble and the rate. From the definition of $\delta_{n}$ it can be seen that if $\hat{\rho}_{n}$ either converges to a constant or grows sufficiently slowly, there exists a $\gamma_{n}$ such that $\delta_{n}\rightarrow 0$ . Similarly to Gallager, for a given $\delta_{n}$ , we define the indicator function

\displaystyle\phi_{m}\bigl{(}\mathcal{C}(M_{n},n)\bigr{)}=\begin{cases}1\ % \text{ if }E_{m}\bigl{(}\mathcal{C}(M_{n},n)\bigr{)}>E_{\rm ex}^{n}(R,Q^{n})-% \delta_{n}\\ 0\ \text{ otherwise},\end{cases}

(7)

and the number of codewords attaining an exponent higher than $E_{\rm ex}^{n}(R,Q^{n})-\delta_{n}$ as

\displaystyle\Phi\bigl{(}\mathcal{C}(M_{n}^{\prime},n)\bigr{)}~{}{=}\sum_{m=1}% ^{M_{n}^{\prime}}\phi_{m}\bigl{(}\mathcal{C}(M_{n}^{\prime},n)\bigr{)}.

(8)

Theorem 1

Consider a pairwise-independent code ensemble with $M_{n}^{\prime}=M_{n}(1+\epsilon)$ codewords and any $\epsilon>0$ . If the sequence $\{\delta_{n}\}_{n=1}^{\infty}$ , which depends on the channel and the ensemble, satisfies $\lim_{n\rightarrow\infty}\delta_{n}=0$ , then for any $0<\epsilon_{1}<\epsilon$ , it holds that

\displaystyle\lim_{n\rightarrow\infty}\mathbb{P}\bigl{[}\Phi\bigl{(}\mathsf{C}% (M_{n}^{\prime},n)\bigr{)}\geq M_{n}(1+\epsilon_{1})\bigr{]}=1.

(9)

Proof:

See Section III. ∎

In words, with high probability we find a mother code with $M_{n}^{\prime}=(1+\epsilon)M_{n}$ codewords, $M_{n}$ of which attain the expurgated exponent. That is, good mother codes are found easily and only contain an arbitrarily small fraction $\epsilon/(1+\epsilon)$ of codewords that need to be expurgated. Theorem 1 extends Gallager’s method, and applies, among others, to i.i.d. (i.i.d.) and constant composition codes over DMCs, as well as channels with memory such as the finite-state channel in [2, Sec. 4.6], for which the expurgated exponent is derived in [7].

As a final remark, recent works [8, 9, 7, 10] show that for many ensembles, most low-rate codes have an error exponent $E\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}$ that is strictly larger than the exponent of the ensemble average error probability, i.e., the random coding exponent. Similarly, Theorem 1 implies that for most codes, almost any codeword has an associated error exponent $E_{m}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}$ that is strictly larger than the ensemble average of the exponent of the error probability of the codebook $\mathbb{E}\big{[}E\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}\big{]}$ . In both cases the smaller error exponent of the average probability of error is due to a relatively small number of elements (codes in the first case, codewords in the second) that perform poorly. Furthermore, as shown in [9, 10] for i.i.d. and constant composition codes over DMC, the error exponents of the codes in the ensemble concentrate around the TRC (TRC) exponent [11, 8]. Similarly to such works, it can be shown that the error exponent $E_{m}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}$ , for any $m$ , concentrates around its mean, the expurgated exponent. The proof makes use of Lemma 1 in Section III, and follows almost identical steps as in [10, Theorem 1], [7, Theorem 1] and [7, Theorem 2] once $P_{\rm e}(\mathcal{C})$ is replaced by $P_{{\rm e},m}\bigl{(}\mathcal{C}\bigr{)}$ and it is omitted here.

III Proof of Theorem 1

We start with the following lemma, whose proof is almost identical to that of [7, Lemma 1].

Lemma 1

For a channel $W^{n}$ and a pairwise-independent $M_{n}^{\prime}$ -codewords code ensemble with codeword distribution $Q^{n}$ , for any ${m}\in\{1,\ldots,M_{n}^{\prime}\}$ it holds that

\mathbb{P}\bigl{[}E_{m}\bigl{(}\mathsf{C}(M_{n}^{\prime},n)\bigr{)}>E_{\rm ex}% ^{n}(R,Q^{n})-\delta_{n}\bigr{]}\geq 1-\frac{1}{\gamma_{n}},

(10)

where $\gamma_{n}$ and $\delta_{n}$ are positive real-valued sequences.

The proof of Lemma 1 follows from Markov’s inequality

\displaystyle\mathbb{P}\Bigl{[}P_{{\rm e},m}(\mathsf{C}_{n})\geq\gamma_{n}^{% \frac{1}{s}}\mathbb{E}[P_{{\rm e},m}(\mathsf{C}_{n})^{s}]^{\frac{1}{s}}\Bigr{]% }\leq\frac{1}{\gamma_{n}}

(11)

and applying the same steps as in [7, Theorem 1] once $P_{\rm e}(\mathsf{C}_{n})$ is replaced with $P_{{\rm e},m}(\mathsf{C}_{n})$ . The sequences $\gamma_{n}$ and $\delta_{n}$ are the same as those introduced in Section II. Observe that using inequality (11) and following similar steps as in [7] it can be shown that $\lim_{n\rightarrow\infty}E_{\rm ex}^{n}(R,Q^{n})$ is a lower bound on $\lim_{n\rightarrow\infty}\mathbb{E}[E_{m}\bigl{(}\mathsf{C}(M_{n},n)\bigr{)}]$ . Furthermore, using similar arguments as in [10] it can be shown that such bound is tight at least for i.i.d. and constant composition codes over DMC. That is, for such ensembles and channels $\lim_{n\rightarrow\infty}\mathbb{E}\big{[}-\frac{1}{n}\log P_{{\rm e},m}\big{(% }\mathsf{C}(M_{n},n)\big{)}\big{]}=\lim_{n\rightarrow\infty}E_{\rm ex}^{n}(R,Q% ^{n})$ , i.e., the expurgated is the typical codeword exponent.

If the positive sequence $\hat{\rho}_{n}$ , defined in (6), converges or grows sufficiently slowly, then there exists a sequence $\gamma_{n}$ such that $\lim_{n\rightarrow\infty}\gamma_{n}=\infty$ , $\lim_{n\rightarrow\infty}\frac{\log\gamma_{n}}{n}=0$ , for which $\delta_{n}=\frac{\hat{\rho}_{n}}{n}\log\gamma_{n}\rightarrow 0$ . For rate zero, that is when $\lim_{n\to\infty}\frac{1}{n}\log{M_{n}}=0$ , the $n$ -length error exponent in (4) depends on the particular subexponential growth of $M_{n}$ , while $\hat{\rho}_{n}$ tends to infinity with a growth that depends on the channel and the ensemble. In this case, as discussed in the paragraph succeeding [7, Eq. (89)], the assumption that $\frac{\hat{\rho}_{n}}{n}\log\gamma_{n}\rightarrow 0$ holds if the normalized variance of the Bhattacharyya coefficient $Z_{n}(\boldsymbol{x},\boldsymbol{x}^{\prime})$ grows slower than $\smash{\sqrt{\frac{n}{\log\gamma_{n}}}}$ . In any case, choosing such $\gamma_{n}$ and applying Lemma 1 we have that

\mathbb{P}\bigl{[}E_{m}\bigl{(}\mathsf{C}(M_{n}^{\prime},n)\bigr{)}>E_{\rm ex}% ^{n}(R,Q^{n})-\delta_{n}\bigr{]}\geq 1-\frac{1}{\gamma_{n}}.

(12)

The random variable $\Phi(\mathsf{C}(M_{n}^{\prime},n))$ , averaged across the ensemble, satisfies

$\displaystyle\mathbb{E}[\Phi(\mathsf{C}(M_{n}^{\prime},n))]$	$\displaystyle=\sum_{m=1}^{M_{n}(1+\epsilon)}\mathbb{E}[\phi_{m}(\mathsf{C}(M_{% n}^{\prime},n))]$	(13)
	$\displaystyle\geq\sum_{m=1}^{M_{n}(1+\epsilon)}\left(1-\frac{1}{\gamma_{n}}\right)$	(14)
	$\displaystyle=M_{n}(1+\epsilon)\left(1-\frac{1}{\gamma_{n}}\right),$	(15)

where (14) follows from the definition of the indicator function (7) and (12).

We define $\Psi(\mathsf{C}(M_{n}^{\prime},n))=M_{n}^{\prime}-\Phi(\mathsf{C}(M_{n}^{% \prime},n))$ , which is the number of codewords with exponent smaller than $E_{\rm ex}^{n}(R,Q^{n})-\delta_{n}$ . From (15) it follows that

\displaystyle\mathbb{E}[\Psi(\mathsf{C}(M_{n}^{\prime},n))]\leq\frac{M_{n}(1+% \epsilon)}{\gamma_{n}}.

(16)

Then, for sufficiently large $n$ we have that

\displaystyle\mathbb{P}\Big{[}\Psi(\mathsf{C}(M_{n}^{\prime},n))>\frac{M_{n}(1% +\epsilon)}{\sqrt{\gamma_{n}}}\Big{]}

\displaystyle~{}{\leq}~{}\frac{1}{\sqrt{\gamma_{n}}},

(17)

where (17) follows from Markov’s inequality and (16). This shows that the probability of finding a code with many codewords with exponent strictly smaller than $E_{\rm ex}^{n}(R,Q^{n})-\delta_{n}$ vanishes with $n$ . To prove our main result, we write the tail probability in (9) as

		$\displaystyle\mathbb{P}\Big{[}\Phi(\mathsf{C}(M_{n}^{\prime},n))\geq M_{n}(1+% \epsilon_{1})\Big{]}$
		$\displaystyle\ =1-\mathbb{P}\Big{[}\Phi(\mathsf{C}(M_{n}^{\prime},n))<M_{n}(1+% \epsilon_{1})\Big{]}$		(18)
		$\displaystyle\ =1-\mathbb{P}\Big{[}\Psi(\mathsf{C}(M_{n}^{\prime},n))>M_{n}(% \epsilon-\epsilon_{1})\Big{]},$		(19)

where we used the definitions of $\Psi(\mathsf{C}(M_{n}^{\prime},n))$ and $M_{n}^{\prime}$ . Since $\gamma_{n}$ tends to infinity, there must exist an $n_{0}\in\mathbb{N}$ such that $\epsilon-\epsilon_{1}>\frac{(1+\epsilon)}{\sqrt{\gamma_{n}}}$ for $n>n_{0}$ and therefore

	$\displaystyle\lim_{n\to\infty}\mathbb{P}\Big{[}\Phi(\mathsf{C}(M_{n}^{\prime},% n))\geq M_{n}(1+\epsilon_{1})\Big{]}$
	$\displaystyle\ \geq\lim_{n\to\infty}1-\mathbb{P}\Big{[}\Psi(\mathsf{C}(M_{n}^{% \prime},n))>\frac{M_{n}(1+\epsilon)}{\sqrt{\gamma_{n}}}\Big{]}$		(20)
	$\displaystyle\ \geq\lim_{n\to\infty}1-\frac{1}{\sqrt{\gamma_{n}}},$		(21)

where (21) follows from (17). Finally, solving the limit yields the desired result.

References

[1] A. Feinstein, “Error bounds in noisy channels without memory,” IRE Trans. on Information Theory, vol. 1, no. 2, pp. 13–14, 1955.
[2] R. Gallager, Information Theory and Reliable Communication. USA: John Wiley & Sons, Inc., 1968.
[3] C. Shannon, R. Gallager, and E. Berlekamp, “Lower bounds to error probability for coding on discrete memoryless channels. I,” Information and Control, vol. 10, no. 1, pp. 65–103, 1967. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0019995867900526
[4] R. Fano, Transmission of Information: A Statistical Theory of Communication. Massachusetts Institute of Technology Press, 1961.
[5] R. Gallager, “A simple derivation of the coding theorem and some applications,” IEEE Trans. Inf. Theory, vol. 11, no. 1, pp. 3–18, 1965.
[6] J. Scarlett, L. Peng, N. Merhav, A. Martinez, and A. Guillén i Fàbregas, “Expurgated random-coding ensembles: Exponents, refinements, and connections,” IEEE Trans. Inf. Theory, vol. 60, no. 8, pp. 4449–4462, 2014.
[7] G. Cocco, A. Guillén i Fàbregas, and J. Font-Segura, “Typical error exponents: A dual domain derivation,” IEEE Trans. Inf. Theory, vol. 69, no. 2, pp. 776–793, Feb. 2023.
[8] N. Merhav, “Error exponents of typical random codes,” IEEE Trans. Inf. Theory, vol. 64, no. 9, pp. 6223–6235, Sep. 2018.
[9] R. Tamir, N. Merhav, N. Weinberger, and A. Guillén i Fàbregas, “Large deviations behavior of the logarithmic error probability of random codes,” IEEE Trans. Inf. Theory, vol. 66, no. 11, pp. 6635–6659, 2020.
[10] L. V. Truong, G. Cocco, J. Font-Segura, and A. Guillén i Fàbregas, “Concentration properties of random codes,” IEEE Trans. Inf. Theory, vol. 69, no. 12, pp. 7499–7537, Dec. 2023.
[11] A. Barg and G. Forney, “Random codes: Minimum distances and error exponents,” IEEE Trans. Inf. Theory, vol. 48, no. 9, pp. 2568–2573, Sep. 2002.