Index Coding with Multiple Interpretations

Pedrosa, Valéria G.; Costa, Max H. M.

doi:10.3390/e24081149

Open AccessArticle

Index Coding with Multiple Interpretations

by

Valéria G. Pedrosa

^*

and

Max H. M. Costa

^*

School of Electrical and Computer Engineering, University of Campinas, Campinas 13083-852, SP, Brazil

^*

Authors to whom correspondence should be addressed.

Entropy 2022, 24(8), 1149; https://doi.org/10.3390/e24081149

Submission received: 20 June 2022 / Revised: 12 August 2022 / Accepted: 16 August 2022 / Published: 18 August 2022

(This article belongs to the Special Issue Applications of Information Theory in Statistics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The index coding problem consists of a system with a server and multiple receivers with different side information and demand sets, connected by a noiseless broadcast channel. The server knows the side information available to the receivers. The objective is to design an encoding scheme that enables all receivers to decode their demanded messages with a minimum number of transmissions, referred to as an index code length. The problem of finding the minimum length index code that enables all receivers to correct a specific number of errors has also been studied. This work establishes a connection between index coding and error-correcting codes with multiple interpretations from the tree construction of nested cyclic codes. The notion of multiple interpretations using nested codes is as follows: different data packets are independently encoded, and then combined by addition and transmitted as a single codeword, minimizing the number of channel uses and offering error protection. The resulting packet can be decoded and interpreted in different ways, increasing the error correction capability, depending on the amount of side information available at each receiver. Motivating applications are network downlink transmissions, information retrieval from datacenters, cache management, and sensor networks.

Keywords:

index coding; pliable index coding; error correcting index coding

1. Introduction

In this work, we consider a source code variant, introduced by Birk and Kol [1], originally called informed source coding-on-demand (ISCOD), and further developed by Bar-Yossef et al. [2]. Motivating applications include satellite transmission of large files, audio and video on demand (such as streaming networks), database data retrieval, cache management for network applications and sensor networks. The model considered in [1] involves a source that possesses n messages and m receivers. Each receiver knows a proper subset of messages, which is referred to as the side information and demands a specific message unknown to it. The source, aware of the messages possessed by each receiver, uses this knowledge to develop a transmission scheme that satisfies the demands of all receivers using as few transmissions as possible, referred to as the index code length.

Index coding can be viewed as special case of rate distortion with multiple receivers, each with some side information about the source [3]. Index coding has received considerable attention recently, motivated by applications in multi-user broadcast scenarios, such as audio and video on demand, streaming networks, satellite communications and by its connection to network coding. In [4,5], the equivalence between network encoding and index encoding has been established. This research topic has been extended in other directions, such as pliable index coding [6], a variation of index coding in which we still consider a server and m clients with side information, but where the receivers are flexible and satisfied to receive any message that is not in their side information set; such flexibility can reduce the amount of communication, sometimes significantly. This has applications in music streaming services or internet searching, such as content distribution networks (CDNs) [7]; a CDN manages servers in multiple geographically distributed locations, stores copies of the web content (including documents, images, audio and others) in its servers and attempts to direct each user request to a CDN location that will provide the best user experience. In this application, each receiver may be interested in receiving any message that it does not already possess as side information. Suppose that we are searching for the latest news and we already have some information. We are happy if we obtain any additional news that we do not have, with minimum delay. Here, we do not specify the news. On a music streaming service, users do not know which song will play next; they are usually only guaranteed that it will be one of a certain group and that it will not be repeated. In online advertising systems, customers do not require a specific advertisement to view; it is the distributor who chooses which one will be placed on customers’ screens. The distributor may wish to avoid repeating the same advertisement for the same customer, as this can decrease customer satisfaction.

How much we can gain in terms of bandwidth and user satisfaction, if recommendation systems become bandwidth-aware and take into account not only the user preferences? Song and Fragouli [8] formulated this as a new problem in the context of index coding, where they relaxed the index coding requirements and considered the case where the customer is satisfied to receive any message that they do not already have, with satisfaction proportional to their preference for that message.

A promising research area that has recently emerged is in how to use index coding to improve the communication efficiency in distributed computing systems, especially for data shuffling in iterative computations [9,10]. Index coding has been proposed to increase the efficiency of data shuffling, which can form a major communication bottleneck for big data applications. In particular, pliable index coding can offer a more efficient framework for data shuffling, as it can better leverage the many possible shuffling choices to reduce the number of transmissions.

The index coding problem subject to transmission errors was initially considered by Dau et al. [11]. In this work, we establish a connection between index coding and error-correcting codes with multiple interpretations from the tree construction of nested cyclic codes proposed in [12]. The notion of multiple interpretation using nested codes [13] is as follows: multiple information packets are separately encoded via linear channel codes, and then combined by addition and transmitted as a single codeword, minimizing the number of channel uses and offering error protection. The resulting packet can be decoded and interpreted in different ways, yielding an increase in error correction capability, depending on the amount of side information available at each receiver.

Part of the content of this paper was presented in [14]. In the current version, evidence to verify our claims has been added, as well as some examples. The results in this paper are an extension of the results in [12,14].

The main contributions of this paper are as follows.

We verify that, for cyclic codes, there will not always be an increase in error correction capability between different levels of the code tree. For this reason, we initially restrict the study to Reed–Solomon codes since they are maximum separable distance (MDS) codes, and provide an increase in Hamming distance at each level. This means that, under certain conditions, knowledge of side information can be interpreted as an increase in error correction capability.
We propose a new variant for the index coding problem, which we call “index coding with multiple interpretations”. We assume that receivers demand all the messages from the source and that the sender is unaware of the subset of messages already known by the receivers. The sender performs encoding such that any side information may be used by the decoder in order to increase its error correction capability. Moreover, if a receiver has no side information, the decoder considers the received word to belong to the highest rate code, associated with the root node of the tree.
We also propose a solution to relax some constraints on how side information should occur at the receivers, using graph coloring associated with the pliable index coding problem.

2. Preliminaries

2.1. Notation and Definitions

For any positive integer n, we let

[n] : = {1, \dots, n}

. We write

F_{q}

to denote the finite field of size q, where q is a prime power, and use

{F_{q}}^{n \times t}

to denote the vector space of all

n \times t

matrices over

F_{q}

.

2.2. Review of Linear and Cyclic Codes

We now introduce the notation and briefly review some of the relevant properties of linear and cyclic codes based on [15,16]. The purpose of a code is to add extra check symbols to the data symbols so that errors may be found and corrected at the receiver. That is, a sequence of data symbols is represented by some longer sequence of symbols with enough redundancy to protect the data. In general, to design coding schemes for receivers with side information, we will consider collections of linear codes that are of length n over

F_{q}

.

Structure of Linear Block Codes

Recall that under componentwise vector addition and componentwise scalar multiplication, the set of n-tuples of elements from

F_{q}

is the vector space called

F_{q}^{n}

. For the vectors

u = (u_{1}, \dots, u_{n}) \in F_{q}^{n}

and

v = (v_{1}, \dots, v_{n}) \in F_{q}^{n}

, the Hamming distance between u and v is defined to be the number of coordinates u and v that differ, i.e.,

d (u, v) = | {i \in [n] : u_{i} \neq v_{i}} | .

Definition 1.

A k-dimensional subspace

C

of

F_{q}^{n}

is called a linear

{(n, k, d)}_{q}

code over

F_{q}

if the minimum distance of

C

,

d (C) ≜ min_{u, v \in C, u \neq v} d (u, v)

is equal to d. Sometimes, we only use

{(n, k)}_{q}

to refer to the code

C

, where n is the length of the codewords and k is the dimension of the code. The code’s rate is the ratio

\frac{k}{n}

.

That is, a

{(n, k)}_{q}

linear code

C

can be completely described by any set of k linearly independent codewords

v_{1}, v_{2}, \dots, v_{k}

; thus, any codeword is one of the

q^{k}

linear combinations

\sum_{i_{1}}^{k} α_{i} v_{i}

,

α_{i} \in F_{q}

. If we arrange the codewords into a

k \times n

matrix G, we say that G is a generator matrix for

C

.

A special case of major importance is

F_{2}^{n}

, which is the vector space of all binary codewords of length n with two such vectors added by modulo-2 addition in each component. A binary code of size

M = 2^{k}

for an integer k is referred to as an

(n, k)

binary code.

We consider cyclic codes of length n over

F_{q}

with

g c d (n, q) = 1

. Label the coordinates of

c \in F_{q}^{n}

with the elements of

Z_{n} = {0, 1 \dots, n - 1}

and associate the vector

c = (c_{0}, \dots, c_{n - 1})

with the polynomial

c (x) = c_{0} + c_{1} x + \dots + c_{n - 1} x^{n - 1}

. With this correspondence, a cyclic code

C

is an ideal of the ring

R_{n} = F_{q} [x] / (x^{n} - 1)

. We use

g (x)

to denote the generator polynomial of

C

and write

C = 〈 g (x) 〉 = {C (x) \in F_{q} [x]; g (x) | C (x)}

to describe a t-error correcting cyclic code.

2.3. Index Coding with Side Information

The system shown in Figure 1 illustrates the index coding problem. Receiver

R_{i}

is requesting the message

x_{i}

,

i \in {1, 2, 3}

and knows other messages as side information;

R_{1}

knows

x_{3}

,

R_{2}

knows

x_{1}

and

x_{3}

and the receiver

R_{3}

knows

x_{1}

and

x_{2}

.

The goal of index coding is to perform the joint encoding of the messages, in order to simultaneously meet the demands of all receivers, while transmitting the resulting messages at the highest possible rate.

Assuming a noiseless broadcast channel, the server would communicate all messages by sending one at a time, in three transmissions.

Alternatively, when transmitting the two coded messages

x_{1}

and

x_{2} \oplus x_{3}

, the receiver

R_{1}

decodes

x_{1}

, from

(x_{2} \oplus x_{3}) \oplus x_{3} = x_{2}

and

(x_{2} \oplus x_{3}) \oplus x_{2} = x_{3}

,

R_{2}

and

R_{3}

recover their demands.

The index coding problem is formulated as follows. Suppose that a server S wants to send a vector

x = (x_{1}, x_{2}, \dots, x_{n})

, where

x_{i} \in F_{q} \forall i \in [n]

, to

[n]

receivers

R_{1}, R_{2}, \dots, R_{n}

. Each receiver

R_{i}

has

x_{S_{i}} = {x_{j}; j \in S_{i} \subseteq [n] ∖ {i}}

as side information and is interested in receiving the message

x_{i}

. The codeword

C (x) \in F_{q}^{ℓ}

is sent and allows each receiver

R_{i}

to retrieve

x_{i}

.

C

is an index code scalar over

F_{q}

of length ℓ. The purpose of S is to find an index code that has the minimum length. The index code is called linear if

C (x)

is a linear function.

Index Coding via Fitting Matrices

A directed graph

G = (V, E)

with n vertices specifies an instance of the index coding problem. Each vertex of

G

corresponds to a receiver (and its demand) and there is a directed edge

i ⟶ j

if and only if the receiver

R_{i}

knows

x_{j}

as side information. Then, we write:

S_{i} = {j : (i, j) is a edge of G}

Definition 2.

Let

G = (V, E)

be a directed graph on n vertices without self-loops.

1.

A 0-1 matrix

M = (m_{i j})

, whose rows and columns are labeled by the elements of

V = [n]

, fits

G

if, for all i and j,

(i): $m_{i i} = 1$ ;
(ii): For $i \neq j$ ,

$m_{i j} = \{\begin{matrix} * \in {0, 1}; & i f (i, j) i s a n e d g e o f G; \\ 0 & ; e l s e . \end{matrix}$

Thus,

M - I

is the adjacency matrix of an edge-induced subgraph of

G

, where I denotes the

n \times n

identity matrix.

2.

The minrank of

G

over the field

F_{2}

is defined as follows:

{m i n r k}_{2} (G) ≜ m i n {{r a n k}_{2} (M) : M f i t s G}

Remark 1.

The term rank

_{2} (M)

denotes the rank of such matrix M over

F_{2}

, after

“ * ”

has been assigned a value of 0 or 1. As an example for the index coding problem instance described in Figure 1, the matrix M would be given as follows:

M = [\begin{matrix} 1 & 0 & * \\ * & 1 & * \\ * & * & 1 \end{matrix}]

Example 1.

Consider the side information graph

G

and a matrix M that fits

G

, as shown in Figure 2. As

{m i n r a n k}_{2} (M) = 2

, we can select two linearly independent rows in a matrix M, namely

M^{'}

, and design an linear index code with the shortest possible length. The codeword sent will be

M^{'} x

.

Theorem 1

([2]). For any side information graph

G

, there exists a linear index code for

G

whose length equals minrk

_{2} (G)

. This bound is optimal for all linear index codes

G

.

In [17], the index encoding problem was generalized. Suppose that a sender wants to transmit n messages

(X_{1}, \dots, X_{n})

, where

X_{i} \in {F_{q}}^{t} \forall i \in [n]

, to m receivers

R_{1}, \dots, R_{m}

, through a noiseless channel. The receiver

R_{j}

is interested in recovering a single block

X_{f (j)}

, where

f : [m] ⟶ [n]

, and knows

X_{S_{j}} = {X_{i}; i \in S_{j} \subseteq [n] ∖ f (j)}

. The goal is to satisfy the demands of all receivers, exploiting their side information in a minimum number of transmissions.

When

m = n

,

f (j) = j, \forall j \in [m]

and

t = 1

, we have a scalar index code [2]. Otherwise, we have a vector index code.

Let

I = {S_{j}; j \in [m]}

be the set of side information of all receivers. An instance of an index coding problem given by

(m, n, I, f)

can be described by a directed hypergraph.

Definition 3.

The side information hypergraph

H (V, E_{H}) = H (m, n, I, f)

is defined by the set of vertices

V = [n]

and the (directed) hyperedges

E_{H}

, where

E_{H} = {e_{j} = (f (j), S_{j}); j \in [m]}

A hyperedge

e_{j} = (f (j), S_{j})

represents the demand and side information of the receiver

R_{j}

.

Example 2.

Consider an instance of an index coding problem in Figure 3. The hypergraph in Figure 3b describes the problem, where

n = 4

(messages) and

m = 5

(receivers) requiring

f (1) = 1

,

f (2) = 3

,

f (3) = 4

,

f (4) = 4

and

f (5) = 2

, and with the following side information sets

S_{1} = {3, 4}, S_{2} = {2, 4}, S_{3} = {1}

,

S_{4} = {2}

and

S_{5} = {1, 3}

, respectively.

Definition 4.

Given an instance of an index encoding problem described by

H (m, n, I, f)

,

C : {F_{q}}^{n \times t} ⟶ {F_{q}}^{ℓ \times t},

is a

F_{q}

- index code with length ℓ, for the instance described by

F

, if, for each receiver

R_{j}

,

j \in [m]

, there exists a decoding function

D_{j} : {F_{q}}^{ℓ \times t} \times {F_{q}}^{t ∣ S_{j} ∣} ⟶ {F_{q}}^{t},

satisfying

D_{j} (C (X), X_{S_{j}}) = X_{f (j)}, \forall X \in {F_{q}}^{n \times t}

.

The transmission rate of the code is defined as

\frac{ℓ}{t}

. If

t = 1

, then the index code is known as a scalar index code; otherwise, it is known as a vector index code. A linear coding function

C

is also called a linear index code. The goal of index coding is to find optimal index codes, i.e., those with the minimum possible transmission rate. For scalar linear index codes, we refer to the quantity r as the length of the code, and thus rate optimality translates to minimal length codes.

Definition 5.

C

is a

F_{q}

-linear index code,

C (X) = G X, \forall X \in F_{q}^{ℓ \times n}

, where

G \in F_{q}^{ℓ \times n}

. G is the matrix that generates the linear index code

C

.

The following definition generalizes the minrank definition over

F_{q}

of the side information graph

G

, which was defined in [2], to a hypergraph

H (m, n, I, f)

.

Definition 6.

Let Supp

(v) = {i \in [n] : v_{i} \neq 0}

, the support of a vector

v \in F_{q}^{n}

. The Hamming weight of v will be denoted by

ω (v) = | S u p p (v) |

, the number of nonzero coordinates of v.

Definition 7

([11]). Suppose that

H (m, n, I, f)

corresponds to an instance of index coding with side information (ICSI). Then, the minrank of

H

over

F_{q}

is defined as

{m i n r k}_{q} (H) ≜ m i n {{r a n k}_{q} ({v_{i} + e_{f (i)}}_{i \in [m]}) : v_{i} \in F_{q} S u p p (v_{i}) \subseteq S_{i}}

This may be rewritten as follows.

Definition 8.

Let a side information hypergraph

H

correspond to an instance of the ICSI problem. A matrix

M = (m_{i j}) \in F_{q}^{m \times n}

fits

H

if

m_{i j} = \{\begin{matrix} 1 & i f j = f (i) \\ 0 & i f j \in / S_{i} \end{matrix}

The minrank of

F

over the field

F_{q}

is defined as follows:

{m i n r k}_{q} (H) ≜ m i n {{r a n k}_{q} (M) : M f i t s H}

Theorem 2

([2]). Given an instance of an index encoding problem described by the hypergraph

H (m, n, I, f)

, the optimal length of an index code on the field

F_{q}

is

{m i n r k}_{q} (H)

.

In [2], it was proven that, in several cases, linear index codes were optimal. They conjectured that for any side information graph

G

, the shortest-length index code would always be linear and have length

{m i n r k}_{2} (G)

. The conjecture was refuted by Lubetzky and Stav in [18]. In any case, as shown by Peeters [19], calculating the minrank of an arbitrary graph is a difficult task. More specifically, Peeters showed that deciding whether a graph has minrank three is an NP-complete problem.

Example 3.

Consider the instance of the index encoding problem given in Example 2. Then, we find that the matrix M that fits the hypergraph

H

has the form:

M = [\begin{matrix} 1 & 0 & * & * \\ 0 & * & 1 & * \\ * & 0 & 0 & 1 \\ 0 & * & 0 & 1 \\ * & 1 & * & 0 \end{matrix}]

The lines are associated with the receivers

R_{1}, \dots, R_{5}

and the columns to the message indexes

1, 2, 3

and 4. The symbol

“ * ”

can be replaced by an arbitrary element in the field

F_{q}

.

For an example, consider the field

F_{2}

. A matrix that fits the hypergraph

H

has rank at least 3. Thus, we select

M = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 \end{matrix}]

which achieves the minimum rank 3. Now, we consider three linearly independent lines of M, and suppose that

G = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}] \Rightarrow G x = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}] [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \\ x_{4} \end{matrix}] = [\begin{matrix} x_{1} \\ x_{2} \oplus x_{3} \\ x_{4} \end{matrix}]

The decoding process goes as follows. Since

R_{2}

and

R_{5}

already know

{x_{2}, x_{4}}

and

{x_{1}, x_{3}}

, respectively, they obtain

x_{3}

and

x_{2}

, respectively, from the first packet. Receiver

R_{1}

obtains

x_{1}

and both

R_{3}

and

R_{4}

obtain

x_{4}

.

Remark 2.

We have made available at [20] an algorithm (m-files) in Matlab, which is designed to solve small examples in this work, since, as we mentioned above, there is no polynomial-time algorithm for an arbitrary graph.

2.4. Pliable Index Coding

The pliable index coding problem (PICOD), introduced by Brahma and Fragouli in [6], is a variant of the index coding problem. In PICOD, users do not have predetermined messages to decode, as in the case of classic index coding; instead, each user is satisfied to decode any message that is not present in its side information set. Figure 4 illustrates this system model.

The problem is formalized as follows: a transmitter with n messages

{x_{i} : i \in [n]}

,

x_{i} \in X

is connected to m receivers

R_{1}, \dots, R_{m}

, through a noiseless channel. Each receiver

R_{j}

knows

x_{S_{j}} = {x_{i} : i \in S_{j}}

as side information. We denote by

I_{j} ≜ [n] ∖ S_{j}

the index set of the unavailable messages in

x_{S_{j}}

. Then,

x_{I_{j}} = {x_{i} : i \in I_{j}}

denotes the set of requests of

R_{j}

. Each receiver

R_{j}

is satisfied if it can successfully recover any message that is not present in its side information set, i.e., any message

x_{d} \in x_{I_{j}}

.

We can represent an instance of a pliable index coding problem using an undirected bipartite graph, one side representing the message indexes and the other side representing the receivers. We connect

R_{j}

to the indices belonging to

I_{j}

, as in Figure 5.

Remark 3.

By having this freedom to choose the desired message for each user, PICOD can satisfy all users with a significant reduction in the number of transmissions compared to the index encoding problem with the same set of messages and the same sets of user side information.

Example 4.

We will consider the case described in Example 2 as a pliable index coding problem. Now, we have the bipartite graph in Figure 5 describing the problem. Note, for example, that client 1 demands any of the messages indexed in

I_{1} = {1, 2}

and knows the indexed messages in

S_{1} = {3, 4}

; client 3 will be satisfied to receive any of the messages

x_{2}, x_{3}

or

x_{4}

, since it only knows

x_{1}

.

Pliable Index Coding via Colorings of Hypergraphs

In [21], a graph coloring approach was presented for pliable index coding. The authors have shown the existence of a coding scheme that has length

O ({log}^{2} Γ)

, where

Γ

refers to a hypergraph parameter that captures the maximum number of intersections between edges of the PICOD hypergraph.

An instance of the pliable index encoding problem is described by

(m, n, I) -

PICOD, onde

I = {I_{j}; j \in [m]}

, and can be conveniently represented by a hypergraph.

Definition 9.

The hypergraph

H (V, E_{H})

, with

V = [n]

vertices and

E_{H} = {e_{j} = (I_{j}); j \in [m]}

hyperedges, completely describes the

(m, n, I) -

PICOD. The hyperedge

e_{j} = (I_{j})

represents the set of requests for

R_{j}

(i.e.,

E_{H} = I

).

The problem illustrated in Example 5 can be represented by a hypergraph, as can be seen in Figure 6.

Let

H = (V, E_{H})

be a hypergraph and

C : V \to [L]

be a coloring of

V

, where L is a positive integer. We say that C is a conflict-free coloring for the hyperdges, if each

E_{H}

of

H

has at least one vertex with unique color. The smallest number of colors required for such a coloring is called the conflict-free chromatic number of

H

, denoted by

χ_{C F} (H)

. This parameter was first introduced by Even et al. [22].

Remark 4.

In [21], pliable index coding was given a graph coloring approach. The authors have shown the existence of a coding scheme that has length

O ({log}^{2} Γ)

, where Γ refers to a hypergraph parameter that captures the maximum number of “incidences” of other hyperedges in any given hyperedge. This result improves the best known achievable results, in some parameter regimes.

Definition 10.

A pliable index code (PIC) consists of a collection of an encoding function on the server that encodes the n messages to an

ℓ -

length codeword,

ϕ : X^{n} ⟶ X^{ℓ},

and decoding functions

ψ_{j} : X^{ℓ} \times X^{∣ S_{j} ∣} ⟶ X,

satisfying

ψ_{j} (ϕ (x), x_{S_{j}}) = x_{d},

for some

d \in I_{j}

.

The quantity

ℓ -

is called the length of the pliable index code. We are interested in designing pliable index codes that have small

ℓ -

.

We will assume that

X = F^{k}

for some finite field

F

and integer

k \geq 1

. If

k > 1

, we refer to this code as a

k -

vector PIC, while the

k = 1

case is also called a scalar PIC. We will concentrate on the linear PICs. In this case, the coding function

ϕ

is represented by a matrix

ℓ \times m k

(denoted by G) such that

ϕ (x_{i} : i \in [n]) = G x^{T}

, where

x = (x_{11}, \dots, x_{1 k}, \dots, x_{m 1}, \dots, x_{m k}) .

The smallest ℓ for which there is a linear

k -

vector PIC for an instance of the pliable index coding problem given by the hypergraph

H

will be denoted by

ℓ_{k}^{*} (H)

.

Definition 11.

Let

C : V \to [L]

be a conflict-free coloring of the hypergraph

H

that represents a PICOD. The indicator matrix associated with this coloring

G_{C}

,

L \times n

, is given by

G_{c} (c, i) = \{\begin{matrix} 1, & i f t h e v e r t e x i r e c e i v e d t h e c o l o r c; \\ 0, & o t h e r w i s e . \end{matrix}

Teorema 1

([21]). The indicator matrix

G_{c}

generates the pliable index code for the problem described by the hypergraph

H

.

Example 5.

Consider the same PICOD represented in Figure 6. The coloring shown in Figure 7 is a conflict-free coloring with two colors. Then, the matrix

G_{c} = [\begin{matrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 \end{matrix}] .

From the messages

x_{1} \oplus x_{4}

and

x_{2} \oplus x_{3}

, all receivers can successfully recover at least one message from their request set.

Using the same parameters as in Example 2, we see that the length of the index code for this instance is

ℓ = 3

, while, for the PICOD case,

ℓ_{k}^{*} (H) = 2

.

2.5. Index Coding via MDS Codes

The index coding model defined in Section 2.3, via graph theory, is only one of many approaches used to describe and solve an index encoding problem. One of the most interesting index coding schemes using codes has the maximum distance separable (MDS) property, which consists of transmitting

κ (G) = n - m i n d e g (G)

, the parity symbols of a systematic MDS code with parameters

(n + κ, n)

, where

m i n d e g (G)

represents the minimum amount of side information available at each receiver, i.e., for a general index encoding problem with side information graph

G

,

mindeg (G) ≜ \min_{i \in [n]} | {j; (i, j) \in E (G)} | = \min_{i \in [n]} | S_{i} | .

Then, every receiver has n code symbols (including its side information) and, by the MDS property, it can successfully recover its desired message.

Proposition 1

([1]). Consider an index coding problem with n messages and n receivers represented by the side information graph

G

. Let

S_{i}

, the side information set of the receiver

R_{i}, i \in [n]

, and then

{minrk}_{q} (G) \leq κ (G) = n - mindeg (G) = n - \min_{i \in [n]} | S_{i} | .

Corollary 1.

If

G

is a complete graph, then

m i n d e g (G) = n - 1

and the transmission of the parity symbol

x_{1} + \dots + x_{n}

of an

(n + 1, n)

MDS code over

F_{2}

achieves minrk

_{q} (G) = 1

.

3. Results

The tree construction method proposed in [12] can be interpreted as a network coding problem with multiple sources and multiple users. In the proposed model, both encoding and decoding are performed by polynomial operations, without the need for side information; however, if they exist, they may allow multiple interpretations at the receivers, based on the side information available at each receiver. Figure 8 illustrates this system model.

Given the connection between network and index coding problems, established in [4], we can also interpret the coding with nested cyclic codes, at the stage where the packets are XOR-ed together, as a case of MDS index coding according to Corollary 1, in the particular case where each receiver is unaware of the message it is requesting, which may be a rare occurrence. However, it is possible to take advantage of the method’s distinguishing feature—the possibility of multiple interpretations at the receivers—and, by imposing some extra conditions, design an index code model that has greater flexibility over the side information sets.

In the next subsections, we present some results and algorithm implementations, and in Section 4, we present in detail the proposed index encoding with multiple interpretations.

3.1. Index Coding from Reed–Solomon Codes

We establish a connection between index coding and error-correcting codes based on the tree construction method of nested cyclic codes proposed in [12]. We implement a few algorithms to perform tree construction using the Matlab language, which allows us to work over finite bodies in a practical and efficient way and helps to solve some implementation problems encountered later in [12]. We prove that for cyclic codes, there will not always be an increase in error correction capability between the levels of the tree, as suggested in [12]. This is why we have initially limited this study to Reed–Solomon codes, because they are MDS codes, which guarantees an increase in Hamming distance at each level, meaning that, under certain conditions, the knowledge of side information will be interpreted as an increase in the decoder’s ability to correct errors.

A Tree Construction with Nested Cyclic Codes

A nested code is characterized by a global code where each element is given by a sum of codewords, each belonging to a different subcode. That is,

c = i_{1} G_{1} \oplus i_{2} G_{2} \oplus \dots \oplus i_{N} G_{N},

where ⊕ represents an XOR operation. For an information vector

i_{ℓ}, 1 \leq ℓ \leq N

, the codeword

i_{ℓ} G_{ℓ}

belongs to a subcode

C_{ℓ}

of code

C

and

c \in C

.

Nested cyclic codes, whose subcodes are generated by generator polynomials, were originally proposed by Heegard [23], and were originally called partitioned linear block codes. They can be defined as follows:

Definition 12.

Let

C = {C (x) \in F_{q} [x]; g (x) | C (x)}

be a t-error-correcting cyclic code having

g (x)

as the generator polynomial. Note that

C = 〈 g (x) 〉

is an ideal of the ring

R_{n} = F_{q} [x] / (x^{n} - 1)

, but is also a vector subspace of

F_{q}^{n}

, such that

C (x) = p_{1} (x) g_{1} (x) + p_{2} (x) g_{2} (x) + \dots + p_{N} (x) g_{N} (x),

where

C_{ℓ} (x) = p_{ℓ} (x) g_{ℓ} (x)

,

1 \leq ℓ \leq N

is an encoded packet belonging to the

t_{ℓ}

-error-correcting subcode

C_{ℓ} = {C_{ℓ} (x) \in F_{q} [x]; g_{ℓ} (x) | C_{ℓ} (x)},

generated by

g_{ℓ} (x)

and satisfying the following conditions:

1.: $g_{ℓ} (x) | g_{ℓ + 1} (x)$ ;
2.: $deg [C_{ℓ} (x)] < deg [g_{ℓ + 1} (x)]$ .

The tree-based algebraic construction of nested cyclic codes, proposed in [12], aims to

1.: Encode, independently, different data packets, providing protection against channel errors;
2.: Encode different data packets producing codewords that are added, resulting in the packet $C_{0}$ ;
3.: Correct the errors on $C_{0}$ and, finally, recover the data in the receiver by polynomial operations.

Consider a tree in which the root node is associated with the vector subspace of an encompassing error correcting code. Thus, the root node is defined as the code

C_{i 0}

, such that

C_{i 0} = 〈 g_{i 0} (x) 〉 = {C_{i 0} (x) \in F_{q} [x]; g_{i 0} (x) | C_{i 0} (x)} .

This subspace corresponds to a

t_{0}

-error-correcting cyclic code

C_{i 0} (n, k_{i 0})

, generated by the polynomial

g_{i 0} (x)

.

Definition 13.

A tree of nested cyclic codes is a finite tree such that

1.: Each inner node (including the root node) can be subdivided into another inner node and a terminal node;
2.: The jth th inner node is associated with a linear subspace $C_{i j} \subset F_{q}^{n}$ of dimension $k_{i j}$ , and can be subdivided into the subspaces

$C_{i j} = C_{i (j + 1)} + C_{t (j + 1)} com C_{i (j + 1)} \cap C_{t (j + 1)} = {0} e k_{i j} = k_{i (j + 1)} + k_{t (j + 1)}$
3.: The subspace $C_{i j}$ , associated with the jth inner node, must be a cyclic linear block code generated by $g_{i j} (x)$ ;
4.: If $C_{i j} = 〈 g_{i j} (x) 〉$ e $C_{i (j + 1)} = 〈 g_{i (j + 1)} (x) 〉$ , then $g_{i j} (x) | g_{i (j + 1)} (x)$ ; furthermore, $g_{i j} (x) | x^{n} - 1$ for any $g_{i j} (x)$ ;
5.: To conclude, the last inner node will have no ramifications.

Remark 5.

Figure 9 illustrates the model described above.

Let

p_{j} (x)

be the data packet associated with the terminal node, for

1 \leq j \leq T

. The encoding is given by

C_{j} (x) = p_{j} (x) g_{i (j - 1)} (x) .

Then, the encoded packets are summed up and the resulting codeword is sent out by the transmitter

C_{0} (x) = C_{1} (x) + C_{2} (x) + \dots + C_{T} (x) .

After the error correction phase, the jth packet

p_{j} (x)

is decoded by the operations:

p_{j} (x) = \{\begin{matrix} [C_{0} (x) mod g_{i j} (x)] / g_{i (j - 1)} (x) & if 1 \leq j \leq T, \\ C_{0} (x) / g_{i (T - 1)} (x) & if j = T . \end{matrix}

(1)

The information will be contained in the remainder of the division of

C_{0} (x)

by

g_{i j} (x)

, since the modulo operation eliminates the influence of all messages related to polynomials of degree equal to or greater than the degree of

g_{i j}

. Thus, the quotient of the final division operation provides the desired information, since all other messages have degree less than the degree of the divisor polynomial. Therefore, in the case of the last package, only the division operation is required. We suggest consulting [12] for more details on the encoding process using the tree construction method.

3.2. Tree Construction: Algorithm and Considerations

We describe a few algorithms in Matlab and considerations for fitting to the model of tree construction, which can be found at [20], allowing us to perform the calculations on finite fields by making the appropriate transformations from integer representation to powers of

α

. Below, we exemplify the main idea of the algorithm.

Example 6.

For

T = 3

let

C_{i 0} (7, 5)

be a Reed–Solomon code in

G F (8)

and

k_{t 1} = k_{t 2} = 2

the dimensions of subspaces

C_{t 1}, C_{t 2}

, respectively. They are associated with the terminal nodes of the tree; the last node of the tree, which is an inner node without ramification, is associated with

C_{i 2}

of dimension

k_{i 2} = 1

.

The packets

p_{1} (x) = x + α^{2}, p_{2} (x) = α^{3} x + α

, both associated with the the terminal nodes, have length 2;

p_{3} (x) = α^{5}

has length equal to 1 and is associated with the last node. Let

α

be the primitive element of

G F (8)

, and the generator polynomials are

$deg (g_{i 0} (x)) = n - k_{i 0} = 2$ $\Rightarrow g_{i 0} (x) = \prod_{j = 1}^{2} (x - α^{j}) = x^{2} + α^{4} x + α^{3};$
$deg (g_{i 1} (x)) = n - k_{i 1} = 4$ $\Rightarrow g_{i 1} (x) = \prod_{j = 1}^{4} (x - α^{j}) = x^{4} + α^{3} x^{3} + x^{2} + α x + α^{3};$
$deg (g_{i 2} (x)) = n - k_{i 2} = 6$ $\Rightarrow g_{i 2} (x) = \prod_{j = 1}^{6} (x - α^{j}) = x^{6} + x^{5} + x^{4} + x^{3} + x^{2} + x + 1 .$

Then, the encoded packets are

$\begin{matrix} C_{1} (x) & = & p_{1} (x) g_{i 0} (x) \\ = & x^{3} + α x^{2} + α^{4} x + α^{5}; \end{matrix}$
$\begin{matrix} C_{2} (x) & = & p_{2} (x) g_{i 1} (x) \\ = & α^{3} x^{5} + α^{5} x^{4} + α^{6} x^{3} + α^{2} x^{2} + x + α^{4}; \end{matrix}$
$\begin{matrix} C_{3} (x) & = & p_{3} (x) g_{i 2} (x) \\ = & α^{5} x^{6} + α^{5} x^{5} + α^{5} x^{4} + α^{5} x^{3} + α^{5} x^{2} + α^{5} x + α^{5} . \end{matrix}$

The transmitted codeword

C_{0} (x)

is given by

$\begin{matrix} C_{0} (x) & = & C_{1} (x) + C_{2} (x) + C_{3} (x) \\ = & α^{5} x^{6} + α^{2} x^{5} + 0 x^{4} + α^{3} x^{3} + 1 x^{2} + 0 x + α^{4} . \end{matrix}$

Remark 6.

Each terminal node is a shortened version of the code associated with the inner node from which the terminal node emanates. It is implicit that the codewords of shortened codes are prefixed with zeros to achieve length n and, therefore, that these codes are not cyclic.

3.2.1. Decoding—Error Correction

Considering tree construction based on Reed–Solomon codes and assuming that the receiver has side information available, when will there be an increase in error correction capability?

Proposition 2.

Due to the nesting structure, the variable error correctability characteristic can only be observed if there is a sequential removal of the packets associated with the nodes from the root to the top of the tree.

Proof.

Supposing that

C_{ℓ} (x), 1 \leq ℓ \leq T

is the first coded packet known at the receiver, then

\begin{matrix} C_{0} (x) & = p_{1} (x) g_{i 0} (x) + \dots + p_{(ℓ - 1)} (x) g_{i (ℓ - 2)} (x) + p_{(ℓ + 1)} (x) g_{i ℓ} (x) + \dots + p_{T} (x) g_{i (T - 1)} (x) \\ = [p_{1} (x) + \dots + p_{(ℓ - 1)} (x) q_{(ℓ - 1)} (x) + p_{(ℓ + 1)} (x) q_{(ℓ + 1)} (x) + \dots + p_{T} (x) q_{T} (x)] g_{i 0} (x), \end{matrix}

therefore,

C_{0} (x) \in C_{i 0} (n, k_{i o})

, whose error correction capability is

t_{0}

. Note that even though the receiver knows about other packages

C_{j} (x), ℓ < j \leq T

, the result does not change. On the other hand, if all packages

C_{j} (x); 1 \leq j < ℓ

are known to the receiver, we can write

\begin{matrix} C_{0} (x) & = p_{(ℓ + 1)} (x) g_{i ℓ} (x) + \dots + p_{T} (x) g_{i (T - 1)} (x) \\ = [p_{(ℓ + 1)} (x) {\bar{q}}_{(ℓ + 1)} (x) + \dots + p_{T} (x) {\bar{q}}_{T} (x)] g_{i ℓ} (x), \end{matrix}

thus,

C_{0} (x) \in C_{i ℓ} (n, k_{i ℓ})

, whose error correction capability is

t_{ℓ} ⩾ t_{0}

, and equality occurs only when

d_{m i n} (C_{ℓ}) - d_{m i n} (C_{0}) < 2 .

□

Example 7.

Consider the same tree as in Example 6.

If all packages are unknown ⇒ the decoding is performed by $C_{i 0} (7, 5)$ ∴ $t_{0} = 1$ ;
If $C_{1} (x)$ is known ⇒ the decoding is performed by $C_{i 1} (7, 3)$ ∴ $t_{1} = 2$ ;
If $C_{2} (x)$ is known $C_{1} (x) \Rightarrow$ the decoding is performed by $C_{i 2} (7, 1)$ ∴ $t_{2} = 3$ .

However, if

C_{2} (x)

is known but

C_{1} (x)

is not, then the resulting codeword still belongs to

C_{0} (x) \in C_{i 0} (7, 5)

, and there is no improvement in error correction capability, since

\begin{matrix} C_{0} (x) & = p_{1} (x) g_{i 0} (x) + p_{3} (x) g_{i 0} (x) \bar{q} (x) \\ = [p_{1} (x) + p_{3} (x) \bar{q} (x)] g_{i 0} (x) . \end{matrix}

Another advantage of Reed–Solomon codes is that they are easily decoded using an algebraic method known as syndrome decoding.

Syndrome Decoding

Syndrome decoding is an algebraic method based on the Berlekamp–Massey algorithm, which became a prototype for the decoding of many other linear codes.

If the coded package

C_{1} (x)

is known and an error

e (x)

occurs, then the message received will be

\begin{matrix} r (x) & = C_{0} (x) + C_{1} (x) + e (x) \end{matrix}

Suppose that the error is given by

e (x) = 0 x^{6} + 0 x^{5} + α^{2} x^{4} + α^{5} x^{3} + 0 x^{2} + 0 x + 0

, and then

\begin{matrix} r (x) & = α^{5} x^{6} + α^{2} x^{5} + α^{2} x^{4} + α^{6} x^{3} + α^{3} x^{2} + α^{4} x + 1 . \end{matrix}

Remark 7.

Notice that we need to find the error locations and their values, which is the main difference with binary codes, since, for binary codes, it is enough to determine the error locations.

The decoding process can be divided into three stages.

1.: Syndrome calculation

The syndrome calculation stage consists of checking the roots of the generating polynomial as inputs of

r (x)

. If the result is null, the sequence belongs to the set of codewords and, therefore, there are no errors. Any nonzero value indicates the presence of an error.

If the encoded packet

C_{1} (x)

is known, then the error correction algorithm is executed by

C_{i 1}

, which is a

R S (7, 3)

code, generated by

g_{i 1} (x) = (x - α) (x - α^{2}) (x - α^{3}) (x - α^{4})

. Then,

\begin{matrix} r (x) & = m (x) g_{i 1} (x) + e (x), \end{matrix}

Therefore, evaluating the roots of

g (x)

at

r (x)

, the result will only be null when there are no errors in the transmission.

∴ S_{i} = r (x) |_{x = α^{i}} = e (α^{i}), \forall i = 1, \dots, n - k = 2 t .

\begin{matrix} S_{1} = r (α) = α^{5}; & S_{3} = r (α^{3}) = 0; \\ S_{2} = r (α^{2}) = α^{6}; & S_{4} = r (α^{4}) = α^{6} . \end{matrix}

2.: Error Localization

Let

μ

be the number of errors

0 \leq μ \leq t

, which occur at locations

ℓ_{1}, \dots, ℓ_{μ}

, and the error polynomial can be written as

e (x) = e_{ℓ_{1}} x^{ℓ_{1}} + \dots + e_{ℓ_{μ}} x^{ℓ_{μ}} .

To correct

r (x)

, we must find the values and locations of the errors, which are denoted, respectively, by

e_{ℓ_{1}}, \dots, e_{ℓ_{μ}}

and

x^{ℓ_{1}}, \dots, x^{ℓ_{μ}}

. Substituting

α^{j}, 1 \leq j \leq 2 t

, into the error polynomial

e (x)

, we obtain

\{\begin{matrix} \begin{matrix} S_{1} & = & e (α) & = & e_{ℓ_{1}} α^{ℓ_{1}} + e_{ℓ_{2}} α^{ℓ_{2}} + \dots + e_{ℓ_{μ}} α^{ℓ_{μ}} \\ S_{2} & = & e (α^{2}) & = & e_{ℓ_{1}} {(α^{ℓ_{1}})}^{2} + e_{ℓ_{2}} {(α^{ℓ_{2}})}^{2} + \dots + e_{ℓ_{μ}} {(α^{ℓ_{μ}})}^{2} \\ ⋮ \\ S_{2 t} & = & e (α^{2 t}) & = & e_{ℓ_{1}} {(α^{ℓ_{1}})}^{2 t} + e_{ℓ_{2}} {(α^{ℓ_{2}})}^{2 t} + \dots + e_{ℓ_{μ}} {(α^{ℓ_{μ}})}^{2 t} \end{matrix} \end{matrix} .

Obtain

X_{i} = α^{ℓ_{i}}

and

Y_{i} = e_{ℓ_{i}}

for

1 \leq i \leq μ

, where

X_{i}

and

Y_{i}

will represent, respectively, the locations and values of the errors. Note that we will have

2 t

equations and

2 t

unknowns, t being error values and t being locations.

\{\begin{matrix} \begin{matrix} S_{1} & = & Y_{1} X_{1} + Y_{2} X_{2} + \dots + Y_{μ} X_{μ} \\ S_{2} & = & Y_{1} {X_{1}}^{2} + Y_{2} {X_{2}}^{2} + \dots + Y_{μ} X_{μ}^{2} \\ ⋮ \\ S_{2 t} & = & Y_{1} {X_{1}}^{2 t} + Y_{2} {X_{2}}^{2 t} + \dots + Y_{μ} X_{μ}^{2 t} \end{matrix} \end{matrix} .

It can be shown that this nonlinear system has a unique solution if

0 \leq μ \leq t

[15]. The techniques that solve this system of equations include defining the error locator polynomial (ELP)

σ (z)

[24].

Definition 14.

Define the error locator polynomial

s i g m a (z)

, as

\begin{matrix} σ (z) & = & (1 - X_{1} z) (1 - X_{2} z) \dots (1 - X_{μ} z) \\ = & σ_{μ} z^{μ} + \dots + σ_{2} z^{2} + σ_{1} z + 1 . \end{matrix}

The inverse of the square root of

σ (z)

,

1 / X_{1}, \dots, 1 / X_{μ}

, indicates the locations of errors.

To find error locations

X_{i}, 1 \leq i \leq μ

, note that

σ_{1}, σ_{2}, \dots, σ_{μ}

and calculate the zeros of

σ (z)

; to find them, we use a syndrome matrix, as we see below:

\begin{matrix} [\begin{matrix} S_{1} & S_{2} & \dots & S_{μ} \\ S_{2} & S_{3} & \dots & S_{μ + 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ S_{μ} & S_{μ + 1} & \dots & S_{2 μ - 1} \end{matrix}] [\begin{matrix} σ_{μ} \\ σ_{μ - 1} \\ ⋮ \\ σ_{1} \end{matrix}] & = [\begin{matrix} - S_{μ + 1} \\ - S_{μ + 2} \\ ⋮ \\ - S_{2 μ} \end{matrix}] \end{matrix}

Returning to Code RS(7,3), where the error correction capability is

t = 2

, we must find

σ_{1}

e

σ_{2}

:

\begin{matrix} [\begin{matrix} S_{1} & S_{2} \\ S_{2} & S_{3} \end{matrix}] [\begin{matrix} σ_{2} \\ σ_{1} \end{matrix}] & = [\begin{matrix} S_{3} \\ S_{4} \end{matrix}] \\ [\begin{matrix} α^{5} & α^{6} \\ α^{5} & 0 \end{matrix}] [\begin{matrix} σ_{2} \\ σ_{1} \end{matrix}] & = [\begin{matrix} 0 \\ α^{6} \end{matrix}] \Rightarrow σ_{2} = 1 e σ_{1} = α^{6} \end{matrix}

Thus,

σ (z) = z^{2} + α^{6} z + 1

with roots

α^{3}

and

α^{4}

, so there is an error at the locations

α^{- 3} = α^{4}

and

α^{- 4} = α^{3}

. Then,

e (x) = e_{3} x^{3} + e_{4} x^{4} .

3.: Determining the error values

Calculating

e (x) = e_{3} x^{3} + e_{4} x^{4}

at the points

α

and

α^{2}

, we can use the syndromes already obtained,

S_{1}

and

S_{2}

, to determine the values of the errors, solving the following system:

\{\begin{matrix} \begin{matrix} S_{1} & = & e (α) & = & e_{3} α^{3} + e_{4} α^{4} \\ S_{2} & = & e (α^{2}) & = & e_{3} α^{6} + e_{4} α^{8} \end{matrix} \end{matrix}

Therefore, the error polynomial is given by

e (x) = α^{5} x^{3} + α^{2} x^{4} .

Now, correcting the received word

r (x)

, we have

\begin{matrix} C_{0} (x) & = r (x) + c_{1} (x) + e (x) \\ = α^{5} x^{6} + α^{2} x^{5} + 0 x^{4} + α^{3} x^{3} + 1 x^{2} + 0 x + α^{4} . \end{matrix}

3.2.2. Decoding—Data Recovery

Example 8.

For the cases in the previous examples, where

T = 3

, the original data can be recovered as follows:

\begin{matrix} • p_{1} (x) & = \frac{C_{0} (x) mod g_{i 1} (x)}{g_{i 0} (x)} = \frac{p_{1} g_{i 0} mod g_{i 1}}{g_{i 0} (x)} = \frac{p_{1} (x) g_{i 0} (x)}{g_{i 0} (x)}; \\ • p_{2} (x) & = \frac{C_{0} (x) mod g_{i 2} (x)}{g_{i 1} (x)} = \frac{p_{1} g_{i 0} mod g_{i 2} + p_{2} g_{i 1} mod g_{i 2}}{g_{i 1} (x)}; \\ • p_{3} (x) & = \frac{C_{0} (x)}{g_{i 2} (x)} = \frac{p_{1} (x) g_{i 0} (x)}{g_{i 2} (x)} + \frac{p_{2} (x) g_{i 1} (x)}{g_{i 2} (x)} + \frac{p_{3} (x) g_{i 2} (x)}{g_{i 2} (x)}; \end{matrix}

In summary, the module operation removes the branches above the node of interest and the division operation removes the branches below. Therefore, no side information is needed at the receiver in order to recover the data packets.

Will There Always Be an Increase in Error Correction Capability?

We analyze two cases of tree construction of nested cyclic codes, with the same parameters at each level. In one of them, we observe no increase in the error correction capability from the second to last internal node of the tree. This is due to the variety of possibilities of generating polynomials for a cyclic code of parameters

(n, k)

. As a result, we demonstrate in Proposition 3 that, for Reed–Solomon codes, this feature of increasing capacity will be guaranteed provided that:

k_{i j} - k_{i (j + 1)} \geq 2, \forall j = 0, \dots, T - 1 .

Example 9.

Let

C_{i 0} (15, 10)

be a cyclic code in

G F (2)

and

k_{t 1} = 4, k_{t 2} = 2

be the dimensions of the subspaces

C_{t 1}, C_{t 2}

, respectively. The last node is associated with

C_{i 2}

with dimension

k_{i 2} = 4

. The construction is depicted in Figure 10.

We consider the factorization:

x^{15} - 1 = (1 + x) (1 + x^{3} + x^{4}) (1 + x + x^{2} + x^{3} + x^{4}) (1 + x + x^{2}) (1 + x + x^{4})

Case 1.

•	$g_{i 0} (x) = (1 + x) (1 + x^{3} + x^{4})$	⇒	$t_{0} = 1;$
•	$g_{i 1} (x) = g_{i 0} (x) (1 + x + x^{2} + x^{3} + x^{4})$	⇒	$t_{1} = 2;$
•	$g_{i 2} (x) = g_{i 1} (x) (1 + x + x^{2})$	⇒	$t_{2} = 3 .$

Case 2.

•	$g_{i 0} (x) = (1 + x) (1 + x + x^{4})$	$\Rightarrow$	$t_{0} = 1;$
•	$g_{i 1} (x) = g_{i 0} (x) (1 + x^{3} + x^{4})$	$\Rightarrow$	$t_{1} = 2;$
•	$g_{i 2} (x) = g_{i 1} (x) (1 + x + x^{2})$	$\Rightarrow$	$t_{2} = 2 .$

Remark 8.

We have provided an m-file algorithm at [20], which can be run through Matlab and performs the operations described in Examples 8 and 9.

Proposition 3.

Given a

(n, k)

Reed–Solomon code, which has minimum distance

d = n - k + 1

, one can guarantee an increase in error correction capability at each level of the tree provided that

k_{i j} - k_{i (j + 1)} \geq 2, \forall j = 0, \dots, T - 1

.

Proof.

We must prove that

t_{i (j + 1)} \geq t_{i j} + 1, \forall j = 0, \dots, T - 1

. For simplicity but without loss of generality, set

j = 0

. If

k_{i 0} - k_{i 1} \geq 2

, then we can write:

\begin{matrix} (- d_{i 0} + n + 1) + d_{i 1} - n - 1 & \geq 2 \\ d_{i 1} - 1 & \geq d_{i 0} - 1 + 2 \\ [\frac{d_{i 1} - 1}{2}] & \geq [\frac{d_{i 0} - 1}{2}] + 1 \\ t_{i 1} & \geq t_{i 0} + 1 . \end{matrix}

This completes the proof. □

The verification that, for cyclic codes, there will not always be an increase in the error correction capacity between the levels of the tree, as considered in [12], leads us to search for answers on how to properly choose the generating polynomials for a code of parameters

(n, k)

and its subcodes, in order to guarantee subcodes with larger Hamming distance, with the purpose of observing an increase in the error correction capacity between the levels of the tree. An approach to constructing chains of some linear block codes while keeping the minimum distances (of the generated subcodes) as large as possible is presented in [25] and may be the solution to this problem.

3.3. An Example with a BCH Code

According to Luo and Vinck [25], to construct a chain of BCH subcodes with the characteristic of maintaining the minimum distance as large as possible, the task becomes more difficult because their subcodes may not be BCH and cyclic codes, and therefore the minimum distance of these subcodes might not be found easily. However, for primitive BCH codes, the minimum distance coincides with the weight of the generator polynomial, which makes it feasible to use it for the construction of the nested subcode chain that we seek. For non-primitive BCH codes, this statement is not always valid. For an extensive description of the minimum distance for BCH codes, we recommend consulting [26].

In Table 1, we present the parameters for binary primitive BCH codes of length

n = 2^{m} - 1

; it will guide the tree construction.

Example 10.

Consider the root node associated with the BCH code

C_{i 0} (15, 11)

. Suppose that we want to encode the packets

p_{1} (x) = x^{3} + x^{2} + x

,

p_{2} (x) = x

and

p_{3} (x) = x^{4} + x^{2} + 1

associated with nodes whose dimensions are

k_{t 1} = 4

,

k_{t 2} = 2

and

k_{i 2} = 5

, respectively. The polynomials

g_{i 0} (x) = x^{4} + x + 1

,

g_{i 1} (x) = x^{8} + x^{7} + x^{6} + x^{4} + 1)

and

g_{i 2} (x) = x^{10} + x^{8} + x^{5} + x^{4} + x^{2} + x + 1

generate the codes associated with the internal nodes,

C_{i 0} (15, 11)

,

C_{i 1} (15, 7)

e

C_{i 2} (15, 5)

, respectively, as shown in Figure 11.

Encoding the packets, we have:

$\begin{matrix} C_{1} (x) & = & p_{1} (x) g_{i 0} (x) \\ = & x^{7} + x^{6} + x^{5} + x^{4} + x; \end{matrix}$
$\begin{matrix} C_{2} (x) & = & p_{2} (x) g_{i 1} (x) \\ = & x^{9} + x^{8} + x^{7} + x^{5} + x; \end{matrix}$
$\begin{matrix} C_{3} (x) & = & p_{3} (x) g_{i 2} (x) \\ = & α^{5} x^{6} + α^{5} x^{5} + α^{5} x^{4} + α^{5} x^{3} + α^{5} x^{2} + α^{5} x + α^{5} . \end{matrix}$

The transmitted codeword

C_{0} (x)

is given by:

\begin{matrix} C_{0} (x) & = & C_{1} (x) + C_{2} (x) + C_{3} (x) \\ = & x^{14} + x^{8} + x^{7} + x^{6} + x^{3} + x + 1 \end{matrix}

Alternatively, it is possible to represent the codeword in vector form:

C_{0} (x) = (1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1)

After the error correction process, which will be performed on the sum of the coded packets, taking into account the side information available at each receiver, data recovery will occur as follows:

\begin{matrix} • p_{1} (x) & = \frac{(x^{14} + x^{8} + x^{7} + x^{6} + x^{3} + x + 1) mod (x^{8} + x^{7} + x^{6} + x^{4} + 1)}{(x^{4} + x + 1)}; \\ • p_{2} (x) & = \frac{(x^{14} + x^{8} + x^{7} + x^{6} + x^{3} + x + 1) mod (x^{10} + x^{8} + x^{5} + x^{4} + x^{2} + x + 1)}{(x^{8} + x^{7} + x^{6} + x^{4} + 1)}; \\ • p_{3} (x) & = \frac{(x^{14} + x^{8} + x^{7} + x^{6} + x^{3} + x + 1)}{(x^{10} + x^{8} + x^{5} + x^{4} + x^{2} + x + 1)} . \end{matrix}

Remark 9.

We have made available at [20] an m-file Matlab algorithm that performs the tree construction operations and the data recovery for the BCH code.

4. Index Coding with Multiple Interpretations

In the problem of index coding with multiple interpretations, we assume that receivers demand all the messages from the source and that the sender is unaware of the subset of messages already known in the receivers—performing an encoding so that any side information may be used by the decoder, in order to increase its error correction capability. Otherwise, if a receiver has no side information, the decoder considers the received word to belong to the highest rate code associated with the root node of the tree.

The proposed encoding process is shown in Figure 12 and can be performed in four main steps:

1.: Encoding of the different data packets with nested cyclic codes, which consists of subdividing the vector space of a linear block code into vector subspaces, using each of them for encoding a different user;
2.: Implementation of index coding at the relay node; the basic idea is that the different data packets, encoded by polynomial multiplications with linearly independent generators, are added and then forwarded to the receivers;
3.: Multiple interpretations at the receivers that occur at the error correction stage, where each receiver can decode the received message at different rates depending on the known side information;
4.: The data recovery stage, i.e., the process of decoding $C_{1} (x), \dots, C_{T} (x)$ through polynomial operations (1), as described in Section 3.1.

The notion of multiple interpretations was introduced in [13], indicating that the error correction capability in decoding gradually improves as the amount of side information available at the receiver increases. However, as we prove in Proposition 2, because of the nested structure of the tree, this characteristic of variable error correction capability can only be observed if there is a sequential removal of packets associated with the nodes, i.e., the side information should occur sequentially from the root to the top of the tree. However, in practice, this is not always the case. Thus, if we want to ensure that any information can be used efficiently in the decoder, it will be necessary to assume knowledge of the side information by the relay node or even the demand set, if we have a PICOD problem.

The following is a proposal for pliable index coding with multiple interpretations.

Pliable Index Coding with Multiple Interpretations

As in the pliable index coding problem [6], we will assume that the transmitter knows the demand set of each receiver and that all receivers are satisfied by receiving any message contained in their demand set. For example, if we are searching on the internet for a red flower image and we already have some previously downloaded pictures on our computer, if we find any other image that we do not have yet, we will be satisfied.

The goal of the server is to find an encoding scheme that satisfies all receivers, using as few transmissions as possible and ensuring that all side information associated with nodes located below the node where the packet to be recovered is located may be interpreted as a gain in error correction capability, even when they do not appear in such a sequence.

The idea behind this proposal is to apply conflict-free coloring to the hypergraph that represents the demands of all receivers, and instead of sending the encoded word

C_{0} (x) = C_{1} (x) + C_{2} (x) + \dots + C_{T} (x)

, we select the packets in a way that maximizes the possibility of a gain in error correction capability, since, as mentioned above, packages will only be removed if they occur sequentially.

Example 11.

Consider an instance of an pliable index coding with multiple interpretations in Figure 13a, where the encoded packets

C_{1} (x) = g_{i 0} (x) p_{1} (x), C_{2} (x) = g_{i 1} (x) p_{2} (x), C_{3} (x) = g_{i 2} (x) p_{3} (x)

and

C_{4} (x) = g_{i 3} (x) p_{4} (x)

will be sent to receivers

R_{1}, R_{2}

and

R_{3}

, which have demand sets

I_{1} = {1, 2}

,

I_{2} = {3}

,

I_{1} = {2, 4}

, respectively, as we see in Figure 13b.

Figure 14 shows conflict-free coloring with two colors and

G_{c} = [\begin{matrix} 1 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 \end{matrix}]

, which represents the pliable index code.

Note that if we send only the message

C_{0} = C_{1} \oplus C_{3} \oplus C_{4}

, all receivers recover one and only one message from their request set, as we can see in Table 2.

Depending on the problem, this would be an ideal solution, since the transmitter may want each receiver to decode only one message, in which case we would have a PICOD(1); no client can receive more than one message from its request set. The case of PICOD(1) is dealt with in detail in [27], and the following example, which aptly illustrates its use, is provided.

Consider a media service provider whom we pay for movies. The provider has a set of movies and customers pay for a certain number of movies, e.g., one movie. Suppose that the service is being sold in such a way that customers will be happy to receive any movie that they have not watched yet. There is a restriction on the service provider’s side, since customers who have paid for only one movie should not receive more than one. Therefore, it can only supply one film for each client.

5. Conclusions

The verification that, for cyclic codes, there will not always be an increase in the error correction capacity between the levels of the tree leads us to search for ways to correctly choose the generating polynomials for a code and its subcodes, in order to guarantee subcodes with larger Hamming distance and an increase in error correction capability in consecutive levels of the tree. A method for the construction of chains of some linear block codes that maintains the minimum distances (of the generated subcodes) as large as possible is presented in Vinck and may be useful in addressing this issue.

Our work deals with the construction of index coding. We treat index coding as a network coding problem and we show how it is possible to construct pliable index codes with multiple interpretations by exploiting the conflict-free coloring of a hypergraph. Studying conflict-free coloring of a hypergraph in the context of the general index coding problem seems to be an interesting direction for future studies.

Author Contributions

Conceptualization, V.G.P. and M.H.M.C.; Formal analysis, V.G.P. and M.H.M.C.; Investigation, V.G.P.and M.H.M.C.; Methodology, V.G.P. and M.H.M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001.

Conflicts of Interest

The authors declare no conflict of interest.

References

Birk, Y.; Kol, T. Informed-source coding-on-demand (ISCOD) over broadcast channels. In Proceedings of the IEEE INFOCOM’98, the Conference on Computer Communications, Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies, San Francisco, CA, USA, 29 March–2 April 1998; Volume 3, pp. 1257–1264. [Google Scholar] [CrossRef]
Bar-Yossef, Z.; Birk, Y.; Jayram, T.S.; Kol, T. Index Coding With Side Information. IEEE Trans. Inf. Theory 2011, 57, 1479–1494. [Google Scholar] [CrossRef]
Unal, S.; Wagner, A.B. A Rate–Distortion Approach to Index Coding. IEEE Trans. Inf. Theory 2016, 62, 6359–6378. [Google Scholar] [CrossRef]
El Rouayheb, S.; Sprintson, A.; Georghiades, C. On the index coding problem and its relation to network coding and matroid theory. IEEE Trans. Inf. Theory 2010, 56, 3187–3195. [Google Scholar] [CrossRef]
Effros, M.; El Rouayheb, S.; Langberg, M. An Equivalence Between Network Coding and Index Coding. IEEE Trans. Inf. Theory 2015, 61, 2478–2487. [Google Scholar] [CrossRef]
Brahma, S.; Fragouli, C. Pliable Index Coding. IEEE Trans. Inf. Theory 2015, 61, 6192–6203. [Google Scholar] [CrossRef]
Kurose, J.F.; Ross, K.W. Computer Networking: A Top-Down Approach, 5th ed.; Addison-Wesley Publishing Company: San Francisco, CA, USA, 2009. [Google Scholar]
Song, L.; Fragouli, C. Making Recommendations Bandwidth Aware. IEEE Trans. Inf. Theory 2018, 64, 7031–7050. [Google Scholar] [CrossRef]
Song, L.; Srinivasavaradhan, S.R.; Fragouli, C. The benefit of being flexible in distributed computation. In Proceedings of the 2017 IEEE Information Theory Workshop (ITW), Kaohsiung, Taiwan, 6–10 November 2017; pp. 289–293. [Google Scholar] [CrossRef]
Song, L.; Fragouli, C.; Zhao, T. A Pliable Index Coding Approach to Data Shuffling. IEEE Trans. Inf. Theory 2020, 66, 1333–1353. [Google Scholar] [CrossRef]
Dau, S.H.; Skachek, V.; Chee, Y.M. Error Correction for Index Coding With Side Information. IEEE Trans. Inf. Theory 2013, 59, 1517–1531. [Google Scholar] [CrossRef]
Barbosa, F.C.; Costa, M.H.M. A tree construction method of nested cyclic codes. In Proceedings of the 2011 IEEE Information Theory Workshop, Paraty, Brazil, 16–20 October 2011; pp. 302–305. [Google Scholar] [CrossRef]
Xiao, L.; Fuja, T.E.; Kliewer, J.; Costello, D.J. Nested codes with multiple interpretations. In Proceedings of the 2006 40th Annual Conference on Information Sciences and Systems, Princeton, NJ, USA, 22–24 March 2006; pp. 851–856. [Google Scholar] [CrossRef]
Alencar, V.P.; Costa, M.H.M. Index coding from Reed-Solomon Codes. In Proceedings of the Brazilian Society of Computational and Applied Mathematics, SBMAC, São Paulo, Brazil, 10–11 November 2021; Volume 8. [Google Scholar] [CrossRef]
Blahut, R.E. Algebraic Codes for Data Transmission, 1nd ed; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar] [CrossRef]
Pless, V. FJ MacWilliams and NJA Sloane, The theory of error-correcting codes. I and II. Bull. Am. Math. Soc. 1978, 84, 1356–1359. [Google Scholar] [CrossRef]
Alon, N.; Hasidim, A.; Lubetzky, E.; Stav, U.; Weinstein, A. Broadcasting with side information. In Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science, Philadelphia, PA, USA, 25–28 October 2008; pp. 823–832. [Google Scholar] [CrossRef]
Lubetzky, E.; Stav, U. Nonlinear Index Coding Outperforming the Linear Optimum. IEEE Trans. Inf. Theory 2009, 55, 3544–3551. [Google Scholar] [CrossRef]
Peeters, R. Orthogonal representations over finite fields and the chromatic number of graphs. Combinatorica 1996, 16, 417–431. [Google Scholar] [CrossRef]
Alencar, V.G.P. Construcao de Arvore em MatLab. 2021. Available online: https://github.com/valeriaurca/Construcao-de-Arvore.git (accessed on 4 June 2022).
Krishnan, P.; Mathew, R.; Kalyanasundaram, S. Pliable Index Coding via Conflict-Free Colorings of Hypergraphs. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021; pp. 214–219. [Google Scholar] [CrossRef]
Even, G.; Lotker, Z.; Ron, D.; Smorodinsky, S. Conflict-Free Colorings of Simple Geometric Regions with Applications to Frequency Assignment in Cellular Networks. SIAM J. Comput. 2003, 33, 94–136. [Google Scholar] [CrossRef]
Heegard, C. Partitioned linear block codes for computer memory with ’stuck-at’ defects. IEEE Trans. Inf. Theory 1983, 29, 831–842. [Google Scholar] [CrossRef]
Marvasti, F.; Hasan, M.; Echhart, M.; Talebi, S. Efficient algorithms for burst error recovery using FFT and other transform kernels. IEEE Trans. Signal Process. 1999, 47, 1065–1075. [Google Scholar] [CrossRef]
Han Vinck, A.J.; Luo, Y. Optimum distance profiles of linear block codes. In Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, ON, Canada, 6–11 July 2008; pp. 1958–1962. [Google Scholar] [CrossRef]
Lin, S.; Costello, D.J. Error Control Coding: Fundamentals and Applications, 2nd ed.; Pearson-Prentice Hall: Upper Saddle River, NJ, USA, 2004. [Google Scholar]
Sasi, S.; Rajan, B.S. On pliable index coding. arXiv 2019, arXiv:1901.05809. [Google Scholar]

Figure 1. Index coding problem with three receivers.

Figure 2. Graph and matrix related to the problem described in Figure 1.

Figure 3. A single sender with multiple receivers having side information: (a) An instance of an index coding problem with

m = 5

(receivers) and

n = 4

(messages). (b) The hypergraph that describes this instance will have four vertices and five hyperedges:

e_{1} = (1, {3, 4})

,

e_{2} = (3, {2, 4})

,

e_{3} = (4, {1})

,

e_{4} = (4, {2})

and

e_{5} = (2, {1, 3})

.

Figure 3. A single sender with multiple receivers having side information: (a) An instance of an index coding problem with

m = 5

(receivers) and

n = 4

(messages). (b) The hypergraph that describes this instance will have four vertices and five hyperedges:

e_{1} = (1, {3, 4})

,

e_{2} = (3, {2, 4})

,

e_{3} = (4, {1})

,

e_{4} = (4, {2})

and

e_{5} = (2, {1, 3})

.

Figure 4. Pliable index coding scheme.

Figure 5. Bipartite graph for PICOD.

Figure 6. Hypergraph.

Figure 7. Conflict-free coloring with two colors.

Figure 8. Coding model with nested cyclic codes.

Figure 9. Tree construction. The sum of the dimensions associated with the last node and the terminal nodes is equal to the dimension of the root node.

Figure 10. Tree construction.

Figure 11. Tree construction of a BCH code tree.

Figure 12. Scheme for index coding with multiple interpretations.

Figure 13. Pliable index coding with multiple interpretations: (a) The construction representation of an instance of a pliable index coding problem with

m = 3

(receivers) and

n = 4

(messages). (b) Shows the hypergraph that describes this instance.

Figure 13. Pliable index coding with multiple interpretations: (a) The construction representation of an instance of a pliable index coding problem with

m = 3

(receivers) and

n = 4

(messages). (b) Shows the hypergraph that describes this instance.

Figure 14. Conflict-free coloring with two colors.

Table 1. Parameters for values of

m \leq 6

.

Table 1. Parameters for values of

m \leq 6

.

	$m = 3$	$m = 4$			$m = 5$					$m = 6$
n	7	15			31					63
k	4	11	7	5	26	21	16	11	6	57	51	45	39	36	30	24	18	16	10	7
t	1	1	2	3	1	2	3	5	7	1	2	3	4	5	6	7	10	11	13	15

Note that there will always be an increase in error correction capability for a fixed n and varying k.

Table 2. Receivers with their side information sets.

Receivers	Side Information Sets	Decodes from Transmission
$R_{1}$	$S_{1} = {3, 4}$	$C_{1} \oplus C_{3} \oplus C_{4} \oplus C_{3} \oplus C_{4} = C_{1}$
$R_{2}$	$S_{2} = {1, 2, 4}$	$C_{1} \oplus C_{3} \oplus C_{4} \oplus C_{1} \oplus C_{4} = C_{3}$
$R_{3}$	$S_{3} = {1, 3}$	$C_{1} \oplus C_{3} \oplus C_{4} \oplus C_{1} \oplus C_{3} = C_{4}$

Each receiver R_j is satisfied if it can successfully recover any new message that is not present in its side information set, i.e., any message x_d ∈

x_{I_{j}}

, where

I_{j} ≜ [n] ∖ S_{j}

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pedrosa, V.G.; Costa, M.H.M. Index Coding with Multiple Interpretations. Entropy 2022, 24, 1149. https://doi.org/10.3390/e24081149

AMA Style

Pedrosa VG, Costa MHM. Index Coding with Multiple Interpretations. Entropy. 2022; 24(8):1149. https://doi.org/10.3390/e24081149

Chicago/Turabian Style

Pedrosa, Valéria G., and Max H. M. Costa. 2022. "Index Coding with Multiple Interpretations" Entropy 24, no. 8: 1149. https://doi.org/10.3390/e24081149

APA Style

Pedrosa, V. G., & Costa, M. H. M. (2022). Index Coding with Multiple Interpretations. Entropy, 24(8), 1149. https://doi.org/10.3390/e24081149

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Index Coding with Multiple Interpretations

Abstract

1. Introduction

2. Preliminaries

2.1. Notation and Definitions

2.2. Review of Linear and Cyclic Codes

Structure of Linear Block Codes

2.3. Index Coding with Side Information

Index Coding via Fitting Matrices

2.4. Pliable Index Coding

Pliable Index Coding via Colorings of Hypergraphs

2.5. Index Coding via MDS Codes

3. Results

3.1. Index Coding from Reed–Solomon Codes

A Tree Construction with Nested Cyclic Codes

3.2. Tree Construction: Algorithm and Considerations

3.2.1. Decoding—Error Correction

Syndrome Decoding

3.2.2. Decoding—Data Recovery

Will There Always Be an Increase in Error Correction Capability?

3.3. An Example with a BCH Code

4. Index Coding with Multiple Interpretations

Pliable Index Coding with Multiple Interpretations

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

	$m = 3$	$m = 4$			$m = 5$					$m = 6$
n	7	15			31					63
k	4	11	7	5	26	21	16	11	6	57	51	45	39	36	30	24	18	16	10	7
t	1	1	2	3	1	2	3	5	7	1	2	3	4	5	6	7	10	11	13	15

	$m = 3$	$m = 4$			$m = 5$					$m = 6$
n	7	15			31					63
k	4	11	7	5	26	21	16	11	6	57	51	45	39	36	30	24	18	16	10	7
t	1	1	2	3	1	2	3	5	7	1	2	3	4	5	6	7	10	11	13	15

	$m = 3$	$m = 4$			$m = 5$					$m = 6$
n	7	15			31					63
k	4	11	7	5	26	21	16	11	6	57	51	45	39	36	30	24	18	16	10	7
t	1	1	2	3	1	2	3	5	7	1	2	3	4	5	6	7	10	11	13	15