Publications
Klim Efremenko and Gillat Kol and Raghuvansh Saxena
Optimal Error Resilience of Adaptive Message Exchange
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2021)
[Abstract]
[Paper: PDF]
Klim Efremenko and Gillat Kol and Raghuvansh Saxena
Binary Interactive Error Resilience Beyond 1/8
Foundations of Computer Science (FOCS 2020)
[Abstract]
[Paper: PDF]
Klim Efremenko and Gillat Kol and Raghuvansh Saxena
Interactive Error Resilience Beyond $\frac{2}{7}$
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2020)
[Abstract]
[Paper: PDF]
Klim Efremenko and Gillat Kol and Raghuvansh Saxena
Radio Network Coding Requires Logarithmic Overhead.
Foundations of Computer Science (FOCS 2019)
[Abstract]
[Paper: PDF]
We consider the celebrated {\em radio network} model for abstracting communication in wireless networks. In this model, in any round, each node in the network may broadcast a message to all its neighbors. However, a node is able to hear a message broadcast by a neighbor only if no {\em collision} occurred, meaning that it was the only neighbor broadcasting.
While the (noiseless) radio network model received a lot of attention over the last few decades, the effect of noise on radio networks is still not well understood.
In this paper, we take a step forward and show that making radio network protocols resilient to noise may require a substantial performance overhead. Specifically, we construct a multi-hop network and a communication protocol over this network that works in {\em T} rounds when there is no noise. We prove that any scheme that simulates our protocol and is resilient to stochastic noise, requires at least {\em cT log(n)} rounds, for some constant {\em c}.
%We prove that any scheme that simulates our protocol and is resilient to stochastic noise, requires an $\Omega(\log n)$ multiplicative overhead in terms of the number of rounds.
This stands in contrast to our previous result (STOC, 2018), showing that protocols over the single-hop (clique) network can be made noise resilient with only a constant overhead. Our result also settles a recent conjecture by Censor{-}Hillel, Haeupler, Hershkowitz, Zuzic (2018).
We complement the above result by giving a scheme to simulate any protocol with a fixed order of transmissions with only an O{\em (log (n))} overhead.
%However, if two or more neighbors of the same node are broadcasting at the same time, their messages will not be received due to {\em collision}.
Klim Efremenko and Gillat Kol and Raghuvansh Saxena
Interactive coding over the noisy broadcast channel
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing(STOC 2018)
[Abstract]
[Paper: PDF]
In this paper we show first constant rate coding scheme for a noisy broadcast model
defined by El Gamal in 1984. In this model a set of $n$ players, each holding a private input bit, communicate over a noisy broadcast channel. Their mutual goal is for all players to learn all inputs. At each round one of the players broadcasts a bit to all the other players, and the bit received by each player is flipped with a fixed constant probability (independently for each recipient). How many rounds are needed?
The best know protocol before our work was given in 1988 by Gallager, who gave an elegant noise-resistant protocol
requiring only $\mathcal{O}(n \log \log n)$ rounds. We show that $O(n)$ rounds is suffice. Moreover we generalized the above result and initiate the study of interactive coding over the noisy broadcast channel.
We show that any interactive protocol that works over the noiseless broadcast channel can be simulated over our restrictive noisy broadcast model with only constant, (independent of number of players), blowup of the communication.
In the paper by Goyal, Kindler, and Saks in 2008 in~\cite{GKS08} it was shown that Gallager's algorithm is essentially tight for a restrictive (non-adaptive) model of communication\footnote{They do not write it explicitly, but in there proofs they rely on it}. Before our work this restriction does not considered a significant restriction.
In this work we show that this is not a case and adaptivity can significantly improve the rate of communication.
Klim Efremenko and Gillat Kol and Raghuvansh Saxena
Noisy Beeps
Principles of Distributed Computing (PODC) 2020
[Abstract]
[Paper: PDF]
We study the effect of noise on the $n$-party {\em beeping model}. In this model, in every round, each party may decide to either `beep' or not. All parties hear a beep if and only if at least one party beeps. The beeping model is becoming increasingly popular, as it offers a very simple abstraction of wireless networks and is very well suited for studying biological phenomena. Still, the noise resilience of the beeping model is yet to be understood.
Our main result is a lower bound, showing that making protocols in the beeping model resilient to noise may have a large performance overhead. Specifically, we give a protocol that works over the (noiseless) beeping model, and prove that any scheme that simulates this protocol over the beeping model with correlated stochastic noise will blow up the number of rounds by an $\Omega(\log n)$ multiplicative factor.
We complement this result by a matching upper bound, constructing a noise-resilient simulation scheme with $\mathcal{O}(\log n)$ overhead for any noiseless beeping protocol.
Mark Braverman and Klim Efremenko and Ran Gelles and Michael A. Yitayew:
Optimal Short-Circuit Resilient Formulas.
Computational Complexity Conference(CCC 2019)
[Abstract]
[Paper: PDF]
We consider fault-tolerant boolean formulas in which the output of a faulty gate
is short-circuited to one of the gate's inputs. A recent result by Kalai et al. [FOCS 2012]
converts any boolean formula into a resilient formula of polynomial size that works correctly if less than a fraction $1/6$ of the gates (on every input-to-output path) are faulty.
We improve the result of Kalai et al., and show how to efficiently fortify any boolean formula against
a fraction $1/5$ of short-circuit gates per path, with only a polynomial blowup in size. We additionally show that it is impossible to obtain formulas with higher resilience and sub-exponential growth in size.
\medskip
Towards our results, we consider interactive coding schemes when noiseless feedback is present; these produce resilient boolean formulas via a Karchmer-Wigderson relation.
We develop a coding scheme that resists up to a fraction $1/5$ of corrupted transmissions \emph{in each direction of the interactive channel}.
We further show that such a level of noise is maximal for coding schemes with sub-exponential blowup in communication. Our coding scheme takes a surprising inspiration from Blockchain technology.
Klim Efremenko and Aryeh Kontorovich and Moshe Noivirt:
Fast and Bayes-consistent nearest neighbors.
[Abstract]
[Paper: PDF]
Research on nearest-neighbor methods tends to focus
somewhat dichotomously either on the statistical or the computational aspects
--- either on, say,
Bayes consistency and rates
of convergence
or on techniques for speeding up the proximity search.
This paper aims at bridging these realms:
to reap the advantages of fast evaluation time
while maintaining Bayes consistency,
and further
without sacrificing too much in the risk decay rate.
We combine the locality-sensitive hashing (LSH) technique
with a novel missing-mass argument
to obtain
a fast and Bayes-consistent
classifier.
Our algorithm's prediction runtime
compares favorably against
state of the art approximate NN methods,
while maintaining Bayes-consistency and attaining rates comparable to minimax.
On samples of size $n$ in $\R^d$,
our pre-processing phase
has runtime $O(d n \log n)$,
while the evaluation phase
has runtime $O(d\log n)$ per query point.
Klim Efremenko and Garg, Ankit and Oliveira, Rafael and Wigderson, Avi
Barriers for rank methods in arithmetic complexity
Innovations in Theoretical Computer Science Conference (ITCS 2018)
[Abstract]
[Paper: PDF]
{\em Arithmetic complexity}, the study of the cost of computing polynomials via additions and multiplications, is considered (for many good reasons) simpler to understand than {\em Boolean complexity}, namely computing Boolean functions via logical gates. And indeed, we seem to have significantly more lower bound techniques and results in arithmetic complexity than in Boolean complexity. Despite many successes and rapid progress, however, foundational challenges, like proving super-polynomial lower bounds on circuit or formula size for explicit polynomials, or super-linear lower bounds on explicit 3-dimensional tensors, remain elusive.
At the same time (and possibly for similar reasons), we have plenty more excuses, in the form of ``barrier results'' for failing to prove basic lower bounds in Boolean complexity than in arithmetic complexity. Efforts to find barriers to arithmetic lower bound techniques seem harder, and despite some attempts we have no excuses of similar quality for these failures in arithmetic complexity. This paper aims to add to this study.
In this paper we address {\em rank methods}, which were long recognized as encompassing and abstracting almost all known arithmetic lower bounds to-date, including the most recent impressive successes. Rank methods (under the name of {\em flattenings}) are also in wide use in algebraic geometry for proving tensor rank and symmetric tensor rank lower bounds. Our main results are barriers to these methods. In particular,
\begin{itemize}
\item Rank methods {\em cannot} prove better than $\Omega_d (n^{\lfloor d/2 \rfloor})$ lower bound on the tensor rank of {\em any} $d$-dimensional tensor of side $n$. (In particular, they cannot prove super-linear, indeed even $>8n$ tensor rank lower bounds for {\em any} 3-dimensional tensors.)
\item Rank methods {\em cannot} prove $\Omega_d (n^{\lfloor d/2 \rfloor})$ on the {\em Waring rank}\footnote{A very restricted form of depth-3 circuits} of any $n$-variate polynomial of degree $d$. (In particular, they cannot prove such lower bounds on stronger models, including depth-3 circuits.)
\end{itemize}
The proofs of these bounds use simple linear-algebraic arguments, leveraging connections between the {\em symbolic} rank of matrix polynomials and the usual rank of their evaluations. These techniques can perhaps be extended to barriers for other arithmetic models on which progress has halted.
To see how these barrier results directly inform the state-of-art in arithmetic complexity we note the following.
First, the bounds above nearly match the best explicit bounds we know for these models, hence offer an explanations why the rank methods got stuck there. Second, the bounds above are a far cry (quadratically away) from the true complexity (e.g. of random polynomials) in these models, which {\em if} achieved (by any methods), are known to imply super-polynomial formula lower bounds.
We also explain the relation of our barrier results to other attempts, and in particular how they significantly differ from the recent attempts to find analogues of ``natural proofs'' for arithmetic complexity. Finally, we discuss the few arithmetic lower bound approaches which fall outside rank methods, and some natural directions our barriers suggest.
Klim Efremenko and Elad Haramaty and Yael Kalai
Interactive Coding with Nearly Optimal Round and Communication Blowup
Innovations in Theoretical Computer Science Conference (ITCS 2020)
[Abstract]
[Paper: PDF]
The problem of constructing error-resilient interactive protocols was introduced in the seminal works of Schulman (FOCS 1992, STOC 1993). These works show how to convert any two-party interactive protocol into one that is resilient to constant-fraction of error, while blowing up the communication by only a constant factor. Since these seminal works, there have been many followup works which improve the error rate, the communication rate, and the computational efficiency.
All these works assume that in the underlying protocol in each round each party sends a single bit. Thus, in this model, any protocol with $2T$ bits of communication requires $T$ rounds of back-and-forth communication. Moreover, all these works assume that the communication complexity of the underlying protocol is {\em fixed} (and a priori known).
In this work, we consider the model where in each round each party may send a message of {\em arbitrary length}, where the length of the messages and the length of the protocol may be {\em adaptive}, and may depend on the private inputs of the parties and on previous communication. In particular, we do not assume any bound on the communication complexity of the underlying protocol.
This model is known as the (synchronous) {\em message passing model}, and is commonly used in distributed computing, and is the most common model used in cryptography.
We consider the adversarial error model, where $\epsilon$-fraction of the communication may be corrupted. Our error model not only allows the adversary to toggle with the corrupted bits, but also allows the adversary to insert and delete bits. We also assume a bound $\epsilon'$ on the fraction of rounds that can be (fully) corrupted, and argue that such a bound is necessary in order to obtain an error-resilient version with small blowup in the round complexity.
We show how to convert any protocol $\Pi$ into another protocol $\Pi'$ with comparable efficiency guarantees, that is resilient to such adversarial error (for some fixed constants $\epsilon,\epsilon'>0$).
We construct such $\Pi'$ for a various range of blowup parameters.
In particular, we construct such $\Pi'$ with $(1+\tilde{O}\left(\epsilon^{1/4}\right))$ blowup in communication and $O(1)$ blowup in rounds. We also show how to reduce the blowup in rounds at the expense of increasing the blowup in communication, and construct $\Pi'$ where both the blowup in rounds and communication, approaches one (i.e., no blowup) as $\epsilon$ and $\epsilon'$ approach zero.
We also give evidence that our parameters are optimal.
Finally, we emphasize that our transformation is quite simple, and preserves the computational efficiency of the protocol. Namely, if in underlying protocol~$\Pi$ the parties run in time~$T$, the their runtime in $\Pi'$ is $\poly(T)$.
Ankit Singh Rawat and Itzhak Tamo and Venkatesan Guruswam and Klim Efremenko
MDS Code Constructions With Small Sub-Packetization and Near-Optimal Repair Bandwidth
IEEE Transactions on Information Theory 2018
Mark Braverman Klim Efremenko and Ran Gelles and Bernhard Haeupler
Constant-rate coding for multiparty interactive communication is impossible.
Journal of the ACM 2017, Symposium on Theory of Computing (STOC) 2016,
[Abstract]
[Paper: PDF]
We study coding schemes for multiparty interactive communication
over synchronous networks that suffer from stochastic noise, where each bit is independently flipped with probability~$\eps$. We analyze the minimal overhead that must be added by the coding scheme in order to succeed in performing the computation despite the noise.
Our main result is a lower bound on the communication of any noise-resilient protocol over a synchronous star network with $n$-parties (where all parties communicate in every round).
Specifically, we show a task that can be solved by communicating
$T$~bits over the noise-free network, but for which any protocol with success probability of~${1-o(1)}$ must communicate at least $\Omega(T \frac{\log n}{\log\log n})$ bits when the channels are noisy.
By a 1994 result of Rajagopalan and Schulman, the slowdown we prove is the highest one can obtain on any topology, up to a $\log \log n$ factor.
We complete our lower bound with a matching coding scheme that achieves the same overhead; thus, the capacity of (synchronous) star networks is $\Theta(\log\log n /\log n)$. Our bounds prove that, despite several previous coding schemes with rate $\Omega(1)$ for certain topologies, no coding scheme with constant rate $\Omega(1)$ exists for arbitrary $n$-party noisy networks.
Noga Alon Mark Braverman Klim Efremenko and Ran Gelles and Bernhard Haeupler
Reliable Communication over Highly Connected Noisy Networks
invited to Distributed Computing (DIST) special issue of Principles of Distributed Computing (PODC) 2016
[Abstract]
[Paper: PDF]
We consider the task of multiparty computation performed over networks in
the presence of random noise. Given an $n$-party protocol that takes $R$
rounds assuming noiseless communication, the goal is to find a coding
scheme that takes $R'$ rounds and computes the same function with high
probability even when the communication is noisy, while maintaining a
constant asymptotic \emph{rate}, i.e., while keeping $\liminf_{n,R\to
\infty} R/R'$ positive.
Rajagopalan and Schulman (STOC '94) were the first to consider this
question, and provided a coding scheme with rate~$O(1/\log (d+1))$, where
$d$ is the maximal degree in the network. While that
scheme provides a constant rate coding for many practical situations,
in the worst case, e.g., when the network is a complete graph, the rate
is~$O(1/\log n)$, which tends to $0$ as $n$ tends to infinity.
We revisit this question and provide an efficient coding scheme with
a constant rate for the interesting case of fully connected networks.
We furthermore extend the result and show that if a ($d$-regular) network has
mixing time~$m$, then there exists an efficient coding scheme with
rate $O(1/m^3\log m)$. This implies a constant rate coding scheme for
any $n$-party protocol over a $d$-regular network with a constant mixing time,
and in particular for random graphs with $n$ vertices and
degrees $n^{\Omega(1)}$.
Noga Alon Klim Efremenko and Benny Sudakov
Testing Equality in Communication Graphs.
IEEE Information Theory 2017
[Abstract]
[Paper: PDF]
Let $G=(V,E)$ be a connected undirected graph with $k$ vertices. Suppose
that on each vertex of the graph there is a player having an $n$-bit
string. Each player is allowed to communicate with its neighbors according
to an agreed communication protocol, and the players must decide,
deterministically, if their inputs are all equal. What is the minimum
possible total number of bits transmitted in a protocol solving
this problem ? We determine this minimum up to a lower order
additive term in many cases (but not for all graphs).
In particular, we show that it is $kn/2+o(n)$ for any
Hamiltonian $k$-vertex graph, and that for any $2$-edge connected
graph with $m$ edges containing no
two adjacent vertices of degree exceeding $2$ it is $mn/2+o(n)$.
The proofs combine graph theoretic ideas with
tools from additive number theory.
Klim Efremekno, Joseph Landsberg, Hal Schenck and Jerzy Weyman
The method of shifted partial derivatives cannot separate the permanent from the determinant
Mathematics of Computation 2018
[Abstract]
[Paper: PDF]
The method of shifted partial
derivatives
introduced in
\cite{DBLP:journals/eccc/Kayal12,gupta4} alone cannot prove that the padded permanent
$\ell^{n-m}\tperm_m$ cannot
be realized inside the $GL_{n^2}$-orbit closure of the determinant $\tdet_n$
when $n>2m^2+2m$.
Klim Efremekno, Joseph Landsberg, Hal Schenck and Jerzy Weyman
On minimal free resolutions and the method of shifted partial derivatives in complexity theory.
Journal of Algebra 2018
[Abstract]
[Paper: PDF]
The minimal free resolution of the Jacobian ideals of the determinant polynomial
were computed by Lascoux \cite{MR520233}, and it is an active area of
research to understand the Jacobian ideals of
the permanent, see e.g., \cite{MR1777172,MR2386244}. As a step in this direction we compute several new cases
and completely determine the linear strands of the minimal free resolutions
of the ideals generated by sub-permanents.
Our motivation is an exploration of the utility and limits of the method of shifted partial
derivatives
introduced in
\cite{DBLP:journals/eccc/Kayal12,gupta4}. The method of
shifted partial derivatives amounts to computing Hilbert functions
of Jacobian ideals, and the Hilbert functions
are in turn the Euler characteristics of the minimal free resolutions of the
Jacobian ideals. We compute several such Hilbert functions
relevant for complexity theory. We show that the method of shifted partial derivatives alone cannot prove the padded permanent
$\ell^{n-m}\tperm_m$ cannot
be realized inside the $GL_{n^2}$-orbit closure of the determinant $\tdet_n$
when $m< 1.5 n^{2}$.
Mark Braverman and Klim Efremenko
List and Unique Coding for Interactive Communication in the Presence of Adversarial Noise
Special issue of the SIAM Journal of Computing for FOCS 2014.
[Abstract]
[Paper: PDF]
In this paper we extend the notion of list-decoding to the setting of interactive communication and study its limits. In particular, we show that any protocol can be encoded, with a constant rate, into a list-decodable protocol which is resilient
to a noise rate of up to $\frac{1}{2}-\varepsilon$, and that this is tight.
Using our list-decodable construction, we study a more nuanced model of noise where the adversary can corrupt up to a fraction $\alpha$
Alice's communication and up to a fraction $\beta$ of Bob's communication. We use list-decoding in order to fully characterize the region $\mathcal{R}_U$ of pairs $(\alpha, \beta)$ for which unique decoding with a constant rate is possible. The region $\mathcal{R}_U$ turns out to be quite unusual in its shape. In particular, it is bounded by a piecewise-differentiable curve with infinitely many pieces.
We show that outside this region, the rate must be exponential. This suggests that in some error regimes, list-decoding is necessary for optimal unique decoding.
We also consider the setting where only one party of the communication must output the correct answer. We precisely characterize the region of all pairs $(\alpha,\beta)$ for which one-sided unique decoding is possible in a way that Alice will output the correct answer.
Klim Efremenko and Ran Gelles and Bernhard Haeupler
Maximal Noise in Interactive Communication over Erasure Channels and Channels with Feedback
IEEE Transactions on Information Theory 2016, (appeared in ITCS 2015)
[Abstract]
[Paper: PDF]
We provide tight upper and lower bounds on the noise resilience of interactive communication
over noisy channels with \emph{feedback}.
In this setting, we show that the maximal fraction of noise that any robust protocol can resist is~$1/3$.
Additionally, we provide a simple and efficient robust protocol that succeeds as long as the fraction of noise is at most~$1/3-\eps$.
Surprisingly, both bounds hold regardless of whether the parties communicate via a binary or an arbitrarily large alphabet.
This is contrasted with the bounds of~$1/4$ and $1/8$ shown by Braverman and Rao (STOC~'11) for the case of robust protocols over noisy channels without feedback, assuming a large (constant size) alphabet or a binary alphabet respectively.
We also consider interactive communication over \emph{erasure} channels. We provide a protocol that matches the optimal tolerable erasure rate of $1/2 - \eps$ of previous protocols (Franklin et~al., CRYPTO~'13) but operates in a much simpler and more efficient way. Our protocol works with an alphabet of size~$6$, in contrast to prior protocols in which the alphabet size grows as~$\eps\to0$.
Building on the above algorithm with \emph{fixed} alphabet we are able to devise
a protocol that works for \emph{binary} erasure channels,
improving the best previously known bound on the tolerable erasure rate from $1/4-\eps$ to~$1/3-\eps$.
Klim Efremenko
From Irreducible Representations to Locally Decodable Codes.
The 44st ACM Symposium on Theory of Computing, STOC 2012
[Abstract]
[BiBTeX]
[Paper: PDF]
Locally Decodable Code (LDC) is a code that encodes a message in
a way that one can decode any particular symbol of the message by reading only a constant number of locations,
even if a constant fraction of the encoded message is adversarially corrupted.
In this paper we present a new approach for the construction of LDCs. We show that if there exists an irreducible representation $(\rho, V)$ of $G$ and $q$ elements $g_1,g_2,\ldots, g_q$
in $G$ such that there exists a linear combination of matrices $\rho(g_i)$ that is of rank one,
then we can construct a $q$-query Locally Decodable Code
$C:V-> F^G$.
We show the potential of this approach by constructing constant query LDCs of sub-exponential length matching the parameters of the best known constructions.
@article{Efremenko11,
author = {
Klim Efremenko
},
title = { From Irreducible Representations to Locally Decodable Codes.},
journal = {Electronic Colloquium on Computational Complexity (ECCC)},
year = {2011}
}
Avraham Ben-Aroya, Klim Efremenko and Amnon Ta-Shma
Local List Decoding with a Constant Number of Queries.
51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010
[Abstract]
[BiBTeX]
[Paper: PDF]
Recently Efremenko showed locally-decodable codes of sub-exponential
length. That result showed that these codes can handle up to
$\frac{1}{3} $ fraction of errors. In this paper we show that the
same codes can be locally unique-decoded from error rate
$\half-\alpha$ for any $\alpha>0$ and locally list-decoded from
error rate $1-\alpha$ for any $\alpha>0$, with only a constant
number of queries and a constant alphabet size. This gives the first
sub-exponential codes that can be locally list-decoded with a
constant number of queries.
@inproceedings{BET10,
author = {Avraham Ben-Aroya and
Klim Efremenko and
Amnon Ta-Shma},
title = {Local List Decoding with a Constant Number of Queries},
booktitle = {FOCS},
year = {2010},
pages = {715-722},
ee = {http://dx.doi.org/10.1109/FOCS.2010.88}
}
Avraham Ben-Aroya, Klim Efremenko and Amnon Ta-Shma
A Note on Amplifying the Error-Tolerance of Locally Decodable Codes.
Electronic Colloquium on Computational Complexity (ECCC) 17: 134 (2010)
[Abstract]
[BiBTeX]
[Paper: PDF]
Trevisan [Tre03] suggested a transformation that allows amplifying the error rate a
code can handle. We observe that this transformation, that was suggested in the non-local
setting, works also in the local setting and thus gives a generic, simple way to amplify
the error-tolerance of locally decodable codes. Specifically, this shows how to transform a
locally decodable code that can tolerate a constant fraction of errors to a locally decodable
code that can recover from a much higher error-rate, and how to transform such locally
decodable codes to locally list-decodable codes.
The transformation of [Tre03] involves a simple composition with an approximately
locally (list) decodable code. Using a construction of such codes by Impagliazzo et
al. [IJKW10], the transformation incurs only a negligible growth in the length of the code
and in the query complexity.
@article{DBLP:journals/eccc/Ben-AroyaET10a,
author = {Avraham Ben-Aroya and
Klim Efremenko and
Amnon Ta-Shma},
title = {A Note on Amplifying the Error-Tolerance of Locally Decodable
Codes},
journal = {Electronic Colloquium on Computational Complexity (ECCC)},
volume = {17},
year = {2010},
pages = {134},
ee = {http://eccc.hpi-web.de/report/2010/134},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Klim Efremenko
3-Query Locally Decodable Codes of Subexponential Length
STOC 2009 Special Issue SIAM Journal on Computing
[Abstract]
[BiBTeX]
[Paper: PDF]
Locally Decodable Codes (LDC) allow one to decode any particular symbol of the input message by making a constant number of queries to a codeword, even if a constant fraction of the codeword is damaged. In a recent work ~\cite{Yekhanin08} Yekhanin constructs a 3-query LDC with sub-exponential length. However, this construction requires a conjecture that there are infinitely many Mersenne primes. In this paper, we give the first unconditional constant query LDC construction with subexponantial codeword length. In addition, our construction reduces codeword length.
@article{Efremenko09,
author = {Klim Efremenko},
title = {3-Query Locally Decodable Codes of Subexponential Length },
year = {2009},
booktitle= {The 41st ACM Symposium on Theory of Computing}
}
Klim Efremenko and Omer Reingold
How Well Do Random Walks Parallelize?
APPROX-RANDOM, 2009
[Abstract]
[BiBTeX]
[Paper: PDF]
A random walk on a graph is a process that explores the graph in a random way: at each step the walk is at a vertex of the graph, and at each step it moves to a uniformly selected neighbor of this vertex. Random walks are extremely useful in computer science and in other fields. A very natural problem that was recently raised by Alon, Avin, Koucky, Kozma, Lotker, and Tuttle (though it was implicit in several previous papers) is to analyze the behavior of $k$ independent walks in comparison with the behavior of a single walk. In particular, Alon et al. showed that in various settings (e.g., for expander graphs), $k$ random walks cover the graph (i.e., visit all its nodes), $\Omega(k)$-times faster (in expectation) than a single walk. In other words, in such cases $k$ random walks efficiently ``parallelize" a single random walk. Alon et al.\ also demonstrated that, depending on the specific setting, this ``speedup" can vary from logarithmic to exponential in $k$.
In this paper we initiate a more systematic study of multiple random walks. We give lower and upper bounds both on
the cover time {\em and on the hitting time} (the time it takes to hit one specific node) of multiple random walks. Our study revolves over three alternatives for the starting vertices of the random walks: the worst starting vertices (those who maximize the hitting/cover time), the best starting vertices, and starting vertices selected from the stationary distribution. Among our results, we show that the speedup when starting the walks at the worst vertices cannot be too large - the hitting time cannot improve by more than an $O(k)$ factor and the cover time cannot improve by more than $\min\{k \log n,k^2\}$ (where $n$ is the number of vertices). These results should be contrasted with the fact that there was no previously known upper-bound on the speedup and that the speedup can even be {\em exponential} in $k$ for random starting vertices. We further show that for $k$ that is not too large (as a function of various parameters of the graph), the speedup in cover time is $O(k)$ {\em even for walks that start from the best vertices} (those that minimize the cover time). As a rather surprising corollary of our theorems, we obtain a new bound which relates
the cover time $C$ and the mixing time $\mix$ of a graph. Specifically, we show that $C=O(m \sqrt{\mix}\log^2 n)$ (where $m$ is the number of edges).
@inproceedings{DBLP:conf/approx/EfrReingold09,
author = {Klim Efremenko and
Omer Reingold},
title = {How Well Do Random Walks Parallelize?},
booktitle = {APPROX-RANDOM},
year = {2009},
pages = {476-489},
crossref = {DBLP:conf/approx/2009},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Raffaell Clifford , Klim Efremenko, Ely Porat and Amir Rothschild
From Coding Theory to Efficient Pattern Matching
20 nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2009
[Abstract]
[BiBTeX]
[Paper: PDF]
We consider the classic problem of pattern matching with few mismatches in the presence of promiscuously matching wildcard symbols. Given a text $t$ of length $n$ and a pattern $p$ of length $m$ with optional wildcard symbols and a bound $k$, our algorithm finds all the alignments for which the pattern matches the text with Hamming distance at most $k$ and also returns the location and identity of each mismatch. The algorithm we present is deterministic and runs in $\tilde{O}(kn)$ time, matching the best known randomised time complexity to within logarithmic factors. The solutions we develop borrow from the tool set of algebraic coding theory and provide a new framework in which to tackle approximate pattern matching problems.
@inproceedings{DBLP:conf/soda/CliffordEPR09,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Ely Porat and
Amir Rothschild},
title = {From coding theory to efficient pattern matching},
booktitle = {SODA},
year = {2009},
pages = {778-784},
ee = {http://doi.acm.org/10.1145/1496770.1496855},
crossref = {DBLP:conf/soda/2009},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Klim Efremenko and Ely Porat
Approximating General Metric Distances Between a Pattern and a Text
19nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA),2008
[Abstract]
[BiBTeX]
[Paper: PDF]
Let $T=t_0 \ldots t_{n-1}$ be a text and $P = p_0 \ldots p_{m-1}$
a pattern taken from some finite alphabet set $\Sigma$, and let
$\dist$ be a metric on $\Sigma$. We consider the problem of
calculating the sum of distances between the symbols of $P$ and the
symbols of substrings of $T$ of length $m$ for all possible offsets.
We present an $\varepsilon$-approximation algorithm for this problem
which runs in time $O(\frac{1}{\varepsilon^2}n\cdot \mathrm{
polylog}(n,\abs{\Sigma}))$. This algorithm is based on a low
distortion embedding of metric spaces into normed spaces (especially, into $\ell_{\infty}$), which is done as a preprocessing
stage. The algorithm is also based on a technique of sampling.
@inproceedings{PoratE08,
author = {Ely Porat and
Klim Efremenko},
title = {Approximating general metric distances between a pattern
and a text},
booktitle = {SODA},
year = {2008},
pages = {419-427}
}
Raffaell Clifford , Klim Efremenko, Ely Porat and Amir Rothschild
Pattern Matching with Don't Cares and Few Errors
Journal of Computer and System Sciences (JCSS), 2010
[Abstract]
[BiBTeX]
[Paper: PDF]
We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give a O(n(k+logmlogk)logn) time randomised algorithm which finds the correct answer with high probability. We then present a new deterministic O(nk^2log^2m) time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we develop an approach based on k-selectors that runs in ?(nkpolylogm) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.
@article{DBLP:journals/jcss/CliffordEPR10,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Ely Porat and
Amir Rothschild},
title = {Pattern matching with don't cares and few errors},
journal = {J. Comput. Syst. Sci.},
volume = {76},
number = {2},
year = {2010},
pages = {115-124},
ee = {http://dx.doi.org/10.1016/j.jcss.2009.06.002},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Raffaell Clifford , Klim Efremenko, Benny Porat and Ely Porat
A Black Box for Online Approximate Pattern Matching
Information and Computation, 2011
[Abstract]
[BiBTeX]
[Paper: PDF]
We present a deterministic black box solution for online approximate matching.
Given a pattern of length $m$ and a streaming text of length $n$ that
arrives one character at a time, the task is to report the distance
between the pattern and a sliding window of the text as soon as the
new character arrives. Our solution requires $O(\Sigma_{j=1}^{\log_2{m}} T(n,2^{j-1})/n)$ time for each input character, where $T(n,m)$ is the total running time of the best offline
algorithm. The types of approximation that are supported include exact matching with wildcards, matching under the Hamming norm, approximating the Hamming norm, $k$-mismatch and numerical measures such as the $L_2$ and $L_1$ norms. For these examples, the resulting online algorithms take $O(\log^2{m})$, $O(\sqrt{m\log{m}})$, $O(\log^2{m}/{\epsilon}^2)$, $O(\sqrt{k \log k} \log{m})$, $O(\log^2{m})$ and $O(\sqrt{m\log{m}})$ time per character respectively. The space overhead is $O(m)$ which we show is optimal.
@inproceedings{DBLP:conf/cpm/CliffordEPP08,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Benny Porat and
Ely Porat},
title = {A Black Box for Online Approximate Pattern Matching},
booktitle = {CPM},
year = {2008},
pages = {143-151},
ee = {http://dx.doi.org/10.1007/978-3-540-69068-9_15},
crossref = {DBLP:conf/cpm/2008},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Raffaell Clifford , Klim Efremenko, Benny Porat, Ely Porat and Amir Rothschild
Mismatch sampling
Information and Computation, 2012
[Abstract]
[BiBTeX]
[Paper: PDF]
We consider the well known problem of pattern matching under the
Hamming distance. Previous approaches have shown how to count the
number of mismatches efficiently especially, when a bound is known
for the maximum Hamming distance. Our interest is different in
that we wish collect a random sample of mismatches of fixed size
at each position in the text. Given a pattern $p$ of length $m$
and a text $t$ of length $n$,
we show how to sample with high probability $c$ mismatches where possible from every alignment of $p$ and $t$ in time.
Further, we guarantee that the mismatches are sampled uniformly and that they can therefore be seen as representative
of the types of mismatches that occur.
@inproceedings{DBLP:conf/spire/CliffordEPPR08,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Benny Porat and
Ely Porat and
Amir Rothschild},
title = {Mismatch Sampling},
booktitle = { Information and Computation},
year = {2012},
pages = {99-108},
ee = {http://dx.doi.org/10.1007/978-3-540-89097-3_11},
crossref = {DBLP:conf/spire/2012},
bibsource = {DBLP, http://dblp.uni-trier.de}
}