TY - GEN
T1 - Optimal mean-based algorithms for trace reconstruction
AU - De, Anindya
AU - O'Donnell, Ryan
AU - Servedio, Rocco A.
N1 - Funding Information:
Supported by start-up grant from Northwestern University. Supported by NSF grant CCF-1618679. Supported by NSF grants CCF-1420349 and CCF-1563155.
Publisher Copyright:
© 2017 ACM.
PY - 2017/6/19
Y1 - 2017/6/19
N2 - In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruc tion problem was due to Holenstein et al. [SODA 2008]; it uses exp(Õ(n1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least nΩ(log n) samples. In this paper we improve both of these results, obtaining match ing upper and lower bounds for mean-based trace reconstruction For any constant deletion rate 0 < δ < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n1/3)) traces. In fact, we obtain matching upper and lower bounds even for δ subconstant and ρ:= 1 - δ subconstant: when (log3 n)/n << δ < 1/2 the bound is exp(-Θ(δn)1/3), and when 1/√n << ρ ≤ 1/2 the bound is exp(-Θ(n/ρ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.
AB - In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruc tion problem was due to Holenstein et al. [SODA 2008]; it uses exp(Õ(n1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least nΩ(log n) samples. In this paper we improve both of these results, obtaining match ing upper and lower bounds for mean-based trace reconstruction For any constant deletion rate 0 < δ < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n1/3)) traces. In fact, we obtain matching upper and lower bounds even for δ subconstant and ρ:= 1 - δ subconstant: when (log3 n)/n << δ < 1/2 the bound is exp(-Θ(δn)1/3), and when 1/√n << ρ ≤ 1/2 the bound is exp(-Θ(n/ρ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.
KW - Deletion channel
KW - Littlewood polynomials
KW - Trace reconstruction
UR - http://www.scopus.com/inward/record.url?scp=85024385040&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85024385040&partnerID=8YFLogxK
U2 - 10.1145/3055399.3055450
DO - 10.1145/3055399.3055450
M3 - Conference contribution
AN - SCOPUS:85024385040
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 1047
EP - 1056
BT - STOC 2017 - Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing
A2 - McKenzie, Pierre
A2 - King, Valerie
A2 - Hatami, Hamed
PB - Association for Computing Machinery
T2 - 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017
Y2 - 19 June 2017 through 23 June 2017
ER -