TY - JOUR

T1 - Optimal mean-based algorithms for trace reconstruction1,2

AU - De, Anindya

AU - O’Donnell, Ryan

AU - Servedio, Rocco A.

N1 - Funding Information:
1Supported by start up grant from Northwestern University, NSF Grant CCF-1618679, CCF-1420349 and CCF-1563155. The authors would like to thank the Simons’ Foundation for sponsoring the symposium on analysis of Boolean functions where the authors began work on this project. A. De would like to thank Aravindan Vijayaraghavan for useful discussions about this problem.

PY - 2019/4

Y1 - 2019/4

N2 - In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [in Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms 389–398 (2008) ACM]; it uses exp(O(n 1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a “mean-based algorithm,” meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least n( log n) samples. In this paper, we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < δ < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp((n1/3)) traces. In fact, we obtain matching upper and lower bounds even for δ subconstant and ρ = 1 − δ subconstant: when (log3 n)/n δ ≤ 1/2 the bound is exp(−(δn)1/3), and when 1/n ρ ≤ 1/2 the bound is exp(−(n/ρ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.

AB - In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [in Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms 389–398 (2008) ACM]; it uses exp(O(n 1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a “mean-based algorithm,” meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least n( log n) samples. In this paper, we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < δ < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp((n1/3)) traces. In fact, we obtain matching upper and lower bounds even for δ subconstant and ρ = 1 − δ subconstant: when (log3 n)/n δ ≤ 1/2 the bound is exp(−(δn)1/3), and when 1/n ρ ≤ 1/2 the bound is exp(−(n/ρ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.

KW - Deletion channel

KW - Littlewood polynomials

KW - Trace reconstruction

UR - http://www.scopus.com/inward/record.url?scp=85060906435&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060906435&partnerID=8YFLogxK

U2 - 10.1214/18-AAP1394

DO - 10.1214/18-AAP1394

M3 - Article

AN - SCOPUS:85060906435

VL - 29

SP - 851

EP - 874

JO - Annals of Applied Probability

JF - Annals of Applied Probability

SN - 1050-5164

IS - 2

ER -