On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation

DSpace/Manakin Repository

On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation

Citable link to this page


Title: On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation
Author: Nelson, Jelani; Nguyẽn, Huy L.; Woodruff, David P.

Note: Order does not necessarily reflect citation order of authors.

Citation: Nelson, Jelani, Huy L. Nguyẽn, and David P. Woodruff. 2014. “On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation.” Linear Algebra and Its Applications 441: 152–167.
Full Text & Related Files:
Abstract: We study classic streaming and sparse recovery problems using deterministic linear sketches, including \(\ell_1/\ell_1\) and \(\ell_{\infty}/\ell_1\) sparse recovery problems (the latter also being known as ℓ1ℓ1-heavy hitters), norm estimation, and approximate inner product. We focus on devising a fixed matrix \(A \epsilon \mathbb{R}^{m \times n}\) and a deterministic recovery/estimation procedure which work for all possible input vectors simultaneously. Our results improve upon existing work, the following being our main contributions:

• A proof that \(\ell_{\infty}/\ell_1\) sparse recovery and inner product estimation are equivalent, and that incoherent matrices can be used to solve both problems. Our upper bound for the number of measurements is \(m=O(\varepsilon^{-2}min\{log n,(log n/log(1/\varepsilon))^2\})\). We can also obtain fast sketching and recovery algorithms by making use of the Fast Johnson–Lindenstrauss transform. Both our running times and number of measurements improve upon previous work. We can also obtain better error guarantees than previous work in terms of a smaller tail of the input vector.

• A new lower bound for the number of linear measurements required to solve \(\ell_1/\ell_1\) sparse recovery. We show \(\Omega(k/\varepsilon^2+k log(n/k)/\varepsilon)\) measurements are required to recover an x′ with \(‖x-x′‖_1\leq(1+\varepsilon)‖x_{tail(k)}‖_1\), where \(x_{tail(k)}\) is x projected onto all but its largest k coordinates in magnitude.

• A tight bound of \(m=\theta(\varepsilon^{-2}log(\varepsilon^2n))\) on the number of measurements required to solve deterministic norm estimation, i.e., to recover \(‖x‖_2\pm\varepsilon‖x‖_1\).

For all the problems we study, tight bounds are already known for the randomized complexity from previous work, except in the case of \(\ell_1/\ell_1\) sparse recovery, where a nearly tight bound is known. Our work thus aims to study the deterministic complexities of these problems. We remark that some of the matrices used in our algorithms, although known to exist, currently are not yet explicit in the sense that deterministic polynomial time constructions are not yet known, although in all cases polynomial time Monte Carlo algorithms are known.
Published Version: doi:10.1016/j.laa.2012.12.025
Other Sources: http://arxiv.org/abs/1206.5725
Terms of Use: This article is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#OAP
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:13629629
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search