Publication:

Randomized Methods in Streaming Algorithms and Error Correction

Loading...
Thumbnail Image

Date

2019-09-10

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Blasiok, Jaroslaw. 2019. Randomized Methods in Streaming Algorithms and Error Correction. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

\emph{Streaming algorithms} are designed to efficiently process a massive amount of data in the context of strict space requirements. Specifically, a large dataset is processed sequentially by an algorithm with working space significantly smaller than the size of the entire dataset. Upon processing the entire dataset, the algorithm should produce probabilistic estimates of statistics of interest. A memory footprint of such an algorithm, a \emph{sketch}, could be thought of as a highly compressed version of the entire dataset --- it consists of enough information to recover estimates for quantities of interest, as well as allows for limited further processing, for example in order to incorporate additional data. In the area of streaming algorithms, the goal is to understand the limits of computation under restricted resource requirements: for a specific task at hand, what is the minimum space consumption of a streaming algorithm? How this space depends on natural parameters of interests, such as the size of the data domain, the relative error of the estimator, and the failure probability? Error correction is, in a sense, a dual problem: for communication over a noisy channel, we wish to \emph{introduce redundancy} and send a longer transmission over the channel --- such that even if the transmission gets corrupted, the receiver can recover the original message. The main question in this area is how to introduce the amount of redundancy as small as possible for a given error rate, and how to achieve it with efficient algorithms for encoding and decoding. In this thesis, I present several new contributions to these two areas. \begin{itemize} \item The first algorithm for classical streaming problem \emph{distinct elements} with optimal space complexity with respect to all of the natural parameters of interest: domain size, relative error, and failure probability. \item Algorithms for \emph{tracking} variants of streaming problems, where the statistic of interest is reported after each update in the data stream. Specifically, an optimal algorithm for tracking of distinct elements, and improved algorithms for tracking of $F_p$ moments of frequency vectors, for $p\in (0, 2)$. \item Matching upper and lower bounds (up to polylogarithmic factors) for streaming space complexity of computing arbitrary \emph{symmetric norm} of the frequency vector, in terms of geometric properties of this norm. \item Modular framework for the analysis of polar codes. Polar codes are a first known construction of codes that \emph{efficiently achieve capacity} --- i.e., achieve the redundancy rate that is arbitrarily close to the theoretical limit (\emph{capacity} of a channel), together with efficient algorithms for encoding and decoding, where the minimum length of the codeword grows just polynomially in the inverse of the desired \emph{gap to capacity}. The new framework introduced here allows showing that much more general class of polar codes than was previously known efficiently achieve capacity. \item New results in the analysis of polar codes, showing exponentially small failure probability for all non-trivial families of polar codes in the small blocklength regime. \end{itemize}

Description

Other Available Sources

Research Data

Keywords

streaming algorithms, distinct elements, continuous monitoring, error-correcting codes, polar codes

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories