Publication: Streaming Algorithms for Code Sparsification
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
The rapid growth of data collection and analysis in the 21st century has prompted the development of several new techniques in theoretical computer science. The streaming model, in which an algorithm has sequential access to a stream of input data and can only store a small fraction of the input in random access memory, has emerged as a powerful approach for processing datasets that are too large to store on a single system. From the perspective of algorithmic techniques, on the other hand, sparsification algorithms allow for the compression of large structures like graphs while preserving important global properties. These two ideas have a rich and interconnected history, and streaming algorithms for graph and hypergraph sparsification have become essential tools in the processing of massive datasets like social networks. Recently, a new framework called code sparsification has been developed as a method of generalizing existing sparsification algorithms through the lens of compressing matrices over finite fields. However, existing works only describe how to implement code sparsification in the classical computational setting, limiting its power. In this thesis, we develop the first algorithms for code sparsification in the insertion-only and dynamic streaming models. This provides a unified theory of graph and hypergraph sparsification over streams, and extends these techniques to allow for the space-efficient sparsification of Cayley graphs and a subset of constraint satisfaction problems.