Publication:

Streaming Algorithms and Graph Connectivity

Loading...
Thumbnail Image

Date

2019-05-06

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wang, Zhengyu. 2019. Streaming Algorithms and Graph Connectivity. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

The streaming model treats the input as a sequence of items that can be read in one single pass using only limited storage, preferably poly-logarithmic in the input size. Streaming algorithms have numerous applications such as Internet monitoring, databases, and trend detection. In the first part of the thesis, I describe several new contributions to streaming algorithms, including: (i) A space and time optimal algorithm (taking O(1) words, O(1) update and query time) that finds ell_2-heavy hitters in insertion streams. ell_2-heavy hitter is the strongest notion of heavy hitters (or frequent items) achievable in poly-logarithmic space. (ii) An optimal space lower bound for samplers. Samplers are used as the building blocks for streaming graph algorithms such as graph connectivity. Graph connectivity is one of the most fundamental graph problems. In the second part of the thesis, I present new algorithms for graph connectivity in the dynamic streaming setting and parallel computing setting: (i) A randomized algorithm for dynamic graph connectivity using O(n log^3 n) bits with improved update time: With 1/poly(n) failure probability, the algorithm has worst-case running time O(log^3 n) per edge insertion, O(log^4 n) per edge deletion, and O(log n / loglog n) per query, where n is the number of vertices. (ii) A randomized graph connectivity algorithm that runs in O(log diam(G) loglog_{M/n} n) parallel time under the Massive Parallel Computing (MPC) model for undirected graph G with n vertices and total memory constraint M, where diam(G) refers to the largest diameter among all its connected components.

Description

Other Available Sources

Research Data

Keywords

streaming algorithm, graph connectivity, parallel algorithm, heavy hitter, sampler, massive parallel computing

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories