Optimal Las Vegas Approximate Near Neighbors in L_p

Wei, Alexander

dc.contributor.author	Wei, Alexander
dc.date.accessioned	2020-08-28T10:50:38Z
dc.date.created	2020-05
dc.date.issued	2020-06-17
dc.date.submitted	2020
dc.identifier.citation	Wei, Alexander. 2020. Optimal Las Vegas Approximate Near Neighbors in L_p. Bachelor's thesis, Harvard College.
dc.identifier.uri	https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364767	*
dc.description.abstract	In the approximate near neighbor search problem, one is given a database D of n points and asked to construct a data structure which, given a query point q, returns a "near" neighbor of q in D. As a fundamental algorithmic primitive for processing high-dimensional data, approximate near neighbor search has found a wide variety of applications, both within computer science and beyond, including to problems in machine learning, computer vision, data mining, and genomics. We show that high-dimensional approximate near neighbor search can be solved without false negatives while matching the performance of the state-of-the-art algorithms (which can produce false negatives). Our results hold for approximate near neighbor search for Euclidean space as well as all l_p spaces with 1 <= p < 2. We achieve this result in two ways: In the data-independent setting, in which algorithms do not adapt to the dataset, we show that the optimal ball-carving LSH of Andoni and Indyk (FOCS 2006) can be adapted into an algorithm without false negatives. In the data-dependent setting, in which algorithms can adapt to structure found in the dataset, we modify the data structure of Andoni et al. (SODA 2017) to produce no false negatives by constructing a new "Las Vegas" locality-sensitive filter family for the unit sphere. Our data-independent construction improves on the recent Las Vegas data structure of Ahle (FOCS 2017) for l_p when 1 < p <= 2. Our data-dependent construction does even better for l_p for all p in [1, 2] and is the first Las Vegas approximate near neighbors data structure to make use of data-dependent approaches. We also answer open questions of Indyk (SODA 2000), Pagh (SODA 2016), and Ahle by showing that for approximate near neighbors, Las Vegas data structures can match state-of-the-art Monte Carlo data structures in performance for both the data-independent and data-dependent settings and across space-time tradeoffs.
dc.description.sponsorship	Computer Science
dc.description.sponsorship	Computer Science
dc.format.mimetype	application/pdf
dc.language.iso	en
dash.license	LAA
dc.title	Optimal Las Vegas Approximate Near Neighbors in L_p
dc.type	Thesis or Dissertation
dash.depositing.author	Wei, Alexander
dc.date.available	2020-08-28T10:50:38Z
thesis.degree.date	2020
thesis.degree.grantor	Harvard College
thesis.degree.grantor	Harvard College
thesis.degree.level	Undergraduate
thesis.degree.level	Undergraduate
thesis.degree.name	AB
thesis.degree.name	AB
dc.type.material	text
thesis.degree.department	Computer Science
thesis.degree.department	Computer Science
dash.identifier.vireo
dash.author.email	w31.alex@gmail.com

Files in this item

Name:: WEI-SENIORTHESIS-2020.pdf
Size:: 570.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FAS Theses and Dissertations [6136]

Show simple item record