The Performance of Johnson-Lindenstrauss Transforms: Beyond the Classical Framework
MetadataShow full item record
CitationJagadeesan, Meena. 2020. The Performance of Johnson-Lindenstrauss Transforms: Beyond the Classical Framework. Bachelor's thesis, Harvard College.
AbstractEuclidean dimensionality reduction is a commonly used technique to scale up algorithms in machine learning and data science. The goal of Euclidean dimensionality reduction is to reduce the dimensionality of a dataset, while preserving Euclidean distances between data points. Given the high-dimensionality of modern datasets, Euclidean dimensionality reduction serves as an effective pre-processing technique: it enables a significant speedup of computational tasks (such as clustering and nearest neighbors) while preserving their accuracy. Beginning with a seminal work by Johnson and Lindenstrauss in the 1980s, Euclidean dimensionality reduction has been studied for decades in the theoretical computer science and mathematics literatures. Recently, the performance of Euclidean dimensionality reduction has been studied in settings that depart from the classical framework, motivated by machine learning and neuroscience considerations.
In this undergraduate thesis, we continue the study of how Euclidean dimensionality reduction performs in settings beyond the classical framework. Our first result is a characterization of the performance of the standard dimensionality reduction schemes (called sparse Johnson-Lindenstrauss transforms) on "well-behaved" datasets; we generalize a line of work in the machine learning literature initiated by Weinberger et al. (ICML '09). Our second result is an analysis of neuroscience-motivated dimensionality reduction schemes (called sparse, sign-consistent Johnson-Lindenstrauss transforms); we build on work by Allen-Zhu et al.(PNAS '15). A shared technical theme between our results is a new perspective on analyzing Johnson-Lindenstrauss transforms.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364687
- FAS Theses and Dissertations