Publication: Estimating Curvature of Data Manifolds with Diffusion Models
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
In the quest to understand the geometry of data, curvature is a fundamental characterization. Differential geometry supplies many notions of curvature; this thesis presents exposition organizing a tree of curvature notions. Most importantly, we extend new research into a novel method harnessing diffusion models for estimating curvature from data. This new tool may aid in downstream applications such as shape analysis, learning theory, adversarial robustness, and more.
The key novel contributions of this thesis are (i) the introduction of a diffusion model to learn the data manifold and probe its latent representation rather than the raw data, (ii) a comparison of previous estimation methods using quadratic regression and diffusion maps investigating how they deteriorate with increased noise and dimension, and (iii) a new approach using geodesic interpolations generated by a diffusion model to estimate more precise directions of curvature.
Our findings on toy manifolds show that curvature estimation through a diffusion model proves more robust to noise, but this depends greatly on the fidelity of the diffusion model. Finally, we provide initial guesses at the curvature of small real world image datasets in the MNIST family, suggesting that these datasets might be relatively flat, which would be consistent with their empirical ease to learn.