Publication:

Estimating Curvature of Data Manifolds with Diffusion Models

Loading...
Thumbnail Image

Date

2025-05-16

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wang, Jason. 2025. Estimating Curvature of Data Manifolds with Diffusion Models. Bachelors Thesis, Harvard University Engineering and Applied Sciences.

Abstract

In the quest to understand the geometry of data, curvature is a fundamental characterization. Differential geometry supplies many notions of curvature; this thesis presents exposition organizing a tree of curvature notions. Most importantly, we extend new research into a novel method harnessing diffusion models for estimating curvature from data. This new tool may aid in downstream applications such as shape analysis, learning theory, adversarial robustness, and more.

The key novel contributions of this thesis are (i) the introduction of a diffusion model to learn the data manifold and probe its latent representation rather than the raw data, (ii) a comparison of previous estimation methods using quadratic regression and diffusion maps investigating how they deteriorate with increased noise and dimension, and (iii) a new approach using geodesic interpolations generated by a diffusion model to estimate more precise directions of curvature.

Our findings on toy manifolds show that curvature estimation through a diffusion model proves more robust to noise, but this depends greatly on the fidelity of the diffusion model. Finally, we provide initial guesses at the curvature of small real world image datasets in the MNIST family, suggesting that these datasets might be relatively flat, which would be consistent with their empirical ease to learn.

Description

Other Available Sources

Research Data

Keywords

curvature, differential geometry, diffusion map, diffusion models, manifold learning, regression, Artificial intelligence, Applied mathematics, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories