Publication: Training Energy-Based Models to Learn Gaussian Mixture Distributions via Langevin Dynamics
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Energy-Based Models (EBMs) offer a flexible framework for modeling probability distributions through an energy function, avoiding the need for explicit density functions. Using toy distributions such as Gaussian Mixture Models (GMMs). This thesis examines variations of Langevin Dynamics as a method for sampling and training EBMs, providing concrete implementations. We present the theoretical foundations and practical implementations of key EBM training algorithms, including Unadjusted Langevin Algorithm (ULA), Metropolis-Adjusted Langevin Algorithm (MALA), and adapting the Jarzynski Equality for sampling and training. Results demonstrate the benefits and limitations of these methods, highlighting how while the Jarzynski Equality offers a time-dependent perspective that accelerates the discovery of distribution modes, it introduces practical challenges, such as difficulty of implementation and instability during training. Future work involves refining the Jarzynski-based training procedure, exploring alternative loss functions like Fischer Divergence, and extending the compositional capabilities of EBMs for tasks in robotics and reinforcement learning.