On Programming Languages for Probabilistic Modeling
Huang, Daniel E.
MetadataShow full item record
AbstractThe problem of probabilistic modeling and inference, at a high-level, can be viewed as constructing a (model, query, inference) tuple, where an inference algorithm implements a query on a model. Notably, the derivation of inference algorithms can be a difficult and error-prone task. Hence, researchers have explored how ideas from probabilistic programming can be applied. In the context of constructing these tuples, probabilistic programming can be seen as taking a language-based approach to probabilistic modeling and inference. For instance, by using (1) appropriate languages for expressing models and queries and (2) devising inference techniques that operate on encodings of models (and queries) as program expressions, the task of inference can be automated. Notably, these modeling languages will typically provide continuous distributions. Hence, there is no direct connection with traditional, discrete notions of computation. Nevertheless, there is an intuitive understanding of such languages given in terms of sampling.
In this dissertation, I investigate aspects of probabilistic programming languages (PPLs) in two parts. In the first part, I show how (Type-2) computable distributions can be used to give both distributional and (algorithmic) sampling semantics to a high-level, PCF-like language extended with continuous distributions. One motivation for using computable distributions, as opposed to more generally measures, is so that we can think of a Turing-complete probabilistic modeling language as expressing computable distributions. In the second part, I describe a compiler that generates Markov Chain Monte Carlo (MCMC) inference sampling algorithms for a language called AugurV2, which expresses a class of fixed-structure, parametric models. To manage the large design space of MCMC algorithms and implementations, I propose a sequence of intermediate languages that enable a compiler to gradually and successively refine a declarative description of a probabilistic model and a query for posterior samples into an executable inference algorithm. The compilation strategy produces composable MCMC algorithms for execution on the CPU and GPU. I will also show how to use the semantics based on computable distributions to justify aspects of AugurV2's design and implementation.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:40046525
- FAS Theses and Dissertations