Publication: Dimensionality Reduction of Dynamic Topic Models Using a Finite State Machine Representation
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Dynamic topic models produce semantically interpretable topics that describe data evolving through time. The number of these topics increases with the range of time fitted, diminishing the interpretability of the model output. In this work, we introduce a finite state machine representation characterizing the trajectory of the data with a smaller collection of topics. We develop a hierarchical clustering scheme to generate the states of the finite state machine as well as a transition model to describe data evolution. The primary advantages of our model are lower-dimensional summaries of data and a more intuitive representation of text evolution. These advantages are important for human interpretation. We will apply our model to a dataset of electronic health records (EHRs) of patients diagnosed with Autism Spectrum Disorder (ASD). We offer interpretations of model output describing disease trajectories of patients with ASD that corroborate previous findings on comorbidities common to ASD. Additionally, we show retention of predictive performance despite reduced dimension of model parameters.