Publication:

Bayesian Weighted Clustering Methods for Dietary Survey Data

Loading...
Thumbnail Image

Date

2025-01-15

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wu, Stephanie. 2025. Bayesian Weighted Clustering Methods for Dietary Survey Data. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

In this dissertation, we examine the intersection between Bayesian and survey statistics. Motivated by questions in nutritional epidemiology and health disparities, we introduce three Bayesian weighted clustering models to better understand dietary behaviors of understudied populations using data from population surveys.

Chapter 1 provides an introduction and a brief overview of the motivation behind our methods. Chapter 2 contains a literature review of Bayesian survey statistics methods and their applications in population health. We summarize advantages and disadvantages, review recent applications and extensions, and consider how these approaches may be leveraged to improve research in population health equity.

In Chapter 3, we introduce a survey-weighted extension to joint models that cluster multivariate categorical variables supervised by a binary outcome, referred to as supervised weighted overfitted latent class analysis (SWOLCA). We evaluate model performance using simulation studies and apply SWOLCA to data from the National Health and Nutrition Examination Survey (NHANES) to derive dietary patterns associated with hypertensive outcomes among low-income women in the United States (US).

Chapter 4 extends Bayesian clustering models to non-probability samples (NPS), where the mechanism of inclusion into the sample is unknown, through the introduction of a Weighted Overfitted Latent Class Analysis for Non-probability samples (WOLCAN). Dietary behavior patterns are identified for adults in Puerto Rico aged 30 to 75 and analyzed in association with type 2 diabetes, hypertension, and hypercholesterolemia.

Chapter 5 highlights a survey-weighted extension to model-based clustering that addresses subpopulation heterogeneity, referred to as weighted robust profile clustering (WRPC). With data from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), we derive dietary patterns among individuals of Puerto Rican heritage living in the US and analyze food consumption heterogeneity by income, nativity, sex, and age.

Chapter 6 provides concluding remarks and avenues for future research.

Description

Other Available Sources

Research Data

Keywords

Bayesian clustering, Dietary pattern analysis, Latent class analysis, Non-probability samples, Nutritional epidemiology, Survey statistics, Biostatistics, Nutrition, Public health

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories