Publication: Developments in Aggregate Relational Data
No Thumbnail Available
Open/View Files
Date
2024-07-11
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
da Silva Baum, Derick. 2024. Developments in Aggregate Relational Data. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
Research Data
Abstract
Aggregate relational data (ARD) on relationships between individuals and subpopulations have been informative for studying egocentric network size, assessing segregation in contact with subgroups, and estimating the size of unlisted groups. Despite their wide range of applications, ARD survey questions are difficult to answer, making them prone to considerable measurement error. Additionally, the data generated by these questions can be challenging to model and analyze, necessitating various assumptions about the nature of acquaintanceship with subgroups.
This dissertation consists of three chapters addressing the following research questions about the quality of ARD and strategies for modeling these data: 1) What are the properties of models for analyzing ARD? 2) How can we evaluate the fit of ARD models to the observed data? 3) How reliable are ARD survey items and the network size measure obtained by combining them? We highlight key findings related to each of these questions. In addressing the first question, we found that under some conditions, simpler and more sophisticated modeling specifications yield identical estimates for quantities of interest, such as network and subgroup prevalence. Analysts might opt for the simplest alternative to prevent unnecessary extra variance that could arise from including redundant parameters.
Our endeavors to answer the second question showed that a stepwise approach to model augmentation that considers models progressively, from simpler to more complex, can reveal novel insights into the effects of different model assumptions. This approach enables a more nuanced perspective on patterns of acquaintanceship with subgroups compared to the typical procedure adopted in the ARD literature, which primarily focuses on parameter estimates from a single model. For example, we demonstrated that subgroups with similar levels of a statistic commonly used to summarize the extent of segregation in contact with subgroups --- overdispersion --- can exhibit vastly different distributions of reported connections.
Finally, the third chapter shows that measurement error in individual ARD items is severe, with reliability estimates falling below the standard adequacy threshold of 0.70. Measurement errors at the item level likely affect quantities derived from ARD models, leading to less precise and potentially biased estimates. We illustrated this for network size, whose reliability is also below that threshold.
Description
Other Available Sources
Keywords
Sociology, Statistics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service