Publication:
Developments in Aggregate Relational Data

No Thumbnail Available

Date

2024-07-11

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

da Silva Baum, Derick. 2024. Developments in Aggregate Relational Data. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Aggregate relational data (ARD) on relationships between individuals and subpopulations have been informative for studying egocentric network size, assessing segregation in contact with subgroups, and estimating the size of unlisted groups. Despite their wide range of applications, ARD survey questions are difficult to answer, making them prone to considerable measurement error. Additionally, the data generated by these questions can be challenging to model and analyze, necessitating various assumptions about the nature of acquaintanceship with subgroups. This dissertation consists of three chapters addressing the following research questions about the quality of ARD and strategies for modeling these data: 1) What are the properties of models for analyzing ARD? 2) How can we evaluate the fit of ARD models to the observed data? 3) How reliable are ARD survey items and the network size measure obtained by combining them? We highlight key findings related to each of these questions. In addressing the first question, we found that under some conditions, simpler and more sophisticated modeling specifications yield identical estimates for quantities of interest, such as network and subgroup prevalence. Analysts might opt for the simplest alternative to prevent unnecessary extra variance that could arise from including redundant parameters. Our endeavors to answer the second question showed that a stepwise approach to model augmentation that considers models progressively, from simpler to more complex, can reveal novel insights into the effects of different model assumptions. This approach enables a more nuanced perspective on patterns of acquaintanceship with subgroups compared to the typical procedure adopted in the ARD literature, which primarily focuses on parameter estimates from a single model. For example, we demonstrated that subgroups with similar levels of a statistic commonly used to summarize the extent of segregation in contact with subgroups --- overdispersion --- can exhibit vastly different distributions of reported connections. Finally, the third chapter shows that measurement error in individual ARD items is severe, with reliability estimates falling below the standard adequacy threshold of 0.70. Measurement errors at the item level likely affect quantities derived from ARD models, leading to less precise and potentially biased estimates. We illustrated this for network size, whose reliability is also below that threshold.

Description

Other Available Sources

Keywords

Sociology, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories