Publication: Respondent-Driven Sampling and Homophily in Network Data
Date
2013-02-13
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Nesterko, Sergiy O. 2012. Respondent-Driven Sampling and Homophily in Network Data. Doctoral dissertation, Harvard University.
Research Data
Abstract
Data that can be represented as a network, where there are measurements both on units and on pairs of units, are becoming increasingly prevalent in the social sciences and public health. Homophily in network data, or the tendency of units to connect based on similar nodal attribute values (i.e. income, HIV status) more often than expected by chance is receiving strong attention from researchers in statistics, medicine, sociology, public health and others. Respondent-Driven Sampling (RDS) is a link-tracing network sampling strategy heavily used in public health worldwide that is cost efficient and allows us to survey populations inaccessible by conventional techniques. Via extensive simulation we study the performance of existing methods of estimating population averages, and show that they have poor performance if there is homophily on the quantity surveyed. We propose the first model-based approach for this setting and show its superiority as a point estimator and in terms of uncertainty intervals coverage rates, and demonstrate its application to a real life RDS-based survey. We study how the strength of homophily effects can be estimated and compared across networks and different binary attributes under several network sampling schemes. We give a proof that homophily can be effectively estimated under RDS and propose a new homophily index. This work moves towards a deeper understanding of network structure as a function of nodal attributes and network sampling under homophily.
Description
Other Available Sources
Keywords
homophily, model-based estimation, network, respondent-driven sampling, sampling, statistics
Terms of Use
Metadata Only