Publication:
Respondent-Driven Sampling and Homophily in Network Data

Thumbnail Image

Date

2013-02-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Nesterko, Sergiy O. 2012. Respondent-Driven Sampling and Homophily in Network Data. Doctoral dissertation, Harvard University.

Research Data

Abstract

Data that can be represented as a network, where there are measurements both on units and on pairs of units, are becoming increasingly prevalent in the social sciences and public health. Homophily in network data, or the tendency of units to connect based on similar nodal attribute values (i.e. income, HIV status) more often than expected by chance is receiving strong attention from researchers in statistics, medicine, sociology, public health and others. Respondent-Driven Sampling (RDS) is a link-tracing network sampling strategy heavily used in public health worldwide that is cost efficient and allows us to survey populations inaccessible by conventional techniques. Via extensive simulation we study the performance of existing methods of estimating population averages, and show that they have poor performance if there is homophily on the quantity surveyed. We propose the first model-based approach for this setting and show its superiority as a point estimator and in terms of uncertainty intervals coverage rates, and demonstrate its application to a real life RDS-based survey. We study how the strength of homophily effects can be estimated and compared across networks and different binary attributes under several network sampling schemes. We give a proof that homophily can be effectively estimated under RDS and propose a new homophily index. This work moves towards a deeper understanding of network structure as a function of nodal attributes and network sampling under homophily.

Description

Other Available Sources

Keywords

homophily, model-based estimation, network, respondent-driven sampling, sampling, statistics

Terms of Use

Metadata Only

Endorsement

Review

Supplemented By

Referenced By

Related Stories