Empirical Assessment of Respondent Driven Sampling from Total Survey Error Perspectives

$549,999FY2015SBENSF

Regents Of The University Of Michigan - Ann Arbor, Ann Arbor MI

Investigators

Sunghee Leecontact Juliette Roddy Michael Elliott Sean E McCabe

Abstract

This research project will provide an empirical investigation into the realities of respondent driven sampling (RDS) data collection. RDS is a new method for sampling rare or hidden populations. The method starts with the members of the target population and traces their social networks as well as the networks of those to whom they are connected. Although this method is growing in popularity, there is a scarcity of publicly available RDS datasets. As a result, methodological assessments of RDS are very limited. This research project will conduct a theory-drive methodological assessment of RDS. The project will provide a platform to develop design features for RDS studies that minimize potential errors and violations of critical assumptions. By developing appropriate inference strategies, rare or hidden populations will be more fully represented in the data collected by RDS methods. Absent these types of examinations, behavioral and social science data collected through RDS runs the risk of mischaracterizing rare or hidden populations in unknown ways. To the extent that these data inform public policies, rare or hidden populations may benefit from this research. New datasets will be generated and made publicly available to promote methodological research beyond this project. This research project will conduct an empirical assessment RDS within the Total Survey Error (TSE) framework. TSE is a framework in survey methodology that allows for the systematic examination of errors. This project seeks to improve current RDS data collection and inference practices by examining sampling productivity, error properties, and replicability. The investigators will collect data on two rare populations in Los Angeles County for which external probability-based sample data are available. The probably-based data will be used as "gold standards" against which estimates from the RDS collected data will be compared, providing a unique opportunity to assess RDS not only as a sampling method but also as a data collection method. The data sets will empirically inform a simulation study that will examine the effects of various network structures and response propensities on sampling productivity and inferential errors. The project is supported by the Methodology, Measurement, and Statistics Program and a consortium of federal statistical agencies as part of a joint activity to support research on survey and statistical methodology.

View original record on NSF Award Search →