Empirically testing network sampling strategies in unbounded risk populations

$467,253R01FY2013DANIH

Family Health International, Durham NC

Investigators

Abstract

DESCRIPTION (provided by applicant): Network-based investigations and interventions are hindered in part by uncertainties about sampling. Hidden, elusive populations are not amenable to the master-frame, top-down sampling that permits valid estimation and hypothesis testing. One result has been a burgeoning of theoretical approaches and simulation. Completing the loop from theoretical result to empirical verification has received less attention. We propose that the theoretical concerns about empirical network sampling may not be justified, and propose to compare three different sampling approaches in two different areas (Dar es Salaam, Tanzania and Atlanta, GA) to determine if different sampling approaches produce similar network configurations. We will conduct the study in two Phases: {Phase One, an intense geographic risk assessment, will use key informant workshops that focus on mental maps of the community, collection of GPS coordinates for sites identified by these maps, and rapid assessment surveys to facilitate an understanding of where different types of risk activity are occurring within the communities in which we plan to sample. These data will then be entered into ArcGIS and analyzed in order to select the venues and informal areas from which to recruit seed participants in Phase Two. In Phase Two, in the selected high risk activity areas} we will conduct three different sampling approaches: time-space; short chain-link (one seed and referral of three risk contacts); and long chain-link (one seed and a subsequent chain of nine risk contacts). In all three approaches, we will elicit information on all contacts (social, sexual, or drug-using) in eac respondent's personal network, and follow up with interviews on the risk contacts, in accordance with the specific design. In keeping with our hypotheses, we anticipate that each of these groups created by each design will link up to form a large connected component, and will be characterized by similar network attributes (such as degree distribution, clustering, mixing, concurrency, component distribution, and geographic contiguity), despite differences in sampling. In addition, we posit that a relatively small number of persons are required to elucidate the underlying network configuration. If these hypotheses are substantiated, they will provide a better empirical basis for verifying theoretical relationships and for developing network-based interventions.

View original record on NIH RePORTER →