Mapping and predicting HIV-transmission hotspots with phylogenetics and geospatial machine learning
National Institute On Drug Abuse
Investigators
Abstract
With this project, we are bringing something very new to HIV tracking and prevention, using two different set of big-data techniques in which we are already ahead of our peers. The first set of techniques is geospatial: we are validating new ways to interpolate physical/social-disorder data between directly observed city blockface to regions not directly observed. We use full-land-coverage data (e.g., satellite imagery, tax data) to account for spatial discontinuities such as parks and major roads. We are working on ways to extrapolate beyond the outer spatial boundaries of the regions where observations have been made. We have a methods manuscript in preparation, and our success has already elicited great enthusiasm from local governmental agencies that have agreed to provide environmental data and assist in participant recruitment. The second set of techniques is temporal: we are validating new ways to generate live predictions of outpatients imminent risk of drug craving or stress several hours into the future. We do this in machine-learning models that use several hours of the patients GPS tracks in combination with person-level information. We have a provisional patent for this process (Method and System for a Mobile Health Platform, PCT/US2016/029553) and we are finishing a manuscript for publication. Together, these techniques can be applied to create a proactive epidemiological approach to HIV prevention. The future-prediction algorithms that we use on a time scale of hours for individuals will be adapted to work on a time scale of days, weeks, or month for areas and social venues that become hotspots for HIV transmission. Using viral phylogenetic data, social-contact data, and activity-space data, we intend to develop wall-to-wall surface maps of HIV reservoir and transmission risk in the city of Baltimore, develop algorithms for prediction of changes in the maps, and use viral phylogenetic data to tailor our algorithms for specific key populations such as MSM of color We have collected data from research participants to provide some of the empirical grounding for this project. One of our postdocs is working with an extramural HIV epidemiologist, who has access to city and state databases that will provide input to our machine-learning models. The ultimate goal of the project will be to help focus PrEP and other prevention resources into the geographical areas where they are about to become most needed, using strategies that are predictive rather than reactive.
View original record on NIH RePORTER →