CCF-BSF: AF: Small: Collaborative Research: Algorithmic Techniques for Inferring Transmission Networks from Noisy Sequencing Data
Georgia State University Research Foundation, Inc., Atlanta GA
Investigators
Abstract
Many viruses encode their genome in RNA and exhibit high genomic diversity within their hosts. Advances in sequencing technologies have made it feasible to track viral transmissions and timely detect outbreaks on a global scale. The goal of this project is to develop a comprehensive set of predictive mathematical models and accurate computational methods for integrated analysis of the massive epidemiological and sequencing datasets generated by emerging molecular surveillance programs. Research results will be broadly disseminated via journal publications and presentations at international conferences, including the Workshop on Computational Advances in Molecular Epidemiology organized by the PIs. Prototype implementations of developed algorithms will be distributed as open-source packages and incorporated in the cloud-based Global Hepatitis Outbreak and Surveillance Toolkit (GHOST) developed at CDC. The project will provide ample opportunities for promoting participation of women and underrepresented groups in bioinformatics and molecular epidemiology research at GSU, UCONN, Georgia Tech and Tel Aviv University. An important aspect of the project is to disseminate core concepts and ideas from Computer Science and Computational Biology to wide target audiences including: (1) teaching Computer Science, in an informal setting, to middle and high school students, and (2) incorporating computational thinking into Life Science curriculum at the undergraduate university level. The proposed research and education activities will leverage the extensive expertise of an interdisciplinary team comprised of computer scientists, mathematicians, and molecular epidemiologists to develop accurate mathematical models and computational methods for key problems in molecular epidemiology including deconvolution and inference of viral variants from error-prone pooled sequencing data, inference of relatedness between viral samples and transmission networks, inference of transmission event times and network parameters, as well as predictive modeling of transmission network dynamics. The team will carry out extensive algorithm validation on massive molecular surveillance datasets generated at CDC and develop robust prototype software implementations.
View original record on NSF Award Search →