Data Analysis Tools for Emerging High-throughput Technologies
Dana-Farber Cancer Inst, Boston MA
Investigators
Linked publications & trials
Abstract
Project Summary Biomedical research and basic sciences are increasingly dependent on high-throughput technologies capable of simultaneously measuring thousands of nucleic acid molecules within a sample. As the use of these methods increases, the resulting complex datasets require the development of advanced statistical techniques for accurate data processing and interpretation. Our group has previously demonstrated that robust statistical methodologies signiï¬cantly outperform the default ad hoc algorithms provided by technology developers. Our primary objective is to address the most pressing challenges faced by the research community where our expertise can have the greatest impact. The high citation rates of our statistical methodology and the widespread adoption of our open source software highlight the success and impact of our work. The applications of high-throughput technologies have evolved beyond their original purpose of DNA sequence analysis to include dynamic and quantitative outcomes such as gene expression levels, with measurements now available at the single-cell level. These advances introduce variability that further complicates data analysis, particularly in distinguishing biologically relevant signals from unwanted noise. In addition, these measurements are often subject to signiï¬cant technological and biological biases, which can profoundly aï¬ect subsequent analyses. Over the past ï¬ve years, we have continued to develop solutions to tackle these types of challenges. The R35 mechanism has improved our productivity, broadened our research scope, and fostered new collaborations. Our contributions in single-cell RNA sequencing have improved dimension reduction, cell annotation, and clustering. Our ideas have established new standards and inï¬uenced the ï¬eld's direction. This work has been highly cited and has served as the foundation for other statistically rigorous methods. We have also made signiï¬cant contributions in spatial transcriptomics, developing the most widely used computational tool for cell-type annotation in this technology, thereby establishing best practices in this rapidly expanding area. The ï¬exibility of the R35 mechanism has also allowed us to extend our impact beyond genomics, as demonstrated by our contributions to public health surveillance during the COVID-19 pandemic. In the next ï¬ve years, we plan to continue to address the most pressing challenges faced by investigators who rely on high-throughput technologies. We will leverage the expertise of our collaborators to prioritize projects. Although predicting speciï¬c future needs is challenging, we have identiï¬ed key areas that we intend to address. Speciï¬cally, we will develop methodologies and software to meet the most urgent needs of users of spatial transcriptomics and scRNA-Seq technologies, create general approaches for inference in region-of-interest detection, and tackle challenges identiï¬ed through recently initiated collaborations. We will use the ï¬exibility of the R35 mechanism to maximize the impact of our work and ensure that our eï¬orts continue to advance the ï¬eld.
View original record on NIH RePORTER →