ITR: Collaborative Research: New Approaches to Experimental Design and Statistical Analysis of Genomic and Structural Biologic Data from Multiple Sources
Cornell University, Ithaca NY
Investigators
Abstract
The biological sciences are advancing by posing increasingly complex and quantitative questions which require experiments that are increasingly complex procedures, and analysis of increasingly complex and large data sets. Information technology is pervasive throughout this process. Before beginning the laboratory work, computation is necessary for planning the experiment, and for later analysis of the results. In gene chip experiments for determining gene activity levels, planning issues include which biological hypotheses should be considered and what chemical conditions will yield the most informative results, followed by computation to reduce the collected data, which can be gigabytes of information, to forms that can be understood and exploited by biological scientists. In electron microscope experiments for determining the 3-D structure of viruses, planning issues include electron energy, defocus level, beam current, number of tilts, and tilt angles, followed by computation to reduce the measured data, which can be one hundred thousand or more images, to a biologically-plausible 3-D structure. Historically, insufficient attention has been devoted to the use of highly sophisticated information technology for quantitative planning and analysis of experiments, which jointly takes into account the behavior of the measurement apparatus, the goals of the experiment, the unavoidable uncertainty in the system, and the algorithmic complexity that a particular experimental design implies for the subsequent computational analysis of the experimental data. The research objective of this ITR project is to bring together a team of investigators from MIT, Purdue and NYU-Courant along with their industrial collaborators to apply principles from information, coding and systems theory, along with advanced computational methods for statistical inference and numerical optimization, to create a unified approach to planning and analysis of complex quantitative experiments in the biological sciences, such as the determination of gene expression using gene chips and the determination of 3-D viral structure from scattering and electron microscopy experiments. These biological problems will challenge the state of the art in information technology and an important characteristic of the project is the parallel development of new information technology and new biological applications. The human-resources objectives of this ITR project are to provide the opportunity for undergraduate students, graduate students, and postdoctoral associates to learn about and contribute to this exciting area at the interface between information technology and biological sciences. Because of the biological focus of the research it is anticipated that the proposed project will be an outstanding opportunity to recruit women and other underrepresented minorities into the Systems, Information and Computer Science endeavor.
View original record on NSF Award Search →