GGrantIndex
← Search

Collaborative Research: Inference for Statistical Graphics

$189,974FY2010MPSNSF

Iowa State University, Ames IA

Investigators

Abstract

Since the publication of the NSF landmark report "Visualization in Scientific Computing" in 1987, computer-aided visualization has been recognized as one of the most potent tool sets for scientific discovery. However, discoveries based on data displays are often criticized because they are not secured by statistical inference. The team of researchers from Iowa State University, Rice University and University of Pennsylvania is addressing exactly this issue by bringing the rigors of statistical inference to visual data exploration. Statistical inference for plots are cast as comparison of a plot of the actual data with plots of null data simulated under a null hypothesis. If the actual plot stands out from a background of "null plots", it amounts to the rejection of the null hypothesis. Executing this idea leads to rigorous protocols that can confer proper statistical significance to visual discoveries. Tools of mathematical statistics are employed to reduce composite null hypotheses to single reference distributions: conditioning on a minimal sufficient statistic, bootstrap plug-ins, and posterior predictive sampling. The protocols also have the potential to shift the perception of exploration-based findings in the scientific communities and dramatically increase the impact that these findings are allowed to have. The testing protocols will be made accessible with implementation in the open-source R language. Data graphics are an essential part of communicating information. But how reliable is the information that we gather from them? The investigators will develop a rigorous framework for visual inference modeled after formal statistical testing. This framework allows the reader of a graphic to determine whether structure is real or spurious (is that a man in the moon, or just some rocks?). These protocols have the potential to shift the perception of exploration-based findings in the scientific community and dramatically increase the impact of exploratory work. Some aspects of the protocols are so intuitive that they can be used for general audiences and integrated in the teaching of introductory statistics at from grade school to college.

View original record on NSF Award Search →