Informative Model Specification Tests Using Coarsened Data

$104,866FY2010MPSNSF

University South Carolina Research Foundation, Columbia SC

Investigators

Abstract

This research aims to develop novel methods that utilize coarsened data to assess the validity of model specifications. The proposed methods are motivated by the finding that statistical inference can be affected differently by different sources of model misspecification interacting with different (coarsened) data generating schemes. By evaluating the changes in inference outcomes as the data are strategically coarsened, the new methods can not only detect violation of model assumptions but also pinpoint the most influential source of model misspecification. This direction of research will lead to significant improvement on the existing model diagnostic methods, most of which only provide an overall assessment of goodness-of-fit or allow testing only one model assumption at a time. A crucial thread running through the investigation is the study of statistical inference based on coarsened data in the presence of model misspecification. The project will advance the understanding of the so-called "wrong model analysis" for coarsened data. As data are rarely collected ideally as one plans in practice, coarsened data are ubiquitous. Hence such understanding gained from the proposed research is valuable. Statistical modeling is a key step of statistical analyses in nearly all fields of applications. A poorly formulated model often results in misleading conclusions. As researchers entertain increasingly complex statistical models to explain random phenomena, the need for more sophisticated diagnostic techniques becomes pressing. The investigator will conduct comprehensive analyses on the effects of model misspecification on statistical inference based on data of different structure. This knowledge will then be used to develop a rich class of informative diagnostics tools to protect data analysts from inappropriate model assumptions and to direct model improvement. The idea underlying the proposed methods is original and revolutionary. It advocates sacrifice of data information in order to reveal the mechanism that governs random phenomena, the insight unattainable without such counterintuitive sacrifice. The project will integrate research and education by sharing the rationale and the investigation with the graduate students who will work with the investigator or take the advanced topic course recently developed by the investigator.

View original record on NSF Award Search →