Collaborative Research: Renyi Divergence-based Robust Inference in Regression, Time Series and Association Studies.

$75,999FY2013MPSNSF

University Of Georgia Research Foundation Inc, Athens GA

Investigators

Abstract

This collaborative research project focuses on developing a novel approach to dimension reduction in regression, time series, and multivariate association studies based on a family of Rényi divergences, with a central theme of providing estimators that are inherently robust to data contamination, sustaining only a minimal loss in efficiency. This family not only characterizes the conditional independence underlying the concept of sufficient dimension reduction in regression and time series, but also characterizes independence between canonical variates in multivariate association studies. The novelty of the approach lies in exploiting a tuning parameter of the family, which balances the efficiency and the degree of robustness of the estimators. In each of the three areas, this project focuses on investigating a host of issues such as: (i) the computation of estimates, (ii) the detection of the true dimension, (iii) the selection of an optimal tuning parameter, and (iv) a formal justification of the method via theory. Furthermore, the project focuses on carrying out an in-depth study of robustness via influence functions and sample/empirical influence functions. Finally, the project focuses on finding an optimal Rényi divergence measure that is both robust and efficient, without the need for prior outlier detection or removal. Rapid advances in technology have led to an information overload in most sciences. A typical characteristic of many contemporary datasets is that they are relatively high-dimensional in nature. This has prompted a shift in the applied sciences toward a different relationship-study genre arising in regression, time series and multivariate association, popularly known as dimension reduction, whose goal is to reduce the dimensionality of the variables as a first phase in the data analysis. However, the presence of outliers in high-dimensional datasets adversely affects the performance of existing dimension reduction methodologies, resulting in conclusions that are not completely reliable. Given that outliers are commonly encountered in high-dimensional datasets and that their presence is hard to detect, there is an urgent need to identify dimension reduction methods that possess some degree of automatic robustness, or non-sensitivity, to outliers. The proposed project provides robust dimension reduction methods, which would contribute significantly to the analysis of high-dimensional data arising in fields such as the social sciences, machine learning, sports, economics, environmental studies, morphometrics and cancer studies, among others. In fact, this project will not only provide novel tools for scientists in various disciplines to obtain reliable conclusions on high-dimensional data analysis, but also significantly advance the statistical theory, thereby paving a new research path in dimension reduction.

View original record on NSF Award Search →