Doctoral Dissertation Research: Envelope Models and Methods
University Of Minnesota-Twin Cities, Minneapolis MN
Investigators
Abstract
Multivariate linear regression (MLR) is a paradigm for studying the relationship between two groups of variables, the predictors and the responses. It is broadly applied in many disciplines for explaining the dependence between the responses and the predictors or for conducting prediction of future outcomes. With the development of technology for data collection and measurement, many contemporary problems involve high-dimensional datasets. This implies the possibility that a considerable amount of the response information may be redundant or irrelevant. This redundant or irrelevant part of the data will bring variation into the estimation in MLR, making it inefficient. To address this problem, a new class of models called envelopes was introduced by Cook et al. (2010). It uses dimension-reduction techniques to identify and extract the relevant information in the data, so that the estimation is based on only the relevant part. Up to this point, however, the envelope class is still in its infancy. It has restrictions on the data structure, and its advantages cannot always be realized. This doctoral dissertation research project will study and bring the envelope class to maturity, making it more flexible and achieving further efficiency gains by enriching the class with new models and methods. New models that address scale invariance, heteroscedasticity, and small sample size issues will be developed. New models that lead to further efficiency gains beyond the current models also will be developed. These extensions of the envelope class will make minimal assumptions on data structure and extend the applicability and power of the enveloping idea, making it more appealing. This research will result in more efficient data analysis methods for sociology, economics, genetics, and many other disciplines in science and engineering. These methods are expected to achieve the same accuracy in analysis with a smaller sample size, making experiments and the data collection process shorter, easier, and less expensive. The project also will link existing statistical tools, such as dimension reduction techniques and methods for estimating large covariance matrix, to the field of MLR in a novel way that opens new frontiers of their application. User-friendly software will be developed that implements the new methodology. As a Doctoral Dissertation Research Improvement award, support is provided to enable a promising student to establish a strong, independent research career.
View original record on NSF Award Search →