Integrative Multivariate Analysis of Multi-View Data
University Of Connecticut, Storrs CT
Investigators
Abstract
Multi-view data, or the measuring of several distinct yet interrelated sets of characteristics pertaining to a single set of subjects and possibly collected from an array of sources, has become increasingly common in the fields of engineering and scientific research. This project innovates new methodologies, statistical theories, and scalable computational tools to tackle a range of statistical learning problems with multi-view data. An integrated statistical analysis of the multi-view data generation mechanisms, enabled by this project, will allow us to gain extraordinary insight of real-world phenomena by utilizing information obtained from different lenses and from different angles. The PI will develop several generalizations of the reduced-rank matrix structure, to enable a spectrum of multivariate statistical methods for multi-view learning. The general methodology of reduced-rank estimation is one of the most critical ingredients in modern multivariate analysis. However, for handling multi-view data, the potential of the reduced-rank methodology is far from being fully realized or understood. This project presents the following overarching objectives: (1) develop integrative multivariate regression for joint learning, which entails the exploitation of multiple sets of features to build an integrated predictive model of multivariate response; (2) develop integrative canonical correlation analysis for shared learning, by combining the exploration of shared low-dimensional association structures between multiple sets of features and the development of coherent predictive models for multivariate response; (3) develop integrative dimension reduction for multi-scale learning, by utilizing both the global and local low-dimensional structures among sub-matrices of a high-dimensional matrix object; (4) develop diagnostic measures for robust learning, which would enable reliable multi-view data integration and data quality assessment.
View original record on NSF Award Search →