Collaborative Research: High-Dimensional Spatial-Temporal Modeling and Inference for Large Multi-Source Environmental Monitoring Systems
Michigan State University, East Lansing MI
Investigators
Abstract
Remote sensing technologies and Geographic Information Systems continue to bring about dramatic developments in scientific discovery. Scientists in a variety of disciplines today have unprecedented access to massive spatial and temporal databases comprising high resolution remote sensed measurements. Statistical modeling and analysis for such data often entail reckoning with spatial associations and variations at multiple levels while attempting to recognize underlying patterns and potentially complex relationships among the scientific variables. Traditional statistical hypothesis testing is no longer adequate for these inferential objectives and statisticians are increasingly turning to multi-level or hierarchical modeling structures for analyzing complex spatial-temporal data. However, there continue to remain substantial computational bottlenecks as scientists encounter the data deluge in remote-sensed data that demand specialized "BIG DATA" technologies. The PIs will address these problems by developing probabilistic machine learning tools for spatial-temporal BIG DATA within the context of scientific advancements in forest structure, topography, and weather-related events (e.g., storms) that can have far-reaching public health, economic, environmental, and security implications. Several innovations in statistical and computational methods and related software development are envisioned. The proposed data products will offer quantification of forest damage/change and landslide risk assessment for Puerto Rico following hurricanes Irma and Maria. Key educational components include dissemination of proposed technologies across the scientific communities including data scientists, engineers, foresters, ecologists, and climate scientists. The PIs plan to train the next generation of data scientists through dissemination efforts for undergraduate and graduate students in STEM fields. The PIs will develop a statistical framework for executing elaborate case studies and data analysis on high-dimensional remotely sensed data, where "high dimension" alludes to one or all of a massive number of (i) spatial locations; (ii) time points; and (iii) responses or outcomes. The PIs will introduce massively scalable multivariate spatial process models within a rich Bayesian hierarchical framework to obtain fully model-based inference for the underlying data generating process. Innovative statistical methodologies are proposed to implement hierarchical models at scales involving tens of millions of spatial locations, thousands of time points and possibly hundreds of remote-sensed variables. The massive scalability of these models will be achieved through sparsity-inducing spatial-temporal processes and other graphical models, matrix-variate low-rank models, conjugate Bayesian distribution theory, and meta-learning paradigms using approximations of a collection of posterior distributions. Theoretical results that enhance current methods will be explored as will be several proposed case studies at hitherto unprecedented scales. The PIs will develop a full suite of spatial models in a wide variety of experiments involving massive data sets. Since massive data sets are where complex relationships can be detected effectively, the proposed methods are well-suited for modeling complex scientific phenomena. Key substantive inference and statistical quantification will be offered for forest damage/change and landslide risk assessment for Puerto Rico following hurricanes Irma and Maria. The PIs will provide probability-based uncertainty quantification and will substantially enhance the scientific community's understanding of storm-related damage assessment. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →