CAREER: Big Data Climate Causality Analytics
University Of Maryland Baltimore County, Baltimore MD
Investigators
Abstract
A fundamental problem in climate science is climate causality analysis that studies the cause-effect relationship among climate variables, such as temperature and humidity. By studying how the climate system works from a causality perspective, the findings could be used for many research areas including climate variability, climate dynamics, climate simulation, and extreme climate prediction. Nowadays, climate causality study faces many computing challenges, such as processing very large and high-dimensional datasets, and the complexity of modern computing resources. To tackle these challenges, this project targets novel causality discovery algorithms and related scalable computing techniques. The project is expected to greatly aid Earth System scientists and climate scientists to explore new hypotheses and use cases related to climate causality. The project includes an integrated program of research, education and outreach to help better understand and evaluate climate simulation, fostering workforce development for a multidisciplinary research community on "Data + Computing + Climate Science", and raising interest in both IT technology and climate studies among K-12 students, and various underrepresented groups. The project thus serves the national interest, as stated in NSF's mission, by promoting the progress of science and advancing national prosperity and welfare. The goal of this CAREER project is to study efficient and reproducible causality analytics for large-scale climate data, so that climate scientists can easily test their causal hypotheses, reproduce existing studies and compare different causality analytics results. To handle the increasing dimensionality and resolution of spatiotemporal climate datasets, the project will study incremental causality discovery algorithms for large-scale climate datasets and parallel causality discovery for spatiotemporal climate data. To address the variety of both causal discovery algorithms and climate simulation/observation datasets, the project will study how to effectively measure climate causality results from different causality algorithms and different climate datasets, and integrate causality results through ensemble techniques. To cope with difficulties in conducting and reproducing causality analytics with large-scale climate datasets, the project will study cloud computing for big data climate analytics pipeline construction and execution optimization. The project will be evaluated from two perspectives. From the computing perspective, the research will be evaluated in terms of algorithm computation complexity, algorithm accuracy and algorithm scalability. From the climate perspective, the applicability of the research will be evaluated by collaborating with climate scientists in their specific research programs. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →