GGrantIndex
← Search

ITR: Collaborative Focused Mining of Atmospheric Aerosol Datasets: Integration of Mass Spectrometry and Environmental Monitoring

$2,040,000FY2003CSENSF

University Of Wisconsin-Madison, Madison WI

Investigators

Abstract

Increasing concern over the role of atmospheric particles (aerosols) on global climate change, human health and welfare and the Earth's ecosystem has created a great need to better understand the composition, origin, and influence of atmospheric pollutants. In order to develop control strategies that can mitigate the onset of climate change, as well as the degradation of the environment and our quality of life, there is a great need to better understand the sources, dynamics, and compositions of atmospheric aerosols. Recent advances in aerosol science have led to the development of a new generation of real-time instruments, which provide continuous or semi-continuous streams of data about certain aerosol properties. However, these instruments have added a significant level of complexity to atmospheric aerosol data, and dramatically increased the amounts of data to be collected, managed, and analyzed. This project aims to mine atmospheric aerosol data sets, and in the process, to develop novel mining environments that can be applied in other domains as well. Time-series analysis, clustering, and decision trees are well-known techniques for which robust software is available, and will be used to analyze atmospheric aerosol datasets. In addition, new approaches will be explored as appropriate, including a framework called subset mining that is especially suited for finding correlations between (parts of) different datasets, such as mass spectrometry and environmental monitoring data. An important objective is to develop a framework to create multi-step analyses using one or more mining techniques, and to focus the patterns generated by these techniques by incorporating domain knowledge into the analysis. The goal is to reduce the time required for complex analyses by leveraging user input to automatically explore a large space of alternative models, and to take advantage of optimizations made possible by exploring several models in parallel. A related objective is to enable research groups to share their results, all the way down to the datasets and how they were processed to arrive at the results, in an analysis environment with continually evolving datasets and complex analysis chains. This project will train graduate and undergraduate students at UW-Madison and Carleton College in two disciplines, Computer Science and Atmospheric Chemistry. In addition to the educational impact, the research results and tools will be widely disseminated through publications and the Web (www.cs.wisc.edu/~raghu/admitr), and are expected to significantly advance the state of the art in two directions: (1) Scientific and regulatory efforts to understand and mitigate the impacts of air pollution and environmental contaminants, (2) Foundations, algorithms, and technology for data mining.

View original record on NSF Award Search →