III/EAGER: Temporal Relationships Among Clusters in Data Streams (TRACDS)
Southern Methodist University, Dallas TX
Investigators
Abstract
State-of-the-art data stream clustering algorithms developed by the data mining community do not utilize the temporal order of events and therefore in the resulting clustering all temporal information is lost. This is quite strange as one of the salient features of data streams is temporal ordering of events. In this project we develop a technique to efficiently incorporate temporal ordering into the clustering process and prove its usefulness on large, high-throughput data streams. Temporal ordering is introduced into the data stream clustering process by dynamically constructing an evolving Markov Chain where the states represent clusters. Our approach is based on the previously developed Extensible Markov Model (EMM). The results of this project will provide a framework upon which important stream mining applications such as anomaly detection and prediction of future events are easily implemented. By showing that state-of-the-art data steam clustering algorithms can incorporate temporal order information efficiently, this project will have a broad impact on many areas where temporal order is essential. As examples, NOAA Hurricane Data and NASA satellite data will be used throughout this project. Results, including open source software will be distributed via the project Web site (http://lyle.smu.edu/ida/tracds).
View original record on NSF Award Search →