Mining Dynamics of Data Streams in Multi-Dimensional Space
University Of Illinois At Urbana-Champaign, Urbana IL
Investigators
Abstract
Stream data processing and mining represent an important, emerging class of data-intensive applications where data flows in and out dynamically, in huge (possibly infinite) volumes, adaptive to only single-scan algorithms, but often demanding fast or even real-time responses. Based on the observation that a majority of stream data resides at the primitive level of abstraction, but most interesting patterns may need to be discovered at certain high levels of abstraction in multi-dimensional space, this project focuses on the issues on stream data mining and develop effective, efficient and scalable methods for mining the dynamics of data streams in multi-dimensional space. The scope of this study includes the discovery of changes, trends, and evolutions of characteristics, clusters, classification models, and frequent patterns in data streams. The methodology is to capture sufficient statistical, compact and aggregate information in concise data structures to facilitate the efficient processing of both continuous and ad-doc stream mining queries. Several strategically important applications are being explored, including network intrusion detection, telecommunication and Web data flow analysis, and financial data flow analysis. This effort's results will contribute to the development of the principles and new methods for real-time data mining systems and promote its strategically important applications, including timely discovery of terrorist or criminal activities for homeland security, intrusion detection, multi-dimensional analysis of data-intensive, fast-changing events, and other areas with broad impact. The research results will be published timely in conferences and journals (and be made available at http://www.cs.uiuc.edu/~hanj/projs/streamine.htm) for wide dissemination, industry adoption, and education of new generation of information technology students and workers.
View original record on NSF Award Search →