ACT SGER: Locating Sparse Events in High Speed Stream Data, with a Focus on Statistical Analysis
Georgia Tech Research Corporation, Atlanta GA
Investigators
Abstract
The PIs propose to study statistical problems that arise from identifying sparse events in high-speed stream sequences. Stream data are commonly encountered in many applications. Examples include (1) internet traffic data in internet pricing and network security, (2) the vast amount of information in communication networks and the demand of (terrorism) activity monitoring. These problems are closely related to the theme of Approaches to Combat Terrorism (ACT). These problems share some common properties: the events of interests are 'extremely' sparse; and the identification needs to be realized at a 'very' high speed. Moreover, in both network security and terrorist activity monitoring, false detection is far more disastrous than false alarms. It poses big challenges to both statisticians and computational scientists to find efficient methods to identify sparse events in stream data at high speed. The PIs propose to study the above problems, emphasizing the statistical modeling aspect, namely, what is the optimal modeling and computational strategy to solve these problems. The main challenge in stream data is the development of fast real time algorithms. Statisticians have developed powerful tools in sequential analysis, e.g., MCMC and other maximal likelihood based approaches. At the same time, the proposed problem has not been given much attention in statistics. New analysis needs to be done to integrate both optimality in mathematical statistics and efficiency in scientific computing. The PIs will start with two key ideas: a tree-based matching method and a special decision tree - cascade. They will benefit from interactions with researchers in other fields such as computer vision and computer security. The support will allow the PIs to train two graduate students to work in this area. The support will also help to jump start a larger effort on computationally efficient statistical modeling in stream data. The PIs will develop software for their algorithms and make it available over the internet. This award is supported jointly by the NSF and the Intelligence Community. The Approaches to Combat Terrorism Program in the Directorate for Mathematical and Physical Sciences supports new concepts in basic research and workforce development with the potential to contribute to national security.
View original record on NSF Award Search →