From Frames to Events: A Statistical Approach to Activity Analysis in Multi-Camera Systems

$507,364FY2009CSENSF

Trustees Of Boston University, Boston

Investigators

Venkatesh Saligramacontact Janusz Konrad

Abstract

From Frames to Events: A Statistical Approach to Activity Analysis in Multi-Camera Systems Venkatesh Saligrama and Janusz Konrad, Boston University, MA 02215 Unlike other sensors, cameras provide excellent resolution, long viewing range, wide field of view and low latency thus permitting pervasive, wide-area visual surveillance. However, most network cameras deployed today are simple capture/compression/transmission devices, at most supporting rudimentary motion detection; all higher-level processing is highly centralized. This centralized architecture stems from human-centric visual analytics as well as limited in-camera processing capacity, and is not scalable to large multi-camera systems. With over 30 million surveillance cameras in use in the United States today, that produce 4 billion hours of video footage per week, monitoring by human operators is obviously not sustainable. An autonomous, distributed, bandwidth-efficient, real-time video analytics system is needed. This project makes a step towards building such a system. At its core is a novel statistical framework for activity discovery and analysis that departs from the centralized model and leverages processing power of camera nodes. While traditional activity analysis operates at object level, e.g., objects are identified, tracked, and tested for abnormality, methods under development in this project employ activity analysis at pixel level. If the abnormal activity is reliably identified, then object extraction and tracking focus on region of interest and thus are relatively straightforward, on account of absence of clutter. In order to reliably identify pixel-level abnormalities, or more generally activities, a novel event-based video representation is used that decomposes video into iid samples lending itself to the application of statistical learning. In order to facilitate multi-camera collaboration, geometric invariance of dynamic events is exploited thus bypassing the difficult issues related to 3-D geometry dependent on viewing angles.

View original record on NSF Award Search →