CAREER: A General Framework for Methodical and Interpretable Anomaly Mining
Carnegie Mellon University, Pittsburgh PA
Investigators
Abstract
Anomaly mining is the task of finding irregularities in the data. It finds applications in a plethora of domains, such as security, finance, astronomy, and medicine. Despite its immense popularity, however, it remains an extremely challenging task for many real world applications. For many practitioners, the task is poorly defined and under-specified as existing definitions and solutions are often too simplistic and do not directly correspond to the needs of modern applications. This project takes the essential steps to bridge the gap between research and practice to dramatically improve the usability, effectiveness, and interpretability of anomaly mining techniques, and to ultimately mature the field into a more valuable contributor to the larger world. It promises significant impact on many concrete problems, such as insider threat, tax evasion, and health-care fraud detection, important for the government, industry, and the society. Collaborations with industry and hospital partners aim to shepherd innovations into deployed technology, with tangible impact on security and healthcare. The primary agenda to achieve these goals involves developing a new framework for anomaly mining that utilizes multiple heterogeneous data sources and techniques in a corroborative fashion to fundamentally reframe our understanding and ability to define, detect, and describe real-world anomalies. The project formalizes novel definitions of complex anomalies that fuse multiple data sources, and invents complex anomaly detection algorithms that further present descriptions that provide rationale for the detected anomalies. Research also explores and models anomaly ensembles that systematically harness evidences from multiple detection techniques. Ultimately, this project strives to push the boundaries of anomaly mining as a field through this quest for principled foundations and practices.
View original record on NSF Award Search →