GGrantIndex
← Search

CAREER: Transforming data analysis via new algorithms for feature extraction

$329,094FY2016CSENSF

University Of California-Davis, Davis CA

Investigators

Abstract

Analysis and exploration of data, including classification, inference, and retrieval, are ubiquitous tasks in science and applied fields. Given any such task, a fundamental paradigm is the extraction of features that are relevant. In the design of algorithms for the analysis and exploration of data, feature extraction techniques act as basic building blocks or primitives that can be combined to model complex behavior. Some of the fundamental feature extraction tools include Principal Component Analysis (PCA), Independent Component Analysis (ICA), and half-space-based learning and classification. Data rarely satisfy the precise assumptions of these models and feature extraction tools, and combining these tools amplifies errors. This motivates the challenging task of designing new algorithms that are robust against noise and that can be combined as building blocks while keeping the error propagation under control. The proposed work will: (1) Raise ICA from a very successful practical tool to an algorithmic primitive with strong theoretical guarantees and applicability to a rich family of problems beyond independence. (2) Find reasonable assumptions and algorithms that allow efficient learning of intersections of half-spaces. (3) Systematically study the following well-motivated refinement of PCA known as the subset selection problem. This refinement aims to select relevant features among the given features of the input data, unlike PCA, which creates new and possibly artificial features. New feature extraction algorithms enhance the toolbox available to researchers in data-intensive fields such as biology, signal processing and computer vision. They also enable improved data analysis by practitioners in security, marketing, business and government processes, and essentially any field that involves the analysis of feature-rich data. The proposed work includes the implementation of the more practical algorithms. Education and outreach aspects of this project include the mentoring of young researchers, the design of a new course for graduate and undergraduate students incorporating some of the PI's research, and the involvement of pre-college students and local communities into science and research.

View original record on NSF Award Search →