RI: Small: Efficient Learning and Inference with Perturbations

$272,979FY2017CSENSF

Purdue University, West Lafayette IN

Investigators

Abstract

Learning and inference drives much of the research in many diverse domains, such as natural language processing, computer vision, speech processing and computational biology. In these fields, complex models are required in order to better represent real-world objects (e.g., sentences, images, speech, proteins). As such, one aims to obtain more representational power by expressing objects as the interaction of a large number of constituent elements. While producing more realistic models, this also increases the computational cost of inferring such objects, as well as of learning such inference models from data. The situation worsens as real-world objects become large scale, which opens the opportunity to investigate the use of randomized algorithms to make learning and inference more computationally efficient. This project will also provide education and outreach opportunities through a Hands-on Learning Theory course, undergraduate involvement in research and workshops at major conferences on the topic of learning and inference. The goal of this project is to develop novel randomized polynomial-time algorithms for learning and inference in large-scale structured prediction problems. The project aims to analyze maximum margin models, maximum a-posteriori perturbation models, latent variable models, and the relationship between regularization and different notions of perturbation. The project makes use of theoretical methods for creating new algorithms with practical advantages over current methods. The project aims to produce algorithms that work in polynomial-time, use a small sufficient and necessary number of training samples, and have a guarantee of small generalization error. All the software produced in this project will be open-sourced, and made available for download.

View original record on NSF Award Search →