III: Small: Collaborative Research: Approximate Learning and Inference in Graphical Models
University Of Texas At Dallas, Richardson TX
Investigators
Abstract
This project is designing efficient methods for approximate learning and prediction in graphical models. In a typical setting, the parameters of an unknown graphical model are to be estimated from data observations. Once learned, the parameters are often used to make predictions about unseen data. The learning problem can be solved by estimating the parameters of the model that generate the observed data with the highest probability (a process known as maximum likelihood estimation), and the prediction task is typically performed by a statistical inference method. As exact learning and prediction are computationally intractable, in practice, we seek to replace the maximum likelihood estimation and prediction tasks with more tractable surrogates. This project is developing such surrogates that (a) can be leveraged for learning in large, real-world graphical models with hidden variables, (b) are orders of magnitude faster than the current state-of-the-art methods, and (c) provide a rigorous alternative to the more heuristic methods that are often employed at scale. Under certain conditions, surrogates can outperform exact maximum likelihood estimation combined with an approximate inference algorithm for prediction. However, many of the typical approaches are much too slow or too limited in power to be used to learn the kinds of large-scale graphical models with many hidden variables that arise in practice. This project studies the design of fast, distributed approximate learning and inference procedures based on the Bethe approximation, a surrogate that is known to perform well in practice. The core observation is that approximate maximum likelihood estimation using the Bethe surrogate can be reduced to solving a series of approximate inference problems via the Frank-Wolfe algorithm. The benefit of this approach is that many fast, combinatorial algorithms already exist for approximate inference in graphical models. In addition to provable bounds and convergence rates, the methods are practically evaluated using several publicly available datasets, primarily social network and image data. Baselines will be application-specific methods known to work well on those datasets, with the developed code made publicly available.
View original record on NSF Award Search →