AF:Small:Mathematical Programming for Average-Case Problems
University Of California-Berkeley, Berkeley CA
Investigators
Abstract
Optimization, perhaps the most ubiquitous computational task of all, consists of finding a solution that either maximizes or minimizes a certain function while satisfying a set of constraints. If the space of solutions to the problem is finite and discrete, then such problems are referred to as "combinatorial optimization" problems. Unfortunately, most combinatorial optimization problems are computationally intractable to solve exactly, i.e., it is widely believed that finding the optimal solution would be prohibitively time-consuming. To cope with this computational intractability, it is natural to try to find an approximate solution to the optimization problem. A classic approach to find approximate solutions that has been studied for more than four decades now, is the use of ``convex relaxations". The idea of convex relaxation is to convert the underlying optimization task into a convex optimization task, since efficient algorithms are known for convex optimization. Intuitively, a convex optimization problem is one whose underlying solution space is continuous, and for every pair of solutions, their average is also a solution. Converting an arbitrary combinatorial optimization problem in to a convex optimization task is inherently lossy, in that we only recover an approximate solution to the original problem in the process. The question then is, how good is the approximation? and which convex relaxation yield the best approximation? Over the last two decades, these questions have been extensively studied for two fundamental classes of convex relaxations widely used in algorithms -- namely linear and semi-defintie programming. However, the study of convex relaxations have been mostly centered around the ``worst-case" where the goal is to demonstrate guarantees on the approximation obtained by the convex relaxation, on every instance of the underlying problem. Although worst case analysis is useful in that it yields a guaranteed approximation on every instance of the underlying problem, it is too pessimistic at times. In many real-life settings, the underlying input to the optimization problem is chosen from a natural probability distribution. For example, the optimization problem could be set on a social network, wherein the connections are not completely arbitrary (worst case), but have distinct structure that can be modeled using a probability distribution. Another example, is that of recovering a signal in presence of noise, the noise is typically not completely arbitrary (worst case), but somewhat random (average case). This proposal aims to study the performance of convex relaxation techniques namely linear and semidefinite programming in the average-case setting. The proposal will develop new algorithms using convex relaxations and characterize the approximation obtained via convex relaxations in the average case setting. Algorithms for the average case might translate to better approximations in practical applications. Furthermore, the techniques developed and the questions raised will advance the state-of the-art in many different areas including, approximation algorithms, computational complexity theory, probability and random matrix theory and convex optimization. The PI will disseminate the deeper insights so gained, to a broader of audience through designing a graduate course on convex relaxations, supervising research projects for undergraduates, and organizing a workshop, apart from delivering conference talks and seminars. Technical Overview: A majority of the work on mathematical programming for combinatorial optimization is devoted to algorithms on worst-case inputs. In other words, these algorithms have approximation or run-time guarantees on every input. This research project is concerned with understanding the power of mathematical-programming based algorithms in the average case, where the inputs to the algorithm are assumed to sampled from a natural underlying distribution. A new and growing body of work studies a diverse set of problems like random constraint satisfaction problems, statistical block models, planted clique, sparse PCA , tensor completion, tensor PCA, compressed sensing and matrix completion, where the underlying instance is drawn from a probability distribution. This project intends to develop a theory characterizing the power of mathematical programming, especially the sum-of-squares (SoS) SDP hierarchy in this context. Our approach has the potential to yield remarkably simple necessary and su cient conditions which precisely predict the efficacy of LP and SDP hierarchies for broad classes of distributional problems. It begins with a seemingly trite, but important conceptual shift in our approach for analyzing SoS SDPs in distributional settings, namely cast the distributional problems as the task of distinguishing samples from two diferent distributions. Viewing the problem from this standpoint, exposes the central thesis underlying this proposal that for broad classes of average-case problems, the power of LP or SDP hierarchies are characterized by simple families of distinguishing functions. For example, we argue that the power of SoS SDP relaxations for distinguishability are characterized by so-called low-degree spectral distinguishers. These are functions that compute a matrix whose entries are low-degree polynomials of the input and then output a simple function of the eigenvalues, say the largest one. This characterization opens up several exciting research avenues, both to prove lower bounds against and to devise algorithms for distributional problems. By translating these problems in to the realm of indistinguishability with respect to simple families of distinguishers, it makes them amenable to tools from random matrix theory and pseudorandomness. On the one hand, this research program is likely spur and be accompanied by, corresponding advances in random matrix theory. On the other, it holds the promise to shed light on the computational complexity of several basic tasks in high-dimensional statistics and unsupervised learning such as sparse PCA, tensor PCA and statistical block models. For average-case problems like these, in the absence of conventional complexity theoretic notions such as NP-completeness, characterizing the power of LP/SDP relaxations is possibly the most compelling evidence of their complexity that one can hope to produce.
View original record on NSF Award Search →