Semi-algebraic complexity and models for massive data set processing

$497,377FY2008CSENSF

University Of Washington, Seattle WA

Investigators

Abstract

This project focuses on the computational complexity of problems in semi-algebraic inference and in massive data set processing, as well as on advancing the analytical techniques of multiparty communication complexity, which has been a useful tool for understanding problems in both of these areas. Semi-algebraic inference systems express sets of constraints using polynomial inequalities and use rules of inference to derive new polynomial inequalities from existing ones. They are widely used in operations research and combinatorial optimization. Part of this project is to investigate what can be derived efficiently using semi-algebraic inference, including the complexity of proofs of unsatisfiability based on semi-algebraic inference, the quality of the linear and semi-definite programming approximations for NP-hard optimization problems that can derived using known semi-algebraic inference, and the extent to which our understanding of inference over binary domains can be leveraged to yield better algorithms for classes of semi-algebraic inference over the real numbers. Networked computers have allowed the accumulation of data sets of unprecedented size. New distributed methods that process such massive data sets efficiently by giving only approximate answers are having substantial impact in practice. However, the theoretical models of massive data set processing that have been analyzed to date represent only a very limited class of methods and do not capture techniques already used in practice. Part of this research is aimed at producing and analyzing theoretical models that will allow us to understand the limitations of these methods for massive data set processing and to make better use of them. Multiparty communication complexity measures the amount of information that must be communicated between multiple participants, each having partial information about the inputs to a function, in order to compute its output. Because of its flexibility it is widely applicable to problems in computational complexity. The final goal of this project is to add to the rather limited set of techniques that can be used to analyze multiparty communication complexity with the aim goal of improving its applications to semi-algebraic inference and distributed massive data set processing.

View original record on NSF Award Search →