Unifying Information- and Optimization-Theoretic Approaches for Modeling and Training Generative Adversarial Networks

$1,099,999FY2021MPSNSF

Arizona State University, Scottsdale AZ

Investigators

Lalitha Sankarcontact Angelia Nedich Giulia Pedrielli Shiwei Lan

Abstract

The success of modern machine learning (ML) is driven by copious amounts of data needed to learn complex predictive models. However, the lack of publicly available data to develop and test ML algorithms has large societal implications including unverifiable concerns of algorithmic bias in blackbox models. The need for public datasets is even stronger for critical systems such as the electric grid where realistic datasets are essential for both real-time decision-making and robust long-term planning. Synthetic data promises a secure and consistent way to develop ML algorithms; yet, developing principled methods to generate synthetic data with guarantees on learning the data distribution continues to be an open problem. Generative adversarial networks (GANs) have emerged as an effective deep learning approach for generating synthetic data. GANs involve two modules, modeled in practice as deep neural networks: a generator of synthetic samples, and a discriminator which classifies inputs to it as real or fake. The opposing goals of the two modules yields a minimum-maximum (min-max) game. Despite their success, GANs are difficult to train due to a range of instability problems, non-convergence, and mode collapse. The educational component trains graduate students, postdocs, and undergraduates, particularly from underrepresented minority groups (via an Arizona State University Summer Undergraduate Research Initiative) across electrical engineering, computer science, and statistics to emerging challenges in data science and machine learning. This project develops a unified information- and optimization-theoretic framework to address these challenges, leveraging information theory, optimization and game theory, Bayesian methods, and stochastic sequential search techniques. Connections between vanishing gradients and GAN loss functions is addressed via a loss function-based tunable framework for GANs that recovers several oft-used GANs. The project tackles GAN optimization problems in two novel ways: (i) establishing existence of solutions for a general class of nonconvex and functional min-max problems; and (ii) introducing a unifying variational inequality (VI) framework to systematically solve deterministic and stochastic VI problems, including min-max GANs. The project addresses mode collapse by training GANs with Bayesian priors on generator and discriminator parameters. A key novelty of this approach is in identifying the role of latent space on mode collapse using an inverse problem methodology. Finally, the project applies and evaluates the proposed approaches to training tunable GANs and develops a stochastic sequential search algorithm to assure global optimality of trained GANs. Theoretical results are evaluated using both public and proprietary datasets. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →