GGrantIndex
← Search

TRIPODS: Towards a Unified Theory of Structure, Incompleteness & Uncertainty in Heterogeneous Graphs

$1,581,509FY2017CSENSF

University Of California-Santa Cruz, Santa Cruz CA

Investigators

Abstract

This project brings together researchers from mathematics, statistics, and computer science to develop a unified theory of data science applied to uncertain and heterogeneous graph and network data. Most real-world applications of networks involve complex phenomena, such as socio-behavioral interactions, biological and/or chemical processes, technical systems like data centers, and communication systems for smart cities. These data are heterogeneous, including multiple modalities and multiple scales. Crucially, the data observed is often incomplete and very noisy. A new foundation for data science needs to be built in order to address these challenges in the context of graph and network data. Similarly, we lack a clear unified theory that allows us to understand how to quantify the uncertainty in the system that arises from the uncertainty in the relationships among its actors. This is a fertile area for transdisciplinary collaboration between statisticians, mathematicians, and computer scientists, with strong impacts on industry, academia, government and broader society. This project centers around two research themes. In the first theme, the PIs will investigate models of algorithms on uncertain network data, and specifically combine techniques from sub-linear algorithms with Bayesian methods. The second theme focuses on how algorithms can benefit from data uncertainty, in the context of privacy, disclosure, and robustness to noise. In both these themes, technical advances will be achieved by marrying computational approaches to uncertainty with statistical and mathematical approaches for uncertainty. In addition to the research agenda, the project involves an ambitious vision for data science capabilities spanning academia to industry. The education aspects of this vision include a series of themed workshops and the development of comprehensive educational resources spanning secondary, undergraduate and advanced graduate materials. The project also involves collaboration between UC Santa Cruz and Silicon Valley companies that will ground our proposed theoretical and algorithmic advances with practical applications to real-world problems and data. Furthermore, these collaborations with industrial partners will lead to specialized workforce development. Looking forward towards Phase II, the project aims to develop collaborations with industry partners and various academic institutions in the area to develop a Silicon Valley/Greater Bay Area Institute on Foundations of Data Science, potentially to be located at University of California Santa Cruz Silicon Valley campus in Santa Clara. Funds for the project come from CISE Information Technology Research and MPS Division of Mathematical Sciences.

View original record on NSF Award Search →