GGrantIndex
← Search

CAREER: Querying Evolving Graphs

$497,808FY2018CSENSF

New York University, New York NY

Investigators

Abstract

Graphs are used to represent a plethora of phenomena, including the Web, social networks, biological pathways, transportation networks, and semantic knowledge bases. Many interesting and important questions about graphs concern their evolution rather than their static state: Which Web pages are showing an increasing popularity trend? How does influence propagate in social networks? How do the utilization of transportation options and the cost of ridership in a city change during the day and throughout the week? How does knowledge evolve? Formulating these questions as programs is currently beyond the skills of most data scientists. Executing such programs poses tremendous efficiency challenges, especially for graphs with billions of edges, and with non-trivial evolution rates. Much research and engineering effort today goes into developing sophisticated graph analytics and their efficient implementations, both stand-alone and in scope of data processing platforms. Yet, systematic support for querying and analysis of evolving graphs is still lacking. This support is urgently needed, due both to the scalability challenges inherent in evolving graph analysis, and to considerations of usability and ease of dissemination. This project will fill this gap by establishing the fundamental principles of effective modeling and efficient analysis of evolving graphs, and by making results available to the community of use in an open-source platform called Portal. This project will build on the state of the art in temporal data management, making the principles and techniques that were developed over decades of research and practice in that domain available to evolving graph applications. The project will develop: (1) a data model for evolving graphs and an expressive compositional algebra; (2) an efficient implementation of the data structures and of the algebraic operations, together with any necessary algebraic primitives and physical representations / access methods, in scope of a distributed data-parallel framework; (3) a declarative query language that supports concise specification of sophisticated graph analysis tasks, and a query optimizer that generates efficient query execution plans; (4) a principled evaluation methodology of usability and efficiency, based on real and synthetic datasets and analysis tasks. This project will impact research and practice in data management, by contributing novel representation, analysis and benchmarking methods for evolving graph data. Results of this project will help incorporate sophisticated evolving graph analysis into larger applications, and will enable scaling up to modern volumes. The Portal framework will support computational and data scientists who work with evolving graphs in social network analysis, knowledge management and network traffic analysis. A prominent set of use cases for this work will come from data science for social-good applications, including urban homelessness and analysis of transportation utilization and cost in cities. For further information see the project web page: portaldb.github.io. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →