GGrantIndex
← Search

CIF: Small: Collaborative Research: Analytics on Edge-labeled Hypergraphs: Limits to De-anonymization

$250,000FY2016CSENSF

Princeton University, Princeton NJ

Investigators

Abstract

Data analytics is a rapidly growing field, aided by the availability of huge amounts of data and significant computing power. The immense potential of data analytics to provide benefits to the society in application areas such as health, economics, and finance, is reliant on the fundamental and urgent challenge of protecting privacy of users. In this project, new theoretical paradigms and approaches to address privacy vulnerability of users in network environments in presence of big data are studied. The vulnerability results from the indigenous structural dependencies in the network as well as the presence of exogenous auxiliary information outside of the network that permits deanonymization of the users. This project has transformative potential to impact a broad class of applications where user privacy is critical. The project?s inherently inter-disciplinary nature and real-world technological potential complements the investigators? on-going efforts to engage more students (especially women and minorities) to study topics at the intersection of application and quantitative reasoning in the STEM disciplines. The research is divided into three thrusts: (1) Development of information-theoretic converses for deanonymization problem in random edge-labeled hyper-graphs for adversaries with access to correlated information sources. Such converses enable deriving necessary conditions under which the adversary cannot deanonymize the system, no matter how much computational power or storage is available. (2) Research practical achievable schemes: Besides tight (but not necessarily efficient) achievable schemes required for calibrating the converses, the design of practical deanonymization algorithms to quantify how much attackers can learn when the released datasets do not meet the necessary conditions of the converse, are explored. (3) Real-world evaluations: The performance of the algorithms and their practical applicability are evaluated on real world datasets.

View original record on NSF Award Search →