EAGER: SaTC: Quantifying the Fair Value of Data and Privacy in Distributed Learning
University Of California-Berkeley, Berkeley CA
Investigators
Abstract
Data-driven decision-making drives the engine of our modern economy. As data becomes an increasingly important resource, understanding its value becomes critical. This project marries the economic study of data with the growing field of data privacy to present a framework for quantifying the value of data and privacy. The relationship between users who generate data and platforms that collect and benefit from this data will be explored, progressing our understanding of what constitutes a fair and open data market. Emphasis is placed on the concept of fair payment for data by considering well-known concepts from economics and extending them to include the critical component of user privacy. A better understanding of the value of data and privacy can empower individuals and regulators, leading to a stronger and more productive economy. In addition, the project addresses the important topic of fairness. This research has the potential to transform the way data is viewed, treated, and monetized by studying a fundamental framework where platforms and individuals can both fairly benefit from the value of data. This project systematically approaches the fundamental question of how to quantify the value of data in a privacy-centric game-theoretic framework in order to explore the relationship between platforms and users with data, leading to the concept of a fair and open data market. The concept of fair payment for data is considered using the foundational game-theoretic concept of the Shapley value, and extending this concept to include the critical component of heterogeneous user privacy. Specifically, the proposed project investigates the technical questions of how to quantify the cost of providing privacy, how to monetize the value of data at various heterogeneous levels of privacy, and how to enable platforms to design fair incentive structures. The proposed investigation is highly interdisciplinary, including elements of optimization, machine learning, probability theory, and statistics, as well as critically, from relevant aspects of economics and game theory such as Nash equilibria and Shapley value to treat concepts like fairness and value. The project aims to bridge the recent advancements in rigorous privacy guarantees for statistical inference and machine learning settings with the economics of quantifying the value of data under heterogeneous privacy requirements. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →