HNDS-I: Bringing Differential Privacy to Social Science Data Repositories
Harvard University, Cambridge MA
Investigators
Abstract
Solving many of the most important and troubling problems of human society will require the ability to interpret the very large, detailed, and highly informative sets of data collected from billions of people around the world. These data come from sources such as cell phone records, insurance records, medical records, social media, and web traffic. However, these data contain identifying information that, if made available to the public, could be dangerous or embarrassing to the people involved. It is important, therefore, to find ways that scientists can conduct the research necessary to learn about what causes society’s ills and how to fix them without violating anyone’s individual privacy. This project builds open source, community-based software tools that will let scientists safely access, analyze, and share sensitive datasets, with mathematical guarantees for the privacy of the individuals who may be represented in those datasets. The project is based on the mathematical theory of differential privacy, which changes the data in a way that hides individual-level information. Even though the individual-level information is hidden, differential privacy still allows researchers to study the population-level patterns in the data. The project builds user-friendly, publicly available interactive differential privacy tools that can integrate with widely used data repositories. These tools can be effectively employed even by those without expertise in differential privacy. The project also creates new statistical methods that ensure that the tools are useful for social, behavioral, and economic research. The infrastructure created by this project will allow private companies, government agencies, and other organizations to share data with researchers while remaining confident that privacy violations cannot occur. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →