SATC: CORE: Medium: Principles and Algorithms for Visual Data Exploration Under Differential Privacy

$1,191,106FY2020CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Gerome Miklaucontact Ali Sarvghad Batn Moghaddam Narges Mahyar

Abstract

A fundamental part of data science is visual data exploration, which uses visualization and user interaction to facilitate the discovery of new knowledge and actionable insights from data. Visual data exploration is well-supported by current visual analytics technology. However, in many domains, the data being explored includes personal facts about individuals. Access to the data may therefore be limited by privacy policies or regulations. This project will develop novel visual data exploration technology that can support the discovery of new knowledge while at the same time guaranteeing that individuals’ privacy is protected. These technologies will be studied in the context of healthcare data, government administrative data, and mobility data. This project will expand the safe and effective exploration of private data, allowing a broader community of data scientists to generate insights from a richer set of data sources, including those previously off-limits due to privacy concerns. The visual exploration methods developed in this project will provide guarantees in the model of differential privacy, which is emerging as the dominant standard for protecting personal data. Enabling accurate visual exploration of data while offering a guarantee of differential privacy requires novel advances in privacy algorithms, visualization technology, as well as careful evaluation methodology and experiments with human subjects. The fundamental challenges of supporting data visualization under differential privacy stem from the complex interaction between privacy algorithms and visualization techniques. Algorithms for private data release can be better designed if they are customized to visualization tasks. And special visualization methods need to be used with noisy privatized data, including those that communicate uncertainty and are robust to spurious visual artifacts. The proposed research has the potential to transform the use of private data by (i) investigating how current visualization and interaction techniques should be adapted in the presence of noise introduced by differentially private algorithms, (ii) developing new algorithms that oﬀer better visual accuracy, for both static visualizations and interactive visual exploration, and (iii) providing a benchmark and evaluation standards to accelerate innovation in private visualization. The effectiveness and value of these algorithms will be evaluated empirically by running a series of human-centered evaluations. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →