GGrantIndex
← Search

RII Track-2 FEC: The Visual Experience Database: A Large-Scale Point-of-View Video Database for Vision Research

$3,974,003FY2019O/DNSF

Bates College, Lewiston ME

Investigators

Abstract

Current artificial intelligence (AI) systems that recognize visual content require millions of training examples to achieve good performance. However, the databases used to train such systems often take photos and videos from the internet, and thus do not represent the content that humans see on a daily basis. This introduces substantial biases into the AI systems that can have serious implications for AI-based applications such as self-driving cars. This project, a collaboration between Bates College, the University of Nevada, Reno, and North Dakota State University, Fargo will create the Visual Experience Database (VED), a database of over 240 hours of video shot from the perspective of a diverse set of observers engaged in common, everyday activities such as shopping, eating, or walking. Along with these videos, we will track each observer's head and eye position in order to understand how people look at the world, and how this changes with environment, age, and task. Our goal is to make this database open and accessible to all. Having the computer skills to use the database is key to accessibility, so we will be releasing a suite of software tools for using the database, as well as implement a summer workshop in basic computer programming skills to grow a workforce that is prepared for a variety of scientific occupations. By making the database open to the public, we will enable scientists, historians, and even artists to benefit from this rich resource. Progress in both human visual neuroscience and computer vision are limited by the availability of representative visual data. However, currently available image and movie databases are not representative of typical first-person visual experience. This project, a collaboration between Bates College, the University of Nevada, Reno, and North Dakota State University, Fargo, will create the Visual Experience Database (VED), a database of over 240 hours of first-person video complete with eye- and head-tracking. We will record from people of diverse ages (5-70 years) across three geographically distinct sites as they engage in common, everyday activities such as shopping, eating, or walking. With these data, we will be able to assess how observers sample their visual environments, and how gaze patterns change with environment, age, and task. Further, these data can be used as training data for next-generation computer vision systems. In order to develop a workforce with the skills necessary to work with big data, we will teach a Big Data Skills Summer Workshop to provide undergraduate and graduate students with the basic skills of computer programming and computational literacy to make contributions to this project and to prepare them for a variety of STEM occupations. The VED will be of broad use across several academic communities (cognitive science, neuroscience, computer vision, and possibly digital humanities and art). By creating a database that represents common, human experiences, we bypass the many biases of extant datasets, which will increase the efficacy of computer vision algorithms. By making these data fully open, we will enable advances in these fields to be accessible to all. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →