CHS: Small: Supporting Crowdsourced Sensemaking in Big Data with Dynamic Context Slices

$532,000FY2015CSENSF

Virginia Polytechnic Institute And State University, Blacksburg VA

Investigators

Abstract

This research will investigate how crowdsourcing and computational techniques can be combined to support the efforts of an individual analyst engaged in a complex sensemaking task, such as identifying a threat to national security or determining the names of people and places in a photograph. Currently, such complex tasks are beyond the capabilities of the most advanced machine learning techniques or crowdsourcing workflows, and even trained experts struggle to perform them. Huge quantities of data are now available online, but making sense of them is challenging because human cognition, while remarkably powerful, is nevertheless a limited resource. Visual analytics tools seek to overcome this limitation by leveraging the complementary strengths of information visualization and data mining, but these tools generally assist with low-level tasks, requiring significant effort on the part of users. Crowdsourcing has emerged as a promising technique for applying human intelligence to problems computers cannot easily solve, but for crowds to assist individuals with complex sensemaking tasks, two significant challenges must be addressed. First, we must understand when crowds versus computation are more useful at each phase in the sensemaking loop. Second, we must overcome the limited time and expertise of most crowd workers to sustain deep, complex lines of inquiry. This research addresses both of these challenges through a series of four experiments. First, it will conduct a laboratory study where individuals perform complex sensemaking tasks to understand what types and amounts of context they use to make decisions, and how the sensemaking loop might be decomposed into subtasks. Second, it will conduct a series of experiments comparing crowdsourcing to automated techniques for each of the most promising sensemaking subtasks. Third, it will experiment with different crowd workflows to develop a revised sensemaking loop, optimized for the relative strengths of crowds and computation, and develop a software prototype based on this approach. At the core of the software design is the novel concept of "context slices," an innovative technique for addressing the transience of crowd workers by giving them only the information they need to complete their assigned task, allowing complex investigations to be pursued across multiple workers. The fourth experiment will evaluate this approach by comparing performance with the software to the baselines established in the first study.

View original record on NSF Award Search →