EAGER: Concurrent Data Structures

$265,044FY2016CSENSF

Yale University, New Haven CT

Investigators

Abstract

Most computer programming describes a sequence of steps to be carried out one by one by a single processor core. As the rate of increase in processor speed has slowed, CPU manufacturers have responded by building systems with many processor cores that together carry out multiple tasks simultaneously. These processors share a common memory that they use both for their own individual computations and for communication with other processors. Organizing this shared memory so that the processors can make progress without being delayed by other processors requires careful coordination and the design of specialized data structures and communication protocols to allow efficient cooperation without conflicts. The project will study how letting processors make random choices between different ways to accomplish the same task to improve the efficiency and amount of memory used by these data structures, an approach that appears to be necessary given known impossibility results for non-randomized methods. This may significantly improve our ability to exploit the power of multicore machines, while simplifying the work of programmers of these machines. In addition to this impact on computer science and the practice of programming, the project will directly impact both undergraduate and graduate research. Because concurrent data structures are well-suited to undergraduate implementation projects, which avoid difficulties that often arise with involving undergraduates in more theoretical research, the project will serve as a bridge for recruiting students into cutting-edge, high-stakes research, including students from under-represented groups. At the graduate level, results from the project will feed directly into the PI's teaching, including updates to the PI's publicly-available lecture notes already in use by many students at other institutions. The main question considered by the project is: Can we remove bottlenecks in concurrent executions of programs in shared-memory system using data structures that avoid traditional concurrency control tools like locking? Two techniques that appear promising are the use of randomization, which avoids problems where bad timing consistently produces bad executions in which different processes interfere with each others' operations, and limited-use assumptions, where shared data structures are designed under the assumption that they will only be used for a limited number of operations, allowing for significant reductions in cost and complexity. In addition to applying these techniques individually to the design of concurrent shared-memory data structures, the project will also consider how these methods can complement each other, for example by the use of randomized helping techniques to transform short-lived limited-use data structures into long-lived unlimited-use data structures. A key element of this work will be the development of improved cost measures for shared-memory data structures, including dynamic measures of space complexity that more accurately reflect practical memory usage than the worst-case measures of maximum memory consumption found in previous work.

View original record on NSF Award Search →