A Performance Model for Partitioned Global Address Space Languages
Michigan Technological University, Houghton MI
Investigators
Abstract
Writing a program to solve a large scale computational problem on a supercomputer is much more difficult than writing a program to solve the same problem on an ordinary computer. DARPA's High Productivity Computing Systems program has focused attention on a new family of parallel programming languages that reduce the effort required to develop supercomputer programs. One of the most widely used of these languages is Unified Parallel C (UPC). This research project develops a way to predict how long UPC programs will run so that programmers will know beforehand whether their programs will run efficiently. Predicting program run times reduces the need for the trial-and-error development of programs. This saves the time of the programmers and the supercomputer, both expensive commodities. This project advances work on a performance model for UPC implementations that run on clusters. A model that describes the remote reuse distance for objects in the software cache is developed. This is a natural extension of local reuse distance for hardware cache. An analysis of remote reuse distance yields functions that predict cache behavior and its impact on performance. The effects of problem size, blocking factor, and the number of threads are included in the model. Common benchmarks, such as the NAS parallel benchmarks, are used to validate the model. The addition of a model of software cache behavior provides programmers with more accurate information about overall run times and suggests ways to improve software cache management for UPC and other languages in its family.
View original record on NSF Award Search →