CRII: SHF: Optimizing Program Executions on Non-uniform Threaded Architectures

$174,973FY2015CSENSF

College Of William And Mary, Williamsburg VA

Investigators

Abstract

Modern computer systems employ multiple threads to exploit multiple cores with multiple functional units. High thread-level parallelism arises from having multiple threads per core (simultaneous or fine-grained multithreading, SMT), multiple cores per processor (chip multithreaded processor, CMP), and multiple processors per node (non-uniform memory access, NUMA). Threads share different levels of hardware resources depending on where they execute. An architecture with hybrid SMT, CMP and NUMA threads is a non-uniform threaded architecture. Most multi-socket systems today are of this form. Due to the thread abstraction in operating systems and programming models, software developers often treat all threads in the system as symmetric, ignoring their non-uniformity in hardware. As a result, multithreaded code running on non-uniform threaded architectures performs at a level far below the theoretical peak, which can degrade productivity and increase energy consumption. The main focus of this project is to investigate both static and dynamic software methods to exploit non-uniformity between hardware threads. For the static optimization, the PI aims to introduce thread non-uniformity in software via code transformations. This will also include improving existing applications by identifying opportunities in application source code to apply such transformations. For the dynamic optimization, the PI plans to study online scheduling methods to match non-uniform threads in software and hardware. Given an executable, it will be analyzed online to characterize resource sharing between its threads and appropriate scheduling strategies will be applied for both threads and data. This project will tightly integrate static and dynamic optimization methods for programs running on non-uniform threaded architectures. This research can dramatically improve the performance of large-scale multithreaded applications running on today?s and emerging parallel architectures. More broadly, this project will have a strong impact in designing performance tools and parallel programming frameworks for non-uniform threads. It will likely attract broad interest from industry and academia. An important part of this project is its integration with teaching undergraduate and graduate courses as well as student mentoring.

View original record on NSF Award Search →