Developing an OpenMP Runtime for UVM-Capable GPUs

Workshop: LLVM-HPC2017: Fourth Workshop on the LLVM Compiler Infrastructure in HPC
Authors: Hashim Sharif (University of Illinois)

Abstract: With the emergence of new hardware architectures, programming models such as OpenMP must consider new design choices. In the GPU programming space, Unified Virtual Memory (UVM) is one of the new technologies that warrant such considerations. In particular, the new UVM capabilities supported by the Nvidia Pascal architecture introduce new optimization opportunities. While on-demand paging offered by a UVM-based system simplifies kernel programming, it can hamper performance due to excessive page faults at runtime. Accordingly, we have developed an OpenMP runtime that optimizes the data communication and kernel execution for UVM-capable systems. The runtime evaluates the different design choices by leveraging cost models that determine the communication and computation cost, given the application and hardware characteristics. Specifically, we employ static and dynamic analysis to identify application data access patterns that feed into the performance cost models. Our preliminary results demonstrate that the developed optimizations can provide significantly improved performance for OpenMP applications.

Workshop Index