PACXXv2 + RV -- An LLVM-Based Portable High-Performance Programming Model
Event Type
Workshop

Compiler Analysis and Optimization
Compilers
Debugging
Parallel Programming Languages, Libraries, Models and Notations
Program Transformation
SIGHPC Workshop
TimeMonday, November 13th11:30am - 12pm
Location710
DescriptionTo achieve high performance on today's HPC systems, multiple programming models have to be used. An example for this burden to the developer is OpenCL: the OpenCL's SPMD programming model must be used together with a host programming model, commonly C or C++. Different programming models require different compilers for code generation, which introduce challenges for the software developer, e.g, different compilers must be convinced to agree on basic properties like type layouts to avoid subtle bugs. Moreover, the resulting performance highly depends on the features of the used compilers and may vary unpredictably.
We present PACXXv2 -- an LLVM based, single-source, single-compiler programming model which integrates explicitly parallel SPMD programming into C++. Our novel CPU back-end provides portable and predictable performance on various state-of-the-art CPU architectures comprising Intel x86 architectures, IBM Power8 and ARM Cortex CPUs. We efficiently integrate the Region Vectorizer (RV) into our back-end and exploit its whole function vectorization capabilities for our kernels. PACXXv2 utilizes C++ generalized attributes to transparently propagate information about memory allocations to the PACXX back-ends to enable additional optimizations.
We demonstrate the high-performance capabilities of PACXXv2 together with RV on benchmarks from well-known benchmark suites and compare the performance of the generated code to Intel's OpenCL driver and POCL -- the portable OpenCL project based on LLVM.
We present PACXXv2 -- an LLVM based, single-source, single-compiler programming model which integrates explicitly parallel SPMD programming into C++. Our novel CPU back-end provides portable and predictable performance on various state-of-the-art CPU architectures comprising Intel x86 architectures, IBM Power8 and ARM Cortex CPUs. We efficiently integrate the Region Vectorizer (RV) into our back-end and exploit its whole function vectorization capabilities for our kernels. PACXXv2 utilizes C++ generalized attributes to transparently propagate information about memory allocations to the PACXX back-ends to enable additional optimizations.
We demonstrate the high-performance capabilities of PACXXv2 together with RV on benchmarks from well-known benchmark suites and compare the performance of the generated code to Intel's OpenCL driver and POCL -- the portable OpenCL project based on LLVM.