P76: A Compiler Agnostic and Architecture Aware Predictive Modeling Framework for Kernels
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception

TimeTuesday, November 14th5:15pm - 7pm
LocationFour Seasons Ballroom
DescriptionMulti-architecture machines make program characterization for modeling a regression outcome difficult. Determining where to offload compute-dense portions requires accurate prediction models for multiple architectures. To productively achieve portable performance across these diverse architectures, users are adopting portable programming models such as OpenMP and RAJA.
When adopted, portable models make traditional high-level source code analysis inadequate for program characterization. In this poster, we introduce a common microarchitecture instruction format (ComIL) for program characterization. ComIL is capable of representing programs in an architecture-aware compiler-agnostic manner. We evaluate feature extraction with ComIL by constructing multiple regression-objective models for performance (execution time) and correctness (maximum absolute error). These models perform better than the current state of the art -- achieving a mean error rate of only 4.7% when predicting execution time. We plan to extend this work to handle multiple architectures concurrently and evaluate with more representative physics kernels.
When adopted, portable models make traditional high-level source code analysis inadequate for program characterization. In this poster, we introduce a common microarchitecture instruction format (ComIL) for program characterization. ComIL is capable of representing programs in an architecture-aware compiler-agnostic manner. We evaluate feature extraction with ComIL by constructing multiple regression-objective models for performance (execution time) and correctness (maximum absolute error). These models perform better than the current state of the art -- achieving a mean error rate of only 4.7% when predicting execution time. We plan to extend this work to handle multiple architectures concurrently and evaluate with more representative physics kernels.