P73: HPC Production Job Quality Assessment
SessionPoster Reception
Authors
Event Type
ACM Student Research Competition
Poster
Reception

TimeTuesday, November 14th5:15pm - 7pm
LocationFour Seasons Ballroom
DescriptionUsers of HPC systems would benefit from more feedback about the quality of their application runs, such as knowing whether or not the performance of a particular run was good, or whether the resources requested were enough, or too much. Such feedback requires more information to be kept regarding production application runs, and requires some analytics to assess any new runs. In this research, we assess the practicality of using job data, system data, and hardware performance counters in a near-zero overhead manner to assess job performance, in particular whether or not the job runtime was in line with expectations from historical application performance. We show over four proxy applications and two real application that our assessment is within 10% of actual performance.