Performance Portability of an Intermediate-Complexity Atmospheric Research Model in Coarray Fortran
Author/Presenters
Event Type
Workshop

Applications
Effective Application of HPC
Parallel Programming Languages, Libraries, Models and Notations
Performance
Programming Systems
SIGHPC Workshop
Scientific Computing
TimeMonday, November 13th12pm - 12:30pm
Location702
DescriptionWe present results on the scalability and performance of an open-source, Coarray Fortran (CAF) mini-application (mini-app) that solves several parallel, numerical algorithms known to dominate the execution of the Intermediate Complexity Atmospheric Research (ICAR) model developed at the National Center for Atmospheric Research (NCAR). The solver employs standard Fortran 2008 features and includes several Fortran 2008 implementations of the collective subroutines that are defined in the Committee Draft the upcoming Fortran 2015 standard. The
ability of CAF to run atop various communication layers and the increasing compiler support for CAF facilitated initial evaluations of several compiler/runtime/hardware combinations. Results are presented for the GNU, Intel, and Cray compilers, each of which offers different parallel runtime libraries employing one or more communication layers, including MPI, OpenSHMEM, and proprietary alternatives. We studied the performance on both multi- and many-core processors running on distributed-memory systems. The results of our initial investigations suggest promising scaling behavior across a range of hardware, compiler, and runtime choices on platforms ranging up to 100,000 cores.
ability of CAF to run atop various communication layers and the increasing compiler support for CAF facilitated initial evaluations of several compiler/runtime/hardware combinations. Results are presented for the GNU, Intel, and Cray compilers, each of which offers different parallel runtime libraries employing one or more communication layers, including MPI, OpenSHMEM, and proprietary alternatives. We studied the performance on both multi- and many-core processors running on distributed-memory systems. The results of our initial investigations suggest promising scaling behavior across a range of hardware, compiler, and runtime choices on platforms ranging up to 100,000 cores.