SC17 Denver, CO

OpenSHMEM in the Era of Exascale

Authors: Dr. Steve Poole (Los Alamos National Laboratory)

Abstract: OpenSHMEM is a PGAS API for single-sided asynchronous scalable communications in HPC applications. OpenSHMEM is a community driven standard for the SHMEM API across multiple architectures/implementations. This BoF brings together the OpenSHMEM community to present the latest accomplishments since release of the 1.3 specification, and discuss future directions for the OpenSHMEM community as we develop version 1.4. The BoF will consist of talks from end-users, implementers, middleware and tool developers to discuss their experiences and plans for using OpenSHMEM. We will then open the floor for discussion of the specification and our mid-to-long term goals.

Long Description: OpenSHMEM, a library-based API for the Partitioned Global Address Space (PGAS) programming model, is the result of a community effort to standardize the SHMEM API. supported by multiple vendors and implementers. OpenSHMEM offers a simple, intuitive interface that may be used to specify single-sided asynchronous communications in an application. OpenSHMEM is particularly advantageous for applications at extreme scales with many small put/get operations and/or irregular communication patterns across compute nodes, since it offloads communication operations to the hardware whenever possible.

OpenSHMEM is entering a new golden era as a simple and efficient API to program new and next-generation RDMA hardware support that is becoming de facto technology for HPC interconnects. As a result, there is a demand for solutions like OpenSHMEM, and its user base is starting to expand significantly. One of the goals of this BoF is to bring together the community of users, implementers, tools providers, and academics to help position OpenSHMEM to easily adopt and adapt to these new hardware changes.

OpenSHMEM, as a programming model, can benefit applications or algorithms that exploit small messages / low latency communication that are prevalent in the fields of artificial intelligence, graph traversal, sorting, and in codes with dynamic behavior (e.g. recursive data structures, data analytics). During the BoF, we will talk about success stories that are using OpenSHMEM in these type of applications.

Since SC 2008, this BoF has served, and will continue to serve, as a forum gathering feedback from users on OpenSHMEM extensions, along with identifying challenges and opportunities that the specification needs to address. We will talk about the latest V1.3 of the specification and potential changes and additions for V1.4 and beyond. Topics of importance include: hybrid programming (with MPI, OpenMP, etc.), memory hierarchies, asynchronous models, collectives, fault-tolerance, extending OpenSHMEM for accelerators and heterogeneous devices, and next generation network architectures such as the one being developed by the Gen-Z consortium..

The BoF will present the current state of OpenSHMEM implementations and how these will support future architectures, including next generation leadership class DOE supercomputers.

Lightning talks will provide the main focus of the BoF, during which members of the OpenSHMEM community present quick updates on their current or planned support of the OpenSHMEM software stack and tools ecosystem, and how to contact the speakers and their organizations.

At the end of the BoF we will open the floor for a discussion with experts from academia, laboratories, and vendors, to answer any questions from the community on today’s OpenSHMEM state of the union, and talk about where we would like to go next.

Over the past years, we had an average of 60 people attending the BoF. We will provide a survey link that they can complete online. All the reports, presentations or articles about this or previous SC BoF have been posted at the official official website.

Conference Presentation: pdf

Birds of a Feather Index