Cross-Layer Allocation and Management of Hardware Resources in Shared Memory Nodes
Authors: Dr. Emmanuel Jeannot (French Institute for Research in Computer Science and Automation (INRIA))
BP
Abstract: The goal of this BoF is to gather the community (from runtime system to compilers) working in the area of hardware resource allocation for threads. We will discuss this problem, share visions, propose solutions, and coordinate a worldwide effort.
We will consider all the resources of a shared memory node and discuss how to coordinate resource sharing by different parts of the software stack to avoid competition for these resources.
Participants will be able to provide their own vision through discussions. The goal is to come up with a document specifying possible solutions and discuss possible implementations.
Long Description: Even if supercomputers are composed of distributed memory nodes, efficiently
managing each node is a key issue for performance. As we are facing a
significant increase of the number of cores on each node and a deep memory
hierarchy, allocating and managing the threads that are executed on the cores is
a challenge that requires cooperation and coordination between the different
components of the software stack. The goal is to consider all shareable
resources of a node: cores, memory, power, cache, etc.
For instance, at the application level some part of the application may use
pthreads to program its concurrency. However, it may also rely on computational
libraries (e.g. MKL) that are also multi-threaded. Moreover, other parts of the
application may use OpenMP. Furthermore, an MPI runtime system might have
internal parallelism (e.g. use a progress thread for communication). Currently,
each component of the application is unaware that other components are also
using threads, causing potential over-subscription and poor performance. Beyond
Linux affinity masks, there is no common mechanism to allow the different
components to be aware of each other and co-operate in their use of HW
resources.
Tools like hwloc provide a portable, static, view of the node topology but do
not provide any intelligent strategy to share resources between different
application components. Therefore, even if most of the element of the HPC
software stack that query topology information use hwloc, we still lack a
mechanism to allocate and manage HW resource allocation.
Many research team have identified this problem. This may be called application
composition, dynamic topology management or topology-aware core selection,
etc. In any case, the basic problem is the same. While there are a variety of
proposed solutions, there is no agreement, and since the whole problem is one of
co-operation, a common solution is required. We think it would be of great
interest for the HPC community to meet and discuss the issue and possible
solutions. The goal is to share the different visions, and ideas and then to
coordinate a worldwide effort. The ultimate goal is to see whether we can come
up with some common way to address the problem and standardize how information
related to HW resoucre usage can be managed and expressed.
We think that SC is the perfect venue for hosting this first BoF on the topic as
it is the only place where all of those who need to be involved will naturally
be present. To succeed, we need to involve end users, MPI implementers, OpenMP
implementers, implementers of other parallel runtimes (C++17, ...) as well as
those implementing HW resource detection code, and others. Many people from the
community have already shown their interest in this BOF. They span the whole
community from the US (ANL, Sandia, UTK), Europe (Inria, CEA, RTW Aachen, LMU
München) and Asia (Tokyo Riken) as well as industry (Intel, Bull), etc.
For more details, see: http://www.labri.fr/perso/ejeannot/SC_BOF/SC17_BOF.html
Conference Presentation: pdf
Birds of a Feather Index