Modeling and Simulation of Communication in HPC Systems

Authors: Dr. Misbah Mubarak (Argonne National Laboratory)

Abstract: Modeling and simulation play a key role in analyzing the impact of system design choices on applications' communication performance. The HPC community currently offers several predictive design tools with varying focus and strengths. To make the best use of these tools, it is critical that the tool developers and end users interact to identify the user requirements and capabilities of these tools. In this BoF, we will discuss recent advances in modeling and simulation techniques and identify opportunities for cross-pollination among various predictive design techniques. We will also discuss the requirements and challenges posed by the end users.

Long Description: The HPC community is increasingly relying on predictive design techniques to assess communication tradeoffs, co-design interconnects and explore application performance in uncertain scenarios (such as deployment of future systems). While a variety of modeling and simulation tools exist to analyze communication performance and system design, some tools are more suitable than others for serving user specific needs in the design life cycle.

In this BoF, we will look at tools that are being used for HPC communication modeling and provide functionality such as tracing HPC applications, modeling application communication, replaying communication on HPC system simulations and deriving performance predictions. We will primarily focus on analytical modeling tools, cycle accurate simulations and discrete-event simulation frameworks.

Objectives of the BoF are (i) identify the capabilities and limitations of current modeling and simulation techniques in reflecting real world scenarios, (ii) discuss ways to address gaps in the modeling and simulation techniques so that they can be effectively used for system procurements, (iii) identify opportunities for cross-pollination among the current approaches and, (iv) derive a roadmap for having validated, scalable and extendable modeling and simulation tools that can make accurate performance predictions.

BoF participants will include both end users and experts from the modeling/simulation community. The presentations and any related material will be available on the website after the BoF. We will also conduct a survey of BoF attendees asking questions about (i) the current modeling and simulation tools that they use (if any) and (ii) what are the existing scenarios in which they intend to use modeling and simulation. The summary of the survey results will be made available on the website.

