SessionPerformance Analysis
Authors
Event Type
Paper
Performance
TimeThursday, November 16th10:30am -
11am
Location402-403-404
DescriptionThe fat-tree topology is one of the most commonly used
network topologies in HPC systems. Vendors support
several options that can be configured when deploying
fat-tree networks for production systems, such as link
bandwidth, number of rails, number of planes, and
tapering. This paper showcases the use of simulations to
compare the impact of these design options on
representative production HPC applications, libraries,
and multi-job workloads. We present advances in the
TraceR-CODES simulation framework that enable this
analysis and evaluate its prediction accuracy against
experiments on a production fat-tree network. In order
to understand the impact of network configurations on
various anticipated scenarios, we study workloads with
different communication patterns,
computation-to-communication ratios, and scaling
characteristics. Using multi-job workloads, we also
study the impact of inter-job interference and different
job placement schemes.
Download PDF:
here
Authors