P47: Understanding Congestion on Omni-Path Fabrics
SessionPoster Reception
Event Type
ACM Student Research Competition
Poster
Reception

TimeTuesday, November 14th5:15pm - 7pm
LocationFour Seasons Ballroom
DescriptionHigh-performance computing systems require high-speed interconnects, such as InfiniBand (IB), to efficiently transmit data. Intel’s Omni-Path Architecture (OPA) is a new interconnect similar to IB that is implemented on some of Los Alamos National Laboratory’s recent clusters. Both interconnects suffer from degraded performance under heavy network traffic loads, resulting in packet discards. However, unlike IB, OPA specifically calls out these drops in the form of the performance counter, congestion discards. Owing to the relative immaturity of the OPA fabric technology, the correlation between performance degradation and congestion discards has not been fully evaluated to date. This research aims to increase the level of understanding of the effects congestion has on cluster performance by presenting a sufficiently high data injection load to the OPA fabric such that performance degradation is induced and the cause of this performance degradation can be evaluated. LA-UR-17-26341