DescriptionThe Fabriscale Monitoring System (FMS) is a cluster interconnect monitoring software that provides visual insight into the status of your InfiniBand cluster. In this presentation we present the key features of the FMS and show you how to get an overview of performance and drill-down into statistics, alerts and key metrics. With FMS monitoring of the cluster is automated and alarms are only raised when the operator's attention is required. The operator will be pointed to where the problem has occurred, supported by relevant metrics and statistics. This saves time, leads to faster error recovery, less strain on operators and reduced downtime for your cluster.
The FMS integrates with job schedulers, e.g. Slurm, MOAB HPC Suite, and PBS Works, to leverage scheduling information to present performance information as a function of workload. Potential network bottlenecks can be identified per job, and utilization for a job can be specified per port.