DescriptionHPC file systems today work in a best-effort manner
where individual applications can flood the file system
with requests, effectively leading to a denial of
service for all other tasks. This paper presents a Token
Bucket Filter (TBF) policy for the Lustre file system.
The TBF enforces RPC rate limitations based on
(potentially complex) Quality of Service (QoS) rules.
The QoS rules are enforced in Lustre's Object Storage
Servers, where each request is assigned to an
automatically created QoS class.
QoS implementation for Lustre enables various features
for each class including the support for high-priority
and real-time requests even under heavy load and the
utilization of spare bandwidth by less important tasks
under light load. The framework also enables dependent
rules to change a job's RPC even at very small
timescales. Furthermore, we propose a Global Rate
Limiting (GRL) algorithm to enforce system-wide RPC rate