Designing and Building Efficient HPC Cloud with Modern Networking Technologies on Heterogeneous HPC Clusters
Student: Jie Zhang (Ohio State University)
Advisor: Dhabaleswar K. (DK) Panda (Ohio State University)
Abstract: Cloud Computing platforms through virtualization technologies are widely used by many users due to their high availability and flexibility. However, running HPC applications on the cloud still suffers from fairly low performance, more specifically, the degraded I/O performance from the virtualized I/O devices. SR-IOV solves the problem by delivering near-native I/O performance. Nevertheless, SR-IOV lacks high-performance virtualization support, such as locality-aware and NUMA-aware support, and also prevents VM live migration. Moreover, the critical HPC resources among VMs, (e.g, SR-IOV enabled Virtual Functions), need to be managed and isolated to efficiently support running multiple concurrent MPI jobs on HPC clouds. Therefore, we propose a framework to design and build the efficient HPC clouds with modern networking technologies on heterogeneous HPC clusters. Through this framework, the HPC cloud can deliver near-native performance for MPI applications on different types of virtualization environments, support SR-IOV enabled VM migration, and provide the efficient resource sharing.
Doctoral Showcase Index