SessionACM Gordon Bell Finalists
Event Type
ACM Gordon Bell Finalist
TimeThursday, November 16th10:30am -
DescriptionWe refactor and optimize the entire Community
Atmosphere Model (CAM) to the full system of the Sunway
TaihuLight, and provide a petascale climate modeling
performance. We scale the CAM to 1.5 million cores with
a simulation speed of 2.81 simulated years per day using
OpenACC directives at the first stage. We then apply a
more aggressive and challenging finer-grained redesign
of the HOMME dynamical core, to achieve finer memory
control, more efficient vectorization and overlap
between computation and communication. Besides, a
register communication based parallelism scheme is
proposed to minimize the data dependencies in the
modules. By doing so, our optimized kernels running on a
260-core Sunway processor outperform the established
HOMME kernels on a platform with up to 184 Intel Xeon
E5-2680V3 CPU cores. And our implementation has achieved
a sustainable double-precision performance around 2.5
Pflops for a 0.75 km global simulation when using
8,519,680 cores.
Download PDF: