DescriptionWith the rise of heterogeneous systems in high performance computing, how we utilize accelerators has become a critical factor in achieving the optimal performance. We explore several issues with using accelerators in Charm++, a parallel programming model that employs overdecomposition. We propose a runtime support scheme that enables concurrent execution of heterogeneous tasks and evaluate its performance. Using a synthetic benchmark that utilizes busy-waiting to simulate workload, we observe that the effectiveness of the runtime support varies with the application characteristics, with a maximum speedup of 4.79x. With a two-dimensional five-point stencil benchmark designed to represent a realistic workload, we obtain up to 2.75x speedup.