TOD-Tree: Task-Overlapped Direct send Tree Image Compositing for Hybrid MPI Parallelism
Abstract
Modern supercomputers have very powerful multi-core CPUs. The programming model on these supercomputer is switching from pure MPI to MPI for inter-node communication, and shared memory and threads for intra-node communication. Consequently the bottleneck in most systems is no longer computation but communication between nodes. In this paper, we present a new compositing algorithm for hybrid MPI parallelism that focuses on communication avoidance and overlapping communication with computation at the expense of evenly balancing the workload. The algorithm has three stages: a direct send stage where nodes are arranged in groups and exchange regions of an image, followed by a tree compositing stage and a gather stage. We compare our algorithm with radix-k and binary-swap from the IceT library in a hybrid OpenMP/MPI setting, show strong scaling results and explain how we generally achieve better performance than these two algorithms.
BibTeX
@inproceedings {10.2312:pgv.20151157,
booktitle = {Eurographics Symposium on Parallel Graphics and Visualization},
editor = {C. Dachsbacher and P. Navrátil},
title = {{TOD-Tree: Task-Overlapped Direct send Tree Image Compositing for Hybrid MPI Parallelism}},
author = {Grosset, A. V. Pascal and Prasad, Manasa and Christensen, Cameron and Knoll, Aaron and Hansen, Charles},
year = {2015},
publisher = {The Eurographics Association},
DOI = {10.2312/pgv.20151157}
}
booktitle = {Eurographics Symposium on Parallel Graphics and Visualization},
editor = {C. Dachsbacher and P. Navrátil},
title = {{TOD-Tree: Task-Overlapped Direct send Tree Image Compositing for Hybrid MPI Parallelism}},
author = {Grosset, A. V. Pascal and Prasad, Manasa and Christensen, Cameron and Knoll, Aaron and Hansen, Charles},
year = {2015},
publisher = {The Eurographics Association},
DOI = {10.2312/pgv.20151157}
}