Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns
Abstract
The paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set of 10~compute nodes interconnected locally by the InfiniBand switches with the traffic forwarded by specialized hardware: IBEX G40 - QDR InfiniBand RDMA based Extension Platform. A set of six all-reduce algorithms, consisting of two ring-based (including a PAP-aware pre-reduced ring), two binomial-tree based and two hierarchical ones, was tested for balanced and imbalanced process arrival patterns (PAPs). The results showed high and stable bandwidth with large data transmission latency of the branch connecting the remote sites (about 13 ms in comparison to 10 us locally), and for the tested algorithms there was an advantage of hierarchical approach, and then binomial tree. Finally, we also observed some performance increase in PAP-aware solution in comparison to its regular counterpart. The main conclusion is that for the distributed cluster environment with imbalanced PAPs, there is a need for designing new hierarchical algorithms with PAP-aware support.
Citations
-
2
CrossRef
-
0
Web of Science
-
4
Scopus
Authors (6)
Cite as
Full text
- Publication version
- Submitted Version
- License
- Copyright (Springer Nature Switzerland AG 2020)
Keywords
Details
- Category:
- Conference activity
- Type:
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Published in:
-
Advances in Intelligent Systems and Computing
no. 1151,
pages 817 - 829,
ISSN: 2194-5357 - Title of issue:
- Advanced Information Networking and Applications strony 817 - 829
- Language:
- English
- Publication year:
- 2020
- Bibliographic description:
- Proficz J., Sumionka P., Skomiał J., Semeniuk M., Niedzielewski K., Walczak M.: Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns// Advanced Information Networking and Applications/ : , 2020, s.817-829
- DOI:
- Digital Object Identifier (open in new tab) 10.1007/978-3-030-44041-1_72
- Sources of funding:
-
- Statutory activity/subsidy
- Verified by:
- Gdańsk University of Technology
seen 178 times