Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns
Abstrakt
The paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set of 10~compute nodes interconnected locally by the InfiniBand switches with the traffic forwarded by specialized hardware: IBEX G40 - QDR InfiniBand RDMA based Extension Platform. A set of six all-reduce algorithms, consisting of two ring-based (including a PAP-aware pre-reduced ring), two binomial-tree based and two hierarchical ones, was tested for balanced and imbalanced process arrival patterns (PAPs). The results showed high and stable bandwidth with large data transmission latency of the branch connecting the remote sites (about 13 ms in comparison to 10 us locally), and for the tested algorithms there was an advantage of hierarchical approach, and then binomial tree. Finally, we also observed some performance increase in PAP-aware solution in comparison to its regular counterpart. The main conclusion is that for the distributed cluster environment with imbalanced PAPs, there is a need for designing new hierarchical algorithms with PAP-aware support.
Cytowania
-
2
CrossRef
-
0
Web of Science
-
4
Scopus
Autorzy (6)
Cytuj jako
Pełna treść
- Wersja publikacji
- Submitted Version
- Licencja
- Copyright (Springer Nature Switzerland AG 2020)
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Aktywność konferencyjna
- Typ:
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Opublikowano w:
-
Advances in Intelligent Systems and Computing
nr 1151,
strony 817 - 829,
ISSN: 2194-5357 - Tytuł wydania:
- Advanced Information Networking and Applications strony 817 - 829
- Język:
- angielski
- Rok wydania:
- 2020
- Opis bibliograficzny:
- Proficz J., Sumionka P., Skomiał J., Semeniuk M., Niedzielewski K., Walczak M.: Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns// Advanced Information Networking and Applications/ : , 2020, s.817-829
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1007/978-3-030-44041-1_72
- Źródła finansowania:
-
- Działalność statutowa/subwencja
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 178 razy