MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems - Publication - Bridge of Knowledge

Search

MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems

Abstract

In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects and easily allows various formulas to model execution and communication times of particular blocks of code. A simulator engine within the MERPSYS environment simulates execution of the application that consists of processes with various codes, to which distinct labels are assigned. The simulator runs one Java thread per label and scales computations and communication times adequately. This approach allows fast coarse-grained simulation of large applications on large-scale systems. We have performed tests and verification of results from the simulator for three real parallel applications implemented with C/MPI and run on real HPC clusters: a master-slave code computing similarity measures of points in a multidimensional space, a geometric single program multiple data parallel application with heat distribution and a divide-and-conquer application performing merge sort. In all cases the simulator gave results very similar to the real ones on configurations tested up to 1000 processes. Furthermore, it allowed us to make predictions of execution times on configurations beyond the hardware resources available to us.

Citations

  • 2 4

    CrossRef

  • 0

    Web of Science

  • 2 8

    Scopus

Cite as

Full text

download paper
downloaded 195 times
Publication version
Accepted or Published Version
License
Creative Commons: CC-BY-NC-ND open in new tab

Keywords

Details

Category:
Articles
Type:
artykuł w czasopiśmie wyróżnionym w JCR
Published in:
SIMULATION MODELLING PRACTICE AND THEORY no. 77, pages 124 - 140,
ISSN: 1569-190X
Language:
English
Publication year:
2017
Bibliographic description:
Czarnul P., Kuchta J., Matuszek M., Proficz J., Rościszewski P., Szymański J., Wójcik M.: MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems// SIMULATION MODELLING PRACTICE AND THEORY. -Vol. 77, (2017), s.124-140
DOI:
Digital Object Identifier (open in new tab) 10.1016/j.simpat.2017.05.009
Bibliography: test
  1. Coarse-grained modeling of the application and system in MERPSYS. open in new tab
  2. Simulations and calibration of cost functions based on selected results from real runs. open in new tab
  3. Simulation for other configurations (input data size, the number of processes etc.) open in new tab
  4. Intel ® Xeon ® CPUs, http://ark.intel.com/#@Processors , online; accessed 12-April-2017.
  5. NVIDIA GPUs, https://www.top500.org/system/178764 , online; accessed 12-April-2017. open in new tab
  6. H. Fu , J. Liao , J. Yang , L. Wang , Z. Song , X. Huang , C. Yang , W. Xue , F. Liu , F. Qiao , et al. , The sunway taihulight supercomputer: system and applications, Sci. China Inf. Sci. 59 (7) (2016) 072001 . open in new tab
  7. BOINC, http://boinc.berkeley.edu/ , online; accessed 12-April-2017. open in new tab
  8. Globus toolkit, http://toolkit.globus.org/toolkit/ , online; accessed 12-April-2017. open in new tab
  9. UNICORE, http://www.unicore.eu/documentation/manuals/unicore/files/client _ intro.pdf , online; accessed 12-April-2017. open in new tab
  10. Gridbus, http://gridbus.cs.mu.oz.au/middleware/ , online; accessed 12-April-2017. open in new tab
  11. P. Rosciszewski, P. Czarnul, R. Lewandowski, M. Schally-Kacprzak, Kernelhive: a new workflow-based framework for multilevel high performance com- puting using clusters and workstations with CPUs and GPUs, Concurrency Comput. 28 (9) (2016) 2586-2607 . http://dx.doi.org/10.1002/cpe.3719 . open in new tab
  12. MERPSYS server, http://merpsys.eti.pg.gda.pl/portal , online; accessed 12-April-2017. open in new tab
  13. P. Czarnul , J. Kuchta , P. Ro ściszewski , J. Proficz , Modeling energy consumption of parallel applications, in: 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), 2016, pp. 855-864 . open in new tab
  14. P. Rosciszewski, Executing multiple simulations in the MERPSYS environment, in: Modeling Large-Scale Computing Systems. Practical Approaches in MERPSYS, Gdansk University of Technology, 2016, pp. 123-133 . 978-83-938367-2-7, https://repository.os.niwa.gda.pl/handle/niwa _ item/138 .
  15. W. Kreutzer, J. Hopkins, M. van Mierlo, Simjava -a framework for modeling queueing networks in java, in: Proceedings of the 29th Conference on Winter Simulation, WSC '97, IEEE Computer Society, Washington, DC, USA, 1997, pp. 4 83-4 88 . http://dx.doi.org/10.1145/26 8437.26 854 8 . open in new tab
  16. A. Varga, OMNet++, in: Modeling and Tools for Network Simulation, Springer Berlin Heidelberg, 2010, pp. 35-59, doi: 10.1007/978-3-642-12331-3 _ 3 . open in new tab
  17. R. Buyya, M. Murshed, Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing, Concurrency Comput. 14 (13-15) (2002) 1175-1220, doi: 10.1002/cpe.710 . open in new tab
  18. J. Proficz, P. Czarnul, Performance and Power-Aware Modeling of MPI Applications for Cluster Computing, Springer International Publishing, Cham, 2016, pp. 199-209 . http://dx.doi.org/10.1007/978-3-319-32152-3 _ 19 . open in new tab
  19. W.E. Denzel, J. Li, P. Walker, Y. Jin, A framework for end-to-end simulation of high-performance computing systems, in: Proceedings of the 1st In- ternational Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops, Simutools '08, ICST, vol. 21, Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, ICST, Brussels, Belgium, Belgium, 2008 . pp. 21:1-21:10. http://dl.acm.org/citation.cfm?id=1416222.1416248 . open in new tab
  20. Message passing interface forum, 2015, MPI : A Message-Passing Interface Standard, Version 3.1. open in new tab
  21. R.N. Calheiros, R. Ranjan, A. Beloglazov, C.A.F. De Rose, R. Buyya, Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms, Softw. Pract. Exper. 41 (1) (2011) 23-50 . http://dx.doi.org/10.1002/spe.995 . open in new tab
  22. A. Medina, A. Lakhina, I. Matta, J. Byers, Brite: Boston University representative internet topology generator, 2001, open in new tab
  23. S. Bak, M. Krystek, K. Kurowski, A. Oleksiak, W. Piatek, J. Weglarz, GSSIM -A tool for distributed computing experiments, Sci. Program. 19 (4) (2011) 231-251 . http://dx.doi.org/10.3233/SPR-2011-0332 . open in new tab
  24. H. Adalsteinsson, S. Cranford, D.A. Evensky, J.P. Kenny, J. Mayo, A. Pinar, C.L. Janssen, A simulator for large-scale parallel computer architectures, Int. J. Distrib. Syst. Technol. 1 (2) (2010) 57-73, doi: 10.4018/jdst.2010040104 . open in new tab
  25. H. Casanova, A. Legrand, M. Quinson, Simgrid: a generic framework for large-scale distributed experiments, in: Proceedings of the Tenth International Conference on Computer Modeling and Simulation, UKSIM '08, IEEE Computer Society, Washington, DC, USA, 2008, pp. 126-131 . http://dx.doi.org/10. 1109/UKSIM.2008.28 . open in new tab
  26. B. Donassolo, H. Casanova, A. Legrand, P. Velho, Fast and scalable simulation of volunteer computing systems using simgrid, in: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, ACM, New York, NY, USA, 2010, pp. 605-612 . http://dx. doi.org/10.1145/1851476.1851565 . open in new tab
  27. C.L. Dumitrescu , I. Foster , Gangsim: a simulator for grid scheduling studies, in: Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International Symposium on, vol. 2, IEEE, 2005, pp. 1151-1158 . open in new tab
  28. T.T. Sa , R. Calheiros , D. Gomes , Cloudreports: an extensible simulation tool for energy-aware cloud computing environments, in: Cloud Computing, Springer International Publishing, 2014, pp. 127-142 . ISBN 978-3-319-10529-1.
  29. B. Pranggono , D. Alboaneen , H. Tianfield , 11 Simulation Tools for Cloud Computing, CRC Press, 2014 . open in new tab
  30. A. Bashar, Modeling and simulation frameworks for cloud computing environment: a critical evaluation, Int. J. Comput. Inf. Eng. 1(9) 1 −6, http: //www.pmu.edu.sa/kcfinder/upload/files/ICCCSS2014 _ Abul _ Bashar.pdf .
  31. R. Malhotra, P. Jain, Study and comparison of cloudsim simulators in the cloud computing, SIJ Trans. Comput. Sci. Eng. Appl., 1(4) 111 −115. open in new tab
  32. M. Kaleem , P. Khan , Commonly used simulation tools for cloud computing research, in: Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference on, 2015, pp. 1104-1111 . open in new tab
  33. A . Ahmed , A .S. Sabyasachi , Cloud computing simulators: a detailed survey and future direction, in: Advance Computing Conference (IACC), 2014 IEEE International, IEEE, 2014, pp. 866-872 . open in new tab
  34. P. Czarnul, P. Rosciszewski, M.R. Matuszek, J. Szymanski, Simulation of parallel similarity measure computations for large data sets, in: 2nd IEEE International Conference on Cybernetics, CYBCONF 2015, Gdynia, Poland, June 24-26,2015, IEEE, 2015, pp. 472-477 . http://dx.doi.org/10.1109/CYBConf. 2015.7175980 . open in new tab
  35. P. Czarnul, M. Matuszek, Performance modeling and prediction of real application workload in a volunteer-based system, in: Applications of Informa- tion Systems in Engineering and Bioscience, Proceedings of 13th International Conference on Software Engineering, Parallel and Distributed Systems conference (SEPADS), WSEAS, Gdansk, Poland, 2014, pp. 37-45 . ISBN: 978-960-474-381-0, http://www.wseas.us/e-library/conferences/2014/Gdansk/ SEBIO/SEBIO-03.pdf . open in new tab
  36. P. Rosciszewski , Modeling and simulation for exploring power/time trade-off of parallel deep neural network training, in: Proceedings of ICCS 2017 Conference, Zurich, Switzerland, Procedia Computer Science, 2017 . In press. open in new tab
  37. P. Czarnul, K. Grzeda, Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit linux clusters, in: P. Kac- suk, J. Dongarra (Eds.), Recent Advances in Parallel Virtual Machine and Message Passing Interface, 11th European PVM/MPI Users' Group Meet- ing, Budapest, Hungary, September 19-22, 2004, Proceedings, Vol. 3241 of Lecture Notes in Computer Science, Springer, 2004, pp. 234-241 . http: //dx.doi.org/10.1007/978-3-540-30218-6 _ 35 . open in new tab
  38. K. Key, J. Ovall, A parallel goal-oriented adaptive finite element method for 2.5-d electromagnetic modelling, Geophys. J. Int. 186 (1) (2011) 137-154 . http://dx.doi.org/10.1111/j.1365-246X.2011.05025.x . open in new tab
  39. S. Buckeridge, R. Scheichl, Parallel geometric multigrid for global weather prediction, Numer. Linear Algebra Appl. 17 (2-3) (2010) 325-342 . http: //dx.doi.org/10.1002/nla.699 . open in new tab
  40. P. Czarnul, Parallelization of divide-and-conquer applications on intel xeon phi with an openmp based framework, in: J. Swiatek, L. Borzemski, A. Grzech, Z. Wilimowska (Eds.), Information Systems Architecture and Technology: Proceedings of 36th International Conference on Information Systems Architecture and Technology -ISAT 2015 -Part III, Karpacz, Poland, September 20-22,2015, Vol. 431 of Advances in Intelligent Systems and Computing, Springer, 2015, pp. 99-111 . http://dx.doi.org/10.1007/978-3-319-28564-1 _ 9 . open in new tab
  41. Java EE 1.7, http://www.oracle.com/technetwork/java/javaee/tech/index.html , online; accessed 12-April-2017.
  42. Java EE full profile, http://jcp.org/aboutJava/communityprocess/final/jsr342/index.html , online; accessed 12-April-2017. open in new tab
  43. PostgreSQL Server, http://www.postgresql.org/docs/ , online; accessed 12-April-2017. open in new tab
  44. Oracle, Java DataBase Connectivity Tutorial, http://docs.oracle.com/javase/tutorial/jdbc/basics/index.html . open in new tab
  45. Java web start technology, http://jcp.org/aboutJava/communityprocess/final/jsr056/index.html , online; accessed 12-April-2017. open in new tab
  46. Java message service, http://jcp.org/aboutJava/communityprocess/final/jsr914/index.html , online; accessed 12-April-2017. open in new tab
  47. Galera+ cluster, http://task.gda.pl/kdm/sprzet/gplus/ , online; accessed 12-April-2017. open in new tab
  48. I.H. Witten , E. Frank , Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2005 . open in new tab
  49. T.M. Cover , P.E. Hart , Nearest neighbor pattern classification, Inf. Theory IEEE Trans. 13 (1) (1967) 21-27 . open in new tab
  50. M. Du, X.-s. Chen, Accelerated k-nearest neighbors algorithm based on principal component analysis for text categorization, J. Zhejiang Univ. SCI. C 14 (6) (2013) 407-416 . http://dx.doi.org/10.1631/jzus.C1200303 . open in new tab
  51. J.A. Hartigan , M.A. Wong , Algorithm AS 136: a k-means clustering algorithm, Appl. Stat. 28 (1978) 100-108 . open in new tab
  52. R.F.V.d. Wijngaart, Nas Parallel Benchmarks Version 2.4, Technical Report NAS Technical Report NAS-02-007, NASA Advanced Supercomputing (NAS) Division, 2002 . https://www.nas.nasa.gov/assets/pdf/techreports/2002/nas-02-007.pdf .
Verified by:
Gdańsk University of Technology

seen 280 times

Recommended for you

Meta Tags