Single and Dual-GPU Generalized Sparse Eigenvalue Solvers for Finding a Few Low-Order Resonances of a Microwave Cavity Using the Finite-Element Method - Publication - Bridge of Knowledge

Search

Single and Dual-GPU Generalized Sparse Eigenvalue Solvers for Finding a Few Low-Order Resonances of a Microwave Cavity Using the Finite-Element Method

Abstract

This paper presents two fast generalized eigenvalue solvers for sparse symmetric matrices that arise when electromagnetic cavity resonances are investigated using the higher-order finite element method (FEM). To find a few loworder resonances, the locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm with null-space deflation is applied. The computations are expedited by using one or two graphical processing units (GPUs) as accelerators. The performance of the solver is tested for single and dual GPU hardware setups, making use of two types of GPU: NVIDIA Kepler K40s and NVIDIA Pascal P100s. The speed of the GPU-accelerated solvers is compared to a multithreaded implementation of the same algorithm using a multicore central processing unit (CPU, Intel Xeon E5-2680 v3 with twelve cores). It was found that, even for the least efficient setups, the GPU-accelerated code is approximately twice as fast as a parallel CPU-only implementation

Citations

  • 0

    CrossRef

  • 0

    Web of Science

  • 0

    Scopus

Cite as

Full text

download paper
downloaded 19 times
Publication version
Accepted or Published Version
License
Creative Commons: CC-BY open in new tab

Keywords

Details

Category:
Articles
Type:
artykuł w czasopiśmie wyróżnionym w JCR
Published in:
RADIOENGINEERING no. 27, edition 4, pages 930 - 936,
ISSN: 1210-2512
Language:
English
Publication year:
2018
Bibliographic description:
Dziekoński A., Mrozowski M.: Single and Dual-GPU Generalized Sparse Eigenvalue Solvers for Finding a Few Low-Order Resonances of a Microwave Cavity Using the Finite-Element Method// RADIOENGINEERING. -Vol. 27, iss. 4 (2018), s.930-936
DOI:
Digital Object Identifier (open in new tab) 10.13164/re.2018.0930
Bibliography: test
  1. KRAKIWSKY, S. E., TURNER, L. E., OKONIEWSKI, M. M. Ac- celeration of finite-difference time-domain (FDTD) using graph- ics processor units (GPU). In IEEE MTT-S International Mi- crowave Symposium Digest. Fort Worth (USA), 2004, p. 1033-1036. DOI: 10.1109/MWSYM.2004.1339160 open in new tab
  2. INMAN, M. J., ELSHERBENI, A. Z. Programming video cards for computational electromagnetics applications. IEEE Antennas and Propagation Magazine, 2005, vol. 47, no. 6, p. 71-78. DOI: 10.1109/MAP.2005.1608730 open in new tab
  3. SYPEK, P., DZIEKONSKI, A., MROZOWSKI, M. How to render FDTD computations more effective using a graphics accelerator. IEEE Transactions on Magnetics, 2009, vol. 45, no. 3, p. 1324-1327. DOI: 10.1109/TMAG.2009.2012614 open in new tab
  4. DE DONNO, D., ESPOSITO, A., TARRICONE, L., et al. In- troduction to GPU computing and CUDA programming: A case study on FDTD [EM programmer's notebook]. IEEE Antennas and Propagation Magazine, 2010, vol. 52, no. 3, p. 116-122. DOI: 10.1109/MAP.2010.5586593 open in new tab
  5. DE DONNO, D., ESPOSITO, A., MONTI, G., et al. Parallel efficient method of moments exploiting graphics processing units. Microwave and Optical Technology Letters, 2010, vol. 52, no. 11, p. 2568-2572. DOI: 10.1002/mop.25534 open in new tab
  6. DE DONNO, D., ESPOSITO, A., MONTI, G., et al. MPIE/MoM acceleration with a general-purpose graphics processing unit. IEEE Transactions on Microwave Theory and Techniques, 2012, vol. 60, no. 9, p. 2693-2701. DOI: 10.1109/TMTT.2012.2203924 open in new tab
  7. MU, X., ZHOU, H.-X., CHEN, K., et al. Higher order method of mo- ments with a parallel out-of-core LU solver on GPU/CPU platform. IEEE Transactions on Antennas and Propagation, 2014, vol. 62, no. 11, p. 5634-5646. DOI: 10.1109/TAP.2014.2350536 open in new tab
  8. GUAN, J., YAN, S., JIN, J.-M. An OpenMP-CUDA implementa- tion of multilevel fast multipole algorithm for electromagnetic sim- ulation on multi-GPU computing systems. IEEE Transactions on Antennas and Propagation, 2013, vol. 61, no. 7, p. 3607-3616. DOI: 10.1109/TAP.2013.2258882 open in new tab
  9. KLÖCKNER, A., WARBURTON, T., BRIDGE, J., et al. Nodal discontinuous galerkin methods on graphics processors. Journal of Computational Physics, 2009, vol. 228, no. 21, p. 7863-7882. DOI: 10.1016/j.jcp.2009.06.041 open in new tab
  10. CAPOZZOLI, A., KILIC, O., CURCIO, C., et al. The success of GPU computing in applied electromagnetics. Applied Computational Elec- tromagnetics Society Journal, 2018, vol. 33, no. 2. ISSN: 1054-4887 open in new tab
  11. DZIEKONSKI, A., LAMECKI, A., MROZOWSKI, M. GPU acceleration of multilevel solvers for analysis of microwave components with finite element method. IEEE Microwave and Wireless Components Letters, 2011, vol. 21, no. 1, p. 1-3. DOI: 10.1109/LMWC.2010.2089974 open in new tab
  12. DZIEKONSKI, A., LAMECKI, A., MROZOWSKI, M. Tuning a hybrid GPU-CPU V-cycle multilevel preconditioner for solving large real and complex systems of FEM equations. IEEE Anten- nas and Wireless Propagation Letters, 2011, vol. 10, p. 619-622. DOI: 10.1109/LAWP.2011.2159769 open in new tab
  13. DINH, Q., MARECHAL, Y. Toward real-time finite-element simula- tion on GPU. IEEE Transactions on Magnetics, 2016, vol. 52, no. 3, p. 1-4. DOI: 10.1109/TMAG.2015.2477602 open in new tab
  14. DZIEKONSKI, A., SYPEK, P., LAMECKI, A., et al. Generation of large finite-element matrices on multiple graphics processors. In- ternational Journal for Numerical Methods in Engineering, 2013, vol. 94, no. 2, p. 204-220. DOI: 10.1002/nme.4452 open in new tab
  15. MENG, H.-T., NIE, B.-L., WONG, S., et al. GPU accelerated finite-element computation for electromagnetic analysis. IEEE An- tennas and Propagation Magazine, 2014, vol. 56, no. 2, p. 39-62. DOI: 10.1109/MAP.2014.6837065 open in new tab
  16. AURENTZ, J. L., KALANTZIS, V., SAAD, Y. Cucheb: a GPU implementation of the filtered lanczos procedure. Com- puter Physics Communications, 2017, vol. 220, p. 332-340. DOI: 10.1016/j.cpc.2017.06.016 open in new tab
  17. ANZT, H., TOMOV, S., DONGARRA, J. Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product. In Pro- ceedings of the Symposium on High Performance Computing (HPC). open in new tab
  18. Alexandria (USA), 2015, p. 75-82. open in new tab
  19. KREUTZER, M., ERNST, D., BISHOP, A. R., et al. Chebyshev filter diagonalization on modern manycore processors and GPG- PUs. In Proceedings of the International Conference on High Per- formance Computing. Frankfurt (Germany), 2018, p. 329-349. DOI: 10.1007/978-3-319-92040-5_17 open in new tab
  20. RODRIGUES, W., PECCHIA, A., DER MAUR, M. A., et al. A comprehensive study of popular eigenvalue methods employed for quantum calculation of energy eigenstates in nanostructures using GPUs. Journal of Computational Electronics, 2015, vol. 14, no. 2, p. 593-603. DOI: 10.1007/s10825-015-0695-z open in new tab
  21. DZIEKONSKI, A., REWIENSKI, M., SYPEK, P., et al. GPU- accelerated LOBPCG method with inexact null-space filtering for solving generalized eigenvalue problems in computational elec- tromagnetics analysis with higher-order FEM. Communications in Computational Physics, 2017, vol. 22, no. 4, p. 997-1014. DOI: 10.4208/cicp.OA-2016-0168 open in new tab
  22. RUBIO, J., ARROYO, J., ZAPATA, J. Analysis of passive microwave circuits by using a hybrid 2-D and 3-D finite-element mode-matching method. IEEE Transactions on Microwave Theory and Techniques, 1999, vol. 47, no. 9, p. 1746-1749. DOI: 10.1109/22.788618 open in new tab
  23. ZHU, Y., CANGELLARIS, A. C. Multigrid Finite Element Meth- ods for Electromagnetic Field Modeling. John Wiley & Sons, 2006. ISBN: 9780471741107 open in new tab
  24. INGELSTROM, P. A new set of H (curl)-conforming hierarchical basis functions for tetrahedral meshes. IEEE Transactions on Mi- crowave Theory and Techniques, 2006, vol. 54, no. 1, p. 106-114. DOI: 10.1109/TMTT.2005.860295 open in new tab
  25. KNYAZEV, A. V. Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method. SIAM Journal on Scientific Computing, 2001, vol. 23, no. 2, p. 517-541. DOI: 10.1137/S1064827500366124 open in new tab
  26. ARBENZ, P., GEUS, R. Multilevel preconditioned iterative eigensolvers for Maxwell eigenvalue problems. Applied Nu- merical Mathematics, 2005, vol. 54, no. 2, p. 107-121. DOI: 10.1016/j.apnum.2004.09.026 open in new tab
  27. DZIEKONSKI, A., LAMECKI, A., MROZOWSKI, M. A mem- ory efficient and fast sparse matrix vector product on a GPU. Progress in Electromagnetics Research, 2011, vol. 16, p. 49-63. DOI:10.2528/PIER11031607 open in new tab
  28. DZIEKONSKI, A., SYPEK, P., LAMECKI, A., et al. Communication and load balancing optimization for finite element electromagnetic simulations using multi-GPU workstation. IEEE Transactions on Mi- crowave Theory and Techniques, 2017, vol. 65, no. 8, p. 2661-2671. DOI: 10.1109/TMTT.2017.2714670 open in new tab
  29. REWIENSKI, M., DZIEKONSKI, A., LAMECKI, A., et al. A sta- bilized complex LOBPCG eigensolver for the analysis of moderately lossy EM structures. IEEE Microwave and Wireless Components Let- ters, 2018, vol. 28, no. 1, p. 7-9. DOI: 10.1109/LMWC.2017.2771289 open in new tab
  30. LAMECKI, A., BALEWSKI, L., MROZOWSKI, M. An efficient framework for fast computer aided design of microwave circuits based on the higher-order 3D finite-element method. Radioengineer- ing, 2014, vol. 23, no. 4, p. 970-978.
  31. DZIEKONSKI, A., MROZOWSKI, M. Block conjugate-gradient method with multilevel preconditioning and GPU acceleration for fem problems in electromagnetics. IEEE Antennas and Wire- less Propagation Letters, 2018, vol. 17, no. 6, p. 1039-1042. DOI: 10.1109/LAWP.2018.2830124 open in new tab
  32. AMDAHL, G. M. Validity of the single processor approach to achiev- ing large scale computing capabilities. In Proceedings of the AFIPS Spring Joint Computer Conference. Atlantic City (USA), 1967, p. 483-485. open in new tab
Verified by:
Gdańsk University of Technology

seen 110 times

Recommended for you

Meta Tags