A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM - Publikacja - MOST Wiedzy

Wyszukiwarka

A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM

Abstrakt

The paper discusses a fast implementation of the stabilized locally optimal block preconditioned conjugate gradient (sLOBPCG) method, using a hierarchical multilevel preconditioner to solve nonHermitian sparse generalized eigenvalue problems with large symmetric complex-valued matrices obtained using the higher-order finite-element method (FEM), applied to the analysis of a microwave resonator. The resonant frequencies of the low-order modes are the eigenvalues of the smallest real part of a complex symmetric (though non-Hermitian) matrix pencil. These type of pencils arise in the FEM analysis of resonant cavities loaded with a lossy material. To accelerate the computations, graphics processing units (GPU, NVIDIA Pascal P100) were used. Single and dual-GPU variants are considered and a GPU-memorysaving implementation is proposed. An efficient sliced ELLR-T sparse matrix storage format was used and operations were performed on blocks of vectors for best performance on a GPU. As a result, significant speedups (exceeding a factor of six in some computational scenarios) were achieved over the reference parallel implementation using a multicore central processing unit (CPU, Intel Xeon E5-2680 v3, twelve cores). These results indicate that the solution of generalized eigenproblems needs much more GPU memory than iterative techniques when solving a sparse system of equations, and also requires a second GPU to store some data structures in order to reduce the footprint, even for a moderately large systems

Cytowania

  • 5

    CrossRef

  • 0

    Web of Science

  • 5

    Scopus

Cytuj jako

Pełna treść

pobierz publikację
pobrano 187 razy
Wersja publikacji
Accepted albo Published Version
Licencja
Copyright (2018 IEEE)

Słowa kluczowe

Informacje szczegółowe

Kategoria:
Publikacja w czasopiśmie
Typ:
artykuł w czasopiśmie wyróżnionym w JCR
Opublikowano w:
IEEE Access nr 6, strony 69826 - 69834,
ISSN: 2169-3536
Język:
angielski
Rok wydania:
2018
Opis bibliograficzny:
Dziekoński A., Mrozowski M.: A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM// IEEE Access. -Vol. 6, (2018), s.69826-69834
DOI:
Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1109/access.2018.2871219
Bibliografia: test
  1. M. Wang, H. Klie, M. Parashar, and H. Sudan, ''Solving sparse linear systems on NVIDIA Tesla GPUs,'' in Proc. Int. Conf. Comput. Sci., Berlin, Germany: Springer, 2009, pp. 864-873. otwiera się w nowej karcie
  2. S. Georgescu and H. Okuda, ''Conjugate gradients on multiple GPUs,'' Int. J. Numer. Methods Fluids, vol. 64, nos. 10-12, pp. 1254-1273, 2010. otwiera się w nowej karcie
  3. A. Dziekonski, A. Lamecki, and M. Mrozowski, ''GPU acceleration of multilevel solvers for analysis of microwave components with finite ele- ment method,'' IEEE Microw. Wireless Compon. Lett., vol. 21, no. 1, pp. 1-3, Jan. 2011. otwiera się w nowej karcie
  4. A. Dziekonski, A. Lamecki, and M. Mrozowski, ''Tuning a hybrid GPU- CPU V-cycle multilevel preconditioner for solving large real and com- plex systems of FEM equations,'' IEEE Antennas Wireless Propag. Lett., vol. 10, pp. 619-622, 2011. otwiera się w nowej karcie
  5. R. Helfenstein and J. Koko, ''Parallel preconditioned conjugate gradi- ent algorithm on GPU,'' J. Comput. Appl. Math., vol. 236, no. 15, pp. 3584-3590, 2012. otwiera się w nowej karcie
  6. R. Li and Y. Saad, ''GPU-accelerated preconditioned iterative linear solvers,'' J. Supercomput., vol. 63, no. 2, pp. 443-466, 2013. otwiera się w nowej karcie
  7. L. Z. Khodja, R. Couturier, A. Giersch, and J. M. Bahi, ''Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters,'' J. Supercomput., vol. 69, no. 1, pp. 200-224, Jul. 2014.
  8. A. Dziekonski, P. Sypek, A. Lamecki, and M. Mrozowski, ''Communi- cation and load balancing optimization for finite element electromagnetic simulations using multi-GPU workstation,'' IEEE Trans. Microw. Theory Techn., vol. 65, no. 8, pp. 2661-2671, Aug. 2017. otwiera się w nowej karcie
  9. A. Dziekonski and M. Mrozowski, ''Block conjugate-gradient method with multilevel preconditioning and GPU acceleration for FEM problems in electromagnetics,'' IEEE Antennas Wireless Propag. Lett., vol. 17, no. 6, pp. 1039-1042, Jun. 2018. otwiera się w nowej karcie
  10. J. L. Aurentz, V. Kalantzis, and Y. Saad, ''Cucheb: A GPU implementation of the filtered lanczos procedure,'' Comput. Phys. Commun., vol. 220, pp. 332-340, Nov. 2017. otwiera się w nowej karcie
  11. H. Anzt, S. Tomov, and J. Dongarra, ''On the performance and energy efficiency of sparse linear algebra on GPUs,'' Int. J. High Perform. Comput. Appl., vol. 31, no. 5, pp. 375-390, 2017. otwiera się w nowej karcie
  12. H. Anzt, S. Tomov, and J. Dongarra, ''Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product,'' in Proc. Symp. High Perform. Comput., 2015, pp. 75-82. otwiera się w nowej karcie
  13. M. Kreutzer et al., ''Performance engineering and energy efficiency of building blocks for large, sparse eigenvalue computations on heteroge- neous supercomputers,'' in Software for Exascale Computing-SPPEXA. Cham, Switzerland: Springer, 2016, pp. 317-338. otwiera się w nowej karcie
  14. M. Kreutzer et al., ''Chebyshev filter diagonalization on modern manycore processors and GPGPUs,'' in Proc. Int. Conf. High Perform. Comput.. otwiera się w nowej karcie
  15. Cham, Switzerland: Springer, 2018, pp. 329-349.
  16. W. Rodrigues, A. Pecchia, M. A. der Maur, and A. Di Carlo, ''A compre- hensive study of popular eigenvalue methods employed for quantum cal- culation of energy eigenstates in nanostructures using GPUs,'' J. Comput. Electron., vol. 14, no. 2, pp. 593-603, 2015. otwiera się w nowej karcie
  17. A. V. Knyazev, ''Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method,'' SIAM J. Sci. Comput., vol. 23, no. 2, pp. 517-541, 2001. otwiera się w nowej karcie
  18. M. Rewienski, A. Dziekonski, A. Lamecki, and M. Mrozowski, ''A stabi- lized complex LOBPCG eigensolver for the analysis of moderately lossy EM structures,'' IEEE Microw. Wireless Compon. Lett., vol. 28, no. 1, pp. 7-9, Jan. 2018. otwiera się w nowej karcie
  19. A. Dziekonski, M. Rewienski, P. Sypek, A. Lamecki, and M. Mrozowski, ''GPU-accelerated LOBPCG method with inexact null-space filtering for solving generalized eigenvalue problems in computational electromagnet- ics analysis with higher-order FEM,'' Commun. Comput. Phys., vol. 22, no. 4, pp. 997-1014, 2017. otwiera się w nowej karcie
  20. P. Ingelstrom, ''A new set of H (curl)-conforming hierarchical basis func- tions for tetrahedral meshes,'' IEEE Trans. Microw. Theory Techn., vol. 54, no. 1, pp. 106-114, Jan. 2006. otwiera się w nowej karcie
  21. J. Rubio, J. Arroyo, and J. Zapata, ''Analysis of passive microwave circuits by using a hybrid 2-D and 3-D finite-element mode-matching method,'' IEEE Trans. Microw. Theory Techn., vol. 47, no. 9, pp. 1746-1749, Sep. 1999. otwiera się w nowej karcie
  22. Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Philadelphia, PA, USA: SIAM, 2000. otwiera się w nowej karcie
  23. P. Arbenz and R. Geus, ''Multilevel preconditioned iterative eigensolvers for Maxwell eigenvalue problems,'' Appl. Numer. Math., vol. 54, no. 2, pp. 107-121, 2005. otwiera się w nowej karcie
  24. A. Lamecki, L. Balewski, and M. Mrozowski, ''An efficient framework for fast computer aided design of microwave circuits based on the higher-order 3D finite-element method,'' Radioengineering, vol. 23, no. 4, pp. 970-978, 2014. otwiera się w nowej karcie
  25. S. Filippone, V. Cardellini, D. Barbieri, and A. Fanfarillo, ''Sparse matrix- vector multiplication on GPGPUs,'' ACM Trans. Math. Softw., vol. 43, no. 4, 2017, Art. no. 30. otwiera się w nowej karcie
  26. H. Anzt, S. Tomov, and J. Dongarra, ''Implementing a sparse matrix vector product for the SELL-C/SELL-C-σ formats on NVIDIA GPUs,'' Innov. otwiera się w nowej karcie
  27. M. Clark, A. Strelchenko, A. Vaquero, M. Wagner, and E. Weinberg. (2017). ''Pushing memory bandwidth limitations through efficient imple- mentations of block-Krylov space solvers on GPUs.'' [Online]. Available: https://arxiv.org/abs/1710.09745 otwiera się w nowej karcie
  28. M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop, ''A uni- fied sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units,'' SIAM J. Sci. Comput., vol. 36, no. 5, pp. C401-C423, 2014. otwiera się w nowej karcie
  29. A. Dziekonski, A. Lamecki, and M. Mrozowski, ''A memory efficient and fast sparse matrix vector product on a GPU,'' Prog. Electromagn. Res., vol. 16, pp. 49-63, 2011. otwiera się w nowej karcie
  30. F. Vázquez, G. Ortega, J. J. Fernández, and E. M. Garzón, ''Improving the performance of the sparse matrix vector product with GPUs,'' in Proc. 10th IEEE Int. Conf. Comput. Inf. Technol., Jun./Jul. 2010, pp. 1146-1151. otwiera się w nowej karcie
  31. A. Dziekonski and M. Mrozowski, ''GPU acceleration of block Krylov methods for FEM problems in electromagnetics,'' in Proc. IEEE MTT-S Int. Conf. Numer. Electromagn. Multiphys. Modeling Optim. RF, Microw., Terahertz Appl. (NEMO), May 2017, pp. 278-280. otwiera się w nowej karcie
  32. S. Papantonis and S. Lucyszyn, ''Lossy spherical cavity resonators for stress-testing arbitrary 3D eigenmode solvers,'' Prog. Electromagn. Res., vol. 151, pp. 151-167, 2015. otwiera się w nowej karcie
  33. J. Schöberl, ''NETGEN an advancing front 2D/3D-mesh generator based on abstract rules,'' Comput. Vis. Sci., vol. 1, no. 1, pp. 41-52, 1997. otwiera się w nowej karcie
  34. ADAM DZIEKONSKI received the M.S.E.E. and Ph.D. degrees (Hons.) in microwave engineering from the Gdańsk University of Technology, Gdańsk, Poland, in 2009 and 2015, respectively. His current research interests include computational electromagnetics (mainly focused on parallelizing computa- tions on graphics processing and central processing units). He was a recipient of the Domestic Grant for Young Scientists from the Foundation for Polish Science in 2012 and 2013. He was also a recipient of the Prime Minister's Award for his Ph.D. thesis in 2016. otwiera się w nowej karcie
  35. MICHAL MROZOWSKI (F'08) received the M.Sc. and Ph.D. degrees (Hons.) from the Gdańsk University of Technology in 1983 and 1990, respectively. In 1986, he joined the Faculty of Electronics, Gdańsk University of Technology, where he is currently a Full Professor, the Head of the Department of Microwave and Antenna Engineering, and the Director of the Center of Excellence for Wireless Communication Engineering. otwiera się w nowej karcie
Weryfikacja:
Politechnika Gdańska

wyświetlono 129 razy

Publikacje, które mogą cię zainteresować

Meta Tagi