Didn't find any results in this catalog!
But we have some results in other catalogs.Filters
total: 82
Search results for: FINITE DIFFERENCE RIEMANN SOLVER MUSTA-FORCE ALGORITHM PARALLEL ALGORITHMS CUDA
-
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
PublicationRelativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....
-
Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
PublicationThis paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...
-
Piotr Sypek dr inż.
PeoplePiotr Sypek received the M.S.E.E. and Ph.D. degrees (with hons.) in microwave engineering from the Gdańsk University of Technology, Gdańsk, Poland, in 2003 and 2012, respectively. He was involved in the design and implementation of parallel algorithms for the formulation and solution of electromagnetic problems executed on CPUs (workstations and clusters) and GPUs. His current research interests include parallel processing in computational...
-
OpenGL accelerated method of the material matrix generation for FDTD simulations
PublicationThis paper presents the accelerated technique of the material matrix generation from CAD models utilized by the finite-difference time-domain (FDTD) simulators. To achieve high performance of these computations, the parallel-processing power of a graphics processing unit was employed with the use of the OpenGL library. The method was integrated with the developed FDTD solver, providing approximately five-fold speedup of the material...
-
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
PublicationWe present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublicationSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
Performance evaluation of parallel background subtraction on GPU platforms
PublicationImplementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...
-
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
PublicationIn the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...
-
Single and Dual-GPU Generalized Sparse Eigenvalue Solvers for Finding a Few Low-Order Resonances of a Microwave Cavity Using the Finite-Element Method
PublicationThis paper presents two fast generalized eigenvalue solvers for sparse symmetric matrices that arise when electromagnetic cavity resonances are investigated using the higher-order finite element method (FEM). To find a few loworder resonances, the locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm with null-space deflation is applied. The computations are expedited by using one or two graphical processing...
-
Performance evaluation of the parallel object tracking algorithm employing the particle filter
PublicationAn algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...