Search results for: cuda

Search results for: cuda

results on page:
embed this view on your website

Filters

total: 43

clear all filters disabled

Acceleration of the DGF-FDTD method on GPU using the CUDA technology
Publication
- Year 2015
We present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...

Full text to download in external service
Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology
Publication
- Year 2016
The discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...

Full text to download in external service
Optymalizacja wydajności obliczeniowej metody elementów skończonych w architekturze CUDA
Publication
- A. Dziekoński
- Year 2015
Celem niniejszej rozprawy oraz stypendium odbytego w ramach projektu było opracowanie numerycznie efektywnego rozwiązania algorytmicznego i sprzętowego, które umożliwia przyspieszenie analizy problemów elektromagnetycznych metodą elementów skończonych (MES) z funkcjami bazowymi wysokiego rzędu. Metoda elementów skończonych w dziedzinie częstotliwości stanowi wydajne i uniwersalne narzędzie analizy układów mikrofalowych (rys....
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications
Publication
- Ł. Jarząbek
- P. Czarnul
- JOURNAL OF SUPERCOMPUTING - Year 2017
The aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...

Full text available to download
High performance filtering for big datasets from Airborne Laser Scanning with CUDA technology
Publication
- W. Błaszczak-bąk
- A. Janowski
- P. Srokosz
- SURVEY REVIEW - Year 2018
There are many studies on the problems of processing big datasets provided by Airborne Laser Scanning (ALS). The processing of point clouds is often executed in stages or on the fragments of the measurement set. Therefore, solutions that enable the processing of the entire cloud at the same time in a simple, fast, efficient way are the subject of many researches. In this paper, authors propose to use General-Purpose computation...

Full text to download in external service
Implementation of algebraic procedures on the GPU using CUDA architecture on the example of generalized eigenvalue problem
Publication
- Ł. Syrocki
- G. Pestka
- Open Computer Science - Year 2016
Full text to download in external service
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
Publication
- P. Czarnul
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2023
In the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...

Full text to download in external service
Wykorzystanie technologii CUDA do kompresji w czasie rzeczywistym danych pochodzących z sonarów wielowiązkowych.
Publication
- A. Chybicki
- K. Laskowski
- M. Moszyński
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010
W pracy przedstawiono projekt oraz implementację systemu przeznaczonego do kompresji danych z sonarów wielowiązkowych działającego z wykorzystaniem technologii CUDA. Omówiono oraz zastosowano metody bezstratnej kompresji danych oraz techniki przetwarzania równoległego. Stworzoną aplikację przetestowano pod kątem prędkości i stopnia kompresji oraz porównano z innymi rozwiązaniami umożliwiającymi kompresję tego typu informacji.
Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams
Publication
- P. Czarnul
- COMPUTING AND INFORMATICS - Year 2020
The paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams,...

Full text available to download
Przetwarzanie Równoległe CUDA/Parallel processing on CUDA
e-Learning Courses
- J. Cychnerski
- P. Rościszewski
- P. Czarnul
- J. Atroszko
Performance evaluation of Unified Memory with prefetching and oversubscription for selected parallel CUDA applications on NVIDIA Pascal and Volta GPUs
Publication
- M. Knap
- P. Czarnul
- JOURNAL OF SUPERCOMPUTING - Year 2019
The paper presents assessment of Unified Memory performance with data prefetching and memory oversubscription. Several versions of code are used with: standard memory management, standard Unified Memory and optimized Unified Memory with programmer-assisted data prefetching. Evaluation of execution times is provided for four applications: Sobel and image rotation filters, stream image processing and computational fluid dynamic simulation,...

Full text available to download
Optymalizacja wydajności obliczeniowej metody elementów skończonych w architekturze CUDA

Projects

Project manager: dr inż. Adam Dziekoński Financial Program Name: ETIUDA

Project realized in Faculty of Electronics, Telecommunications and Informatics according to UMO-2013/08/T/ST7/00531 agreement from 2013-09-13
Magdalena Barbara Cudak dr hab. inż.

People

Faculty of Chemical Technology and Engineering
Multi-core and Multiprocessor Implementation of Numerical Integration in Finite Element Method
Publication
- Year 2012
The paper presents techniques for accelerating a numerical integration process which appears in the Finite Element Method. The acceleration is achieved by taking advantages of multi-core and multiprocessor devices. It is shown that using multi-core implementation with OpenMP and a GPU acceleration using CUDA architecture allows one to achieve the speedups by a factor of 5 and 10 on a CPU and GPUs, respectively.
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
Publication
- J. Skrzypczak
- P. Czarnul
- SIMULATION MODELLING PRACTICE AND THEORY - Year 2023
In the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...

Full text to download in external service
Krylov Space Iterative Solvers on Graphics Processing Units
Publication
- A. Dziekoński
- M. Mrozowski
- Year 2010
CUDA architecture was introduced by Nvidia three years ago and since then there have been many promising publications demonstrating a huge potential of Graphics Processing Units (GPUs) in scientific computations. In this paper, we investigate the performance of iterative methods such as cg, minres, gmres, bicg that may be used to solve large sparse real and complex systems of equations arising in computational electromagnetics.

Full text to download in external service
Piotr Sypek dr inż.

People

Department of Microwave and Antenna Engineering

Piotr Sypek received the M.S.E.E. and Ph.D. degrees (with hons.) in microwave engineering from the Gdańsk University of Technology, Gdańsk, Poland, in 2003 and 2012, respectively. He was involved in the design and implementation of parallel algorithms for the formulation and solution of electromagnetic problems executed on CPUs (workstations and clusters) and GPUs. His current research interests include parallel processing in computational...
Paweł Czarnul dr hab. inż.

People

Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics

Paweł Czarnul obtained a D.Sc. degree in computer science in 2015, a Ph.D. in computer science granted by a council at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology in 2003. His research interests include:parallel and distributed processing including clusters, accelerators, coprocessors; distributed information systems; architectures of distributed systems; programming mobile devices....
GPU based implementation of Temperature-Vegetation Dryness Index for AVHRR3 Satellite Data
Publication
- T. Bieliński
- A. Chybicki
- Year 2014
Paper presents an implementation of TVDI (Temperature-Vegetation-Dryness Index) algorithm on GPU (Graphics Processing Unit). Calculation of this index is based on LST (Land Surface Temperature) and NDVI (Normalized Difference Vegetation Index). Discussed results are based on multi-spectral imagery retrieved from AVHRR3 sensors for area of Poland. All phases of TVDI implementation on GPU are modified in respect to CUDA platform....
Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
Publication
- J. Porter-Sobieraj
- S. Cygert
- K. Daniel
- J. Sikorski
- M. Słodkowski
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2015
This paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...

Full text to download in external service

Search

Filters

Catalog

Magdalena Barbara Cudak dr hab. inż.

Piotr Sypek dr inż.

Paweł Czarnul dr hab. inż.