Wyniki wyszukiwania dla: cuda

Wyniki wyszukiwania dla: cuda

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 43

wyczyść wszystkie filtry niedostępne

Latająca Kawiarenka Naukowa
Publikacja
- M. Rucka
- Pismo PG - Rok 2014
W artykule opisano spotkanie Latającej Kawiarenki Naukowej mającej na celu popularyzację nauki z zakresu mechaniki konstrukcji oraz mostów. Kawiarenka zatytułowana „Mosty: cuda architektury i techniki” została zorganizowana przez Akademię Młodych Uczonych PAN oraz Koło Naukowe Mechaniki Budowli KoMBo.
Performance evaluation of parallel background subtraction on GPU platforms
Publikacja
- G. Szwoch
- Elektronika : konstrukcje, technologie, zastosowania - Rok 2015
Implementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...

Pełny tekst do pobrania w serwisie zewnętrznym
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
Publikacja
- S. Cygert
- J. Porter-Sobieraj
- D. Kikoła
- J. Sikorski
- M. Słodkowski
- Rok 2013
Relativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....

Pełny tekst do pobrania w serwisie zewnętrznym
Parallelization of large vector similarity computations in a hybrid CPU+GPU environment
Publikacja
- P. Czarnul
- JOURNAL OF SUPERCOMPUTING - Rok 2018
The paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following are presented and tested for computation of all vector...

Pełny tekst do pobrania w portalu
Performance evaluation of the parallel object tracking algorithm employing the particle filter
Publikacja
- G. Szwoch
- Rok 2016
An algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
Publikacja
- Rok 2014
Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Pełny tekst do pobrania w serwisie zewnętrznym
Współczesne Środowiska Programowania - 22/23
Kursy Online
- T. Stefański
- K. Kruczkowski
Zapoznanie studentów ze współczesnymi środowiskami programowania na przykładzie technologii CUDA firmy Nvidia.
Use of ICT infrastructure for teaching HPC
Publikacja
- P. Czarnul
- M. Matuszek
- Rok 2019
In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

Pełny tekst do pobrania w serwisie zewnętrznym
Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
Publikacja
- Future Generation Computer Systems-The International Journal of Grid Computing-Theory Methods and Applications - Rok 2023
GPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this...

Pełny tekst do pobrania w serwisie zewnętrznym
Implementation of FDTD-Compatible Green's Function on Graphics Processing Unit
Publikacja
- T. Stefański
- K. Krzyżanowska
- IEEE Antennas and Wireless Propagation Letters - Rok 2012
In this letter, implementation of the finite-difference time domain (FDTD)-compatible Green's function on a graphics processing unit (GPU) is presented. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision...

Pełny tekst do pobrania w serwisie zewnętrznym
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
Publikacja
- T. Stefański
- Progress in Electromagnetics Research-PIER - Rok 2013
This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...

Pełny tekst do pobrania w serwisie zewnętrznym
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn - ciało stałe
Publikacja
- A. Butterweck
- M. H. Ghaemi
- Mechanik - Rok 2011
W artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
Modelling and simulation of GPU processing in the MERPSYS environment
Publikacja
- T. Gajger
- P. Czarnul
- Scalable Computing: Practice and Experience - Rok 2018
In this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...

Pełny tekst do pobrania w portalu
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn-ciało stałe
Publikacja
- A. Butterweck
- M. H. Ghaemi
- Rok 2011
W artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
Parallel Programming for Modern High Performance Computing Systems
Publikacja
- P. Czarnul
- Rok 2018
In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Pełny tekst do pobrania w serwisie zewnętrznym
Algorytmy analizy i przetwarzania danych z sonarów wielowiązkowych w rozproszonych systemach GIS
Publikacja
- A. Chybicki
- Rok 2011
Telemonitoring morski oraz szeroko rozumiane badania morza są ważnym elementem aktywności człowieka w sferze badań, nauki oraz gospodarki. Prowadzenie działań związanych z tworzeniem map dna, inspekcją nadbrzeży, umocnień, badaniem fauny morskiej pozwala zrozumieć procesy zachodzące w środowisku morskim oraz przyczynia się do rozwoju wielu gałęzi gospodarki takich jak transport morski, bezpieczeństwo, ochrona portów i inne. W ramach...
Benchmarking overlapping communication and computations with multiple streams for modern GPUs
Publikacja
- P. Czarnul
- Annals of Computer Science and Information Systems - Rok 2018
The paper presents benchmarking a multi-stream application processing a set of input data arrays. Tests have been performed and execution times measured for various numbers of streams and various compute intensities measured as the ratio of kernel compute time and data transfer time. As such, the application and benchmarking is representative of frequently used operations such as vector weighted sum, matrix multiplication etc....

Pełny tekst do pobrania w portalu
Mobile Cloud computing architecture for massively parallelizablegeometric computation
Publikacja
- V. Sánchez Ribes
- H. Mora-Mora
- A. Sobecki
- F. José Mora Gimeno
- COMPUTERS IN INDUSTRY - Rok 2020
Cloud Computing is one of the most disruptive technologies of this century. This technology has been widely adopted in many areas of the society. In the field of manufacturing industry, it can be used to provide advantages in the execution of the complex geometric computation algorithms involved on CAD/CAM processes. The idea proposed in this research consists in outsourcing part of the load to be com- puted in the client machines...

Pełny tekst do pobrania w portalu
Tuning matrix-vector multiplication on GPU
Publikacja
- A. Dziekoński
- M. Mrozowski
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Rok 2010
A matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
Jacobi and gauss-seidel preconditioned complex conjugate gradient method with GPU acceleration for finite element method
Publikacja
- Rok 2010
In this paper two implementations of iterative solvers for solving complex symmetric and sparse systems resulting from finite element method applied to wave equation are discussed. The problem under investigation is a dielectric resonator antenna (DRA) discretized by FEM with vector elements of the second order (LT/QN). The solvers use the preconditioned conjugate gradient (pcg) method implemented on Graphics Processing Unit (GPU)...

Pełny tekst do pobrania w serwisie zewnętrznym

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: cuda