Wyniki wyszukiwania dla: gpu

Wyniki wyszukiwania dla: gpu

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 81

wyczyść wszystkie filtry niedostępne

Multi-core and Multiprocessor Implementation of Numerical Integration in Finite Element Method
Publikacja
- Rok 2012
The paper presents techniques for accelerating a numerical integration process which appears in the Finite Element Method. The acceleration is achieved by taking advantages of multi-core and multiprocessor devices. It is shown that using multi-core implementation with OpenMP and a GPU acceleration using CUDA architecture allows one to achieve the speedups by a factor of 5 and 10 on a CPU and GPUs, respectively.
Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming
Publikacja
- T. M. Boiński
- P. Czarnul
- COMPUTER JOURNAL - Rok 2021
In the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...

Pełny tekst do pobrania w portalu
Programowanie równoległe na architekturach wielordzeniowych
Kursy Online
- A. Brzeski
- P. Czarnul
- R. Kałaska
Kurs poświęcony zagadnieniom programowania równoległego na maszynach z pamięcią współdzieloną, w tym na wielordzeniowych CPU oraz GPU.
Programowanie równoległe na architekturach wielordzeniowych (2024-25)
Kursy Online
- H. A. Mojeed
- P. Czarnul
- R. Kałaska
Kurs poświęcony zagadnieniom programowania równoległego na maszynach z pamięcią współdzieloną, w tym na wielordzeniowych CPU oraz GPU.
Programowanie równoległe na architekturach wielordzeniowych (2023-24)
Kursy Online
- H. A. Mojeed
- P. Czarnul
- R. Kałaska
Kurs poświęcony zagadnieniom programowania równoległego na maszynach z pamięcią współdzieloną, w tym na wielordzeniowych CPU oraz GPU.
Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA
Publikacja
- M. J. Adiletta
- J. J. Tithi
- E. Farsarakis
- G. Gerogiannis
- R. Adolf
- R. Benke
- S. Kashyap
- S. Hsia
- K. Lakhotia
- F. Petrini... i 2 innych
- Rok 2023
Large-scale Graph Convolutional Network (GCN) inference on traditional CPU/GPU systems is challenging due to a large memory footprint, sparse computational patterns, and irregular memory accesses with poor locality. Intel’s Programmable Integrated Unffied Memory Architecture (PIUMA) is designed to address these challenges for graph analytics. In this paper, a detailed characterization of GCNs is presented using the Open-Graph Benchmark...

Pełny tekst do pobrania w serwisie zewnętrznym
Optymalizacja wydajności obliczeniowej metody elementów skończonych w architekturze CUDA
Publikacja
- A. Dziekoński
- Rok 2015
Celem niniejszej rozprawy oraz stypendium odbytego w ramach projektu było opracowanie numerycznie efektywnego rozwiązania algorytmicznego i sprzętowego, które umożliwia przyspieszenie analizy problemów elektromagnetycznych metodą elementów skończonych (MES) z funkcjami bazowymi wysokiego rzędu. Metoda elementów skończonych w dziedzinie częstotliwości stanowi wydajne i uniwersalne narzędzie analizy układów mikrofalowych (rys....
Implementation of FDTD-Compatible Green's Function on Graphics Processing Unit
Publikacja
- T. Stefański
- K. Krzyżanowska
- IEEE Antennas and Wireless Propagation Letters - Rok 2012
In this letter, implementation of the finite-difference time domain (FDTD)-compatible Green's function on a graphics processing unit (GPU) is presented. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision...

Pełny tekst do pobrania w serwisie zewnętrznym
Piotr Szczuko dr hab. inż.

Osoby

Katedra Systemów Multimedialnych

Dr hab. inż. Piotr Szczuko w 2002 roku ukończył studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Tematem pracy dyplomowej było badanie zjawisk jednoczesnej percepcji obrazu cyfrowego i dźwięku dookólnego. W roku 2008 obronił rozprawę doktorską zatytułowaną "Zastosowanie reguł rozmytych w komputerowej animacji postaci", za którą otrzymał nagrodę Prezesa Rady...
Generation of large finite-element matrices on multiple graphics processors
Publikacja
- INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING - Rok 2013
This paper presents techniques for generating very large finite-element matrices on a multicore workstation equipped with several graphics processing units (GPUs). To overcome the low memory size limitation of the GPUs, and at the same time to accelerate the generation process, we propose to generate the large sparse linear systems arising in finite-element analysis in an iterative manner on several GPUs and to use the graphics...

Pełny tekst do pobrania w serwisie zewnętrznym
ZASTOSOWANIA DRONÓW I SENSORÓW WIZYJNYCH I AKUSTYCZNYCH DO ZDALNEJ DETEKCJI I LOKALIZACJI OBIEKTÓW I ZDARZEŃ
Publikacja
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Rok 2016
W referacie przedstawiono wybrane sensory akustyczne i wizyjne i propozycje ich zastosowania do wykrywania i lokalizacji obiektów i zdarzeń z pokładu drona. Opisano pokrótce zastosowane algorytmy analizy strumieni, przedstawiono wyniki badań stworzonych prototypów i metod, zaimplementowanych na wydajnych układach GPU
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
Publikacja
- S. Cygert
- J. Porter-Sobieraj
- D. Kikoła
- J. Sikorski
- M. Słodkowski
- Rok 2013
Relativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....

Pełny tekst do pobrania w serwisie zewnętrznym
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn - ciało stałe
Publikacja
- A. Butterweck
- M. H. Ghaemi
- Mechanik - Rok 2011
W artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...

Pełny tekst do pobrania w portalu
Nowoczesne koncepcje integracji usług w systemie BeesyCluster
Publikacja
- P. Czarnul
- Rok 2010
Opisano funkcje aktualnej wersji systemu BeesyCluster jakowarstwy pośredniej w dostępie do rozproszonych zasobów wraz podsystemami integracji usług, wyboru usług oraz ich wykonania. Zaprezentowano rozszerzenia podsystemu integracji usług zorientowane na green computing. Omówiono problemy inteligentnego wyszukiwania usług, wykorzystanie GPU, współpracę z urządzeniami mobilnymi oraz przetwarzanie w przestrzeniach inteligentnych.Dodatkowo...
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn-ciało stałe
Publikacja
- A. Butterweck
- M. H. Ghaemi
- Rok 2011
W artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
Publikacja
- Rok 2014
Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Pełny tekst do pobrania w serwisie zewnętrznym
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
Publikacja
- Rok 2014
Modern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
Sign Language Recognition Using Convolution Neural Networks
Publikacja
- Rok 2024
The objective of this work was to provide an app that can automatically recognize hand gestures from the American Sign Language (ASL) on mobile devices. The app employs a model based on Convolutional Neural Network (CNN) for gesture classification. Various CNN architectures and optimization strategies suitable for devices with limited resources were examined. InceptionV3 and VGG-19 models exhibited negligibly higher accuracy than...

Pełny tekst do pobrania w portalu
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
Publikacja
- P. Rościszewski
- M. Iwański
- P. Czarnul
- Rok 2020
Practical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...

Pełny tekst do pobrania w serwisie zewnętrznym
Cryptocurrencies as a Speculative Asset: How Much Uncertainty is Included in Cryptocurrency Price?
Publikacja
- T. Ahsan
- K. Zawadzki
- K. Mubashir
- SAGE Open - Rok 2024
The aim of this paper is to examine the relationship between uncertainty indices (Geopolitical Uncertainty Index and Global Economic Policy Uncertainty Index) and cryptocurrencies. This study evaluated the behavior of cryptocurrencies with the evolution of uncertainties (GPU, EPU) on returns and volatility in terms of safe heaven as in traditional specualtive assets it increases their volaitility and reduces risk. For this purpose,...

Pełny tekst do pobrania w portalu
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
Publikacja
- ENERGIES - Rok 2023
High-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...

Pełny tekst do pobrania w portalu
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
Publikacja
- P. Rościszewski
- J. Kaliski
- Rok 2017
In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Pełny tekst do pobrania w serwisie zewnętrznym
Neural Architecture Search for Skin Lesion Classification
Publikacja
- IEEE Access - Rok 2020
Deep neural networks have achieved great success in many domains. However, successful deployment of such systems is determined by proper manual selection of the neural architecture. This is a tedious and time-consuming process that requires expert knowledge. Different tasks need very different architectures to obtain satisfactory results. The group of methods called the neural architecture search (NAS) helps to find effective architecture...

Pełny tekst do pobrania w portalu
Advanced Potential Energy Surfaces for Molecular Simulation
Publikacja
- A. Albaugh
- H. Boateng
- R. Bradshaw
- O. Demerdash
- J. Dziedzic
- Y. Mao
- D. Margul
- J. Swails
- Q. Zeng
- D. Case... i 10 innych
- JOURNAL OF PHYSICAL CHEMISTRY B - Rok 2016
Advanced potential energy surfaces are defined as theoretical models that explicitly include many-body effects that transcend the standard fixed-charge, pairwise-additive paradigm typically used in molecular simulation. However, several factors relating to their software implementation have precluded their widespread use in condensed-phase simulations: the computational cost of the theoretical models, a paucity of approximate models...

Pełny tekst do pobrania w portalu
Comparing Apples and Oranges: A Mobile User Experience Study of iOS and Android Consumer Devices
Publikacja
- P. Falkowski-Gilski
- T. Uhl
- Rok 2023
With the rapid development of wireless networks and the spread of broadband access around the world, the number of active mobile user devices continues to grow. Each year more and more terminals are released on the market, with the smartphone being the most popular among them. They include low-end, mid-range, and of course high-end devices, with top hardware specifications. They do vary in build quality, utilized type of material,...

Pełny tekst do pobrania w serwisie zewnętrznym
Krzysztof Bikonis dr inż.

Osoby

Katedra Systemów Geoinformatycznych
Mobile Cloud computing architecture for massively parallelizablegeometric computation
Publikacja
- V. Sánchez Ribes
- H. Mora-Mora
- A. Sobecki
- F. José Mora Gimeno
- COMPUTERS IN INDUSTRY - Rok 2020
Cloud Computing is one of the most disruptive technologies of this century. This technology has been widely adopted in many areas of the society. In the field of manufacturing industry, it can be used to provide advantages in the execution of the complex geometric computation algorithms involved on CAD/CAM processes. The idea proposed in this research consists in outsourcing part of the load to be com- puted in the client machines...

Pełny tekst do pobrania w portalu
Krylov Space Iterative Solvers on Graphics Processing Units
Publikacja
- A. Dziekoński
- M. Mrozowski
- Rok 2010
CUDA architecture was introduced by Nvidia three years ago and since then there have been many promising publications demonstrating a huge potential of Graphics Processing Units (GPUs) in scientific computations. In this paper, we investigate the performance of iterative methods such as cg, minres, gmres, bicg that may be used to solve large sparse real and complex systems of equations arising in computational electromagnetics.

Pełny tekst do pobrania w serwisie zewnętrznym
Wykorzystanie technologii CUDA do kompresji w czasie rzeczywistym danych pochodzących z sonarów wielowiązkowych.
Publikacja
- A. Chybicki
- K. Laskowski
- M. Moszyński
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Rok 2010
W pracy przedstawiono projekt oraz implementację systemu przeznaczonego do kompresji danych z sonarów wielowiązkowych działającego z wykorzystaniem technologii CUDA. Omówiono oraz zastosowano metody bezstratnej kompresji danych oraz techniki przetwarzania równoległego. Stworzoną aplikację przetestowano pod kątem prędkości i stopnia kompresji oraz porównano z innymi rozwiązaniami umożliwiającymi kompresję tego typu informacji.
Block-based Representation of Application Execution on Modern Parallel Systems
Publikacja
- P. Czarnul
- Rok 2013
The chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.
Modeling of Performance, Reliability and Energy Efficiency in Large-Scale Computational Environment
Publikacja
- J. Kuchta
- Rok 2016
Large scale of complexity of distributed computational systems imposes special challanges for prediction of quality in such systems.Existing quality models for lower-scale systems include functionality,performance,reliability,flexibility and usability.Among these attributes,performance and reliability have a particular significance to the large-scale systems computing quality modeling due to their strong dependence on the system...

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: gpu

Piotr Szczuko dr hab. inż.

Krzysztof Bikonis dr inż.