Wyniki wyszukiwania dla: gpu
-
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
PublikacjaRelativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....
-
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn - ciało stałe
PublikacjaW artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
-
Nowoczesne koncepcje integracji usług w systemie BeesyCluster
PublikacjaOpisano funkcje aktualnej wersji systemu BeesyCluster jakowarstwy pośredniej w dostępie do rozproszonych zasobów wraz podsystemami integracji usług, wyboru usług oraz ich wykonania. Zaprezentowano rozszerzenia podsystemu integracji usług zorientowane na green computing. Omówiono problemy inteligentnego wyszukiwania usług, wykorzystanie GPU, współpracę z urządzeniami mobilnymi oraz przetwarzanie w przestrzeniach inteligentnych.Dodatkowo...
-
Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn-ciało stałe
PublikacjaW artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublikacjaSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
PublikacjaModern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
-
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
PublikacjaPractical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...
-
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
PublikacjaHigh-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublikacjaIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
Neural Architecture Search for Skin Lesion Classification
PublikacjaDeep neural networks have achieved great success in many domains. However, successful deployment of such systems is determined by proper manual selection of the neural architecture. This is a tedious and time-consuming process that requires expert knowledge. Different tasks need very different architectures to obtain satisfactory results. The group of methods called the neural architecture search (NAS) helps to find effective architecture...
-
Advanced Potential Energy Surfaces for Molecular Simulation
PublikacjaAdvanced potential energy surfaces are defined as theoretical models that explicitly include many-body effects that transcend the standard fixed-charge, pairwise-additive paradigm typically used in molecular simulation. However, several factors relating to their software implementation have precluded their widespread use in condensed-phase simulations: the computational cost of the theoretical models, a paucity of approximate models...
-
Comparing Apples and Oranges: A Mobile User Experience Study of iOS and Android Consumer Devices
PublikacjaWith the rapid development of wireless networks and the spread of broadband access around the world, the number of active mobile user devices continues to grow. Each year more and more terminals are released on the market, with the smartphone being the most popular among them. They include low-end, mid-range, and of course high-end devices, with top hardware specifications. They do vary in build quality, utilized type of material,...
-
Krzysztof Bikonis dr inż.
Osoby -
Mobile Cloud computing architecture for massively parallelizablegeometric computation
PublikacjaCloud Computing is one of the most disruptive technologies of this century. This technology has been widely adopted in many areas of the society. In the field of manufacturing industry, it can be used to provide advantages in the execution of the complex geometric computation algorithms involved on CAD/CAM processes. The idea proposed in this research consists in outsourcing part of the load to be com- puted in the client machines...
-
Krylov Space Iterative Solvers on Graphics Processing Units
PublikacjaCUDA architecture was introduced by Nvidia three years ago and since then there have been many promising publications demonstrating a huge potential of Graphics Processing Units (GPUs) in scientific computations. In this paper, we investigate the performance of iterative methods such as cg, minres, gmres, bicg that may be used to solve large sparse real and complex systems of equations arising in computational electromagnetics.
-
Block-based Representation of Application Execution on Modern Parallel Systems
PublikacjaThe chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.
-
Wykorzystanie technologii CUDA do kompresji w czasie rzeczywistym danych pochodzących z sonarów wielowiązkowych.
PublikacjaW pracy przedstawiono projekt oraz implementację systemu przeznaczonego do kompresji danych z sonarów wielowiązkowych działającego z wykorzystaniem technologii CUDA. Omówiono oraz zastosowano metody bezstratnej kompresji danych oraz techniki przetwarzania równoległego. Stworzoną aplikację przetestowano pod kątem prędkości i stopnia kompresji oraz porównano z innymi rozwiązaniami umożliwiającymi kompresję tego typu informacji.
-
Modeling of Performance, Reliability and Energy Efficiency in Large-Scale Computational Environment
PublikacjaLarge scale of complexity of distributed computational systems imposes special challanges for prediction of quality in such systems.Existing quality models for lower-scale systems include functionality,performance,reliability,flexibility and usability.Among these attributes,performance and reliability have a particular significance to the large-scale systems computing quality modeling due to their strong dependence on the system...