Search results for: MULTI GPU - Bridge of Knowledge

The impact of the AC922 Architecture on Performance of Deep Neural Network Training

Publication

- Year 2020

Practical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...

Full text to download in external service

A Task-Scheduling Approach for Efficient Sparse Symmetric Matrix-Vector Multiplication on a GPU

Publication

- SIAM JOURNAL ON SCIENTIFIC COMPUTING - Year 2015

In this paper, a task-scheduling approach to efficiently calculating sparse symmetric matrix-vector products and designed to run on Graphics Processing Units (GPUs) is presented. The main premise is that, for many sparse symmetric matrices occurring in common applications, it is possible to obtain significant reductions in memory usage and improvements in performance when the matrix is prepared in certain ways prior to computation....

Full text to download in external service

Energy-Aware Scheduling for High-Performance Computing Systems: A Survey

Publication

- ENERGIES - Year 2023

High-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...

Full text available to download

Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn - ciało stałe

Publication

- Mechanik - Year 2011

W artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...

Full text available to download

GPU Power Capping for Energy-Performance Trade-Offs in Training of Deep Convolutional Neural Networks for Image Recognition

Publication

- Year 2022

In the paper we present performance-energy trade-off investigation of training Deep Convolutional Neural Networks for image recognition. Several representative and widely adopted network models, such as Alexnet, VGG-19, Inception V3, Inception V4, Resnet50 and Resnet152 were tested using systems with Nvidia Quadro RTX 6000 as well as Nvidia V100 GPUs. Using GPU power capping we found other than default configurations minimizing...

Full text available to download

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

Publication

- Year 2017

In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Full text to download in external service

Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn-ciało stałe

Publication

- Year 2011

W artykule po przedstawieniu podstawowych informacji na temat technologii GPGPU oraz struktury NVIDIA CUDA opisano równania zachowania rządzące przepływami oraz ich dyskretyzację numeryczna. Następnie zbadano możliwości wykorzystania technologii GPGPU w celu zoptymalizowania czasu wykonywania obliczeń numerycznych przepływu przez ośrodek dwufazowy (płyn - cząsteczki ciała stała stałego) zbliżony do ośrodka porowatego. W tym celu,...

Modelling and simulation of GPU processing in the MERPSYS environment

Publication

- Scalable Computing: Practice and Experience - Year 2018

In this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...

Full text available to download

Neural Architecture Search for Skin Lesion Classification

Publication

- IEEE Access - Year 2020

Deep neural networks have achieved great success in many domains. However, successful deployment of such systems is determined by proper manual selection of the neural architecture. This is a tedious and time-consuming process that requires expert knowledge. Different tasks need very different architectures to obtain satisfactory results. The group of methods called the neural architecture search (NAS) helps to find effective architecture...

Full text available to download

Advanced Potential Energy Surfaces for Molecular Simulation

Publication

A. Albaugh
H. Boateng
R. Bradshaw
O. Demerdash
J. Dziedzic
Y. Mao
D. Margul
J. Swails
Q. Zeng
D. Case... and 10 others

- JOURNAL OF PHYSICAL CHEMISTRY B - Year 2016

Advanced potential energy surfaces are defined as theoretical models that explicitly include many-body effects that transcend the standard fixed-charge, pairwise-additive paradigm typically used in molecular simulation. However, several factors relating to their software implementation have precluded their widespread use in condensed-phase simulations: the computational cost of the theoretical models, a paucity of approximate models...

Full text available to download

Parallel multithread computing for spectroscopic analysis in optical coherence tomography

Publication

- Year 2014

Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Full text to download in external service

Comparing Apples and Oranges: A Mobile User Experience Study of iOS and Android Consumer Devices

Publication

P. Falkowski-Gilski
T. Uhl

- Year 2023

With the rapid development of wireless networks and the spread of broadband access around the world, the number of active mobile user devices continues to grow. Each year more and more terminals are released on the market, with the smartphone being the most popular among them. They include low-end, mid-range, and of course high-end devices, with top hardware specifications. They do vary in build quality, utilized type of material,...

Full text to download in external service

Filters

Catalog

The impact of the AC922 Architecture on Performance of Deep Neural Network Training

A Task-Scheduling Approach for Efficient Sparse Symmetric Matrix-Vector Multiplication on a GPU

Energy-Aware Scheduling for High-Performance Computing Systems: A Survey

Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn - ciało stałe

GPU Power Capping for Energy-Performance Trade-Offs in Training of Deep Convolutional Neural Networks for Image Recognition

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

Zastosowanie technologii GPGPU do wspomagania inżynierskich obliczeń numerycznych na przykładzie analizy przepływu przez ośrodek dwufazowy płyn-ciało stałe

Modelling and simulation of GPU processing in the MERPSYS environment

Neural Architecture Search for Skin Lesion Classification

Advanced Potential Energy Surfaces for Molecular Simulation

Parallel multithread computing for spectroscopic analysis in optical coherence tomography

Comparing Apples and Oranges: A Mobile User Experience Study of iOS and Android Consumer Devices