Search results for: MULTI-GPU SCALABILITY

Search results for: MULTI-GPU SCALABILITY

Didn't find any results in this catalog!

But we have some results in other catalogs.

Przykład wyników znalezionych w innych katalogach

zobacz wszystkie wyniki

Filters

total: 1460

clear all filters disabled

displaying 1000 best results Help

Scalability of surrogate-assisted multi-objective optimization of antenna structures exploiting variable-fidelity electromagnetic simulation models
Publication
- S. Kozieł
- A. Bekasiewicz
- ENGINEERING OPTIMIZATION - Year 2016
Multi-objective optimization of antenna structures is a challenging task due to high-computational cost of evaluating the design objectives as well as large number of adjustable parameters. Design speedup can be achieved by means of surrogate-based optimization techniques. In particular, a combination of variable-fidelity electromagnetic (EM) simulations, design space reduction techniques, response surface approximation (RSA) models,...

Full text to download in external service
Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems
Publication
- K. Ocetkiewicz
- C. Czaplewski
- H. Krawczyk
- A. Lipska
- A. Liwo
- J. Proficz
- A. K. Sieradzan
- P. Czarnul
- COMPUTER PHYSICS COMMUNICATIONS - Year 2024
Graphical Processor Units (GPUs) are nowadays widely used in all-atom molecular simulations because of the advantage of efficient partitioning of atom pairs between the kernels to compute the contributions to energy and forces, thus enabling the treatment of very large systems. Extension of time- and size-scale of computations is also sought through the development of coarse-grained (CG) models, in which atoms are merged into extended...

Full text to download in external service
Performance and Energy Aware Training of a Deep Neural Network in a Multi-GPU Environment with Power Capping
Publication
- Year 2024
In this paper we demonstrate that it is possible to obtain considerable improvement of performance and energy aware metrics for training of deep neural networks using a modern parallel multi-GPU system, by enforcing selected, non-default power caps on the GPUs. We measure the power and energy consumption of the whole node using a professional, certified hardware power meter. For a high performance workstation with 8 GPUs, we were...

Full text to download in external service
Communication and Load Balancing Optimization for Finite Element Electromagnetic Simulations Using Multi-GPU Workstation
Publication
- IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES - Year 2017
This paper considers a method for accelerating finite-element simulations of electromagnetic problems on a workstation using graphics processing units (GPUs). The focus is on finite-element formulations using higher order elements and tetrahedral meshes that lead to sparse matrices too large to be dealt with on a typical workstation using direct methods. We discuss the problem of rapid matrix generation and assembly, as well as...

Full text to download in external service
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
Publication
- P. Czarnul
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2023
In the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...

Full text available to download
Multi-GPU-powered UNRES package for physics-based coarse-grained simulations of structure, dynamics, and thermodynamics of protein systems at biological size- and timescales
Publication
- C. Czaplewski
- P. Czarnul
- H. Krawczyk
- A. Lipska
- E. Lubecka
- K. Ocetkiewicz
- J. Proficz
- A. Sieradzan
- R. Ślusarz
- J. Liwo
- BIOPHYSICAL JOURNAL - Year 2024
Coarse-grained models are nowadays extensively used in biomolecular simulations owing to the tremendous extension of size- and time-scale of simulations. The physics-based UNRES (UNited RESidue) model of proteins developed in our laboratory has only two interaction sites per amino-acid residue (united peptide groups and united side chains) and implicit solvent. However, owing to rigorous physics-based derivation, which enabled...

Full text to download in external service
Characterizing the Scalability of Graph Convolutional Networks on Intel® PIUMA
Publication
- M. J. Adiletta
- J. J. Tithi
- E. Farsarakis
- G. Gerogiannis
- R. Adolf
- R. Benke
- S. Kashyap
- S. Hsia
- K. Lakhotia
- F. Petrini... and 2 others
- Year 2023
Large-scale Graph Convolutional Network (GCN) inference on traditional CPU/GPU systems is challenging due to a large memory footprint, sparse computational patterns, and irregular memory accesses with poor locality. Intel’s Programmable Integrated Unffied Memory Architecture (PIUMA) is designed to address these challenges for graph analytics. In this paper, a detailed characterization of GCNs is presented using the Open-Graph Benchmark...

Full text to download in external service
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
Publication
- P. Rościszewski
- M. Iwański
- P. Czarnul
- Year 2020
Practical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report...

Full text to download in external service
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
Publication
- ENERGIES - Year 2023
High-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...

Full text available to download
Performance evaluation of parallel background subtraction on GPU platforms
Publication
- G. Szwoch
- Elektronika : konstrukcje, technologie, zastosowania - Year 2015
Implementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...

Full text to download in external service

Search

Didn't find any results in this catalog!

Filters

Catalog

Search results for: MULTI-GPU SCALABILITY