Search results for: CPU

Search results for: CPU

results on page:
embed this view on your website

Filters

total: 128

clear all filters disabled

Evaluation the effectiveness of virtual machine integrated with CPU
Publication
- T. Bieliński
- Year 2013
In the paper effectiveness of example CPU with integrated virtual machine is presented. The idea and implementation of virtual machine is shown. In next sections reference CPU and sample virtual machine is described. Finally optimality of the translation process is analysed.
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
Publication
- A. Krzywaniak
- P. Czarnul
- Advances in Intelligent Systems and Computing - Year 2017
In the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...

Full text available to download
Parallelization of large vector similarity computations in a hybrid CPU+GPU environment
Publication
- P. Czarnul
- JOURNAL OF SUPERCOMPUTING - Year 2018
The paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following are presented and tested for computation of all vector...

Full text available to download
Study on CPU and RAM Resource Consumption of Mobile Devices using Streaming Services
Publication
- P. Falkowski-Gilski
- M. Woźniak
- Year 2021
Streaming multimedia services have become very popular in recent years, due to the development of wireless networks. With the growing number of mobile devices worldwide, service providers offer dedicated applications that allow to deliver on-demand audio and video content anytime and everywhere. The aim of this study was to compare different streaming services and investigate their impact on the CPU and RAM resources, with respect...

Full text to download in external service
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
Publication
- T. Stefański
- Progress in Electromagnetics Research-PIER - Year 2013
This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...

Full text to download in external service
The Speedup Analysis in GEM Detector Based Acquisition System Algorithms with CPU and PCIe Cards
Publication
- R. Krawczyk
- P. Linczuk
- P. Kolasinski
- A. Wojenski
- G. Kasprowicz
- K. Pozniak
- R. Romaniuk
- W. Zabolotny
- P. Zienkiewicz
- T. Czarski... and 2 others
- Acta Physica Polonica B Proceedings Supplement - Year 2016
Full text to download in external service
CPU-e Revista de Investigacion Educativa

Journals

ISSN: 1870-5308
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
Publication
- J. Skrzypczak
- P. Czarnul
- SIMULATION MODELLING PRACTICE AND THEORY - Year 2023
In the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...

Full text to download in external service
Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams
Publication
- P. Czarnul
- COMPUTING AND INFORMATICS - Year 2020
The paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams,...

Full text available to download
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
Publication
- P. Czarnul
- P. Rościszewski
- Year 2020
Auto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...

Full text available to download
Tuning a Hybrid GPU-CPU V-Cycle Multilevel Preconditioner for Solving Large Real and Complex Systems of FEM Equations
Publication
- IEEE Antennas and Wireless Propagation Letters - Year 2011
This letter presents techniques for tuning an accelerated preconditioned conjugate gradient solver with a multilevel preconditioner. The solver is optimized for a fast solution of sparse systems of equations arising in computational electromagnetics in a finite element method using higher-order elements. The goal of the tuning is to increase the throughput while at the same time reducing the memory requirements in order to allow...

Full text to download in external service
Multi Queue Approach for Network Services Implemented for Multi Core CPUs
Publication
- Journal of Telecommunications and Information Technology - Year 2011
Multiple core processors have already became the dominant design for general purpose CPUs. Incarnations of this technology are present in solutions dedicated to such areas like computer graphics, signal processing and also computer networking. Since the key functionality of network core components is fast package servicing, multicore technology, due to multi tasking ability, seems useful to support packet processing. Dedicated...

Full text available to download
Performance assessment of OpenMP constructs and benchmarks using modern compilers and multi-core CPUs
Publication
- B. Gawrych
- P. Czarnul
- Year 2023
Considering ongoing developments of both modern CPUs, especially in the context of increasing numbers of cores, cache memory and architectures as well as compilers there is a constant need for benchmarking representative and frequently run workloads. The key metric is speed-up as the computational power of modern CPUs stems mainly from using multiple cores. In this paper, we show and discuss results from running codes such as:...

Full text to download in external service
Modelowanie wydajności, niezawodności i zużycia energii wilopoziomowych systemów równoległych wielkiej skali z uwzględnieniem CPU oraz GPU

Projects

Project manager: dr hab. inż. Paweł Czarnul Financial Program Name: OPUS

Project realized in Faculty of Electronics, Telecommunications and Informatics according to UMO-2012/07/B/ST6/01516 agreement from 2013-07-17
Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs
Publication
- P. Czarnul
- P. Rościszewski
- Year 2014
The paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...

Full text to download in external service
KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs
Publication
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2016
The paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....

Full text to download in external service
Preconditioners with Low Memory Requirements for Higher-Order Finite-Element Method Applied to Solving Maxwell’s Equations on Multicore CPUs and GPUs
Publication
- A. Dziekoński
- G. Fotyga
- M. Mrozowski
- IEEE Access - Year 2018
This paper discusses two fast implementations of the conjugate gradient iterative method using a hierarchical multilevel preconditioner to solve the complex-valued, sparse systems obtained using the higher order finite-element method applied to the solution of the time-harmonic Maxwell equations. In the first implementation, denoted PCG-V, a classical V-cycle is applied and the system of equations on the lowest level is solved...

Full text available to download
Programowanie równoległe na architekturach wielordzeniowych
e-Learning Courses
- A. Brzeski
- P. Czarnul
- R. Kałaska
Kurs poświęcony zagadnieniom programowania równoległego na maszynach z pamięcią współdzieloną, w tym na wielordzeniowych CPU oraz GPU.
Programowanie równoległe na architekturach wielordzeniowych (2023-24)
e-Learning Courses
- H. A. Mojeed
- P. Czarnul
- R. Kałaska
Kurs poświęcony zagadnieniom programowania równoległego na maszynach z pamięcią współdzieloną, w tym na wielordzeniowych CPU oraz GPU.
Paweł Czarnul dr hab. inż.

People

Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics

Paweł Czarnul obtained a D.Sc. degree in computer science in 2015, a Ph.D. in computer science granted by a council at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology in 2003. His research interests include:parallel and distributed processing including clusters, accelerators, coprocessors; distributed information systems; architectures of distributed systems; programming mobile devices....

Search

Filters

Catalog

Paweł Czarnul dr hab. inż.