Wyniki wyszukiwania dla: PARALLAX

Simulation of parallel similarity measure computations for large data sets

Publikacja

- Rok 2015

The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...

Pełny tekst do pobrania w serwisie zewnętrznym

Block-based Representation of Application Execution on Modern Parallel Systems

Publikacja

P. Czarnul

- Rok 2013

The chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.

Parallel multithread computing for spectroscopic analysis in optical coherence tomography

Publikacja

- Rok 2014

Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Pełny tekst do pobrania w serwisie zewnętrznym

Redundant Actuation of 3RRR over-actuated Planar Parallel Manipulator

Publikacja

K. Lipiński

- Rok 2009

Praca opisuje zagadnienia modelowania i napędzania manipulatorów równoległych. Cechą charakterystyczną manipulatorów równoległych jest występowanie jednego lub kilku łańcuchów kinematycznych zamkniętych (gałęzi równoległych). Standardowo, konstrukcje takie są napędzane jedynie silnikami montowanymi w parach kinematycznych łączących łańcuchy kinematyczne z podstawą. Niekiedy konstrukcje takie są układami napędzanymi nadmiarowo (liczba...

A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures

Publikacja

- TASK Quarterly - Rok 2012

In this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person's relatives to judge...

Pełny tekst do pobrania w portalu

Sensorless predictive control of three-phase parallel active filter

Publikacja

- Rok 2007

The paper presents the control system of parallel active power filter (APF) with predictive reference current calculation and model based predictive current control. The novel estimator and predictor of grid emf is proposed for AC voltage sensorless operation of APF, regardless of distortion of this voltage. Proposed control system provides control of APF current with high precision and dynamics limited only by filter circuit parameters....

Pełny tekst do pobrania w serwisie zewnętrznym

Planning optimised multi-tasking operations under the capability for parallel machining

Publikacja

- JOURNAL OF MANUFACTURING SYSTEMS - Rok 2021

The advent of advanced multi-tasking machines (MTMs) in the metalworking industry has provided the opportunity for more efficient parallel machining as compared to traditional sequential processing. It entailed the need for developing appropriate reasoning schemes for efficient process planning to take advantage of machining capabilities inherent in these machines. This paper addresses an adequate methodical approach for a non-linear...

Pełny tekst do pobrania w portalu

Performance evaluation of the parallel object tracking algorithm employing the particle filter

Publikacja

G. Szwoch

- Rok 2016

Pełny tekst do pobrania w serwisie zewnętrznym

Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream

Publikacja

G. Szwoch

- Communications in Computer and Information Science - Rok 2011

A background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....

Pełny tekst do pobrania w serwisie zewnętrznym

Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications

Publikacja

P. Czarnul

- Electronics - Rok 2021

The paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...

Pełny tekst do pobrania w portalu

Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines: Complexity and Algorithms

Publikacja

T. Pikies
K. Turowski
M. Kubale

- ARTIFICIAL INTELLIGENCE - Rok 2022

In this paper, the problem of scheduling on parallel machines with a presence of incompatibilities between jobs is considered. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. The paper provides several results concerning schedules, optimal or approximate with respect to the two most popular criteria of optimality:...

Pełny tekst do pobrania w serwisie zewnętrznym

Effective methods for functional confermance testing of parallel and distributed programming libraries.

Publikacja

Ł. Garstecki

- Rok 2004

Rozprawa przedstawia kompletna metodykę tworzenia Zestawów Testów Zgodności dla języków programowania, bibliotek i API, ze szczególnym uwzględnieniem języków i bibliotek programowania równoleglego i rozproszonego. Autor rozpoczął badania w dziedzinie testowania zgodności dla bibliotek programowania równoleglego i rozproszonego, ale Metodyka Kolejnych zawężeń (ang. Consecutive Confinenments Method -CoCoM, stworzona przez Autora,...

Towards Efficient Parallel Image Processing on Cluster Grids Using GIMP.

Publikacja

- Rok 2004

Ze względu na fakt, iż niewielu użytkowników posiada wiedzę niezbędną do wykorzystania niskopoziomowych bibliotek programowania równoległego w celu przyspieszenia działania programów operujących na obrazach, proponujemy plugin do znanej aplikacji GIMP, który umożliwia potokowe wykonanie szeregu filtrów na obrazach załadowanych przez plugin. Prezentujemy szczegóły implementacyjne, scenariusze testowe i wyniki na klastrach, potencjalnie...

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Publikacja

A. Malinowski

- International Journal of Information Technology and Computer Science - Rok 2015

Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...

Pełny tekst do pobrania w serwisie zewnętrznym

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

Publikacja

- Rok 2014

The paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...

Pełny tekst do pobrania w serwisie zewnętrznym

Molecular Diffusion Simulation on ARUZ – Massively-parallel FPGA-based Machine

Publikacja

R. Kielbik
K. Halagan
K. Rudnicki
P. Polanowski
G. Jablonski
J. Jung

- Rok 2021

Pełny tekst do pobrania w serwisie zewnętrznym

Scheduling with precedence constraints: mixed graph coloring in series-parallel graphs.

Publikacja

H. Furmańczyk
A. Kosowski
P. Żyliński

- Rok 2008

W pracy rozważono problem kolorowania grafów mieszanych, opisujący zagadnienie szeregowania zadań, w którym zależności czasowe zadań mają charakter częściowego porządku lub wzajemnego wykluczania. Dla przypadku, w którym graf zależności jest szeregowo-równoległy, podano algorytm rozwiązujący problem optymalnie w czasie $O(n^3.376 * log n)$.

Pełny tekst do pobrania w serwisie zewnętrznym

Computer experiments with a parallel clonal selection algorithm for the graph coloring problem

Publikacja

- Rok 2008

Artificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms

Publikacja

G. Szwoch

- Rok 2014

Implementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...

Pełny tekst do pobrania w serwisie zewnętrznym

Performance evaluation of the parallel object tracking algorithm employing the particle filter

Publikacja

G. Szwoch

- Rok 2016

An algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...

A New Approach for the Mitigating of Flow Maldistribution in Parallel Microchannel Heat Sink

Publikacja

K. Ritunesh
G. Singh
D. Mikielewicz

- JOURNAL OF HEAT TRANSFER-TRANSACTIONS OF THE ASME - Rok 2018

The problem of flow maldistribution is very critical in microchannel heat sinks (MCHS). It induces temperature nonuniformity, which may ultimately lead to the breakdown of associated system. In the present communication, a novel approach for the mitigation of flow maldistribution problem in parallel MCHS has been proposed using variable width microchannels. Numerical simulation of copper made parallel MCHS consisting of 25 channels...

Pełny tekst do pobrania w serwisie zewnętrznym

A Workflow Application for Parallel Processing of Big Data from an Internet Portal

Publikacja

P. Czarnul

- Rok 2014

The paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple...

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology

Publikacja

- Rok 2016

The discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...

Pełny tekst do pobrania w serwisie zewnętrznym

Experimental Research on the Energy Efficiency of a Parallel Hybrid Drive for an Inland Ship

Publikacja

- ENERGIES - Rok 2019

The growing requirements for limiting the negative impact of all modes of transport on the natural environment mean that clean technologies are becoming more and more important. The global trend of e-mobility also applies to sea and inland water transport. This article presents the results of experimental tests carried out on a life-size, parallel diesel-electric hybrid propulsion system. The eciency of the propulsion system was...

Pełny tekst do pobrania w portalu

Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units

Publikacja

J. Porter-Sobieraj
S. Cygert
K. Daniel
J. Sikorski
M. Słodkowski

- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Rok 2015

This paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel in vitro and in silico investigations into anti-inflammatory effects of non-prenylated stilbenoids

Publikacja

V. Leláková
K. Šmejkal
K. Jakubczyk
O. Veselý
P. Landa
J. Václavík
P. Bobáľ
H. Pížová
V. Temml
T. Steinacher... i 4 innych

- Food Chemistry - Rok 2019

Pełny tekst do pobrania w serwisie zewnętrznym

Generating reliable conformance test suites for parallel and distributed languages, libraries, and APIs.

Publikacja

Ł. Garstecki

- Rok 2004

Artykuł nakreśla nową metodykę dla tworzenia Zestawów Testów Zgodności (ZTG) dla języków, bibliotek i API programowania równoległego i rozproszonego. Autor rozpoczął swoje badania w zakresie testowania zgodności dla języka równoległego sterowanego danymi Athapascan, opracował metodykę dla projektowania i analizowania ZTG nazwaną Metodą Kolejnych Zawężeń (ang. Consecutive Confinements Methods - CoCoM), stworzył narzędzie CTS Designer,...

New user-guided and ckpt-based checkpointing libraries for parallel MPI applications

Publikacja

- Rok 2005

Praca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...

From the Dynamic Lattice Liquid Algorithm to the Dedicated Parallel Computer – mDLL Machine

Publikacja

J. Jung
R. Kiełbik
K. Rudnicki
K. Hałagan
P. Polanowski
A. Sikorski

- Computational Methods in Science and Technology - Rok 2018

Pełny tekst do pobrania w serwisie zewnętrznym

Makespan minimization of multi-slot just-in-time scheduling on single and parallel machines

Publikacja

D. Dereniowski
W. Kubiak

- JOURNAL OF SCHEDULING - Rok 2010

Artykuł podejmuje problem szeregowania zadań przy założeniu podziału czasu na sloty jednakowej długości, gdzie każde z zadań ma ustaloną długość oraz czas jego zakończenia, który jest relatywny do końca slotu. Problem znalezienia uszeregowania polega na dokonaniu przydziału zadań do poszczególnych slotów, przy czym w ogólności długość zadania może wymuszać sytuację, w której zadańie jest realizowane nie tylko w slocie, w którym...

Pełny tekst do pobrania w serwisie zewnętrznym

Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications

Publikacja

- JOURNAL OF SUPERCOMPUTING - Rok 2017

The aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...

Pełny tekst do pobrania w portalu

A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache

Publikacja

- Scalable Computing: Practice and Experience - Rok 2018

The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Pełny tekst do pobrania w portalu

MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems

Publikacja

- SIMULATION MODELLING PRACTICE AND THEORY - Rok 2017

In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...

Pełny tekst do pobrania w portalu

Performance/energy aware optimization of parallel applications on GPUs under power capping

Publikacja

- Rok 2020

In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...

Pełny tekst do pobrania w portalu

A Parallel Corpus-Based Approach to the Crime Event Extraction for Low-Resource Languages

Publikacja

N. Khairova
O. Mamyrbayev
N. Rizun
M. Razno
G. Ybytayeva

- IEEE Access - Rok 2023

These days, a lot of crime-related events take place all over the world. Most of them are reported in news portals and social media. Crime-related event extraction from the published texts can allow monitoring, analysis, and comparison of police or criminal activities in different countries or regions. Existing approaches to event extraction mainly suggest processing texts in English, French, Chinese, and some other resource-rich...

Pełny tekst do pobrania w portalu

High power, zero ripples active filtering system with power modules operating in parallel

Publikacja

D. Wojciechowski
R. Strzelecki

- Rok 2010

Pełny tekst do pobrania w serwisie zewnętrznym

Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins

Publikacja

A. Sieradzan
J. Sans‐Duñó
E. Lubecka
C. Czaplewski
A. Lipska
H. Leszczyński
K. Ocetkiewicz
J. Proficz
P. Czarnul
H. Krawczyk
A. Liwo

- JOURNAL OF COMPUTATIONAL CHEMISTRY - Rok 2023

We report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...

Pełny tekst do pobrania w portalu

Benchmarking Parallel Chess Search in Stockfish on Intel Xeon and Intel Xeon Phi Processors

Publikacja

P. Czarnul

- Rok 2018

The paper presents results from benchmarking the parallel multithreaded Stockfish chess engine on selected multi- and many-core processors. It is shown how the strength of play for an n-thread version compares to 1-thread version on both Intel Xeon and latest Intel Xeon Phi x200 processors. Results such as the number of wins, losses and draws are presented and how these change for growing numbers of threads. Impact of using particular...

Pełny tekst do pobrania w serwisie zewnętrznym

Efficient parallel algorithms in global optimization of potential energy functions for peptides, proteins, and crystals

Publikacja

J. Lee
J. Pillardy
C. Czaplewski
Y. Arnautova
D. Ripoll
A. Liwo
K. Gibson
R. Wawak
H. Scheraga

- COMPUTER PHYSICS COMMUNICATIONS - Rok 2000

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.

Publikacja

- Rok 2004

W pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...

Portable parallel simulator using MPI for 2D and 3D domains: design and performance testing

Publikacja

- Rok 2005

W artykule prezentujemy szczegóły projektowo-implementacyjne naszego modularnego kodu symulacyjnego z wykorzystaniem MPI, w tym nakładaniem obliczeń i komunikacji. Podkreślamy modularność naszej implementacji pozwalającą na łatwą adaptację kodu dla innych zasotosowań. Prezentujemy związek pomiędzy przyspieszeniem obliczeń, rozmiarem i kształtami trójwymiarowych domen z różnymi stosunkami liczby węzłów aktualizowanych przez procesor...

Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems

Publikacja

- Scientific Programming - Rok 2020

This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...

Pełny tekst do pobrania w portalu

ARUZ — Large-scale, massively parallel FPGA-based analyzer of real complex systems

Publikacja

R. Kiełbik
K. Hałagan
W. Zatorski
J. Jung
J. Ulański
A. Napieralski
K. Rudnicki
P. Amrozik
G. Jabłoński
D. Stożek... i 4 innych

- COMPUTER PHYSICS COMMUNICATIONS - Rok 2018

Pełny tekst do pobrania w serwisie zewnętrznym

Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform

Publikacja

- Rok 2017

Performance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform

Publikacja

- Journal of Real-Time Image Processing - Rok 2016

Results of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...

Pełny tekst do pobrania w portalu

Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system

Publikacja

T. Stefański

- Progress in Electromagnetics Research-PIER - Rok 2013

This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...

Pełny tekst do pobrania w serwisie zewnętrznym

Multi-agent large-scale parallel crowd simulation with NVRAM-based distributed cache

Publikacja

- Journal of Computational Science - Rok 2019

This paper presents the architecture, main components and performance results for a parallel and modu-lar agent-based environment aimed at crowd simulation. The environment allows to simulate thousandsor more agents on maps of square kilometers or more, features a modular design and incorporates non-volatile RAM (NVRAM) with a fail-safe mode that can be activated to allow to continue computationsfrom a recently analyzed state in...

Pełny tekst do pobrania w serwisie zewnętrznym

Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware

Publikacja

- Applied Sciences-Basel - Rok 2022

In the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...

Pełny tekst do pobrania w portalu

Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems

Publikacja

P. Rościszewski

- Rok 2014

Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Pełny tekst do pobrania w serwisie zewnętrznym

Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming

Publikacja

- COMPUTER JOURNAL - Rok 2021

In the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...

Pełny tekst do pobrania w portalu

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: PARALLAX