Wyniki wyszukiwania dla: parallel computing

Sensitivity of the Baltic Sea level prediction to spatial model resolution

Publikacja

M. Kowalewski

- Rok 2017

he three-dimensional hydrodynamic model of the Baltic Sea (M3D) and...

Pełny tekst do pobrania w serwisie zewnętrznym

A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache

Publikacja

- Scalable Computing: Practice and Experience - Rok 2018

The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Pełny tekst do pobrania w portalu

Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams

Publikacja

P. Czarnul

- COMPUTING AND INFORMATICS - Rok 2020

The paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams,...

Pełny tekst do pobrania w portalu

DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing

Publikacja

- SOFTWARE-PRACTICE & EXPERIENCE - Rok 2022

In the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...

Pełny tekst do pobrania w serwisie zewnętrznym

DATABASE AND BIGDATA PROCESSING SYSTEM FOR ANALYSIS OF AIS MESSAGES IN THE NETBALTIC RESEARCH PROJECT

Publikacja

M. Lewczuk
P. Cichocki
J. Woźniak

- TASK Quarterly - Rok 2017

A specialized database and a software tool for graphical and numerical presentation of maritime measurement results has been designed and implemented as part of the research conducted under the netBaltic project (Internet over the Baltic Sea – the implementation of a multi-system, self-organizing broadband communications network over the sea for enhancing navigation safety through the development of e-navigation services.) The...

Pełny tekst do pobrania w portalu

Processing of Satellite Data in the Cloud

Publikacja

- TASK Quarterly - Rok 2017

The dynamic development of digital technologies, especially those dedicated to devices generating large data streams, such as all kinds of measurement equipment (temperature and humidity sensors, cameras, radio-telescopes and satellites – Internet of Things) enables more in-depth analysis of the surrounding reality, including better understanding of various natural phenomenon, starting from atomic level reactions, through macroscopic...

Pełny tekst do pobrania w portalu

Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors

Publikacja

P. Czarnul

- INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING - Rok 2016

The paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...

Pełny tekst do pobrania w portalu

Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment

Publikacja

- Advances in Intelligent Systems and Computing - Rok 2017

In the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...

Pełny tekst do pobrania w portalu

Theory and implementation of a virtualisation level Future Internet defence in depth architecture

Publikacja

J. Konorski
P. Pacyna
G. Kolaczek
Z. Kotulski
K. Cabaj
P. Szalachowski

- International Journal of Trust Management in Computing and Communications - Rok 2013

An EU Future Internet Engineering project currently underway in Poland defines three parallel internets (PIs). The emerging IIP system (IIPS, abbreviating the project’s Polish name), has a four-level architecture, with level 2 responsible for creation of virtual resources of the PIs. This paper proposes a three-tier security architecture to address level 2 threats of unauthorised traffic injection and IIPS traffic manipulation...

Pełny tekst do pobrania w serwisie zewnętrznym

A Task-Scheduling Approach for Efficient Sparse Symmetric Matrix-Vector Multiplication on a GPU

Publikacja

- SIAM JOURNAL ON SCIENTIFIC COMPUTING - Rok 2015

In this paper, a task-scheduling approach to efficiently calculating sparse symmetric matrix-vector products and designed to run on Graphics Processing Units (GPUs) is presented. The main premise is that, for many sparse symmetric matrices occurring in common applications, it is possible to obtain significant reductions in memory usage and improvements in performance when the matrix is prepared in certain ways prior to computation....

Pełny tekst do pobrania w serwisie zewnętrznym

Modelling and simulation of GPU processing in the MERPSYS environment

Publikacja

- Scalable Computing: Practice and Experience - Rok 2018

In this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...

Pełny tekst do pobrania w portalu

Process arrival pattern aware algorithms for acceleration of scatter and gather operations

Publikacja

J. Proficz

- Cluster Computing-The Journal of Networks Software Tools and Applications - Rok 2020

Imbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...

Pełny tekst do pobrania w portalu

GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM

Publikacja

- Communications in Computational Physics - Rok 2017

This paper presents a GPU-accelerated implementation of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) method with an inexact nullspace filtering approach to find eigenvalues in electromagnetics analysis with higherorder FEM. The performance of the proposed approach is verified using the Kepler (Tesla K40c) graphics accelerator, and is compared to the performance of the implementation based on functions from...

Pełny tekst do pobrania w serwisie zewnętrznym

Communication and Load Balancing Optimization for Finite Element Electromagnetic Simulations Using Multi-GPU Workstation

Publikacja

- IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES - Rok 2017

This paper considers a method for accelerating finite-element simulations of electromagnetic problems on a workstation using graphics processing units (GPUs). The focus is on finite-element formulations using higher order elements and tetrahedral meshes that lead to sparse matrices too large to be dealt with on a typical workstation using direct methods. We discuss the problem of rapid matrix generation and assembly, as well as...

Pełny tekst do pobrania w serwisie zewnętrznym

Performance evaluation of parallel background subtraction on GPU platforms

Publikacja

G. Szwoch

- Elektronika : konstrukcje, technologie, zastosowania - Rok 2015

Implementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...

Pełny tekst do pobrania w serwisie zewnętrznym

Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs

Publikacja

- Rok 2014

The paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...

Pełny tekst do pobrania w serwisie zewnętrznym

Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming

Publikacja

- COMPUTER JOURNAL - Rok 2021

In the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...

Pełny tekst do pobrania w portalu

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

Publikacja

- Rok 2014

The paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...

Pełny tekst do pobrania w serwisie zewnętrznym

GPU based implementation of Temperature-Vegetation Dryness Index for AVHRR3 Satellite Data

Publikacja

- Rok 2014

Paper presents an implementation of TVDI (Temperature-Vegetation-Dryness Index) algorithm on GPU (Graphics Processing Unit). Calculation of this index is based on LST (Land Surface Temperature) and NDVI (Normalized Difference Vegetation Index). Discussed results are based on multi-spectral imagery retrieved from AVHRR3 sensors for area of Poland. All phases of TVDI implementation on GPU are modified in respect to CUDA platform....

Evaluation of multimedia applications in a cluster oriented environment

Publikacja

- Metrology and Measurement Systems - Rok 2012

In the age of Information and Communication Technology (ICT), Web and the Internet have changed significantly the way applications are developed, deployed and used. One of recent trends is modern design of web-applications based on SOA. This process is based on the composition of existing web services into a single scenario from the point of view of a particular user or client. This allows IT companies to shorten product time-to-market....

Pełny tekst do pobrania w portalu

GPU-accelerated finite element method

Publikacja

- Rok 2016

In this paper the results of the acceleration of computations involved in analysing electromagnetic problems by means of the finite element method (FEM), obtained with graphics processors (GPU), are presented. A 4.7-fold acceleration was achieved thanks to the massive parallelization of the most time-consuming steps of FEM, namely finite-element matrix-generation and the solution of a sparse system of linear equations with the...

Pełny tekst do pobrania w serwisie zewnętrznym

Modeling energy consumption of parallel applications

Publikacja

- Annals of Computer Science and Information Systems - Rok 2016

The paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...

Pełny tekst do pobrania w portalu

Filtry

Katalog

Kategoria

Rok

Opcje

Sensitivity of the Baltic Sea level prediction to spatial model resolution

A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache

Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams

DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing

DATABASE AND BIGDATA PROCESSING SYSTEM FOR ANALYSIS OF AIS MESSAGES IN THE NETBALTIC RESEARCH PROJECT

Processing of Satellite Data in the Cloud

Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors

Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment

Theory and implementation of a virtualisation level Future Internet defence in depth architecture

A Task-Scheduling Approach for Efficient Sparse Symmetric Matrix-Vector Multiplication on a GPU

Modelling and simulation of GPU processing in the MERPSYS environment

Process arrival pattern aware algorithms for acceleration of scatter and gather operations

GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM

Communication and Load Balancing Optimization for Finite Element Electromagnetic Simulations Using Multi-GPU Workstation

Performance evaluation of parallel background subtraction on GPU platforms

Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs

Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

GPU based implementation of Temperature-Vegetation Dryness Index for AVHRR3 Satellite Data

Evaluation of multimedia applications in a cluster oriented environment

GPU-accelerated finite element method

Modeling energy consumption of parallel applications

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: parallel computing