Wyniki wyszukiwania dla: PARALLELIZATION

Parallelization Method for a Continuous Property

Publikacja

P. Pilarczyk

- FOUNDATIONS OF COMPUTATIONAL MATHEMATICS - Rok 2010

Pełny tekst do pobrania w serwisie zewnętrznym

Parallelization of video stream algorithms in kaskada platform

Publikacja

A. Brzeski

- Rok 2011

The purpose of this work is to present different techniques of video stream algorithms parallelization provided by the Kaskada platform - a novel system working in a supercomputer environment designated for multimedia streams processing. Considered parallelization methods include frame-level concurrency, multithreading and pipeline processing. Execution performance was measured on four time-consuming image recognition algorithms,...

Parallelization of Compute Intensive Applications into Workflows based on Services in BeesyCluster

Publikacja

P. Czarnul

- Rok 2011

The paper presents an approach for modeling, optimization and execution of workflow applications based on services that incorporates both service selection and partitioning of input data for parallel processing by parallel workflow paths. A compute-intensive workflow application for parallel integration is presented. An impact of the input data partitioning on the scalability is presented. The paper shows a comparison of the theoretical...

Pełny tekst do pobrania w portalu

Parallelization of large vector similarity computations in a hybrid CPU+GPU environment

Publikacja

P. Czarnul

- JOURNAL OF SUPERCOMPUTING - Rok 2018

The paper presents design, implementation and tuning of a hybrid parallel OpenMP+CUDA code for computation of similarity between pairs of a large number of multidimensional vectors. The problem has a wide range of applications, and consequently its optimization is of high importance, especially on currently widespread hybrid CPU+GPU systems targeted in the paper. The following are presented and tested for computation of all vector...

Pełny tekst do pobrania w portalu

Programming, tunning and automatic parallelization of irregular divide and conquer applications in DAMPVM/DAC.

Publikacja

P. Czarnul

- INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS - Rok 2003

Artykuł prezentuje nowy, obiektowo zorientowany wzorzec programowy DAMPVM/DAC, który zimplementowany został z użyciem systemu DAMPVM i umożliwia automatyczny podział nieregularnych aplikacji "Dziel i zwyciężaj" (DAC) w czasie ich działania.

Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment

Publikacja

- Advances in Intelligent Systems and Computing - Rok 2017

In the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...

Pełny tekst do pobrania w portalu

Mixed electromagnetic - circuits modeling and parallelization for rigorouscharacterization of cosite interference in wireless communication channels. W: UGC 2002 Homepage [online]. Department of Defense High Performance Com- puting Modernization Program. Users Group Conference 2002. Austin, Texas, USA. June 10-14, 2002. [Dostęp: 15 grudnia**2002]. Dostępny w World Wide Web: http://www.hpcmo.hpc.mil/Htdocs/UGC/UGC02/paper/[45 slajdów]. Modelowanie układów elektromagnetycznych i zrównoleglanie w celu określenia wzajemnych oddziaływań w bezprzewodowych kanałach komunikacyjnych.

Publikacja

C. D. Sarris
P. Czarnul

- Rok 2002

Równoległe działanie sąsiadujące modułów nadawczo-odbiorczych typowo prowa-dzi do efektów ubocznych z powodu wzajemnych oddziaływań, które obniżają pa-rametry sieci. W celu scharakteryzowania takich efektów, zaprezentowano roz-wiązanie równań Maxwella w dziedzinie czasu z modelowaniem efektów nielinio-wych.

From Sequential to Parallel Implementation of NLP Using the Actor Model

Publikacja

- Advances in Intelligent Systems and Computing - Rok 2018

The article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...

Pełny tekst do pobrania w portalu

GPU-accelerated finite element method

Publikacja

- Rok 2016

In this paper the results of the acceleration of computations involved in analysing electromagnetic problems by means of the finite element method (FEM), obtained with graphics processors (GPU), are presented. A 4.7-fold acceleration was achieved thanks to the massive parallelization of the most time-consuming steps of FEM, namely finite-element matrix-generation and the solution of a sparse system of linear equations with the...

Pełny tekst do pobrania w serwisie zewnętrznym

KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs

Publikacja

- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Rok 2016

The paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel tabu search for graph coloring problem

Publikacja

- Rok 2006

Tabu search is a simple, yet powerful meta-heuristic based on local search that has been often used to solve combinatorial optimization problems like the graph coloring problem. This paper presents current taxonomy of patallel tabu search algorithms and compares three parallelization techniques applied to Tabucol, a sequential TS algorithm for graph coloring. The experimental results are based on graphs available from the DIMACS...

Parallel Programming for Modern High Performance Computing Systems

Publikacja

P. Czarnul

- Rok 2018

In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Pełny tekst do pobrania w serwisie zewnętrznym

Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors

Publikacja

P. Czarnul

- INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING - Rok 2016

The paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...

Pełny tekst do pobrania w portalu

Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs

Publikacja

- Rok 2014

The paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...

Pełny tekst do pobrania w serwisie zewnętrznym

EFFICIENT LINE DETECTION METHOD BASED ON 2D CONVOLUTION FILTER

Publikacja

- Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska - Rok 2021

The article proposes an efficient line detection method using a 2D convolution filter. The proposed method was compared with the Hough transform, the most popular method of straight lines detection. The developed method is suitable for local detection of straight lines with a slope from -45˚ to 45˚. Also, it can be used for curve detection which shape is approximated with the short straight sections. The new method is characterized...

Pełny tekst do pobrania w portalu

FPGA Acceleration of Matrix-Assembly Phase of RWG-Based MoM

Publikacja

T. Topa
A. Noga
T. Stefański

- IEEE Antennas and Wireless Propagation Letters - Rok 2022

In this letter, the field-programmable-gate-array accelerated implementation of matrix-assembly phase of the method of moments (MoM) is presented. The solution is based on a discretization of the frequency-domain mixed potential integral equation using the Rao-Wilton-Glisson basis functions and their extension to wire-to-surface junctions. To take advantage of the given hardware resources (i.e., Xilinx Alveo U200 accelerator card),...

Pełny tekst do pobrania w portalu

Computationally Effcient Solution of a 2D Diffusive Wave Equation Used for Flood Inundation Problems

Publikacja

- Water - Rok 2019

This paper presents a study dealing with increasing the computational efficiency in modeling floodplain inundation using a two-dimensional diffusive wave equation. To this end, the domain decomposition technique was used. The resulting one-dimensional diffusion equations were approximated in space with the modified finite element scheme, whereas time integration was carried out using the implicit two-level scheme. The proposed...

Pełny tekst do pobrania w portalu

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

Publikacja

- Rok 2014

The paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...

Pełny tekst do pobrania w serwisie zewnętrznym

Parallel implementation of a Sailing Assistance Application in a Cloud Environment

Publikacja

- IEEE Access - Rok 2023

Sailboat weather routing is a highly complex problem in terms of both the computational time and memory. The reason for this is a large search resulting in a multitude of possible routes and a variety of user preferences. Analysing all possible routes is only feasible for small sailing regions, low-resolution maps, or sailboat movements on a grid. Therefore, various heuristic approaches are often applied, which can find solutions...

Pełny tekst do pobrania w portalu

A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems

Publikacja

P. Czarnul

- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Rok 2023

In the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...

Pełny tekst do pobrania w portalu

Behavior Analysis and Dynamic Crowd Management in Video Surveillance System

Publikacja

- Rok 2011

A concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...

Pełny tekst do pobrania w serwisie zewnętrznym

Real-Time Gastrointestinal Tract Video Analysis on a Cluster Supercomputer

Publikacja

- Rok 2012

The article presents a novel approach to medical video data analysis and recognition. Emphasis has been put on adapting existing algorithms detecting le- sions and bleedings for real time usage in a medical doctor's office during an en- doscopic examination. A system for diagnosis recommendation and disease detec- tion has been designed taking into account the limited mobility of the endoscope and the doctor's requirements. The...

Real-Time Bleeding Detection in Gastrointestinal Tract Endoscopic Examinations Video

Publikacja

- International Journal of Distributed and Parallel Systems - Rok 2013

The article presents a novel approach to medical video data analysis and recognition of bleedings. Emphasis has been put on adapting pre-existing algorithms dedicated to the detection of bleedings for real-time usage in a medical doctor’s office during an endoscopic examination. A real-time system for analyzing endoscopic videos has been designed according to the most significant requirements of medical doctors. The main goal of...

Pełny tekst do pobrania w portalu

Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins

Publikacja

A. Sieradzan
J. Sans‐Duñó
E. Lubecka
C. Czaplewski
A. Lipska
H. Leszczyński
K. Ocetkiewicz
J. Proficz
P. Czarnul
H. Krawczyk
A. Liwo

- JOURNAL OF COMPUTATIONAL CHEMISTRY - Rok 2023

We report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...

Pełny tekst do pobrania w portalu

The need for new transport protocols on the INTERNET

Publikacja

- Automatyka Elektryka Zakłócenia - Rok 2023

The TCP/IP protocol suite is widely used in IP networks, regardless of diverse environments and usage scenarios. Due to the fact of being the basic concept of organizing the work of the Internet, it is the subject of interest and constant analysis of operators, users, network researchers, and designers. The Internet is a "living" organism in which new needs appear all the time. This is particularly important due to the emerging...

Pełny tekst do pobrania w serwisie zewnętrznym

Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system

Publikacja

- SIMULATION MODELLING PRACTICE AND THEORY - Rok 2023

In the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...

Pełny tekst do pobrania w serwisie zewnętrznym

Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware

Publikacja

- Applied Sciences-Basel - Rok 2022

In the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...

Pełny tekst do pobrania w portalu

Filtry

Katalog

Kategoria

Rok

Opcje

Parallelization Method for a Continuous Property

Parallelization of video stream algorithms in kaskada platform

Parallelization of Compute Intensive Applications into Workflows based on Services in BeesyCluster

Parallelization of large vector similarity computations in a hybrid CPU+GPU environment

Programming, tunning and automatic parallelization of irregular divide and conquer applications in DAMPVM/DAC.

Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment

From Sequential to Parallel Implementation of NLP Using the Actor Model

GPU-accelerated finite element method

KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs

Parallel tabu search for graph coloring problem

Parallel Programming for Modern High Performance Computing Systems

Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors

Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs

EFFICIENT LINE DETECTION METHOD BASED ON 2D CONVOLUTION FILTER

FPGA Acceleration of Matrix-Assembly Phase of RWG-Based MoM

Computationally Effcient Solution of a 2D Diffusive Wave Equation Used for Flood Inundation Problems

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

Parallel implementation of a Sailing Assistance Application in a Cloud Environment

A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems

Behavior Analysis and Dynamic Crowd Management in Video Surveillance System

Real-Time Gastrointestinal Tract Video Analysis on a Cluster Supercomputer

Real-Time Bleeding Detection in Gastrointestinal Tract Endoscopic Examinations Video

Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins

The need for new transport protocols on the INTERNET

Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system

Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: PARALLELIZATION