Search results for: parallel mpi i/o extension

A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache

Publication

- Scalable Computing: Practice and Experience - Year 2018

The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Full text available to download

A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache

Publication

A. Malinowski
P. Czarnul
P. Dorożyński
K. Czuryło
Ł. Dorau
M. Maciejewski
P. Skowron

- Annals of Computer Science and Information Systems - Year 2016

While many scientiﬁc, large-scale applications are data-intensive, fast and efﬁcient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...

Full text available to download

A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications

Publication

A. Malinowski
P. Czarnul
M. Maciejewski
P. Skowron

- Year 2016

The paper presents a fail-safe NVRAM based mechanism for creation and recovery of data copies during parallel MPI application runtime. Specifically, we target a cluster environment in which each node has an NVRAM installed in it. Our previously developed extension to the MPI I/O API can take advantage of NVRAM regions in order to provide an NVRAM based cache like mechanism to significantly speed up I/O operations and allow to preload...

Full text to download in external service

Checkpointing of Parallel MPI Applications using MPI One-sided API with Support for Byte-addressable Non-volatile RAM

Publication

P. Dorożyński
P. Czarnul
A. Malinowski
K. Czuryło
Ł. Dorau
M. Maciejewski
P. Skowron

- Year 2016

The increasing size of computational clusters results in an increasing probability of failures, which in turn requires application checkpointing in order to survive those failures. Traditional checkpointing requires data to be copied from application memory into persistent storage medium, which increases application execution time as it is usually done in a separate step. In this paper we propose to use emerging byte-addressable...

Full text to download in external service

New user-guided and ckpt-based checkpointing libraries for parallel MPI applications

Publication

- Year 2005

Praca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...

Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware

Publication

- Applied Sciences-Basel - Year 2022

In the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...

Full text available to download

Portable parallel simulator using MPI for 2D and 3D domains: design and performance testing

Publication

- Year 2005

W artykule prezentujemy szczegóły projektowo-implementacyjne naszego modularnego kodu symulacyjnego z wykorzystaniem MPI, w tym nakładaniem obliczeń i komunikacji. Podkreślamy modularność naszej implementacji pozwalającą na łatwą adaptację kodu dla innych zasotosowań. Prezentujemy związek pomiędzy przyspieszeniem obliczeń, rozmiarem i kształtami trójwymiarowych domen z różnymi stosunkami liczby węzłów aktualizowanych przez procesor...

Multi-agent large-scale parallel crowd simulation

Publication

A. Malinowski
P. Czarnul
K. Czuryƚo
M. Maciejewski
P. Skowron

- Year 2017

This paper presents design, implementation and performance results of a new modular, parallel, agent-based and large scale crowd simulation environment. A parallel application, implemented with C and MPI, was implemented and run in this parallel environment for simulation and visualization of an evacuation scenario at Gdansk University of Technology, Poland and further in the area of districts of Gdansk. The application uses a...

Full text to download in external service

Parallelization of Compute Intensive Applications into Workflows based on Services in BeesyCluster

Publication

P. Czarnul

- Year 2011

The paper presents an approach for modeling, optimization and execution of workflow applications based on services that incorporates both service selection and partitioning of input data for parallel processing by parallel workflow paths. A compute-intensive workflow application for parallel integration is presented. An impact of the input data partitioning on the scalability is presented. The paper shows a comparison of the theoretical...

Full text to download in external service

Simulation of parallel similarity measure computations for large data sets

Publication

- Year 2015

The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...

Full text to download in external service

Performance and Power-Aware Modeling of MPI Applications for Cluster Computing

Publication

- Year 2016

The paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...

Full text available to download

NVRAM as Main Storage of Parallel File System

Publication

A. Malinowski

- Journal of Computer Science and Control Systems - Year 2016

Modern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a...

Full text to download in external service

An innovative method of measuring the extension of the piston rod in hydraulic cylinders, especially large ones used in the shipbuilding and offshore industry

Publication

- Polish Maritime Research - Year 2022

The article presents the results of selected works related to the wider subject of research conducted at the Faculty of Mechanical Engineering and Shipbuilding at the Gdańsk University of Technology, regarding designing various on board devices with hydraulic drive for ships and other offshore facilities. One of the commonly used these mechanisms are hydraulic actuators with the measurement of the piston rod extension. The issue...

Full text available to download

Modeling energy consumption of parallel applications

Publication

- Annals of Computer Science and Information Systems - Year 2016

The paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...

Full text available to download

Parallel Programming for Modern High Performance Computing Systems

Publication

P. Czarnul

- Year 2018

In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Full text to download in external service

BC-MPI: running an mpi application on multiple clusters with beesycluster connectivity

Publication

P. Czarnul

- Year 2007

W artykule zaproponowano nowy pakiet BC-MPI, który umożliwiauruchomienie aplikacji MPI na wielu klastrach z różnymi implementacjami MPI. Wykorzystuje dedykowane implementacje MPIdo komunikacji wewnątrz klastrów oraz tryb MPI THREAD MULTIPLE dokomunikacji pomiędzy klastrami w dodatkowych wątkach aplikacji MPI. Ponadto, aplikacja BC-MPI może być automatycznie skompilowanai uruchomiona przez warstwę pośrednią BeesyCluster. BeesyClusterumożliwia...

Full text to download in external service

MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems

Publication

- SIMULATION MODELLING PRACTICE AND THEORY - Year 2017

In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...

Full text available to download

Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix

Publication

- Year 2017

In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...

Full text to download in external service

Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns

Publication

J. Proficz
P. Sumionka
J. Skomiał
M. Semeniuk
K. Niedzielewski
M. Walczak

- Advances in Intelligent Systems and Computing - Year 2020

The paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set...

Full text available to download

Parallel multithread computing for spectroscopic analysis in optical coherence tomography

Publication

- Year 2014

Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Full text to download in external service

Process arrival pattern aware algorithms for acceleration of scatter and gather operations

Publication

J. Proficz

- Cluster Computing-The Journal of Networks Software Tools and Applications - Year 2020

Imbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...

Full text available to download

Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment

Publication

- Advances in Intelligent Systems and Computing - Year 2017

In the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...

Full text available to download

Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

Publication

- Year 2017

In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Full text to download in external service

KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs

Publication

- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2016

The paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....

Full text to download in external service

Two Stage SVM and kNN Text Documents Classifier

Publication

- Year 2015

The paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...

Improving Clairvoyant: reduction algorithm resilient to imbalanced process arrival patterns

Publication

- JOURNAL OF SUPERCOMPUTING - Year 2021

The Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to imbalances in process arrival times” was analyzed, commented and improved. The comments concern handling certain edge cases in the original pseudocode and description, i.e., adding another state of a process, improved cache friendliness more precise complexity estimations and some other issues improving the robustness of the algorithm implementation....

Full text available to download

A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems

Publication

P. Czarnul

- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2023

In the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...

Full text to download in external service

Use of ICT infrastructure for teaching HPC

Publication

- Year 2019

In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

Full text to download in external service

Improving Effectiveness of SVM Classifier for Large Scale Data

Publication

- Year 2015

The paper presents our approach to SVM implementation in parallel environment. We describe how classification learning and prediction phases were pararellised. We also propose a method for limiting the number of necessary computations during classifier construction. Our method, named one-vs-near, is an extension of typical one-vs-all approach that is used for binary classifiers to work with multiclass problems. We perform experiments...

Full text to download in external service

Locally Adaptive Cooperative Kalman Smoothing and Its Application to Identification of Nonstationary Stochastic Systems

Publication

M. Niedźwiecki

- IEEE TRANSACTIONS ON SIGNAL PROCESSING - Year 2012

One of the central problems of the stochastic approximation theory is the proper adjustment of the smoothing algorithm to the unknown, and possibly time-varying, rate and mode of variation of the estimated signals/parameters. In this paper we propose a novel locally adaptive parallel estimation scheme which can be used to solve the problem of fixed-interval Kalman smoothing in the presence of model uncertainty. The proposed solution...

Full text to download in external service

Variable-structure algorithm for identification of quasi-periodically varying systems

Publication

- Year 2008

The paper presents a variable-structure version of a generalized notchfiltering (GANF) algorithm. Generalized notch filters are used for identification of quasi-periodically varying dynamic systems and can be considered an extension, to the system case, of classical adaptive notch filters. The proposed algorithm is a cascade of two GANF filters: a multiple-frequency "precise" filter bank, used for precise system tracking, and a...

FPGA Acceleration of Matrix-Assembly Phase of RWG-Based MoM

Publication

T. Topa
A. Noga
T. Stefański

- IEEE Antennas and Wireless Propagation Letters - Year 2022

In this letter, the field-programmable-gate-array accelerated implementation of matrix-assembly phase of the method of moments (MoM) is presented. The solution is based on a discretization of the frequency-domain mixed potential integral equation using the Rao-Wilton-Glisson basis functions and their extension to wire-to-surface junctions. To take advantage of the given hardware resources (i.e., Xilinx Alveo U200 accelerator card),...

Full text to download in external service

A self-optimization mechanism for generalized adaptive notch smoother

Publication

M. Meller

- SIGNAL PROCESSING - Year 2016

Tracking of nonstationary narrowband signals is often accomplished using algorithms called adaptive notch filters (ANFs). Generalized adaptive notch smoothers (GANSs) extend the concepts of adaptive notch filtering in two directions. Firstly, they are designed to estimate coefficients of nonstationary quasi-periodic systems, rather than signals. Secondly, they employ noncausal processing, which greatly improves their accuracy and...

Full text to download in external service

All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns

Publication

J. Proficz

- ACM Transactions on Architecture and Code Optimization - Year 2021

Two novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted...

Full text available to download

Mathematical modelling of two-step nitrification-denitrification for treatment of sludge digester liquors: influence of nitrite (NO2-N) on the process kinetics

Publication

- Year 2014

Separate treatment of the sludge digester liquors is an alternative for expansion of the mainstream treatment line. In order to reduce the oxygen demand for nitrification and organic carbon demand for denitrification, a shortcut in the nitrogen conversion pathway has been promoted in recent years, i.e. nitrification-denitrification via NO2-N instead of NO3-N. Although NO2-N is a common intermediate product of nitrification and...

Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card

Publication

T. Topa
A. Noga
T. Stefański

- Year 2023

In this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...

Full text to download in external service

Application of mechanistic and data-driven models for nitrogen removal in wastewater treatment systems

Publication

M. J. Mehrani

- Year 2022

In this dissertation, the application of mechanistic and data-driven models in nitrogen removal systems including nitrification and deammonification processes was evaluated. In particular, the influential parameters on the activity of the Nitrospira activity were assessed using response surface methodology (RSM). Various long-term biomass washout experiments were operated in two parallel sequencing batch reactor (SBR) with a different...

Full text available to download

Object serialization and remote exception pattern for distributed C++/MPI application

Publication

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2007

MPI is commonly used standard in development of scientific applications. It focuses on interlanguage operability and is not very well object oriented. The paper proposes a general pattern enabling design of distributed and object oriented applications. It also presents its sample implementations and performance tests.

Towards Easy-to-Use Checkpointing of MPI Applications within CLUSTERIX.

Publication

- Year 2004

W literaturze wymienia się wiele bibliotek/systemów zarówno poziomu jądra jak i użytkownika, które wspomagają zapisywanie i odtwarzanie stanu procesów. W odniesieniu do aplikacji równoległych, jest to jednak zadanie cały czas trudne. Praca prezentuje nasze podejście do zapisywania/odtwarzania stanu aplikacji MPI wspomagane przez programistę, które wykorzystane będzie w środowisku projektu CLUSTERIX tj. zintegrowanej grupie klastrów...

An extension of the method of quasilinearization

Publication

T. Jankowski

- ARCHIV DER MATHEMATIK - Year 2003

Metodę kwazilinearyzacji zastosowano do problemów początkowych gdy prawą stronę zagadnienia można przedstawić za pomocą nieliniowej funkcji "rozszerzenia", zakładając o niej pewną regularność. Pokazano, że odpowiednio skonstruowane ciągi monotoniczne są zbieżne kwadratowo do rozwiązania problemu. Praca uogólnia odpowiednie wyniki, gdy prawa strona jest sumą funkcji wklęsłych i wypukłych ze względu na ostatni argument.

An extension to the FEEDB Multimodal Database of Facial Expressions and Emotions

Publication

M. Szwoch
L. Marco-gimenez
M. Arevalillo-herráez
A. Ayesh

- Year 2015

FEEDB is a multimodal database that contains recordings of people expressing different emotions, captured by using a Microsoft Kinect sensor. Data were originally provided in the device’s proprietary format (XED), requiring both the Microsoft Kinect Studio application and a Kinect sensor attached to the system to use the files. In this paper, we present an extension of the database. For a selection of recordings, we also provide...

Full text to download in external service

Modeling Parallel Applications in the MERPSYS Environment

Publication

P. Czarnul

- Year 2016

The chapter presents how to model parallel computational applications for which simulation of execution in a large-scale parallel or distributed environment is performed within the MERPSYS environment. Specifically, it is shown what approaches can be adopted to model key paradigms often used for parallel applications: master-slave, geometric parallelism (single program multiple data), pipelined and divide-and-conquer applications....

Efektywna warstwa pośrednicząca dla obliczeń typu master-slave w środowisku C++/MPI

Publication

K. Bańczyk

- Year 2006

Pokazano, jak dla wysokowydajnościowego algorytmu pisanego w modelu master-slave w języku C++ i spełniającego pewne ograniczenia można napisać i wykorzystać warstwę komunikacyjną zupełnie oddzielającą kod odpowiedzialny za komunikację od kodu odpowiedzialnego za dzie-dzinę problemową. Przedstawiona zostaje specyfkacja wymagań, jakie powinien spełniać hipotetyczny system rozproszony oraz warstwa komunikacyjna, a także wymagania...

The evaluation of eGlasses eye tracking module as an extension for Scratch

Publication

- Year 2016

In this paper we present the possibility of using eGlasses eye tracking module as an extension for Scratch programming tool which is a visual programming language supporting computer skills learning. The main concept behind this project is to setup the interface for rapid interaction design. Eye tracking is a powerful tool for hands free communication but for that requires a dedicated software. This software is rarely tailored...

Full text to download in external service

Three levels of fail-safe mode in MPI I/O NVRAM distributed cache

Publication

- Procedia Computer Science - Year 2018

The paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...

Full text available to download

Parallel immune system for graph coloring

Publication

J. Dąbrowski

- Year 2008

This paper presents a parallel artificial immune system designed forgraph coloring. The algorithm is based on the clonal selection principle. Each processor operates on its own pool of antibodies and amigration mechanism is used to allow processors to exchange information. Experimental results show that migration improves the performance of the algorithm. The experiments were performed using a high performance cluster on a set...

Full text to download in external service

Parallel Computations of Text Similarities for Categorization Task

Publication

J. Szymański

- Year 2013

In this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....

Testing for conformance of parallel programming pattern languages

Publication

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2002

This paper reports on the project being run by TUG and IMAG, aimed at reducing the volume of tests required to exercise parallel programming language compilers and libraries. The idea is to use the ISO STEP standard scheme for conformance testing of software products. A detailed example illustrating the ongoing work is presented.

Bounds on the Cover Time of Parallel Rotor Walks

Publication

D. Dereniowski
A. Kosowski
D. Pająk
P. Uznański

- Year 2014

The rotor-router mechanism was introduced as a deterministic alternative to the random walk in undirected graphs. In this model, a set of k identical walkers is deployed in parallel, starting from a chosen subset of nodes, and moving around the graph in synchronous steps. During the process, each node maintains a cyclic ordering of its outgoing arcs, and successively propagates walkers which visit it along its outgoing arcs in...

Full text to download in external service

Sensors in River Information Services of the Odra River in Poland: Current State and Planned Extension

Publication

A. Stateczny

- Year 2017

According to adopted in 2016 by the polish Council of Ministers assumptions for the plans for the progress of inland waterways in Poland for the years 2016-2020, with the perspective of 2030, assume that by 2030 Odra along its entire length and the Vistula from Warsaw to Gdansk, they will have become international shipping routes, which will be implemented system of River Information Services (RIS). Aspects of RIS sensor application...

Full text to download in external service

Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.

Publication

- Year 2004

W pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...

Performance evaluation of parallel background subtraction on GPU platforms

Publication

G. Szwoch

- Elektronika : konstrukcje, technologie, zastosowania - Year 2015

Implementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...

Full text to download in external service

Block-based Representation of Application Execution on Modern Parallel Systems

Publication

P. Czarnul

- Year 2013

The chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.

Zastosowanie bajtowo adresowanej pamięci NVRAM do zwiększenia wydajności wybranych aplikacji równoległych wykorzystujących MPI I/O

Publication

A. Malinowski

- Year 2019

Obecnie wiele badań podejmuje temat rosnącego problemu wydajności operacji na plikach w środowiskach klastrowych. Jednocześnie, według ostatnich doniesień związanych z rozwojem technologii pamięci komputerowych, w najbliższej przyszłości na rynku powinny pojawić się układy trwałej pamięci o dostępie swobodnym, adresowanej bajtowo. Niniejsza rozprawa pokazuje, że przy użyciu takiej pamięci można zwiększyć wydajność wybranych...

Full text available to download

Computer experiments with a parallel clonal selection algorithm for the graph coloring problem

Publication

- Year 2008

Artificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...

Full text to download in external service

Bounds on the cover time of parallel rotor walks

Publication

D. Dereniowski
A. Kosowski
D. Pająk
P. Uznański

- JOURNAL OF COMPUTER AND SYSTEM SCIENCES - Year 2016

The rotor-router mechanism was introduced as a deterministic alternative to the random walk in undirected graphs. In this model, a set of k identical walkers is deployed in parallel, starting from a chosen subset of nodes, and moving around the graph in synchronous steps. During the process, each node successively propagates walkers visiting it along its outgoing arcs in round-robin fashion, according to a fixed ordering. We consider...

Full text available to download

A Workflow Application for Parallel Processing of Big Data from an Internet Portal

Publication

P. Czarnul

- Year 2014

The paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple...

Full text to download in external service

Decentralized control of a different rated parallel UPS systems

Publication

- Year 2007

The paper presents the single phase uninterruptible power supply (UPS) system with galvanic separated DC-AC-DC-AC converters operating in parallel. The CAN physical layer based system of communication between converters has been developed and applied, which allow to utilize a decentralized master-slave control providing high availability factor of the whole UPS system. The control system of particular converters has been developed...

Full text to download in external service

Extension management of a knowledge base migration process to IPv6

Publication

- Year 2011

There are many reasons to deploy IPv6 protocol with IPv4 address space depletion being the most indisputable. Unfortunately, migration to IPv6 protocol seems slower than anticipated. To improve pace of the IPv6 deployment, authors of the article developed the two applications that supports the migration process. Their main purpose is to help less experienced network administrators facilitate the migration process with a particular...

Full text to download in external service

ROLE OF AGRICULTURAL EXTENSION IN ADOPTION OF SUSTAINABLE AGRICULTURE PRACTICES

Publication

T. S.
B. Sawicka

- ANBAR JOURNAL OF AGRICULTURAL SCIENCES - Year 2023

Full text to download in external service

AffecTube — Chrome extension for YouTube video affective annotations

Publication

- SoftwareX - Year 2023

The shortage of emotion-annotated video datasets suitable for training and validating machine learning models for facial expression-based emotion recognition stems primarily from the significant effort and cost required for manual annotation. In this paper, we present AffecTube as a comprehensive solution that leverages crowdsourcing to annotate videos directly on the YouTube platform, resulting in ready-to-use emotion-annotated...

Full text available to download

Comparison of EHD devices with parallel and in series spiked electrodes

Publication

J. Podlinski
A. Berendt
J. Mizeraczyk

- Year 2012

In this paper two electrohydrodynamic (EHD) devices for gas pumping and cleaning are presented. In both cases to induce an airflow in these EHD devices corona discharge was used. The discharge was generated between the spiked electrodes set parallel (the first case) or in series (the second case) and the plate electrodes. An asymmetric electric field and generated discharge result in unidirectional gas flow through the EHD device....

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Publication

A. Malinowski

- International Journal of Information Technology and Computer Science - Year 2015

Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...

Full text to download in external service

A distributed system for conducting chess games in parallel

Publication

- Procedia Computer Science - Year 2017

This paper proposes a distributed and scalable cloud based system designed to play chess games in parallel. Games can be played between chess engines alone or between clusters created by combined chess engines. The system has a built-in mechanism that compares engines, based on Elo ranking which finally presents the strength of each tested approach. If an approach needs more computational power, the design of the system allows...

Full text available to download

Scheduling of compatible jobs on parallel machines

Publication

T. Pikies

- Year 2021

The dissertation discusses the problems of scheduling compatible jobs on parallel machines. Some jobs are incompatible, which is modeled as a binary relation on the set of jobs; the relation is often modeled by an incompatibility graph. We consider two models of machines. The first model, more emphasized in the thesis, is a classical model of scheduling, where each machine does one job at time. The second one is a model of p-batching...

From Sequential to Parallel Implementation of NLP Using the Actor Model

Publication

- Advances in Intelligent Systems and Computing - Year 2018

The article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...

Full text available to download

Parallel Cooperating A-Teams

Publication

D. Barbucha
I. Czarnowski
P. Jędrzejowicz
E. Ratajczak-Ropel
I. Wierzbowska

- Year 2011

Full text to download in external service

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

Publication

- Year 2014

The paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...

Full text to download in external service

A New Approach for the Mitigating of Flow Maldistribution in Parallel Microchannel Heat Sink

Publication

K. Ritunesh
G. Singh
D. Mikielewicz

- JOURNAL OF HEAT TRANSFER-TRANSACTIONS OF THE ASME - Year 2018

The problem of flow maldistribution is very critical in microchannel heat sinks (MCHS). It induces temperature nonuniformity, which may ultimately lead to the breakdown of associated system. In the present communication, a novel approach for the mitigation of flow maldistribution problem in parallel MCHS has been proposed using variable width microchannels. Numerical simulation of copper made parallel MCHS consisting of 25 channels...

Full text to download in external service

Parallel implementation of a Sailing Assistance Application in a Cloud Environment

Publication

- IEEE Access - Year 2023

Sailboat weather routing is a highly complex problem in terms of both the computational time and memory. The reason for this is a large search resulting in a multitude of possible routes and a variety of user preferences. Analysing all possible routes is only feasible for small sailing regions, low-resolution maps, or sailboat movements on a grid. Therefore, various heuristic approaches are often applied, which can find solutions...

Full text available to download

Sensorless predictive control of three-phase parallel active filter

Publication

- Year 2007

The paper presents the control system of parallel active power filter (APF) with predictive reference current calculation and model based predictive current control. The novel estimator and predictor of grid emf is proposed for AC voltage sensorless operation of APF, regardless of distortion of this voltage. Proposed control system provides control of APF current with high precision and dynamics limited only by filter circuit parameters....

Full text to download in external service

Extension of selected ADFA construction algorithms to the case of cyclic automata.

Publication

J. Daciuk

- Year 2004

W niedawnym artykule Rafael Carrasco i Mikel Forcada przedstawiają przyrostowy algorytm dodawania słów do minimalnego, acyklicznego automatu skończonego. Ten algorytm jest uogólnieniem przyrostowego algorytmu tworzenia acyklicznych deterministycznych automatów skończonych (ADFAs). Przedstawiamy podobne uogólnienia dwóch innych algorytmów tworzenia ADFAs. Chociaż te ougólnienia zostały już opublikowane w maju i czerwcu 2004 r.,...

On Dynamic Extension of a Local Material Symmetry Group for Micropolar Media

Publication

- Symmetry-Basel - Year 2020

For micropolar media we present a new definition of the local material symmetry group considering invariant properties of the both kinetic energy and strain energy density under changes of a reference placement. Unlike simple (Cauchy) materials, micropolar media can be characterized through two kinematically independent fields, that are translation vector and orthogonal microrotation tensor. In other words, in micropolar continua...

Full text available to download

Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform

Publication

- Journal of Real-Time Image Processing - Year 2016

Results of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...

Full text available to download

Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines

Publication

- Year 2021

In this paper we consider a problem of job scheduling on parallel machines with a presence of incompatibilities between jobs. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. Our research stems from the works of Bodlaender, Jansen, and Woeginger (1994) and Bodlaender and Jansen (1993). In particular, we pursue the...

Full text to download in external service

Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system

Publication

T. Stefański

- Progress in Electromagnetics Research-PIER - Year 2013

This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...

Full text to download in external service

Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications

Publication

P. Czarnul

- Electronics - Year 2021

The paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...

Full text available to download

Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream

Publication

G. Szwoch

- Communications in Computer and Information Science - Year 2011

A background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....

Full text to download in external service

A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures

Publication

- TASK Quarterly - Year 2012

In this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person's relatives to judge...

Full text available to download

Parallel processing of multimedia streams

Publication

- Year 2010

W artykule zaprezentowana jest nowa biblioteka wspierającą tworzenie zadań obliczeniowych, część platformy KASKADA.Przedstawiony został projekt biblioteki, uwzględniający diagram głównych klas oraz diagram sekwencji. Drugi z diagramów ukazuje współpracę głównych klas w procesie przetwarzania strumieni multimedialnych. W dalszej częsci omówione zostały szczegły mechanizmu komunikacji międzyzadawniowej oraz przedstawiony został graf...

Conformance testing of parallel languages

Publication

- Year 2002

Przedstawiono propozycję formalizacji opisu procesu generacji, wykonania ioceny testów zgodności dla języków i bibliotek programowania równoległego, wzakresie zgodności funkcjonalnej i wydajnościowej. Przykłady ilustrujące proponowany formalizm wykorzystują platformę programowania Athapascan.

Parallel scheduling by graph ranking

Publication

D. Dereniowski

- Year 2006

Nr dokum.: 73017Praca dotyczy jednego z nieklasycznych modeli kolorowania grafów - uporządkowanego kolorowania. Celem było uzyskanie wyników, które mogo być wykorzystane w praktycznych zastosowaniach tego modelu, do których należą: równoległe przetwarzanie zapytań w relacyjnych bazach danych, równoległa faktoryzacja macierzy metodą Choleskiego, równoległa asemblacja produktu z jego części składowych. W pracy wskazano uogólnienia...

Parallel processing of multimedia streams

Publication

- Computer Applications in Electrical Engineering - Year 2010

Rozdział przedstawia platformę KASKADA służącą do przetwarzania strumieni multimedialnych. Został opisany jej projekt: diagramy UML klas i sekwencji obrazujące mechanizmy przetwarzania strumieni, oraz szczegóły komunikacji. Zaprezentowano, również, specjalistyczny framework wspomagający tworzenie i wykonywanie algorytmów, jak również definiowanie scenariuszy usług, wraz z oceną ich użyteczności.

Performance evaluation of the parallel object tracking algorithm employing the particle filter

Publication

G. Szwoch

- Year 2016

An algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...

Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit

Publication

- RADIOENGINEERING - Year 2014

Parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD...

Full text available to download

Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems

Publication

- Scientific Programming - Year 2020

This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...

Full text available to download

Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms

Publication

G. Szwoch

- Year 2014

Implementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...

Full text to download in external service

Automatic Watercraft Recognition and Identification on Water Areas Covered by Video Monitoring as Extension for Sea and River Traffic Supervision Systems

Publication

N. Wawrzyniak
A. Stateczny

- Polish Maritime Research - Year 2018

The article presents the watercraft recognition and identification system as an extension for the presently used visual water area monitoring systems, such as VTS (Vessel Traffic Service) or RIS (River Information Service). The watercraft identification systems (AIS - Automatic Identification Systems) which are presently used in both sea and inland navigation require purchase and installation of relatively expensive transceivers...

Full text to download in external service

Representing and Managing Experiential Knowledge with Decisional DNA and its Drimos® Extension

Publication

E. Szczerbicki
C. Sanin
K. Sterling-zuluaga

- Year 2022

The Semantic Web concept is proposing a future concept of the WorldWideWeb (WWW) where both humans and man-made systems are able to interconnect and exchange knowledge. One of the challenges of Semantic Web is smart and trusted accommodation of knowledge in artificial systems so it can be unified, enhanced, reused, shared, communicated and distributed with added aptitude. Our research represents an important component of addressing...

Full text available to download

Radar sensors planning for the purpose of extension of River Information Services in Poland

Publication

A. Stateczny
J. Lubczonek
T. Kantak

- Year 2015

Full text to download in external service

Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform

Publication

- Year 2017

Performance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...

Full text to download in external service

Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs

Publication

- Year 2014

The paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...

Full text to download in external service

Search

Filters

Catalog

Search results for: parallel mpi i/o extension

Paweł Czarnul dr hab. inż.