displaying 1000 best results Help
Search results for: parallel mpi i/o extension
-
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
PublicationThe paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...
-
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
PublicationWhile many scientific, large-scale applications are data-intensive, fast and efficient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...
-
A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications
PublicationThe paper presents a fail-safe NVRAM based mechanism for creation and recovery of data copies during parallel MPI application runtime. Specifically, we target a cluster environment in which each node has an NVRAM installed in it. Our previously developed extension to the MPI I/O API can take advantage of NVRAM regions in order to provide an NVRAM based cache like mechanism to significantly speed up I/O operations and allow to preload...
-
Checkpointing of Parallel MPI Applications using MPI One-sided API with Support for Byte-addressable Non-volatile RAM
PublicationThe increasing size of computational clusters results in an increasing probability of failures, which in turn requires application checkpointing in order to survive those failures. Traditional checkpointing requires data to be copied from application memory into persistent storage medium, which increases application execution time as it is usually done in a separate step. In this paper we propose to use emerging byte-addressable...
-
New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
PublicationPraca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...
-
Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware
PublicationIn the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...
-
Portable parallel simulator using MPI for 2D and 3D domains: design and performance testing
PublicationW artykule prezentujemy szczegóły projektowo-implementacyjne naszego modularnego kodu symulacyjnego z wykorzystaniem MPI, w tym nakładaniem obliczeń i komunikacji. Podkreślamy modularność naszej implementacji pozwalającą na łatwą adaptację kodu dla innych zasotosowań. Prezentujemy związek pomiędzy przyspieszeniem obliczeń, rozmiarem i kształtami trójwymiarowych domen z różnymi stosunkami liczby węzłów aktualizowanych przez procesor...
-
Paweł Czarnul dr hab. inż.
PeoplePaweł Czarnul obtained a D.Sc. degree in computer science in 2015, a Ph.D. in computer science granted by a council at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology in 2003. His research interests include:parallel and distributed processing including clusters, accelerators, coprocessors; distributed information systems; architectures of distributed systems; programming mobile devices....
-
Multi-agent large-scale parallel crowd simulation
PublicationThis paper presents design, implementation and performance results of a new modular, parallel, agent-based and large scale crowd simulation environment. A parallel application, implemented with C and MPI, was implemented and run in this parallel environment for simulation and visualization of an evacuation scenario at Gdansk University of Technology, Poland and further in the area of districts of Gdansk. The application uses a...
-
Parallelization of Compute Intensive Applications into Workflows based on Services in BeesyCluster
PublicationThe paper presents an approach for modeling, optimization and execution of workflow applications based on services that incorporates both service selection and partitioning of input data for parallel processing by parallel workflow paths. A compute-intensive workflow application for parallel integration is presented. An impact of the input data partitioning on the scalability is presented. The paper shows a comparison of the theoretical...
-
Simulation of parallel similarity measure computations for large data sets
PublicationThe paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...
-
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
PublicationThe paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...
-
NVRAM as Main Storage of Parallel File System
PublicationModern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a...
-
An innovative method of measuring the extension of the piston rod in hydraulic cylinders, especially large ones used in the shipbuilding and offshore industry
PublicationThe article presents the results of selected works related to the wider subject of research conducted at the Faculty of Mechanical Engineering and Shipbuilding at the Gdańsk University of Technology, regarding designing various on board devices with hydraulic drive for ships and other offshore facilities. One of the commonly used these mechanisms are hydraulic actuators with the measurement of the piston rod extension. The issue...
-
Modeling energy consumption of parallel applications
PublicationThe paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...
-
Parallel Programming for Modern High Performance Computing Systems
PublicationIn view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...
-
BC-MPI: running an mpi application on multiple clusters with beesycluster connectivity
PublicationW artykule zaproponowano nowy pakiet BC-MPI, który umożliwiauruchomienie aplikacji MPI na wielu klastrach z różnymi implementacjami MPI. Wykorzystuje dedykowane implementacje MPIdo komunikacji wewnątrz klastrów oraz tryb MPI THREAD MULTIPLE dokomunikacji pomiędzy klastrami w dodatkowych wątkach aplikacji MPI. Ponadto, aplikacja BC-MPI może być automatycznie skompilowanai uruchomiona przez warstwę pośrednią BeesyCluster. BeesyClusterumożliwia...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublicationIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
PublicationIn this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...
-
Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns
PublicationThe paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set...
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublicationSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
Process arrival pattern aware algorithms for acceleration of scatter and gather operations
PublicationImbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...
-
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
PublicationIn the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...
-
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
PublicationIn the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...
-
KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs
PublicationThe paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....
-
Two Stage SVM and kNN Text Documents Classifier
PublicationThe paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...
-
Improving Clairvoyant: reduction algorithm resilient to imbalanced process arrival patterns
PublicationThe Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to imbalances in process arrival times” was analyzed, commented and improved. The comments concern handling certain edge cases in the original pseudocode and description, i.e., adding another state of a process, improved cache friendliness more precise complexity estimations and some other issues improving the robustness of the algorithm implementation....
-
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
PublicationIn the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...
-
Use of ICT infrastructure for teaching HPC
PublicationIn this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...
-
Improving Effectiveness of SVM Classifier for Large Scale Data
PublicationThe paper presents our approach to SVM implementation in parallel environment. We describe how classification learning and prediction phases were pararellised. We also propose a method for limiting the number of necessary computations during classifier construction. Our method, named one-vs-near, is an extension of typical one-vs-all approach that is used for binary classifiers to work with multiclass problems. We perform experiments...
-
Locally Adaptive Cooperative Kalman Smoothing and Its Application to Identification of Nonstationary Stochastic Systems
PublicationOne of the central problems of the stochastic approximation theory is the proper adjustment of the smoothing algorithm to the unknown, and possibly time-varying, rate and mode of variation of the estimated signals/parameters. In this paper we propose a novel locally adaptive parallel estimation scheme which can be used to solve the problem of fixed-interval Kalman smoothing in the presence of model uncertainty. The proposed solution...
-
Variable-structure algorithm for identification of quasi-periodically varying systems
PublicationThe paper presents a variable-structure version of a generalized notchfiltering (GANF) algorithm. Generalized notch filters are used for identification of quasi-periodically varying dynamic systems and can be considered an extension, to the system case, of classical adaptive notch filters. The proposed algorithm is a cascade of two GANF filters: a multiple-frequency "precise" filter bank, used for precise system tracking, and a...
-
FPGA Acceleration of Matrix-Assembly Phase of RWG-Based MoM
PublicationIn this letter, the field-programmable-gate-array accelerated implementation of matrix-assembly phase of the method of moments (MoM) is presented. The solution is based on a discretization of the frequency-domain mixed potential integral equation using the Rao-Wilton-Glisson basis functions and their extension to wire-to-surface junctions. To take advantage of the given hardware resources (i.e., Xilinx Alveo U200 accelerator card),...
-
A self-optimization mechanism for generalized adaptive notch smoother
PublicationTracking of nonstationary narrowband signals is often accomplished using algorithms called adaptive notch filters (ANFs). Generalized adaptive notch smoothers (GANSs) extend the concepts of adaptive notch filtering in two directions. Firstly, they are designed to estimate coefficients of nonstationary quasi-periodic systems, rather than signals. Secondly, they employ noncausal processing, which greatly improves their accuracy and...
-
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
PublicationTwo novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted...
-
Mathematical modelling of two-step nitrification-denitrification for treatment of sludge digester liquors: influence of nitrite (NO2-N) on the process kinetics
PublicationSeparate treatment of the sludge digester liquors is an alternative for expansion of the mainstream treatment line. In order to reduce the oxygen demand for nitrification and organic carbon demand for denitrification, a shortcut in the nitrogen conversion pathway has been promoted in recent years, i.e. nitrification-denitrification via NO2-N instead of NO3-N. Although NO2-N is a common intermediate product of nitrification and...
-
Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card
PublicationIn this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...
-
Application of mechanistic and data-driven models for nitrogen removal in wastewater treatment systems
PublicationIn this dissertation, the application of mechanistic and data-driven models in nitrogen removal systems including nitrification and deammonification processes was evaluated. In particular, the influential parameters on the activity of the Nitrospira activity were assessed using response surface methodology (RSM). Various long-term biomass washout experiments were operated in two parallel sequencing batch reactor (SBR) with a different...
-
Object serialization and remote exception pattern for distributed C++/MPI application
PublicationMPI is commonly used standard in development of scientific applications. It focuses on interlanguage operability and is not very well object oriented. The paper proposes a general pattern enabling design of distributed and object oriented applications. It also presents its sample implementations and performance tests.
-
Towards Easy-to-Use Checkpointing of MPI Applications within CLUSTERIX.
PublicationW literaturze wymienia się wiele bibliotek/systemów zarówno poziomu jądra jak i użytkownika, które wspomagają zapisywanie i odtwarzanie stanu procesów. W odniesieniu do aplikacji równoległych, jest to jednak zadanie cały czas trudne. Praca prezentuje nasze podejście do zapisywania/odtwarzania stanu aplikacji MPI wspomagane przez programistę, które wykorzystane będzie w środowisku projektu CLUSTERIX tj. zintegrowanej grupie klastrów...
-
An extension of the method of quasilinearization
PublicationMetodę kwazilinearyzacji zastosowano do problemów początkowych gdy prawą stronę zagadnienia można przedstawić za pomocą nieliniowej funkcji "rozszerzenia", zakładając o niej pewną regularność. Pokazano, że odpowiednio skonstruowane ciągi monotoniczne są zbieżne kwadratowo do rozwiązania problemu. Praca uogólnia odpowiednie wyniki, gdy prawa strona jest sumą funkcji wklęsłych i wypukłych ze względu na ostatni argument.
-
JOURNAL OF EXTENSION
Journals -
An extension to the FEEDB Multimodal Database of Facial Expressions and Emotions
PublicationFEEDB is a multimodal database that contains recordings of people expressing different emotions, captured by using a Microsoft Kinect sensor. Data were originally provided in the device’s proprietary format (XED), requiring both the Microsoft Kinect Studio application and a Kinect sensor attached to the system to use the files. In this paper, we present an extension of the database. For a selection of recordings, we also provide...
-
Modeling Parallel Applications in the MERPSYS Environment
PublicationThe chapter presents how to model parallel computational applications for which simulation of execution in a large-scale parallel or distributed environment is performed within the MERPSYS environment. Specifically, it is shown what approaches can be adopted to model key paradigms often used for parallel applications: master-slave, geometric parallelism (single program multiple data), pipelined and divide-and-conquer applications....
-
Efektywna warstwa pośrednicząca dla obliczeń typu master-slave w środowisku C++/MPI
PublicationPokazano, jak dla wysokowydajnościowego algorytmu pisanego w modelu master-slave w języku C++ i spełniającego pewne ograniczenia można napisać i wykorzystać warstwę komunikacyjną zupełnie oddzielającą kod odpowiedzialny za komunikację od kodu odpowiedzialnego za dzie-dzinę problemową. Przedstawiona zostaje specyfkacja wymagań, jakie powinien spełniać hipotetyczny system rozproszony oraz warstwa komunikacyjna, a także wymagania...
-
The evaluation of eGlasses eye tracking module as an extension for Scratch
PublicationIn this paper we present the possibility of using eGlasses eye tracking module as an extension for Scratch programming tool which is a visual programming language supporting computer skills learning. The main concept behind this project is to setup the interface for rapid interaction design. Eye tracking is a powerful tool for hands free communication but for that requires a dedicated software. This software is rarely tailored...
-
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
PublicationThe paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...
-
Parallel immune system for graph coloring
PublicationThis paper presents a parallel artificial immune system designed forgraph coloring. The algorithm is based on the clonal selection principle. Each processor operates on its own pool of antibodies and amigration mechanism is used to allow processors to exchange information. Experimental results show that migration improves the performance of the algorithm. The experiments were performed using a high performance cluster on a set...
-
Parallel Computations of Text Similarities for Categorization Task
PublicationIn this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....
-
Testing for conformance of parallel programming pattern languages
PublicationThis paper reports on the project being run by TUG and IMAG, aimed at reducing the volume of tests required to exercise parallel programming language compilers and libraries. The idea is to use the ISO STEP standard scheme for conformance testing of software products. A detailed example illustrating the ongoing work is presented.
-
Bounds on the Cover Time of Parallel Rotor Walks
PublicationThe rotor-router mechanism was introduced as a deterministic alternative to the random walk in undirected graphs. In this model, a set of k identical walkers is deployed in parallel, starting from a chosen subset of nodes, and moving around the graph in synchronous steps. During the process, each node maintains a cyclic ordering of its outgoing arcs, and successively propagates walkers which visit it along its outgoing arcs in...
-
Sensors in River Information Services of the Odra River in Poland: Current State and Planned Extension
PublicationAccording to adopted in 2016 by the polish Council of Ministers assumptions for the plans for the progress of inland waterways in Poland for the years 2016-2020, with the perspective of 2030, assume that by 2030 Odra along its entire length and the Vistula from Warsaw to Gdansk, they will have become international shipping routes, which will be implemented system of River Information Services (RIS). Aspects of RIS sensor application...
-
Journal of Agricultural Extension
Journals -
Journal of Mathematical Extension
Journals -
Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.
PublicationW pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...
-
Przetwarzanie Równoległe CUDA/Parallel processing on CUDA
e-Learning Courses -
Performance evaluation of parallel background subtraction on GPU platforms
PublicationImplementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...
-
Block-based Representation of Application Execution on Modern Parallel Systems
PublicationThe chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.
-
Zastosowanie bajtowo adresowanej pamięci NVRAM do zwiększenia wydajności wybranych aplikacji równoległych wykorzystujących MPI I/O
PublicationObecnie wiele badań podejmuje temat rosnącego problemu wydajności operacji na plikach w środowiskach klastrowych. Jednocześnie, według ostatnich doniesień związanych z rozwojem technologii pamięci komputerowych, w najbliższej przyszłości na rynku powinny pojawić się układy trwałej pamięci o dostępie swobodnym, adresowanej bajtowo. Niniejsza rozprawa pokazuje, że przy użyciu takiej pamięci można zwiększyć wydajność wybranych...
-
Computer experiments with a parallel clonal selection algorithm for the graph coloring problem
PublicationArtificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...
-
Bounds on the cover time of parallel rotor walks
PublicationThe rotor-router mechanism was introduced as a deterministic alternative to the random walk in undirected graphs. In this model, a set of k identical walkers is deployed in parallel, starting from a chosen subset of nodes, and moving around the graph in synchronous steps. During the process, each node successively propagates walkers visiting it along its outgoing arcs in round-robin fashion, according to a fixed ordering. We consider...
-
DISTRIBUTED AND PARALLEL DATABASES
Journals -
A Workflow Application for Parallel Processing of Big Data from an Internet Portal
PublicationThe paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple...
-
Decentralized control of a different rated parallel UPS systems
PublicationThe paper presents the single phase uninterruptible power supply (UPS) system with galvanic separated DC-AC-DC-AC converters operating in parallel. The CAN physical layer based system of communication between converters has been developed and applied, which allow to utilize a decentralized master-slave control providing high availability factor of the whole UPS system. The control system of particular converters has been developed...
-
Extension management of a knowledge base migration process to IPv6
PublicationThere are many reasons to deploy IPv6 protocol with IPv4 address space depletion being the most indisputable. Unfortunately, migration to IPv6 protocol seems slower than anticipated. To improve pace of the IPv6 deployment, authors of the article developed the two applications that supports the migration process. Their main purpose is to help less experienced network administrators facilitate the migration process with a particular...
-
ROLE OF AGRICULTURAL EXTENSION IN ADOPTION OF SUSTAINABLE AGRICULTURE PRACTICES
Publication -
AffecTube — Chrome extension for YouTube video affective annotations
PublicationThe shortage of emotion-annotated video datasets suitable for training and validating machine learning models for facial expression-based emotion recognition stems primarily from the significant effort and cost required for manual annotation. In this paper, we present AffecTube as a comprehensive solution that leverages crowdsourcing to annotate videos directly on the YouTube platform, resulting in ready-to-use emotion-annotated...
-
Comparison of EHD devices with parallel and in series spiked electrodes
PublicationIn this paper two electrohydrodynamic (EHD) devices for gas pumping and cleaning are presented. In both cases to induce an airflow in these EHD devices corona discharge was used. The discharge was generated between the spiked electrodes set parallel (the first case) or in series (the second case) and the plate electrodes. An asymmetric electric field and generated discharge result in unidirectional gas flow through the EHD device....
-
Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi
PublicationParallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...
-
A distributed system for conducting chess games in parallel
PublicationThis paper proposes a distributed and scalable cloud based system designed to play chess games in parallel. Games can be played between chess engines alone or between clusters created by combined chess engines. The system has a built-in mechanism that compares engines, based on Elo ranking which finally presents the strength of each tested approach. If an approach needs more computational power, the design of the system allows...
-
Scheduling of compatible jobs on parallel machines
PublicationThe dissertation discusses the problems of scheduling compatible jobs on parallel machines. Some jobs are incompatible, which is modeled as a binary relation on the set of jobs; the relation is often modeled by an incompatibility graph. We consider two models of machines. The first model, more emphasized in the thesis, is a classical model of scheduling, where each machine does one job at time. The second one is a model of p-batching...
-
From Sequential to Parallel Implementation of NLP Using the Actor Model
PublicationThe article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...
-
Parallel Cooperating A-Teams
Publication -
Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment
PublicationThe paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...
-
A New Approach for the Mitigating of Flow Maldistribution in Parallel Microchannel Heat Sink
PublicationThe problem of flow maldistribution is very critical in microchannel heat sinks (MCHS). It induces temperature nonuniformity, which may ultimately lead to the breakdown of associated system. In the present communication, a novel approach for the mitigation of flow maldistribution problem in parallel MCHS has been proposed using variable width microchannels. Numerical simulation of copper made parallel MCHS consisting of 25 channels...
-
Parallel implementation of a Sailing Assistance Application in a Cloud Environment
PublicationSailboat weather routing is a highly complex problem in terms of both the computational time and memory. The reason for this is a large search resulting in a multitude of possible routes and a variety of user preferences. Analysing all possible routes is only feasible for small sailing regions, low-resolution maps, or sailboat movements on a grid. Therefore, various heuristic approaches are often applied, which can find solutions...
-
Sensorless predictive control of three-phase parallel active filter
PublicationThe paper presents the control system of parallel active power filter (APF) with predictive reference current calculation and model based predictive current control. The novel estimator and predictor of grid emf is proposed for AC voltage sensorless operation of APF, regardless of distortion of this voltage. Proposed control system provides control of APF current with high precision and dynamics limited only by filter circuit parameters....
-
Journal of Agricultural Education and Extension
Journals -
International Journal of Agricultural Extension
Journals -
Extension of selected ADFA construction algorithms to the case of cyclic automata.
PublicationW niedawnym artykule Rafael Carrasco i Mikel Forcada przedstawiają przyrostowy algorytm dodawania słów do minimalnego, acyklicznego automatu skończonego. Ten algorytm jest uogólnieniem przyrostowego algorytmu tworzenia acyklicznych deterministycznych automatów skończonych (ADFAs). Przedstawiamy podobne uogólnienia dwóch innych algorytmów tworzenia ADFAs. Chociaż te ougólnienia zostały już opublikowane w maju i czerwcu 2004 r.,...
-
On Dynamic Extension of a Local Material Symmetry Group for Micropolar Media
PublicationFor micropolar media we present a new definition of the local material symmetry group considering invariant properties of the both kinetic energy and strain energy density under changes of a reference placement. Unlike simple (Cauchy) materials, micropolar media can be characterized through two kinematically independent fields, that are translation vector and orthogonal microrotation tensor. In other words, in micropolar continua...
-
Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform
PublicationResults of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...
-
Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines
PublicationIn this paper we consider a problem of job scheduling on parallel machines with a presence of incompatibilities between jobs. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. Our research stems from the works of Bodlaender, Jansen, and Woeginger (1994) and Bodlaender and Jansen (1993). In particular, we pursue the...
-
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
PublicationThis paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...
-
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
PublicationThe paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...
-
Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream
PublicationA background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....
-
A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures
PublicationIn this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person's relatives to judge...
-
Parallel processing of multimedia streams
PublicationW artykule zaprezentowana jest nowa biblioteka wspierającą tworzenie zadań obliczeniowych, część platformy KASKADA.Przedstawiony został projekt biblioteki, uwzględniający diagram głównych klas oraz diagram sekwencji. Drugi z diagramów ukazuje współpracę głównych klas w procesie przetwarzania strumieni multimedialnych. W dalszej częsci omówione zostały szczegły mechanizmu komunikacji międzyzadawniowej oraz przedstawiony został graf...
-
Conformance testing of parallel languages
PublicationPrzedstawiono propozycję formalizacji opisu procesu generacji, wykonania ioceny testów zgodności dla języków i bibliotek programowania równoległego, wzakresie zgodności funkcjonalnej i wydajnościowej. Przykłady ilustrujące proponowany formalizm wykorzystują platformę programowania Athapascan.
-
Parallel scheduling by graph ranking
PublicationNr dokum.: 73017Praca dotyczy jednego z nieklasycznych modeli kolorowania grafów - uporządkowanego kolorowania. Celem było uzyskanie wyników, które mogo być wykorzystane w praktycznych zastosowaniach tego modelu, do których należą: równoległe przetwarzanie zapytań w relacyjnych bazach danych, równoległa faktoryzacja macierzy metodą Choleskiego, równoległa asemblacja produktu z jego części składowych. W pracy wskazano uogólnienia...
-
Parallel processing of multimedia streams
PublicationRozdział przedstawia platformę KASKADA służącą do przetwarzania strumieni multimedialnych. Został opisany jej projekt: diagramy UML klas i sekwencji obrazujące mechanizmy przetwarzania strumieni, oraz szczegóły komunikacji. Zaprezentowano, również, specjalistyczny framework wspomagający tworzenie i wykonywanie algorytmów, jak również definiowanie scenariuszy usług, wraz z oceną ich użyteczności.
-
Performance evaluation of the parallel object tracking algorithm employing the particle filter
PublicationAn algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...
-
Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit
PublicationParallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD...
-
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
PublicationThis paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...
-
Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms
PublicationImplementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...
-
Automatic Watercraft Recognition and Identification on Water Areas Covered by Video Monitoring as Extension for Sea and River Traffic Supervision Systems
PublicationThe article presents the watercraft recognition and identification system as an extension for the presently used visual water area monitoring systems, such as VTS (Vessel Traffic Service) or RIS (River Information Service). The watercraft identification systems (AIS - Automatic Identification Systems) which are presently used in both sea and inland navigation require purchase and installation of relatively expensive transceivers...
-
Representing and Managing Experiential Knowledge with Decisional DNA and its Drimos® Extension
PublicationThe Semantic Web concept is proposing a future concept of the WorldWideWeb (WWW) where both humans and man-made systems are able to interconnect and exchange knowledge. One of the challenges of Semantic Web is smart and trusted accommodation of knowledge in artificial systems so it can be unified, enhanced, reused, shared, communicated and distributed with added aptitude. Our research represents an important component of addressing...
-
Radar sensors planning for the purpose of extension of River Information Services in Poland
Publication -
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublicationPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs
PublicationThe paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...