Search results for: parallel computational implementation

Parallel implementation of a Sailing Assistance Application in a Cloud Environment

Publication

- IEEE Access - Year 2023

Sailboat weather routing is a highly complex problem in terms of both the computational time and memory. The reason for this is a large search resulting in a multitude of possible routes and a variety of user preferences. Analysing all possible routes is only feasible for small sailing regions, low-resolution maps, or sailboat movements on a grid. Therefore, various heuristic approaches are often applied, which can find solutions...

Full text available to download

From Sequential to Parallel Implementation of NLP Using the Actor Model

Publication

- Advances in Intelligent Systems and Computing - Year 2018

The article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...

Full text available to download

Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology

Publication

- Year 2016

The discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...

Full text to download in external service

Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system

Publication

T. Stefański

- Progress in Electromagnetics Research-PIER - Year 2013

This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...

Full text to download in external service

Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform

Publication

- Journal of Real-Time Image Processing - Year 2016

Results of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...

Full text available to download

Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins

Publication

A. Sieradzan
J. Sans‐Duñó
E. Lubecka
C. Czaplewski
A. Lipska
H. Leszczyński
K. Ocetkiewicz
J. Proficz
P. Czarnul
H. Krawczyk
A. Liwo

- JOURNAL OF COMPUTATIONAL CHEMISTRY - Year 2023

We report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...

Full text available to download

Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit

Publication

- RADIOENGINEERING - Year 2014

Parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD...

Full text available to download

Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system

Publication

- SIMULATION MODELLING PRACTICE AND THEORY - Year 2023

In the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...

Full text to download in external service

Molecular Simulations Using Boltzmann’s Thermally Activated Diffusion - Implementation on ARUZ – Massively-parallel FPGA-based Machine

Publication

G. Jablonski
P. Amrozik
K. Halagan

- Year 2021

Full text to download in external service

Improved conformational space annealing method to treat β-structure with the UNRES force-field and to enhance scalability of parallel implementation

Publication

C. Czaplewski
A. Liwo
J. Pillardy
S. Ołdziej
H. Scheraga

- POLYMER - Year 2004

Full text to download in external service

Grid Implementation of a Parallel Multiobjective Genetic Algorithm for Optimized Allocation of Chlorination Stations in Drinking Water Distribution Systems: Chojnice Case Study

Publication

- IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS - Year 2008

Solving multiobjective optimization problems requires suitable algorithms to find a satisfactory approximation of a globally optimal Pareto front. Furthermore, it is a computationally demanding task. In this paper, the grid implementation of a distributed multiobjective genetic algorithm is presented. The distributed version of the algorithm is based on the island algorithm with forgetting island elitism used instead of a genetic...

Full text to download in external service

Advanced Control With PLC—Code Generator for aMPC Controller Implementation and Cooperation With External Computational Server for Dealing With Multidimensionality, Constraints and LMI Based Robustness

Publication

- IEEE Access - Year 2022

The manufacturers of Programmable Logic Controllers (PLC) usually equip their products with extremely simple control algorithms, such as PID and on-off regulators. However, modern PLCs have much more efficient processors and extensive memory, which enables implementing more sophisticated controllers. The paper discusses issues related to the implementation of matrix operations, time limitations for code execution within one PLC...

Full text available to download

Multiprocessor Implementation of Parallel Multiobjective Genetic Algorithm for Optimized Allocation of Chlorination Stations in Drinking Water Distribution System a New Water Quality Model Approach

Publication

- Year 2013

The Critical Infrastructure Systems (CISs) have received in recent years a considerable attention due to their heavy impact on sustainable development of modern societies. Most CISs may be classied as large scale complex systems of network structure, in uenced by strong interactions form the surrounding environment, internal and external interconnections. The later is a result of inter-CIS dependencies. The control, monitoring...

Full text to download in external service

Multiprocessor implementation of Parallel Multiobjective Genetic Algorithm for Optimized Allocation of Chlorination Stations in Drinking Water Distribution System - a new water quality model approach

Publication

G. Ewald
T. Zubowicz
M. Brdys

- IFAC Proceedings Volumes - Year 2013

Full text to download in external service

Implementation of Molecular Dynamics and Its Extensions with the Coarse-Grained UNRES Force Field on Massively Parallel Systems: Toward Millisecond-Scale Simulations of Protein Structure, Dynamics, and Thermodynamics

Publication

A. Liwo
S. Ołdziej
C. Czaplewski
D. Kleinerman
P. Blood
H. Scheraga

- Journal of Chemical Theory and Computation - Year 2010

Full text to download in external service

Architecture and implementation of distributed data storage using Web Services, CORBA i PVM. W: Proceedings. PPAM 2003. Parallel Processing and Applied Mathematics. Fifth International Conference. Częstochowa, 7-10 September 2003. Architektura i implementacja rozproszonego zarządzania danymi używając systemów Web Services, CORBA i PVN.

Publication

P. Czarnul

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2003

Proponujemy architekturę i jej implementację PVMWeb Cluster I/O przeznaczoną do rozproszonego zarządzania danymi. Dane zapisywane są w systemie Web Services z geograficznie odległych klientów lub przez wywołania CORBA z wewnątrz danego klastra co oferuje lepsze osiągi.

The chapter analyses the K-Means algorithm in its parallel setting. We provide detailed description of the algorithm as well as the way we paralellize the computations. We identiﬁed complexity of the particular steps of the algorithm that allows us to build the algorithm model in MERPSYS system. The simulations with the MERPSYS have been performed for diﬀerent size of the data as well as for diﬀerent number of the processors used for the computations. The results we got using the model have been compared to the results obtained from real computational environment.

Publication

J. Szymański

- Year 2016

The chapter analyses the K-Means algorithm in its parallel setting. We provide detailed description of the algorithm as well as the way we paralellize the computations. We identiﬁed complexity of the particular steps of the algorithm that allows us to build the algorithm model in MERPSYS system. The simulations with the MERPSYS have been performed for diﬀerent size of the data as well as for diﬀerent number of the processors used...

Simulation of parallel similarity measure computations for large data sets

Publication

- Year 2015

The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...

Full text to download in external service

Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi

Publication

A. Malinowski

- International Journal of Information Technology and Computer Science - Year 2015

Parallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...

Full text to download in external service

Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment

Publication

- Year 2014

The paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...

Full text to download in external service

A distributed system for conducting chess games in parallel

Publication

- Procedia Computer Science - Year 2017

This paper proposes a distributed and scalable cloud based system designed to play chess games in parallel. Games can be played between chess engines alone or between clusters created by combined chess engines. The system has a built-in mechanism that compares engines, based on Elo ranking which finally presents the strength of each tested approach. If an approach needs more computational power, the design of the system allows...

Full text available to download

GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM

Publication

- Communications in Computational Physics - Year 2017

This paper presents a GPU-accelerated implementation of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) method with an inexact nullspace filtering approach to find eigenvalues in electromagnetics analysis with higherorder FEM. The performance of the proposed approach is verified using the Kepler (Tesla K40c) graphics accelerator, and is compared to the performance of the implementation based on functions from...

Full text to download in external service

Computationally Effcient Solution of a 2D Diffusive Wave Equation Used for Flood Inundation Problems

Publication

- Water - Year 2019

This paper presents a study dealing with increasing the computational efficiency in modeling floodplain inundation using a two-dimensional diffusive wave equation. To this end, the domain decomposition technique was used. The resulting one-dimensional diffusion equations were approximated in space with the modified finite element scheme, whereas time integration was carried out using the implicit two-level scheme. The proposed...

Full text available to download

Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card

Publication

T. Topa
A. Noga
T. Stefański

- Year 2023

In this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...

Full text to download in external service

KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs

Publication

- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Year 2016

The paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....

Full text to download in external service

A GPU Solver for Sparse Generalized Eigenvalue Problems with Symmetric Complex-Valued Matrices Obtained Using Higher-Order FEM

Publication

- IEEE Access - Year 2018

The paper discusses a fast implementation of the stabilized locally optimal block preconditioned conjugate gradient (sLOBPCG) method, using a hierarchical multilevel preconditioner to solve nonHermitian sparse generalized eigenvalue problems with large symmetric complex-valued matrices obtained using the higher-order finite-element method (FEM), applied to the analysis of a microwave resonator. The resonant frequencies of the low-order...

Full text available to download

Modeling Parallel Applications in the MERPSYS Environment

Publication

P. Czarnul

- Year 2016

The chapter presents how to model parallel computational applications for which simulation of execution in a large-scale parallel or distributed environment is performed within the MERPSYS environment. Specifically, it is shown what approaches can be adopted to model key paradigms often used for parallel applications: master-slave, geometric parallelism (single program multiple data), pipelined and divide-and-conquer applications....

Parallel Computations of Text Similarities for Categorization Task

Publication

J. Szymański

- Year 2013

In this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....

Block-based Representation of Application Execution on Modern Parallel Systems

Publication

P. Czarnul

- Year 2013

The chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.

Multi-agent large-scale parallel crowd simulation

Publication

A. Malinowski
P. Czarnul
K. Czuryƚo
M. Maciejewski
P. Skowron

- Year 2017

This paper presents design, implementation and performance results of a new modular, parallel, agent-based and large scale crowd simulation environment. A parallel application, implemented with C and MPI, was implemented and run in this parallel environment for simulation and visualization of an evacuation scenario at Gdansk University of Technology, Poland and further in the area of districts of Gdansk. The application uses a...

Full text to download in external service

Performance evaluation of parallel background subtraction on GPU platforms

Publication

G. Szwoch

- Elektronika : konstrukcje, technologie, zastosowania - Year 2015

Implementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...

Full text to download in external service

A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache

Publication

- Scalable Computing: Practice and Experience - Year 2018

The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Full text available to download

Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform

Publication

- Year 2017

Performance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...

Full text to download in external service

NVRAM as Main Storage of Parallel File System

Publication

A. Malinowski

- Journal of Computer Science and Control Systems - Year 2016

Modern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a...

Full text to download in external service

Modeling energy consumption of parallel applications

Publication

- Annals of Computer Science and Information Systems - Year 2016

The paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...

Full text available to download

Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems

Publication

P. Rościszewski

- Year 2014

Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Full text to download in external service

Fast implementation of FDTD-compatible green's function on multicore processor

Publication

T. Stefański

- IEEE Antennas and Wireless Propagation Letters - Year 2012

In this letter, numerically efficient implementation of the finite-difference time domain (FDTD)-compatible Green's function on a multicore processor is presented. Recently, closed-form expression of this discrete Green's function (DGF) was derived, which simplifies its application in the FDTD simulations of radiation and scattering problems. Unfortunately, the new DGF expression involves binomial coefficients, whose computations...

Full text to download in external service

Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms

Publication

G. Szwoch

- Year 2014

Implementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...

Full text to download in external service

Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware

Publication

- Applied Sciences-Basel - Year 2022

In the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...

Full text available to download

Acceleration of the DGF-FDTD method on GPU using the CUDA technology

Publication

- Year 2015

We present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...

Full text to download in external service

Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption

Publication

P. Rościszewski

- Year 2018

Many important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...

Full text to download in external service

Numerical Study on Mitigation of Flow Maldistribution in Parallel Microchannel Heat Sink: Channels Variable Width Versus Variable Height Approach

Publication

R. Kumar
G. Singh
D. Mikielewicz

- JOURNAL OF ELECTRONIC PACKAGING - Year 2019

Microchannel heat sink on one hand enjoys benefits of intensified several folds heat transfer performance but on the other hand has to suffer aggravated form of trifling limitations associated with imperfect hydrodynamics and heat transfer behavior. Flow maldistribution is one of such limitation that exaggerates temperature nonuniformity across parallel microchannels leading to increase in maximum base temperature. Recently, variable...

Full text to download in external service

An Implementation of a Compact Smart Resistive Sensor Based on a Microcontroller with an Internal ADC

Publication

Z. Czaja

- Metrology and Measurement Systems - Year 2016

In the paper a new implementation of a compact smart resistive sensor based on a microcontroller with internal ADCs is proposed and analysed. The solution is based only on a (already existing in the system) microcontroller and a simple sensor interface circuit working as a voltage divider consisting of a reference resistor and the resistive sensor connected in parallel with an interference suppression capacitor. The measurement...

Full text available to download

A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures

Publication

- TASK Quarterly - Year 2012

In this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person's relatives to judge...

Full text available to download

Energy consumption optimization in wastewater treatment plants: Machine learning for monitoring incineration of sewage sludge

Publication

- Sustainable Energy Technologies and Assessments - Year 2023

Biomass management in terms of energy consumption optimization has become a recent challenge for developed countries. Nevertheless, the multiplicity of materials and operating parameters controlling energy consumption in wastewater treatment plants necessitates the need for sophisticated well-organized disciplines in order to minimize energy consumption and dissipation. Sewage sludge (SS) disposal management is the key stage of...

Full text to download in external service

A Parallel Corpus-Based Approach to the Crime Event Extraction for Low-Resource Languages

Publication

N. Khairova
O. Mamyrbayev
N. Rizun
M. Razno
G. Ybytayeva

- IEEE Access - Year 2023

These days, a lot of crime-related events take place all over the world. Most of them are reported in news portals and social media. Crime-related event extraction from the published texts can allow monitoring, analysis, and comparison of police or criminal activities in different countries or regions. Existing approaches to event extraction mainly suggest processing texts in English, French, Chinese, and some other resource-rich...

Full text available to download

50’ Sail Catamaran with Hybrid Propulsion, Design, Theoretical and Experimental Studies

Publication

- Polish Maritime Research - Year 2022

The development of modern lithium batteries and propulsion systems now allows the use of complex propulsion systems for vessels of various sizes. As part of the research and implementation project, a parallel hybrid drive system was designed, built and then tested in the laboratory. The experimental studies conducted allowed for the measurements of power, fuel consumption and electric power distribution in various operating modes...

Full text available to download

DL_MG: A Parallel Multigrid Poisson and Poisson–Boltzmann Solver for Electronic Structure Calculations in Vacuum and Solution

Publication

J. Womack
L. Anton
J. Dziedzic
P. Hasnip
M. Probert
C. Skylaris

- Journal of Chemical Theory and Computation - Year 2018

The solution of the Poisson equation is a crucial step in electronic structure calculations, yielding the electrostatic potential -- a key component of the quantum mechanical Hamiltonian. In recent decades, theoretical advances and increases in computer performance have made it possible to simulate the electronic structure of extended systems in complex environments. This requires the solution of more complicated variants of the...

Full text available to download

Computer experiments with a parallel clonal selection algorithm for the graph coloring problem

Publication

- Year 2008

Artificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...

Full text to download in external service

Acceleration of the discrete Green's function computations

Publication

T. Stefański

- Year 2012

Results of the acceleration of the 3-D discrete Green's function (DGF) computations on the multicore processor are presented. The code was developed in the multiple precision arithmetic with use of the OpenMP parallel programming interface. As a result, the speedup factor of three orders of magnitude compared to the previous implementation was obtained thus applicability of the DGF in FDTD simulations was significantly improved.

Full text to download in external service

Search

Filters

Catalog

Category

Year

Options

Search results for: parallel computational implementation