Filtry
wszystkich: 752
wybranych: 497
-
Katalog
Filtry wybranego katalogu
Wyniki wyszukiwania dla: PARALLAX
-
Simulation of parallel similarity measure computations for large data sets
PublikacjaThe paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...
-
Block-based Representation of Application Execution on Modern Parallel Systems
PublikacjaThe chapter presents how to model execution of a parallel computational application that is to be executed in a large-scale parallel or distributed environment with potentially thousands to millions of execution units. The representation uses pre- viously attributes and factors representative of modern high performance systems including multicore CPUs, GPUs, dedicated accelerators such as Intel Phi.
-
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
PublikacjaSpectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...
-
Redundant Actuation of 3RRR over-actuated Planar Parallel Manipulator
PublikacjaPraca opisuje zagadnienia modelowania i napędzania manipulatorów równoległych. Cechą charakterystyczną manipulatorów równoległych jest występowanie jednego lub kilku łańcuchów kinematycznych zamkniętych (gałęzi równoległych). Standardowo, konstrukcje takie są napędzane jedynie silnikami montowanymi w parach kinematycznych łączących łańcuchy kinematyczne z podstawą. Niekiedy konstrukcje takie są układami napędzanymi nadmiarowo (liczba...
-
A Parallel Genetic Algorithm for Creating Virtual Portraits of Historical Figures
PublikacjaIn this paper we present a genetic algorithm (GA) for creating hypothetical virtual portraits of historical figures and other individuals whose facial appearance is unknown. Our algorithm uses existing portraits of random people from specific historical period and social background to evolve a set of face images potentially resembling the person whose image is to be found. We then use portraits of the person's relatives to judge...
-
Sensorless predictive control of three-phase parallel active filter
PublikacjaThe paper presents the control system of parallel active power filter (APF) with predictive reference current calculation and model based predictive current control. The novel estimator and predictor of grid emf is proposed for AC voltage sensorless operation of APF, regardless of distortion of this voltage. Proposed control system provides control of APF current with high precision and dynamics limited only by filter circuit parameters....
-
Planning optimised multi-tasking operations under the capability for parallel machining
PublikacjaThe advent of advanced multi-tasking machines (MTMs) in the metalworking industry has provided the opportunity for more efficient parallel machining as compared to traditional sequential processing. It entailed the need for developing appropriate reasoning schemes for efficient process planning to take advantage of machining capabilities inherent in these machines. This paper addresses an adequate methodical approach for a non-linear...
-
Performance evaluation of the parallel object tracking algorithm employing the particle filter
Publikacja -
Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream
PublikacjaA background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....
-
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
PublikacjaThe paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...
-
Scheduling with Complete Multipartite Incompatibility Graph on Parallel Machines: Complexity and Algorithms
PublikacjaIn this paper, the problem of scheduling on parallel machines with a presence of incompatibilities between jobs is considered. The incompatibility relation can be modeled as a complete multipartite graph in which each edge denotes a pair of jobs that cannot be scheduled on the same machine. The paper provides several results concerning schedules, optimal or approximate with respect to the two most popular criteria of optimality:...
-
Effective methods for functional confermance testing of parallel and distributed programming libraries.
PublikacjaRozprawa przedstawia kompletna metodykę tworzenia Zestawów Testów Zgodności dla języków programowania, bibliotek i API, ze szczególnym uwzględnieniem języków i bibliotek programowania równoleglego i rozproszonego. Autor rozpoczął badania w dziedzinie testowania zgodności dla bibliotek programowania równoleglego i rozproszonego, ale Metodyka Kolejnych zawężeń (ang. Consecutive Confinenments Method -CoCoM, stworzona przez Autora,...
-
Towards Efficient Parallel Image Processing on Cluster Grids Using GIMP.
PublikacjaZe względu na fakt, iż niewielu użytkowników posiada wiedzę niezbędną do wykorzystania niskopoziomowych bibliotek programowania równoległego w celu przyspieszenia działania programów operujących na obrazach, proponujemy plugin do znanej aplikacji GIMP, który umożliwia potokowe wykonanie szeregu filtrów na obrazach załadowanych przez plugin. Prezentujemy szczegóły implementacyjne, scenariusze testowe i wyniki na klastrach, potencjalnie...
-
Modern Platform for Parallel Algorithms Testing: Java on Intel Xeon Phi
PublikacjaParallel algorithms are popular method of increasing system performance. Apart from showing their properties using asymptotic analysis, proof-of-concept implementation and practical experiments are often required. In order to speed up the development and provide simple and easily accessible testing environment that enables execution of reliable experiments, the paper proposes a platform with multi-core computational accelerator:...
-
Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment
PublikacjaThe paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...
-
Molecular Diffusion Simulation on ARUZ – Massively-parallel FPGA-based Machine
Publikacja -
Scheduling with precedence constraints: mixed graph coloring in series-parallel graphs.
PublikacjaW pracy rozważono problem kolorowania grafów mieszanych, opisujący zagadnienie szeregowania zadań, w którym zależności czasowe zadań mają charakter częściowego porządku lub wzajemnego wykluczania. Dla przypadku, w którym graf zależności jest szeregowo-równoległy, podano algorytm rozwiązujący problem optymalnie w czasie $O(n^3.376 * log n)$.
-
Computer experiments with a parallel clonal selection algorithm for the graph coloring problem
PublikacjaArtificial immune systems (AIS) are algorithms that are based on the structure and mechanisms of the vertebrate immune system. Clonal selection is a process that allows lymphocytes to launch a quick response to known pathogens and to adapt to new, previously unencountered ones. This paper presents a parallel island model algorithm based on the clonal selection principles for solving the Graph Coloring Problem. The performance of...
-
Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms
PublikacjaImplementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...
-
Performance evaluation of the parallel object tracking algorithm employing the particle filter
PublikacjaAn algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...
-
A New Approach for the Mitigating of Flow Maldistribution in Parallel Microchannel Heat Sink
PublikacjaThe problem of flow maldistribution is very critical in microchannel heat sinks (MCHS). It induces temperature nonuniformity, which may ultimately lead to the breakdown of associated system. In the present communication, a novel approach for the mitigation of flow maldistribution problem in parallel MCHS has been proposed using variable width microchannels. Numerical simulation of copper made parallel MCHS consisting of 25 channels...
-
A Workflow Application for Parallel Processing of Big Data from an Internet Portal
PublikacjaThe paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple...
-
Parallel implementation of the DGF-FDTD method on GPU Using the CUDA technology
PublikacjaThe discrete Green's function (DGF) formulation of the finite-difference time-domain method (FDTD) is accelerated on a graphics processing unit (GPU) by means of the Compute Unified Device Architecture (CUDA) technology. In the developed implementation of the DGF-FDTD method, a new analytic expression for dyadic DGF derived based on scalar DGF is employed in computations. The DGF-FDTD method on GPU returns solutions that are compatible...
-
Experimental Research on the Energy Efficiency of a Parallel Hybrid Drive for an Inland Ship
PublikacjaThe growing requirements for limiting the negative impact of all modes of transport on the natural environment mean that clean technologies are becoming more and more important. The global trend of e-mobility also applies to sea and inland water transport. This article presents the results of experimental tests carried out on a life-size, parallel diesel-electric hybrid propulsion system. The eciency of the propulsion system was...
-
Optimizing the computation of a parallel 3D finite difference algorithm for graphics processing units
PublikacjaThis paper explores the possibilities of using a graphics processing unit for complex 3D finite difference computation via MUSTA‐FORCE and WENO algorithms. We propose a novel algorithm based on the new properties of CUDA surface memory optimized for 2D spatial locality and compare it with 3D stencil computations carried out via shared memory, which is currently considered to be the best approach. A case study was performed for...
-
Parallel in vitro and in silico investigations into anti-inflammatory effects of non-prenylated stilbenoids
Publikacja -
Generating reliable conformance test suites for parallel and distributed languages, libraries, and APIs.
PublikacjaArtykuł nakreśla nową metodykę dla tworzenia Zestawów Testów Zgodności (ZTG) dla języków, bibliotek i API programowania równoległego i rozproszonego. Autor rozpoczął swoje badania w zakresie testowania zgodności dla języka równoległego sterowanego danymi Athapascan, opracował metodykę dla projektowania i analizowania ZTG nazwaną Metodą Kolejnych Zawężeń (ang. Consecutive Confinements Methods - CoCoM), stworzył narzędzie CTS Designer,...
-
New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
PublikacjaPraca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...
-
From the Dynamic Lattice Liquid Algorithm to the Dedicated Parallel Computer – mDLL Machine
Publikacja -
Makespan minimization of multi-slot just-in-time scheduling on single and parallel machines
PublikacjaArtykuł podejmuje problem szeregowania zadań przy założeniu podziału czasu na sloty jednakowej długości, gdzie każde z zadań ma ustaloną długość oraz czas jego zakończenia, który jest relatywny do końca slotu. Problem znalezienia uszeregowania polega na dokonaniu przydziału zadań do poszczególnych slotów, przy czym w ogólności długość zadania może wymuszać sytuację, w której zadańie jest realizowane nie tylko w slocie, w którym...
-
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications
PublikacjaThe aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...
-
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
PublikacjaThe paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublikacjaIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
Performance/energy aware optimization of parallel applications on GPUs under power capping
PublikacjaIn the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...
-
A Parallel Corpus-Based Approach to the Crime Event Extraction for Low-Resource Languages
PublikacjaThese days, a lot of crime-related events take place all over the world. Most of them are reported in news portals and social media. Crime-related event extraction from the published texts can allow monitoring, analysis, and comparison of police or criminal activities in different countries or regions. Existing approaches to event extraction mainly suggest processing texts in English, French, Chinese, and some other resource-rich...
-
High power, zero ripples active filtering system with power modules operating in parallel
Publikacja -
Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins
PublikacjaWe report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...
-
Benchmarking Parallel Chess Search in Stockfish on Intel Xeon and Intel Xeon Phi Processors
PublikacjaThe paper presents results from benchmarking the parallel multithreaded Stockfish chess engine on selected multi- and many-core processors. It is shown how the strength of play for an n-thread version compares to 1-thread version on both Intel Xeon and latest Intel Xeon Phi x200 processors. Results such as the number of wins, losses and draws are presented and how these change for growing numbers of threads. Impact of using particular...
-
Efficient parallel algorithms in global optimization of potential energy functions for peptides, proteins, and crystals
Publikacja -
Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.
PublikacjaW pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...
-
Portable parallel simulator using MPI for 2D and 3D domains: design and performance testing
PublikacjaW artykule prezentujemy szczegóły projektowo-implementacyjne naszego modularnego kodu symulacyjnego z wykorzystaniem MPI, w tym nakładaniem obliczeń i komunikacji. Podkreślamy modularność naszej implementacji pozwalającą na łatwą adaptację kodu dla innych zasotosowań. Prezentujemy związek pomiędzy przyspieszeniem obliczeń, rozmiarem i kształtami trójwymiarowych domen z różnymi stosunkami liczby węzłów aktualizowanych przez procesor...
-
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
PublikacjaThis paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...
-
ARUZ — Large-scale, massively parallel FPGA-based analyzer of real complex systems
Publikacja -
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublikacjaPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform
PublikacjaResults of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...
-
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
PublikacjaThis paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...
-
Multi-agent large-scale parallel crowd simulation with NVRAM-based distributed cache
PublikacjaThis paper presents the architecture, main components and performance results for a parallel and modu-lar agent-based environment aimed at crowd simulation. The environment allows to simulate thousandsor more agents on maps of square kilometers or more, features a modular design and incorporates non-volatile RAM (NVRAM) with a fail-safe mode that can be activated to allow to continue computationsfrom a recently analyzed state in...
-
Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware
PublikacjaIn the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...
-
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
PublikacjaRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming
PublikacjaIn the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...