Wyniki wyszukiwania dla: PARALLEL APPLICATIONS

Wyniki wyszukiwania dla: PARALLEL APPLICATIONS

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 497

wyczyść wszystkie filtry niedostępne

Modeling Parallel Applications in the MERPSYS Environment
Publikacja
- P. Czarnul
- Rok 2016
The chapter presents how to model parallel computational applications for which simulation of execution in a large-scale parallel or distributed environment is performed within the MERPSYS environment. Specifically, it is shown what approaches can be adopted to model key paradigms often used for parallel applications: master-slave, geometric parallelism (single program multiple data), pipelined and divide-and-conquer applications....
Modeling energy consumption of parallel applications
Publikacja
- Annals of Computer Science and Information Systems - Rok 2016
The paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...

Pełny tekst do pobrania w portalu
Simulation of Parallel Applications on Large-scale Distributed Systems
Publikacja
- P. Rościszewski
- P. Sidorczak
- Rok 2014
This chapter has a form of a review article in the field of simulating High-Performance Computing systems. We justify the need for a new versatile simulator considering heterogeneity, energy efficiency and reliability of HPC systems. We sketch the problems that need to be solved by such simulator and rationalize using discrete-event simulation for this purpose. Based on a review of existing discrete-event HPC simulation solutions...
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
Publikacja
- P. Czarnul
- Electronics - Rok 2021
The paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...

Pełny tekst do pobrania w portalu
Performance/energy aware optimization of parallel applications on GPUs under power capping
Publikacja
- A. Krzywaniak
- P. Czarnul
- Rok 2020
In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...

Pełny tekst do pobrania w portalu
Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors
Publikacja
- Annals of Computer Science and Information Systems - Rok 2018
In the paper we present extensive results from analyzing energy/performance trade-offs with power capping observed on four different modern CPUs, for three different parallel applications such as 2D heat distribution, numerical integration and Fast Fourier Transform. The CPU tested represent both multi-core type CPUs such as Intel⃝R Xeon⃝R E5, desktop and mobile i7 as well as many-core Intel⃝R Xeon PhiTM x200 but also server, desktop...

Pełny tekst do pobrania w portalu
Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications
Publikacja
- Ł. Jarząbek
- P. Czarnul
- JOURNAL OF SUPERCOMPUTING - Rok 2017
The aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically,...

Pełny tekst do pobrania w portalu
New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
Publikacja
- P. Czarnul
- M. Frączak
- Rok 2005
Praca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...
Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware
Publikacja
- T. Kononowicz
- P. Czarnul
- Applied Sciences-Basel - Rok 2022
In the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...

Pełny tekst do pobrania w portalu
A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications
Publikacja
- A. Malinowski
- P. Czarnul
- M. Maciejewski
- P. Skowron
- Rok 2016
The paper presents a fail-safe NVRAM based mechanism for creation and recovery of data copies during parallel MPI application runtime. Specifically, we target a cluster environment in which each node has an NVRAM installed in it. Our previously developed extension to the MPI I/O API can take advantage of NVRAM regions in order to provide an NVRAM based cache like mechanism to significantly speed up I/O operations and allow to preload...

Pełny tekst do pobrania w serwisie zewnętrznym
Performance evaluation of Unified Memory with prefetching and oversubscription for selected parallel CUDA applications on NVIDIA Pascal and Volta GPUs
Publikacja
- M. Knap
- P. Czarnul
- JOURNAL OF SUPERCOMPUTING - Rok 2019
The paper presents assessment of Unified Memory performance with data prefetching and memory oversubscription. Several versions of code are used with: standard memory management, standard Unified Memory and optimized Unified Memory with programmer-assisted data prefetching. Evaluation of execution times is provided for four applications: Sobel and image rotation filters, stream image processing and computational fluid dynamic simulation,...

Pełny tekst do pobrania w portalu
Checkpointing of Parallel MPI Applications using MPI One-sided API with Support for Byte-addressable Non-volatile RAM
Publikacja
- P. Dorożyński
- P. Czarnul
- A. Malinowski
- K. Czuryło
- Ł. Dorau
- M. Maciejewski
- P. Skowron
- Rok 2016
The increasing size of computational clusters results in an increasing probability of failures, which in turn requires application checkpointing in order to survive those failures. Traditional checkpointing requires data to be copied from application memory into persistent storage medium, which increases application execution time as it is usually done in a separate step. In this paper we propose to use emerging byte-addressable...

Pełny tekst do pobrania w serwisie zewnętrznym
International Conference on Parallel and Distributed Computing, Applications and Technologies

Konferencje
IEEE International Symposium on Parallel and Distributed Processing with Applications

Konferencje
International Conference on Parallel and Distributed Processing Techniques and Applications

Konferencje
International workshop on High-Level Parallel Programming and Applications

Konferencje
International Workshop on Formal Methods for Parallel Programming: Theory and Applications

Konferencje
Parallel processing of multimedia streams
Publikacja
- Computer Applications in Electrical Engineering - Rok 2010
Rozdział przedstawia platformę KASKADA służącą do przetwarzania strumieni multimedialnych. Został opisany jej projekt: diagramy UML klas i sekwencji obrazujące mechanizmy przetwarzania strumieni, oraz szczegóły komunikacji. Zaprezentowano, również, specjalistyczny framework wspomagający tworzenie i wykonywanie algorytmów, jak również definiowanie scenariuszy usług, wraz z oceną ich użyteczności.
Parallel implementation of a Sailing Assistance Application in a Cloud Environment
Publikacja
- IEEE Access - Rok 2023
Sailboat weather routing is a highly complex problem in terms of both the computational time and memory. The reason for this is a large search resulting in a multitude of possible routes and a variety of user preferences. Analysing all possible routes is only feasible for small sailing regions, low-resolution maps, or sailboat movements on a grid. Therefore, various heuristic approaches are often applied, which can find solutions...

Pełny tekst do pobrania w portalu
Parallel Programming for Modern High Performance Computing Systems
Publikacja
- P. Czarnul
- Rok 2018
In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Pełny tekst do pobrania w serwisie zewnętrznym
Paweł Czarnul dr hab. inż.

Osoby

Dział Usług Chmurowych, Wydział Elektroniki, Telekomunikacji i Informatyki, Katedra Architektury Systemów Komputerowych

Paweł Czarnul uzyskał stopień doktora habilitowanego w dziedzinie nauk technicznych w dyscyplinie informatyka w roku 2015 zaś stopień doktora nauk technicznych w zakresie informatyki(z wyróżnieniem) nadany przez Radę Wydziału Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej w roku 2003. Dziedziny jego zainteresowań obejmują: przetwarzanie równoległei rozproszone w tym programowanie równoległe na klastrach obliczeniowych,...
Simulation of parallel similarity measure computations for large data sets
Publikacja
- Rok 2015
The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...

Pełny tekst do pobrania w serwisie zewnętrznym
A Workflow Application for Parallel Processing of Big Data from an Internet Portal
Publikacja
- P. Czarnul
- Rok 2014
The paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple...

Pełny tekst do pobrania w serwisie zewnętrznym
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
Publikacja
- SIMULATION MODELLING PRACTICE AND THEORY - Rok 2017
In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...

Pełny tekst do pobrania w portalu
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
Publikacja
- P. Rościszewski
- Rok 2014
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Pełny tekst do pobrania w serwisie zewnętrznym
Implementation of FDTD-compatible Green's function on heterogeneous CPU-GPU parallel processing system
Publikacja
- T. Stefański
- Progress in Electromagnetics Research-PIER - Rok 2013
This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates...

Pełny tekst do pobrania w serwisie zewnętrznym
Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training
Publikacja
- P. Rościszewski
- Procedia Computer Science - Rok 2017
In the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...

Pełny tekst do pobrania w portalu
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
Publikacja
- A. Malinowski
- P. Czarnul
- P. Dorożyński
- K. Czuryło
- Ł. Dorau
- M. Maciejewski
- P. Skowron
- Annals of Computer Science and Information Systems - Rok 2016
While many scientiﬁc, large-scale applications are data-intensive, fast and efﬁcient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...

Pełny tekst do pobrania w portalu
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
Publikacja
- P. Czarnul
- P. Rościszewski
- Rok 2020
Auto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...

Pełny tekst do pobrania w portalu
Development and tuning of irregular divide-and-conquer applications in DAMPVM/DAC
Publikacja
- P. Czarnul
- Rok 2002
This work presents implementations and tuning experiences with parallel irregular applications developed using the object oriented framework DAM-PVM/DAC. It is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-and-conquer (DAC) applications at runtime and dynamic mapping to processors taking into account their speeds and even loads by other user processes. New implementations of parallel applications...

Pełny tekst do pobrania w serwisie zewnętrznym
Parallelization of Compute Intensive Applications into Workflows based on Services in BeesyCluster
Publikacja
- P. Czarnul
- Rok 2011
The paper presents an approach for modeling, optimization and execution of workflow applications based on services that incorporates both service selection and partitioning of input data for parallel processing by parallel workflow paths. A compute-intensive workflow application for parallel integration is presented. An impact of the input data partitioning on the scalability is presented. The paper shows a comparison of the theoretical...

Pełny tekst do pobrania w portalu
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
Publikacja
- K. Łopatka
- Rok 2015
A system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
Publikacja
- P. Rościszewski
- Rok 2018
Many important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...

Pełny tekst do pobrania w serwisie zewnętrznym
Parallel Cooperating A-Teams
Publikacja
- D. Barbucha
- I. Czarnowski
- P. Jędrzejowicz
- E. Ratajczak-Ropel
- I. Wierzbowska
- Rok 2011
Pełny tekst do pobrania w serwisie zewnętrznym
Integration of Services into Workflow Applications
Publikacja
- P. Czarnul
- Rok 2015
Describing state-of-the-art solutions in distributed system architectures, Integration of Services into Workflow Applications presents a concise approach to the integration of loosely coupled services into workflow applications. It discusses key challenges related to the integration of distributed systems and proposes solutions, both in terms of theoretical aspects such as models and workflow scheduling algorithms, and technical...

Pełny tekst do pobrania w serwisie zewnętrznym
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
Publikacja
- J. Proficz
- P. Czarnul
- Rok 2016
The paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...

Pełny tekst do pobrania w portalu
Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors
Publikacja
- P. Czarnul
- INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING - Rok 2016
The paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...

Pełny tekst do pobrania w portalu
A Concept of Modeling and Optimization of Applications in Large Scale Systems
Publikacja
- P. Czarnul
- Rok 2013
The chapter presents the idea that includes modeling and subsequent optimization of application execution on large scale parallel and distributed systems. The model considers performance, reliability and power consumption. It should allow easy modeling of various classes of applications while reflecting key parameters of both the applications and two classes of target systems: clusters and volunteer based systems. The chapter presents...
Parallel scheduling by graph ranking
Publikacja
- D. Dereniowski
- Rok 2006
Nr dokum.: 73017Praca dotyczy jednego z nieklasycznych modeli kolorowania grafów - uporządkowanego kolorowania. Celem było uzyskanie wyników, które mogo być wykorzystane w praktycznych zastosowaniach tego modelu, do których należą: równoległe przetwarzanie zapytań w relacyjnych bazach danych, równoległa faktoryzacja macierzy metodą Choleskiego, równoległa asemblacja produktu z jego części składowych. W pracy wskazano uogólnienia...
Conformance testing of parallel languages
Publikacja
- Rok 2002
Przedstawiono propozycję formalizacji opisu procesu generacji, wykonania ioceny testów zgodności dla języków i bibliotek programowania równoległego, wzakresie zgodności funkcjonalnej i wydajnościowej. Przykłady ilustrujące proponowany formalizm wykorzystują platformę programowania Athapascan.
Parallel processing of multimedia streams
Publikacja
- Rok 2010
W artykule zaprezentowana jest nowa biblioteka wspierającą tworzenie zadań obliczeniowych, część platformy KASKADA.Przedstawiony został projekt biblioteki, uwzględniający diagram głównych klas oraz diagram sekwencji. Drugi z diagramów ukazuje współpracę głównych klas w procesie przetwarzania strumieni multimedialnych. W dalszej częsci omówione zostały szczegły mechanizmu komunikacji międzyzadawniowej oraz przedstawiony został graf...
Parallel immune system for graph coloring
Publikacja
- J. Dąbrowski
- Rok 2008
This paper presents a parallel artificial immune system designed forgraph coloring. The algorithm is based on the clonal selection principle. Each processor operates on its own pool of antibodies and amigration mechanism is used to allow processors to exchange information. Experimental results show that migration improves the performance of the algorithm. The experiments were performed using a high performance cluster on a set...

Pełny tekst do pobrania w serwisie zewnętrznym
Scheduling of compatible jobs on parallel machines
Publikacja
- T. Pikies
- Rok 2021
The dissertation discusses the problems of scheduling compatible jobs on parallel machines. Some jobs are incompatible, which is modeled as a binary relation on the set of jobs; the relation is often modeled by an incompatibility graph. We consider two models of machines. The first model, more emphasized in the thesis, is a classical model of scheduling, where each machine does one job at time. The second one is a model of p-batching...
Parallel Computations of Text Similarities for Categorization Task
Publikacja
- J. Szymański
- Rok 2013
In this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....
NVRAM as Main Storage of Parallel File System
Publikacja
- A. Malinowski
- Journal of Computer Science and Control Systems - Rok 2016
Modern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a...

Pełny tekst do pobrania w serwisie zewnętrznym
Testing for conformance of parallel programming pattern languages
Publikacja
- Ł. Garstecki
- P. Kaczmarek
- J. C. D. Kergommeaux
- H. Krawczyk
- B. Wiszniewski
- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2002
This paper reports on the project being run by TUG and IMAG, aimed at reducing the volume of tests required to exercise parallel programming language compilers and libraries. The idea is to use the ISO STEP standard scheme for conformance testing of software products. A detailed example illustrating the ongoing work is presented.
Bounds on the Cover Time of Parallel Rotor Walks
Publikacja
- D. Dereniowski
- A. Kosowski
- D. Pająk
- P. Uznański
- Rok 2014
The rotor-router mechanism was introduced as a deterministic alternative to the random walk in undirected graphs. In this model, a set of k identical walkers is deployed in parallel, starting from a chosen subset of nodes, and moving around the graph in synchronous steps. During the process, each node maintains a cyclic ordering of its outgoing arcs, and successively propagates walkers which visit it along its outgoing arcs in...

Pełny tekst do pobrania w serwisie zewnętrznym
Coordination in serial-parallel image processing
Publikacja
- W. Wójcik
- V. Dubovoi
- M. Duda
- R. Romaniuk
- L. Yesmakhanova
- A. Kozbakova
- R. S. Romaniuk
- Rok 2015
Pełny tekst do pobrania w serwisie zewnętrznym
The parallel environment for endoscopic image analysis
Publikacja
- H. Krawczyk
- A. Neyman
- M. Nowikowski
- J. Saif
- Rok 2002
The jPVM-oriented environment to support high performance computing required for the Endoscopy Recommender System (ERS) is defined. SPMD model of image matching is considered and its two implementations are proposed: Lexicographical Searching Algorithm (LSA) and Gradient Serching Algorithm (GSA). Three classes of experiments are considered and the relative degree of similarity and execution time of each algorithm are analysed....

Pełny tekst do pobrania w serwisie zewnętrznym
Bounds on the cover time of parallel rotor walks
Publikacja
- D. Dereniowski
- A. Kosowski
- D. Pająk
- P. Uznański
- JOURNAL OF COMPUTER AND SYSTEM SCIENCES - Rok 2016
The rotor-router mechanism was introduced as a deterministic alternative to the random walk in undirected graphs. In this model, a set of k identical walkers is deployed in parallel, starting from a chosen subset of nodes, and moving around the graph in synchronous steps. During the process, each node successively propagates walkers visiting it along its outgoing arcs in round-robin fashion, according to a fixed ordering. We consider...

Pełny tekst do pobrania w portalu

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: PARALLEL APPLICATIONS

Paweł Czarnul dr hab. inż.