Wyniki wyszukiwania dla: MPI

Wyniki wyszukiwania dla: MPI

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 51

wyczyść wszystkie filtry niedostępne

BC-MPI: running an mpi application on multiple clusters with beesycluster connectivity
Publikacja
- P. Czarnul
- Rok 2007
W artykule zaproponowano nowy pakiet BC-MPI, który umożliwiauruchomienie aplikacji MPI na wielu klastrach z różnymi implementacjami MPI. Wykorzystuje dedykowane implementacje MPIdo komunikacji wewnątrz klastrów oraz tryb MPI THREAD MULTIPLE dokomunikacji pomiędzy klastrami w dodatkowych wątkach aplikacji MPI. Ponadto, aplikacja BC-MPI może być automatycznie skompilowanai uruchomiona przez warstwę pośrednią BeesyCluster. BeesyClusterumożliwia...

Pełny tekst do pobrania w serwisie zewnętrznym
Checkpointing of Parallel MPI Applications using MPI One-sided API with Support for Byte-addressable Non-volatile RAM
Publikacja
- P. Dorożyński
- P. Czarnul
- A. Malinowski
- K. Czuryło
- Ł. Dorau
- M. Maciejewski
- P. Skowron
- Rok 2016
The increasing size of computational clusters results in an increasing probability of failures, which in turn requires application checkpointing in order to survive those failures. Traditional checkpointing requires data to be copied from application memory into persistent storage medium, which increases application execution time as it is usually done in a separate step. In this paper we propose to use emerging byte-addressable...

Pełny tekst do pobrania w serwisie zewnętrznym
Towards Easy-to-Use Checkpointing of MPI Applications within CLUSTERIX.
Publikacja
- Rok 2004
W literaturze wymienia się wiele bibliotek/systemów zarówno poziomu jądra jak i użytkownika, które wspomagają zapisywanie i odtwarzanie stanu procesów. W odniesieniu do aplikacji równoległych, jest to jednak zadanie cały czas trudne. Praca prezentuje nasze podejście do zapisywania/odtwarzania stanu aplikacji MPI wspomagane przez programistę, które wykorzystane będzie w środowisku projektu CLUSTERIX tj. zintegrowanej grupie klastrów...
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
Publikacja
- J. Proficz
- P. Czarnul
- Rok 2016
The paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...

Pełny tekst do pobrania w portalu
Object serialization and remote exception pattern for distributed C++/MPI application
Publikacja
- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2007
MPI is commonly used standard in development of scientific applications. It focuses on interlanguage operability and is not very well object oriented. The paper proposes a general pattern enabling design of distributed and object oriented applications. It also presents its sample implementations and performance tests.
New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
Publikacja
- P. Czarnul
- M. Frączak
- Rok 2005
Praca prezentuje szczególy projektowe i implementacyjne jak również wyniki wydajnościowe dwóch nowych bibliotek checkpointingu opracowanych przez autorów dla równoległych aplikacji MPI. Pierwsz biblioteka, tzw. user-guided wymaga od programisty dostarczenia funkcji pakujących i rozpakowujących stan procesu, ale dostarcza łatwego w użyciu API z wykorzystaniem stałych MPI. Wykorzystuje funkcje I/O MPI-2 lub dedykowany proces master...
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
Publikacja
- A. Malinowski
- P. Czarnul
- Scalable Computing: Practice and Experience - Rok 2018
The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Pełny tekst do pobrania w portalu
Portable parallel simulator using MPI for 2D and 3D domains: design and performance testing
Publikacja
- P. Czarnul
- K. Grzęda
- Rok 2005
W artykule prezentujemy szczegóły projektowo-implementacyjne naszego modularnego kodu symulacyjnego z wykorzystaniem MPI, w tym nakładaniem obliczeń i komunikacji. Podkreślamy modularność naszej implementacji pozwalającą na łatwą adaptację kodu dla innych zasotosowań. Prezentujemy związek pomiędzy przyspieszeniem obliczeń, rozmiarem i kształtami trójwymiarowych domen z różnymi stosunkami liczby węzłów aktualizowanych przez procesor...
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
Publikacja
- A. Malinowski
- P. Czarnul
- Procedia Computer Science - Rok 2018
The paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...

Pełny tekst do pobrania w portalu
Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns
Publikacja
- J. Proficz
- P. Sumionka
- J. Skomiał
- M. Semeniuk
- K. Niedzielewski
- M. Walczak
- Advances in Intelligent Systems and Computing - Rok 2020
The paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set...

Pełny tekst do pobrania w portalu
Efektywna warstwa pośrednicząca dla obliczeń typu master-slave w środowisku C++/MPI
Publikacja
- K. Bańczyk
- Rok 2006
Pokazano, jak dla wysokowydajnościowego algorytmu pisanego w modelu master-slave w języku C++ i spełniającego pewne ograniczenia można napisać i wykorzystać warstwę komunikacyjną zupełnie oddzielającą kod odpowiedzialny za komunikację od kodu odpowiedzialnego za dzie-dzinę problemową. Przedstawiona zostaje specyfkacja wymagań, jakie powinien spełniać hipotetyczny system rozproszony oraz warstwa komunikacyjna, a także wymagania...
Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware
Publikacja
- T. Kononowicz
- P. Czarnul
- Applied Sciences-Basel - Rok 2022
In the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...

Pełny tekst do pobrania w portalu
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
Publikacja
- A. Malinowski
- P. Czarnul
- P. Dorożyński
- K. Czuryło
- Ł. Dorau
- M. Maciejewski
- P. Skowron
- Annals of Computer Science and Information Systems - Rok 2016
While many scientiﬁc, large-scale applications are data-intensive, fast and efﬁcient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...

Pełny tekst do pobrania w portalu
A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications
Publikacja
- A. Malinowski
- P. Czarnul
- M. Maciejewski
- P. Skowron
- Rok 2016
The paper presents a fail-safe NVRAM based mechanism for creation and recovery of data copies during parallel MPI application runtime. Specifically, we target a cluster environment in which each node has an NVRAM installed in it. Our previously developed extension to the MPI I/O API can take advantage of NVRAM regions in order to provide an NVRAM based cache like mechanism to significantly speed up I/O operations and allow to preload...

Pełny tekst do pobrania w serwisie zewnętrznym
Zastosowanie bajtowo adresowanej pamięci NVRAM do zwiększenia wydajności wybranych aplikacji równoległych wykorzystujących MPI I/O
Publikacja
- A. Malinowski
- Rok 2019
Obecnie wiele badań podejmuje temat rosnącego problemu wydajności operacji na plikach w środowiskach klastrowych. Jednocześnie, według ostatnich doniesień związanych z rozwojem technologii pamięci komputerowych, w najbliższej przyszłości na rynku powinny pojawić się układy trwałej pamięci o dostępie swobodnym, adresowanej bajtowo. Niniejsza rozprawa pokazuje, że przy użyciu takiej pamięci można zwiększyć wydajność wybranych...

Pełny tekst do pobrania w portalu
European MPI Users' Group Conference (European PVM/MPI Users' Group Conference)

Konferencje
Analyses of wave records from the Southern Baltic Sea with the emphases on large wave events
Publikacja
- B. Paplińska-Swerpel
- M. Paprota
- J. Przewłócki
- W. Sulisz
- Rok 2005
W pracy przedstawiono wyniki analizy pomiarów szeregów czasowych falowania uzyskanych z boi pomiarowych Waverider, umieszczonych w kilku miejscach linii brzegowej Morza Bałtyckiego. Celem badań było wskazanie okresów i regionów możliwego pojawiania się fal maksymalnych. Rozważano wpływ czasu trwania sztormu i kierunku wiatru na powstawanie pojedynczych oraz grup fal ekstremalnych.
Paweł Czarnul dr hab. inż.

Osoby

Dział Usług Chmurowych, Wydział Elektroniki, Telekomunikacji i Informatyki, Katedra Architektury Systemów Komputerowych

Paweł Czarnul uzyskał stopień doktora habilitowanego w dziedzinie nauk technicznych w dyscyplinie informatyka w roku 2015 zaś stopień doktora nauk technicznych w zakresie informatyki(z wyróżnieniem) nadany przez Radę Wydziału Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej w roku 2003. Dziedziny jego zainteresowań obejmują: przetwarzanie równoległei rozproszone w tym programowanie równoległe na klastrach obliczeniowych,...
Process arrival pattern aware algorithms for acceleration of scatter and gather operations
Publikacja
- J. Proficz
- Cluster Computing-The Journal of Networks Software Tools and Applications - Rok 2020
Imbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...

Pełny tekst do pobrania w portalu
Dissociative multi-photon ionization of isolated uracil and uracil-adenine complexes
Publikacja
- M. Ryszka
- R. Pandey
- C. Rizk
- J. Tabet
- B. Barc
- M. Dampc
- N. Mason
- S. Eden
- INTERNATIONAL JOURNAL OF MASS SPECTROMETRY - Rok 2016
Recent multi-photon ionization (MPI) experiments on uracil revealed a fragment ion at m/z 84 that was proposed as a potential marker for ring opening in the electronically excited neutral molecule. The present MPI measurements on deuterated uracil identify the fragment as C3H4N2O+ (uracil+ less CO), a plausible dissociative ionization product from the theoretically predicted open-ring isomer. Equivalent measurements on thymine...

Pełny tekst do pobrania w serwisie zewnętrznym
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
Publikacja
- A. Malinowski
- P. Czarnul
- Rok 2017
In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...

Pełny tekst do pobrania w serwisie zewnętrznym
Parallelization of Compute Intensive Applications into Workflows based on Services in BeesyCluster
Publikacja
- P. Czarnul
- Rok 2011
The paper presents an approach for modeling, optimization and execution of workflow applications based on services that incorporates both service selection and partitioning of input data for parallel processing by parallel workflow paths. A compute-intensive workflow application for parallel integration is presented. An impact of the input data partitioning on the scalability is presented. The paper shows a comparison of the theoretical...

Pełny tekst do pobrania w portalu
Parallel simulations of electrophysiological phenomena in myocardium on large 32 and 64-bit Linux clusters.
Publikacja
- P. Czarnul
- K. Grzęda
- Rok 2004
W pracy podjęto badania i przeprowadzono symulacje zjawisk elektrofizjologicznych w mięśniu sercowym z wykorzystaniem wytworzonego w tym celu oprogramowania równoległego opartego na MPI. Zaimplementowano i zbadano ulepszenia kodu prowadzące do uzyskania dobrej skalowalności oraz przeprowadzono testy wydajności na najnowszych 32 i 64-bitowych klastrach linuksowych. Praca stanowi próbę równoległej implementacji znanego podejścia...
Strategie obsługi wyjątków w aplikacjach rozproszonych.
Publikacja
- P. Kaczmarek
- H. Krawczyk
- Studia Informatica Pomerania - Rok 2003
Rozpatrzono wykorzystanie mechanizmu obsługi wyjątków w systemach rozproszonych. Zaprezentowano różne strategie obsługi wyjątków dla różnych modeli przetwarzania i odpowiadającym ich środowisk programistycznych. Przyjęto nową koncepcję zdalnego odbiorcy wyjątków oraz zaprezentowano jego implementację przy wykorzystaniu biblioteki MPI oraz RMI.
Improving Clairvoyant: reduction algorithm resilient to imbalanced process arrival patterns
Publikacja
- J. Proficz
- K. Ocetkiewicz
- JOURNAL OF SUPERCOMPUTING - Rok 2021
The Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to imbalances in process arrival times” was analyzed, commented and improved. The comments concern handling certain edge cases in the original pseudocode and description, i.e., adding another state of a process, improved cache friendliness more precise complexity estimations and some other issues improving the robustness of the algorithm implementation....

Pełny tekst do pobrania w portalu
Protokoły łączności do transmisji strumieni multimedialnych na platformie KASKADA
Publikacja
- Rok 2013
Platforma KASKADA rozumiana jako system przetwarzania strumieni multimedialnych dostarcza szeregu usług wspomagających zapewnienie bezpieczeństwa publicznego oraz ocenę badań medycznych. Wydajność platformy KASKADA w znaczącym stopniu uzależniona jest od efektywności metod komunikacji, w tym wymiany danych multimedialnych, które stanowią podstawę przetwarzania. Celem prowadzonych prac było zaprojektowanie podsystemu komunikacji...
Multi-agent large-scale parallel crowd simulation
Publikacja
- A. Malinowski
- P. Czarnul
- K. Czuryƚo
- M. Maciejewski
- P. Skowron
- Rok 2017
This paper presents design, implementation and performance results of a new modular, parallel, agent-based and large scale crowd simulation environment. A parallel application, implemented with C and MPI, was implemented and run in this parallel environment for simulation and visualization of an evacuation scenario at Gdansk University of Technology, Poland and further in the area of districts of Gdansk. The application uses a...

Pełny tekst do pobrania w serwisie zewnętrznym
Kosmiczne zastosowania zaawansowanych technologii informatycznych
Kursy Online
- J. Proficz
- A. Królicka-Gałązka
Nowoczesne technologie wykorzystania systemów dużej mocy obliczeniowej: superkomputerów o architekturze klastrowej na przykładzie środowisk związanych z masowym przetwarzaniem danych (Big Data), obliczeniami w chmurze (Cloud Computing) oraz klasycznym podejściem wymiany wiadomości (MPI: Message Passing Interface) dla przetwarzania wsadowego.
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
Publikacja
- J. Proficz
- ACM Transactions on Architecture and Code Optimization - Rok 2021
Two novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted...

Pełny tekst do pobrania w portalu
Parallelisation of genetic algorithms for solving university timetabling problems
Publikacja
- Rok 2006
Algorytmy genetyczne stanowią ważną metodę rozwiązywania problemów optymalizacyjnych. W artykule skupiono się na projekcie równoległego algorytmu genetycznego pozwalającego uzyskiwać uniwersyteckie rozkłady zajęć, spełniające zarówno twarde jak i miękkie ograniczenia. Czytelnika wprowadzono w niektóre znane sposoby zrównoleglenia, przedstawiono również podejście autorów, ykorzystujące MPI. Przyjęto strukturę zarządzania opartą...
Charakterystyka wielowymiarowa silnika spalinowego jako elementu hybrydowego układu napędowego pojazdu
Publikacja
- J. Kropiwnicki
- S. Makowski
- Rok 2005
Charakterystyka wielowymiarowa przypisuje kazdemu punktowi pracy silnika wektor, którego składowymi są, w rozważanym przypadku, jednostkowe zużycie paliwa ge oraz emisja jednostkowa toksycznych składników spalin: tlenku węgla CO, węglowodorów HC i tlenków azotu NOx. W referacie opisano stanowisko badawcze umożliwiające wykonanie pomiarów niezbędnych do wyznaczenia charakterystyki wielowymiarowej silnika spalinowego przeznaczonego...
Simulation of parallel similarity measure computations for large data sets
Publikacja
- Rok 2015
The paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...

Pełny tekst do pobrania w serwisie zewnętrznym
Use of ICT infrastructure for teaching HPC
Publikacja
- P. Czarnul
- M. Matuszek
- Rok 2019
In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

Pełny tekst do pobrania w serwisie zewnętrznym
Workflow application for detection of unwanted events
Publikacja
- P. Czarnul
- W. Kicior
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Rok 2010
Zaprezentowano rozproszoną aplikację do wykrywania potencjalnie niebezpiecznych zdarzeń z wejściowych strumieni wideo. Rozpoznanie niepożądanych zdarzeń wywołuje alarmy i wysyła powiadomienia do odpowiednich służb, jak również powoduje zarejestrowanie filmu. Model aplikacji składa się z węzłów z kamerami, pobierajacych strumienie danych, przetwarzajacych dane, wysyłajacych powiadomienia i zapisujacych dane. Zaimplementowana aplikacja...
Aktualny stan prac nad napędem hybrydowym pojazdów na Politechnice Gdańskiej.
Publikacja
- S. Makowski
- Rok 2004
Opisano prace związane z modernizacją eksperymentalnego pojazdu hybrydowego PH-MAK oraz stanowiska badawczego, którego głównym elementem jest hamownia podwoziowa. Modernizacja pojazdu objęła, między innymi, wymianę gaźnikowego silnika spalinowego na zasilany wtryskowo silnik Lombardini LGW 523 MPI, zastosowanie akumulatorów żelowych nowej generacji i zakup sterownika baterii akumulatorów Badicheq, który umożliwia dokładny pomiar...
BeesyCluster as Front-End for High Performance Computing Services
Publikacja
- P. Czarnul
- TASK Quarterly - Rok 2015
The paper presents the BeesyCluster system as a middleware allowing invocation of services on high performance computing resources within the NIWA Centre of Competence project. Access is possible through both WWW and SOAP Web Service interfaces. The former allows non-experienced users to invoke both simple and complex services exposed through easyto-use servlets. The latter is meant for integration of external applications with...

Pełny tekst do pobrania w portalu
NVRAM as Main Storage of Parallel File System
Publikacja
- A. Malinowski
- Journal of Computer Science and Control Systems - Rok 2016
Modern cluster environments' main trouble used to be lack of computational power provided by CPUs and GPUs, but recently they suffer more and more from insufficient performance of input and output operations. Apart from better network infrastructure and more sophisticated processing algorithms, a lot of solutions base on emerging memory technologies. This paper presents evaluation of using non-volatile random-access memory as a...

Pełny tekst do pobrania w serwisie zewnętrznym
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
Publikacja
- A. Krzywaniak
- P. Czarnul
- Advances in Intelligent Systems and Computing - Rok 2017
In the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...

Pełny tekst do pobrania w portalu
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
Publikacja
- P. Czarnul
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Rok 2023
In the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...

Pełny tekst do pobrania w portalu
KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs
Publikacja
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Rok 2016
The paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....

Pełny tekst do pobrania w serwisie zewnętrznym
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
Publikacja
- P. Rościszewski
- J. Kaliski
- Rok 2017
In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Pełny tekst do pobrania w serwisie zewnętrznym
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
Publikacja
- SIMULATION MODELLING PRACTICE AND THEORY - Rok 2017
In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...

Pełny tekst do pobrania w portalu
Parallel Programming for Modern High Performance Computing Systems
Publikacja
- P. Czarnul
- Rok 2018
In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Pełny tekst do pobrania w serwisie zewnętrznym
Multi-GPU-powered UNRES package for physics-based coarse-grained simulations of structure, dynamics, and thermodynamics of protein systems at biological size- and timescales
Publikacja
- C. Czaplewski
- P. Czarnul
- H. Krawczyk
- A. Lipska
- E. Lubecka
- K. Ocetkiewicz
- J. Proficz
- A. Sieradzan
- R. Ślusarz
- J. Liwo
- BIOPHYSICAL JOURNAL - Rok 2024
Coarse-grained models are nowadays extensively used in biomolecular simulations owing to the tremendous extension of size- and time-scale of simulations. The physics-based UNRES (UNited RESidue) model of proteins developed in our laboratory has only two interaction sites per amino-acid residue (united peptide groups and united side chains) and implicit solvent. However, owing to rigorous physics-based derivation, which enabled...

Pełny tekst do pobrania w serwisie zewnętrznym
Trace Metal Contamination of Bottom Sediments: A Review of Assessment Measures and Geochemical Background Determination Methods
Publikacja
- N. Nawrot
- E. Wojciechowska
- M. Mohsin
- S. Kuittinen
- A. Pappinen
- S. Rezania
- Minerals - Rok 2021
This paper provides an overview of different methods of assessing the trace metal (TM) contamination status of sediments affected by anthropogenic interference. The geochemical background determination methods are also described. A total of 25 papers covering rivers, lakes, and retention tanks sediments in areas subjected to anthropogenic pressure from the last three years (2019, 2020, and 2021) were analysed to support our examination...

Pełny tekst do pobrania w portalu
Energy Consumption Modeling in SPMD and DAC Applications
Publikacja
- J. Kuchta
- Rok 2016
In this chapter, we show a study of energy consumption during execution of SPMD and DAC application – the same applications which time of execution we modeled in the previous two chapters. We measured an average power usage at a single node of the GALERA+ cluster during application execution and then we modeled the total energy consumption by the application. Next we simulated the applications using MERPSYS and we compared the...
Modeling SPMD Application Execution Time
Publikacja
- J. Kuchta
- Rok 2016
Parallel applications in a Single Process Multiple Data paradigm assume splitting huge amounts of data to multiple processors working in parallel at small data packets. As the individual data packets are not independent, the processors must interact with each other to exchange results of the calculations with their adjacent partners and take these results into account in their own computations. An example of SPMD is geometric parallelism...
Testing for conformance of parallel programming pattern languages
Publikacja
- Ł. Garstecki
- P. Kaczmarek
- J. C. D. Kergommeaux
- H. Krawczyk
- B. Wiszniewski
- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2002
This paper reports on the project being run by TUG and IMAG, aimed at reducing the volume of tests required to exercise parallel programming language compilers and libraries. The idea is to use the ISO STEP standard scheme for conformance testing of software products. A detailed example illustrating the ongoing work is presented.
Improving all-reduce collective operations for imbalanced process arrival patterns
Publikacja
- J. Proficz
- JOURNAL OF SUPERCOMPUTING - Rok 2018
Two new algorithms for the all-reduce operation optimized for imbalanced process arrival patterns (PAPs) are presented: (1) sorted linear tree, (2) pre-reduced ring as well as a new way of online PAP detection, including process arrival time estimations, and their distribution between cooperating processes was introduced. The idea, pseudo-code, implementation details, benchmark for performance evaluation and a real case example...

Pełny tekst do pobrania w portalu
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
Publikacja
- Scientific Programming - Rok 2020
This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...

Pełny tekst do pobrania w portalu

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: MPI

Paweł Czarnul dr hab. inż.