Wyniki wyszukiwania dla: parallel computing

Wyniki wyszukiwania dla: parallel computing

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 97

wyczyść wszystkie filtry niedostępne

Review of parallel computing methods and tools for FPGA technology
Publikacja
- R. Cieszewski
- M. Linczuk
- K. Pozniak
- R. Romaniuk
- R. S. Romaniuk
- Rok 2013
Pełny tekst do pobrania w serwisie zewnętrznym
ACM Transactions on Parallel Computing

Czasopisma

ISSN: 2329-4949 , eISSN: 2329-4957
Parallel Programming for Modern High Performance Computing Systems
Publikacja
- P. Czarnul
- Rok 2018
In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Pełny tekst do pobrania w serwisie zewnętrznym
Highly parallel distributed computing systems with optical interconnections
Publikacja
- J. Just
- R. Romaniuk
- R. S. Romaniuk
- Microprocessing and Microprogramming - Rok 1989
Pełny tekst do pobrania w serwisie zewnętrznym
Highly Parallel Distributed Computing System With Optical Interconnections
Publikacja
- J. Just
- R. Romaniuk
- R. S. Romaniuk
- Rok 1990
Pełny tekst do pobrania w serwisie zewnętrznym
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
Publikacja
- Rok 2014
Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Pełny tekst do pobrania w serwisie zewnętrznym
PARALLEL COMPUTING

Czasopisma

ISSN: 0167-8191 , eISSN: 1872-7336
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
Publikacja
- Scientific Programming - Rok 2020
This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...

Pełny tekst do pobrania w portalu
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
Publikacja
- J. Skrzypczak
- P. Czarnul
- SIMULATION MODELLING PRACTICE AND THEORY - Rok 2023
In the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...

Pełny tekst do pobrania w serwisie zewnętrznym
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
Publikacja
- P. Rościszewski
- Rok 2018
Many important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...

Pełny tekst do pobrania w serwisie zewnętrznym
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING

Czasopisma

ISSN: 0743-7315 , eISSN: 1096-0848
Machine Learning in Multi-Agent Systems using Associative Arrays
Publikacja
- P. Spychalski
- R. Arendt
- PARALLEL COMPUTING - Rok 2018
In this paper, a new machine learning algorithm for multi-agent systems is introduced. The algorithm is based on associative arrays, thus it becomes less complex and more efficient substitute of artificial neural networks and Bayesian networks, which is confirmed by performance measurements. Implementation of machine learning algorithm in multi-agent system for aided design of selected control systems allowed to improve the performance...

Pełny tekst do pobrania w portalu
Parallel Computing

Konferencje
International Parallel Computing Workshop

Konferencje
IFIP International Conference on Network and Parallel Computing

Konferencje
International Conference on Massively Parallel Computing Systems

Konferencje
Drawing maps with advice
Publikacja
- D. Dereniowski
- A. Pelc
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING - Rok 2012
Rozważamy następujący problem obliczeniowy. Agent zostaje umieszczony w wierzchołku nieznanego mu grafu. Wierzchołki grafu są nierozróżnialne, natomiast krawędzie posiadają numery portów. Zadaniem agenta jest wyznaczenie mapy, tzn. obliczenie izomorficznej kopii grafu, lub obliczenie dowolnego drzewa spinającego grafu. Bez dodatkowej informacji zadań tych nie można wykonać. W artykule wyznaczamy oszacowania na minimalną liczbę...

Pełny tekst do pobrania w serwisie zewnętrznym
International Symposium on Parallel and Distributed Computing

Konferencje
International European Conference on Parallel and Distributed Computing

Konferencje
Australasian Symposium on Parallel and Distributed Computing (was AusGrid)

Konferencje
International Conference on Parallel and Distributed Computing, Applications and Technologies

Konferencje
Euro-Par: International European Conference on Parallel and Distributed Computing

Konferencje
International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing

Konferencje
Paweł Czarnul dr hab. inż.

Osoby

Katedra Architektury Systemów Komputerowych, Wydział Elektroniki, Telekomunikacji i Informatyki

Paweł Czarnul uzyskał stopień doktora habilitowanego w dziedzinie nauk technicznych w dyscyplinie informatyka w roku 2015 zaś stopień doktora nauk technicznych w zakresie informatyki(z wyróżnieniem) nadany przez Radę Wydziału Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej w roku 2003. Dziedziny jego zainteresowań obejmują: przetwarzanie równoległei rozproszone w tym programowanie równoległe na klastrach obliczeniowych,...
Jerzy Konorski dr hab. inż.

Osoby

Katedra Teleinformatyki

Jerzy Konorski otrzymał tytuł mgr inż. telekomunikacji na Poitechnice Gdańskiej, zaś stopień doktora n.t. w dyscyplinie informatyka w Instytucie Podstaw Informatyki PAN. W r. 2007 obronił rozprawę habilitacyjną na Wydziale Elektroniki, Telekomnikacji i Informatyki PG. Jest autorem ponad 150 publikacji naukowych, prowadził projekty naukowo-badawcze finansowane ze środków Komitetu Badań Naukowych, UE, US Air Force Office of Scientific...
Paweł Rościszewski dr inż.

Osoby

Paweł Rościszewski received his PhD in Computer Science at Gdańsk University of Technology in 2018 based on PhD thesis entitled: "Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption". Currently, he is an Assistant Professor at the Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Poland....
General Provisioning Strategy for Local Specialized Cloud Computing Environments
Publikacja
- P. Orzechowski
- H. Krawczyk
- Rok 2023
The well-known management strategies in cloud computing based on SLA requirements are considered. A deterministic parallel provisioning algorithm has been prepared and used to show its behavior for three different requirements: load balancing, consolidation, and fault tolerance. The impact of these strategies on the total execution time of different sets of services is analyzed for randomly chosen sets of data. This makes it possible...

Pełny tekst do pobrania w serwisie zewnętrznym
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
Publikacja
- P. Rościszewski
- Rok 2014
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Pełny tekst do pobrania w serwisie zewnętrznym
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
Publikacja
- P. Czarnul
- P. Rościszewski
- Rok 2020
Auto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...

Pełny tekst do pobrania w portalu
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
Publikacja
- SIMULATION MODELLING PRACTICE AND THEORY - Rok 2017
In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...

Pełny tekst do pobrania w portalu
PPAM 2022

Wydarzenia

11-09-2022 07:00 - 14-09-2022 13:56

The PPAM 2022 conference, will cover topics in parallel and distributed computing, including theory and applications, as well as applied mathematics.
Benchmarking Parallel Chess Search in Stockfish on Intel Xeon and Intel Xeon Phi Processors
Publikacja
- P. Czarnul
- Rok 2018
The paper presents results from benchmarking the parallel multithreaded Stockfish chess engine on selected multi- and many-core processors. It is shown how the strength of play for an n-thread version compares to 1-thread version on both Intel Xeon and latest Intel Xeon Phi x200 processors. Results such as the number of wins, losses and draws are presented and how these change for growing numbers of threads. Impact of using particular...

Pełny tekst do pobrania w serwisie zewnętrznym
Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins
Publikacja
- A. Sieradzan
- J. Sans‐Duñó
- E. Lubecka
- C. Czaplewski
- A. Lipska
- H. Leszczyński
- K. Ocetkiewicz
- J. Proficz
- P. Czarnul
- H. Krawczyk
- A. Liwo
- JOURNAL OF COMPUTATIONAL CHEMISTRY - Rok 2023
We report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...

Pełny tekst do pobrania w portalu
Jerzy Proficz dr hab. inż.

Osoby

Centrum Informat. Trójmiejskiej Akadem.Sieci Komputerowej, Katedra Architektury Systemów Komputerowych

Jerzy Proficz – dyrektor Centrum Informatycznego Trójmiejskiej Akademickiej Sieci Komputerowej (CI TASK) na Politechnice Gdańskiej. Uzyskał stopień naukowy doktora habilitowanego (2022) w dyscyplinie: Informatyka techniczna i telekomunikacja. Autor i współautor ponad 50 artykułów w czasopismach i na konferencjach naukowych związanych głównie z równoległym przetwarzaniem danych na komputerach dużej mocy (HPC, chmura obliczeniowa). Udział...
Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms
Publikacja
- G. Szwoch
- Rok 2014
Implementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...

Pełny tekst do pobrania w serwisie zewnętrznym
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
Publikacja
- K. Łopatka
- A. Czyżewski
- Rok 2015
A method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
Publikacja
- S. Cygert
- J. Porter-Sobieraj
- D. Kikoła
- J. Sikorski
- M. Słodkowski
- Rok 2013
Relativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....

Pełny tekst do pobrania w serwisie zewnętrznym
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
Publikacja
- P. Czarnul
- Electronics - Rok 2021
The paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...

Pełny tekst do pobrania w portalu
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
Publikacja
- Rok 2015
We present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...

Pełny tekst do pobrania w serwisie zewnętrznym
Performance/energy aware optimization of parallel applications on GPUs under power capping
Publikacja
- A. Krzywaniak
- P. Czarnul
- Rok 2020
In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...

Pełny tekst do pobrania w portalu
Development and tuning of irregular divide-and-conquer applications in DAMPVM/DAC
Publikacja
- P. Czarnul
- Rok 2002
This work presents implementations and tuning experiences with parallel irregular applications developed using the object oriented framework DAM-PVM/DAC. It is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-and-conquer (DAC) applications at runtime and dynamic mapping to processors taking into account their speeds and even loads by other user processes. New implementations of parallel applications...

Pełny tekst do pobrania w serwisie zewnętrznym
Mechanism of recognition of parallel G-quadruplexes by DEAH/RHAU helicase DHX36 explored by molecular dynamics simulations
Publikacja
- K. A. Hossain
- M. Jurkowski
- J. Czub
- M. Kogut
- Computational and Structural Biotechnology Journal - Rok 2021
Because of high stability and slow unfolding rates of G-quadruplexes (G4), cells have evolved specialized helicases that disrupt these non-canonical DNA and RNA structures in an ATP-dependent manner. One example is DHX36, a DEAH-box helicase, which participates in gene expression and replication by recognizing and unwinding parallel G4s. Here, we studied the molecular basis for the high affinity and specificity of DHX36 for parallel-type...

Pełny tekst do pobrania w portalu
Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework
Publikacja
- P. Rościszewski
- International Journal of Computer Networks & Communications (IJCNC) - Rok 2014
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Pełny tekst do pobrania w serwisie zewnętrznym
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
Publikacja
- A. Malinowski
- P. Czarnul
- P. Dorożyński
- K. Czuryło
- Ł. Dorau
- M. Maciejewski
- P. Skowron
- Annals of Computer Science and Information Systems - Rok 2016
While many scientiﬁc, large-scale applications are data-intensive, fast and efﬁcient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...

Pełny tekst do pobrania w portalu
Performance Analysis of the OpenCL Environment on Mobile Platforms
Publikacja
- P. Falkowski-Gilski
- M. Plewka
- Rok 2022
Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

Pełny tekst do pobrania w serwisie zewnętrznym
Use of ICT infrastructure for teaching HPC
Publikacja
- P. Czarnul
- M. Matuszek
- Rok 2019
In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

Pełny tekst do pobrania w serwisie zewnętrznym
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
Publikacja
- P. Rościszewski
- J. Kaliski
- Rok 2017
In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Pełny tekst do pobrania w serwisie zewnętrznym
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
Publikacja
- A. Malinowski
- P. Czarnul
- Procedia Computer Science - Rok 2018
The paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...

Pełny tekst do pobrania w portalu
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
Publikacja
- Rok 2014
Modern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
Kamil Andrzej Rybacki mgr inż.

Osoby

Dział Dużych Zbiorów Danych

Born on 23 October 1993 in Gdańsk. In 2017, I have received the M.Sc. Degree at the Faculty of Applied Physics and Mathematics, Gdańsk University of Technology, Poland. My main fields of interest include computer simulations of molecular systems, parallel computing in application to computational physics methods and development of various simulation software. Currently, my research is focused on the development of hybrid Molecular...
Video Analytics-Based Algorithm for Monitoring Egress from Buildings
Publikacja
- M. Szczodrak
- A. Czyżewski
- Rok 2013
A concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...

Pełny tekst do pobrania w serwisie zewnętrznym
Benchmarking Deep Neural Network Training Using Multi- and Many-Core Processors
Publikacja
- P. Czarnul
- K. Jabłońska
- International Journal of Computer Information Systems and Industrial Management Applications - Rok 2020
In the paper we provide thorough benchmarking of deep neural network (DNN) training on modern multi- and many-core Intel processors in order to assess performance differences for various deep learning as well as parallel computing parameters. We present performance of DNN training for Alexnet, Googlenet, Googlenet_v2 as well as Resnet_50 for various engines used by the deep learning framework, for various batch sizes. Furthermore,...

Pełny tekst do pobrania w serwisie zewnętrznym
Tuning matrix-vector multiplication on GPU
Publikacja
- A. Dziekoński
- M. Mrozowski
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Rok 2010
A matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
Parallel computations in the volunteer based Comcute system
Publikacja
- Rok 2014
The paper presents Comcute which is a novel multi-level implemen- tation of the volunteer based computing paradigm. Comcute was designed to let users donate the computing power of their PCs in a simplified manner, requiring only pointing their web browser at a specific web address and clicking a mouse. The server side appoints several servers to be in charge of execution of particular tasks. Thanks to that the system can survive...

Pełny tekst do pobrania w serwisie zewnętrznym
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
Publikacja
- A. Malinowski
- P. Czarnul
- Rok 2017
In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...

Pełny tekst do pobrania w serwisie zewnętrznym
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
Publikacja
- Rok 2011
A concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...

Pełny tekst do pobrania w serwisie zewnętrznym
The parallel environment for endoscopic image analysis
Publikacja
- H. Krawczyk
- A. Neyman
- M. Nowikowski
- J. Saif
- Rok 2002
The jPVM-oriented environment to support high performance computing required for the Endoscopy Recommender System (ERS) is defined. SPMD model of image matching is considered and its two implementations are proposed: Lexicographical Searching Algorithm (LSA) and Gradient Serching Algorithm (GSA). Three classes of experiments are considered and the relative degree of similarity and execution time of each algorithm are analysed....

Pełny tekst do pobrania w serwisie zewnętrznym
KernelHive: a new workflow-based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs
Publikacja
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE - Rok 2016
The paper presents a new open-source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available....

Pełny tekst do pobrania w serwisie zewnętrznym
Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card
Publikacja
- T. Topa
- A. Noga
- T. Stefański
- Rok 2023
In this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...

Pełny tekst do pobrania w serwisie zewnętrznym
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
Publikacja
- J. Proficz
- P. Czarnul
- Rok 2016
The paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...

Pełny tekst do pobrania w portalu
Genetic Positioning of Fire Stations Utilizing Grid-computing Platform
Publikacja
- Rok 2012
A chapter presents a model for determining near-optimal locations of fire stations based on topography of a given area and location of forests, rivers, lakes and other elements of the site. The model is based on principals of genetic algorithms and utilizes the power of the grid to distribute and execute in parallel most performance-demanding computations involved in the algorithm.
Simulation of Parallel Applications on Large-scale Distributed Systems
Publikacja
- P. Rościszewski
- P. Sidorczak
- Rok 2014
This chapter has a form of a review article in the field of simulating High-Performance Computing systems. We justify the need for a new versatile simulator considering heterogeneity, energy efficiency and reliability of HPC systems. We sketch the problems that need to be solved by such simulator and rationalize using discrete-event simulation for this purpose. Based on a review of existing discrete-event HPC simulation solutions...
Online sound restoration system for digital library applications
Publikacja
- Rok 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

Pełny tekst do pobrania w serwisie zewnętrznym
Molecular dynamics simulations reveal the balance of forces governing the formation of a guanine tetrad—a common structural unit of G-quadruplex DNA
Publikacja
- NUCLEIC ACIDS RESEARCH - Rok 2016
G-quadruplexes (G4) are nucleic acid conformations of guanine-rich sequences, in which guanines are arranged in the square-planar G-tetrads, stacked on one another. G4 motifs form in vivo and are implicated in regulation of such processes as gene expression and chromosome maintenance. The structure and stability of various G4 topologies were determined experimentally; however, the driving forces for their formation are not fully...

Pełny tekst do pobrania w portalu
An MOR Algorithm Based on the Immittance Zero and Pole Eigenvectors for Fast FEM Simulations of Two-Port Microwave Structures
Publikacja
- G. Fotyga
- D. Szypulski
- A. Lamęcki
- P. Sypek
- M. Rewieński
- V. de la Rubia
- M. Mrozowski
- IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES - Rok 2022
The aim of this article is to present a novel model-order reduction (MOR) algorithm for fast finite-element frequency-domain simulations of microwave two-port structures. The projection basis used to construct the reduced-order model (ROM) comprises two sets: singular vectors and regular vectors. The first set is composed of the eigenvectors associated with the poles of the finite-element method (FEM) state-space system, while...

Pełny tekst do pobrania w serwisie zewnętrznym
Further Developments of the Online Sound Restoration System for Digital Library Applications
Publikacja
- Rok 2014
New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

Pełny tekst do pobrania w serwisie zewnętrznym
Online sound restoration system for digital library applications.
Publikacja
- Journal of the Acoustical Society of America - Rok 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
Wpływ kontekstu na efektywność wykonania interaktywnych aplikacji iteracyjnych w dedykowanej przestrzeni usług
Publikacja
- S. Nasiadka
- Rok 2013
Tematyka rozprawy dotyczy aplikacji kontekstowych wykonywanych w środowisku czasu rzeczywistego typu *pervasive computing*. To środowisko nazywane jest przestrzenią inteligentną a aplikacje w niej wykonywane określane są jako Interaktywne Aplikacje Iteracyjne (IAI). IAI analizuje w sposób ciągły sytuacje (wyrażone przez kontekst) zachodzące w przestrzeni i w zależności od bieżącego kontekstu podejmuje określone działania. W skład...
From Sequential to Parallel Implementation of NLP Using the Actor Model
Publikacja
- Advances in Intelligent Systems and Computing - Rok 2018
The article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...

Pełny tekst do pobrania w portalu
Sensitivity of the Baltic Sea level prediction to spatial model resolution
Publikacja
- M. Kowalewski
- Rok 2017
he three-dimensional hydrodynamic model of the Baltic Sea (M3D) and...

Pełny tekst do pobrania w serwisie zewnętrznym
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
Publikacja
- A. Malinowski
- P. Czarnul
- Scalable Computing: Practice and Experience - Rok 2018
The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Pełny tekst do pobrania w portalu
Investigation of Parallel Data Processing Using Hybrid High Performance CPU + GPU Systems and CUDA Streams
Publikacja
- P. Czarnul
- COMPUTING AND INFORMATICS - Rok 2020
The paper investigates parallel data processing in a hybrid CPU+GPU(s) system using multiple CUDA streams for overlapping communication and computations. This is crucial for efficient processing of data, in particular incoming data stream processing that would naturally be forwarded using multiple CUDA streams to GPUs. Performance is evaluated for various compute time to host-device communication time ratios, numbers of CUDA streams,...

Pełny tekst do pobrania w portalu
DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing
Publikacja
- SOFTWARE-PRACTICE & EXPERIENCE - Rok 2022
In the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...

Pełny tekst do pobrania w serwisie zewnętrznym
DATABASE AND BIGDATA PROCESSING SYSTEM FOR ANALYSIS OF AIS MESSAGES IN THE NETBALTIC RESEARCH PROJECT
Publikacja
- M. Lewczuk
- P. Cichocki
- J. Woźniak
- TASK Quarterly - Rok 2017
A specialized database and a software tool for graphical and numerical presentation of maritime measurement results has been designed and implemented as part of the research conducted under the netBaltic project (Internet over the Baltic Sea – the implementation of a multi-system, self-organizing broadband communications network over the sea for enhancing navigation safety through the development of e-navigation services.) The...

Pełny tekst do pobrania w portalu
Processing of Satellite Data in the Cloud
Publikacja
- J. Proficz
- K. Drypczewski
- TASK Quarterly - Rok 2017
The dynamic development of digital technologies, especially those dedicated to devices generating large data streams, such as all kinds of measurement equipment (temperature and humidity sensors, cameras, radio-telescopes and satellites – Internet of Things) enables more in-depth analysis of the surrounding reality, including better understanding of various natural phenomenon, starting from atomic level reactions, through macroscopic...

Pełny tekst do pobrania w portalu
Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors
Publikacja
- P. Czarnul
- INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING - Rok 2016
The paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...

Pełny tekst do pobrania w portalu
Parallelization of Selected Algorithms on Multi-core CPUs, a Cluster and in a Hybrid CPU+Xeon Phi Environment
Publikacja
- A. Krzywaniak
- P. Czarnul
- Advances in Intelligent Systems and Computing - Rok 2017
In the paper we present parallel implementations as well as execution times and speed-ups of three different algorithms run in various environments such as on a workstation with multi-core CPUs and a cluster. The parallel codes, implementing the master-slave model in C+MPI, differ in computation to communication ratios. The considered problems include: a genetic algorithm with various ratios of master processing time to communication...

Pełny tekst do pobrania w portalu
Theory and implementation of a virtualisation level Future Internet defence in depth architecture
Publikacja
- J. Konorski
- P. Pacyna
- G. Kolaczek
- Z. Kotulski
- K. Cabaj
- P. Szalachowski
- International Journal of Trust Management in Computing and Communications - Rok 2013
An EU Future Internet Engineering project currently underway in Poland defines three parallel internets (PIs). The emerging IIP system (IIPS, abbreviating the project’s Polish name), has a four-level architecture, with level 2 responsible for creation of virtual resources of the PIs. This paper proposes a three-tier security architecture to address level 2 threats of unauthorised traffic injection and IIPS traffic manipulation...

Pełny tekst do pobrania w serwisie zewnętrznym
A Task-Scheduling Approach for Efficient Sparse Symmetric Matrix-Vector Multiplication on a GPU
Publikacja
- SIAM JOURNAL ON SCIENTIFIC COMPUTING - Rok 2015
In this paper, a task-scheduling approach to efficiently calculating sparse symmetric matrix-vector products and designed to run on Graphics Processing Units (GPUs) is presented. The main premise is that, for many sparse symmetric matrices occurring in common applications, it is possible to obtain significant reductions in memory usage and improvements in performance when the matrix is prepared in certain ways prior to computation....

Pełny tekst do pobrania w serwisie zewnętrznym
Modelling and simulation of GPU processing in the MERPSYS environment
Publikacja
- T. Gajger
- P. Czarnul
- Scalable Computing: Practice and Experience - Rok 2018
In this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...

Pełny tekst do pobrania w portalu
Process arrival pattern aware algorithms for acceleration of scatter and gather operations
Publikacja
- J. Proficz
- Cluster Computing-The Journal of Networks Software Tools and Applications - Rok 2020
Imbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...

Pełny tekst do pobrania w portalu
Justyna Zander dr inż.

Osoby
Computer controlled systems - 2022/2023
Kursy Online
- P. Raczyński
materiały wspierające wykład na studiach II stopnia na kierunku ACR pod tytułem komputerowe systemy automatyki 1. Computer system – controlled plant interfacing technique; simple interfacing and with both side acknowledgement; ideas, algorithms, acknowledge passing. 2. Methods of acknowledgement passing: software checking and passing, using interrupt techniques, using readiness checking (ready – wait lines). The best solution...
CCS-lecture-2023-2024
Kursy Online
- P. Raczyński
materiały wspierające wykład na studiach II stopnia na kierunku ACR pod tytułem komputerowe systemy automatyki 1. Computer system – controlled plant interfacing technique; simple interfacing and with both side acknowledgement; ideas, algorithms, acknowledge passing. 2. Methods of acknowledgement passing: software checking and passing, using interrupt techniques, using readiness checking (ready – wait lines). The best solution optimization...
Piotr Sypek dr inż.

Osoby

Katedra Inżynierii Mikrofalowej i Antenowej

Piotr Sypek otrzymał w Politechnice Gdańskiej tytuł magistra inżyniera w 2003 roku oraz stopień doktora nauk technicznych (z wyróżnieniem) w 2012 roku. Obecnie pracuje w Katedrze Inżynierii Mikrofalowej i Antenowej na Wydziale Elektroniki, Telekomunikacji i Informatyki w Politechnice Gdańskiej. Jego działalność badawcza zawiera projektowanie i implementację równoległych algorytmów stosowanych do budowania i wyznaczania rozwiązywania...
Tomasz Bieliński mgr inż.

Osoby

Katedra Systemów Geoinformatycznych
Optimization of Data Assignment for Parallel Processing in a Hybrid Heterogeneous Environment Using Integer Linear Programming
Publikacja
- T. M. Boiński
- P. Czarnul
- COMPUTER JOURNAL - Rok 2021
In the paper we investigate a practical approach to application of integer linear programming for optimization of data assignment to compute units in a multi-level heterogeneous environment with various compute devices, including CPUs, GPUs and Intel Xeon Phis. The model considers an application that processes a large number of data chunks in parallel on various compute units and takes into account computations, communication including...

Pełny tekst do pobrania w portalu
Evaluation of multimedia applications in a cluster oriented environment
Publikacja
- Metrology and Measurement Systems - Rok 2012
In the age of Information and Communication Technology (ICT), Web and the Internet have changed significantly the way applications are developed, deployed and used. One of recent trends is modern design of web-applications based on SOA. This process is based on the composition of existing web services into a single scenario from the point of view of a particular user or client. This allows IT companies to shorten product time-to-market....

Pełny tekst do pobrania w portalu
Performance evaluation of parallel background subtraction on GPU platforms
Publikacja
- G. Szwoch
- Elektronika : konstrukcje, technologie, zastosowania - Rok 2015
Implementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...

Pełny tekst do pobrania w serwisie zewnętrznym
GPU based implementation of Temperature-Vegetation Dryness Index for AVHRR3 Satellite Data
Publikacja
- T. Bieliński
- A. Chybicki
- Rok 2014
Paper presents an implementation of TVDI (Temperature-Vegetation-Dryness Index) algorithm on GPU (Graphics Processing Unit). Calculation of this index is based on LST (Land Surface Temperature) and NDVI (Normalized Difference Vegetation Index). Discussed results are based on multi-spectral imagery retrieved from AVHRR3 sensors for area of Poland. All phases of TVDI implementation on GPU are modified in respect to CUDA platform....
Optimization of Execution Time under Power Consumption Constraints in a Heterogeneous Parallel System with GPUs and CPUs
Publikacja
- P. Czarnul
- P. Rościszewski
- Rok 2014
The paper proposes an approach for parallelization of computations across a collection of clusters with heterogeneous nodes with both GPUs and CPUs. The proposed system partitions input data into chunks and assigns to par- ticular devices for processing using OpenCL kernels defined by the user. The sys- tem is able to minimize the execution time of the application while maintaining the power consumption of the utilized GPUs and...

Pełny tekst do pobrania w serwisie zewnętrznym
Runtime Visualization of Application Progress and Monitoring of a GPU-enabled Parallel Environment
Publikacja
- Rok 2014
The paper presents design, implementation and real life uses of a visualization subsystem for a distributed framework for parallelization of workflow-based computations among clusters with nodes that feature both CPUs and GPUs. Firstly, the proposed system presents a graphical view of the infrastructure with clusters, nodes and compute devices along with parameters and runtime graphs of load, memory available, fan speeds etc. Secondly,...

Pełny tekst do pobrania w serwisie zewnętrznym
Communication and Load Balancing Optimization for Finite Element Electromagnetic Simulations Using Multi-GPU Workstation
Publikacja
- IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES - Rok 2017
This paper considers a method for accelerating finite-element simulations of electromagnetic problems on a workstation using graphics processing units (GPUs). The focus is on finite-element formulations using higher order elements and tetrahedral meshes that lead to sparse matrices too large to be dealt with on a typical workstation using direct methods. We discuss the problem of rapid matrix generation and assembly, as well as...

Pełny tekst do pobrania w serwisie zewnętrznym
GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM
Publikacja
- Communications in Computational Physics - Rok 2017
This paper presents a GPU-accelerated implementation of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) method with an inexact nullspace filtering approach to find eigenvalues in electromagnetics analysis with higherorder FEM. The performance of the proposed approach is verified using the Kepler (Tesla K40c) graphics accelerator, and is compared to the performance of the implementation based on functions from...

Pełny tekst do pobrania w serwisie zewnętrznym
GPU-accelerated finite element method
Publikacja
- Rok 2016
In this paper the results of the acceleration of computations involved in analysing electromagnetic problems by means of the finite element method (FEM), obtained with graphics processors (GPU), are presented. A 4.7-fold acceleration was achieved thanks to the massive parallelization of the most time-consuming steps of FEM, namely finite-element matrix-generation and the solution of a sparse system of linear equations with the...

Pełny tekst do pobrania w serwisie zewnętrznym
Modeling energy consumption of parallel applications
Publikacja
- Annals of Computer Science and Information Systems - Rok 2016
The paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...

Pełny tekst do pobrania w portalu
Piotr Kopa Ostrowski

Osoby

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: parallel computing

Paweł Czarnul dr hab. inż.

Jerzy Konorski dr hab. inż.

Paweł Rościszewski dr inż.

Jerzy Proficz dr hab. inż.

Kamil Andrzej Rybacki mgr inż.

Justyna Zander dr inż.

Piotr Sypek dr inż.

Tomasz Bieliński mgr inż.