Search results for: massively parallel computing

Search results for: massively parallel computing

results on page:
embed this view on your website

Filters

total: 1499

clear all filters disabled

displaying 1000 best results Help

Fixed Pattern Noise Reduction and Linearity Improvement in Time-Mode CMOS Image Sensors
Publication
- M. Kłosowski
- Y. Sun
- SENSORS - Year 2020
In the paper, a digital clock stopping technique for gain and offset correction in time-mode analog-to-digital converters (ADCs) has been proposed. The technique is dedicated to imagers with massively parallel image acquisition working in the time mode where compensation of dark signal non-uniformity (DSNU) as well as photo-response non-uniformity (PRNU) is critical. Fixed pattern noise (FPN) reduction has been experimentally validated...

Full text available to download
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
Publication
- S. Cygert
- J. Porter-Sobieraj
- D. Kikoła
- J. Sikorski
- M. Słodkowski
- Year 2013
Relativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....

Full text to download in external service
Mechanism of recognition of parallel G-quadruplexes by DEAH/RHAU helicase DHX36 explored by molecular dynamics simulations
Publication
- K. A. Hossain
- M. Jurkowski
- J. Czub
- M. Kogut
- Computational and Structural Biotechnology Journal - Year 2021
Because of high stability and slow unfolding rates of G-quadruplexes (G4), cells have evolved specialized helicases that disrupt these non-canonical DNA and RNA structures in an ATP-dependent manner. One example is DHX36, a DEAH-box helicase, which participates in gene expression and replication by recognizing and unwinding parallel G4s. Here, we studied the molecular basis for the high affinity and specificity of DHX36 for parallel-type...

Full text available to download
Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications
Publication
- P. Czarnul
- Electronics - Year 2021
The paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...

Full text available to download
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
Publication
- Year 2015
We present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...

Full text to download in external service
A Power-Efficient Digital Technique for Gain and Offset Correction in Slope ADCs
Publication
- M. Kłosowski
- IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS - Year 2020
In this brief, a power-efficient digital technique for gain and offset correction in slope analog-to-digital converters (ADCs) has been proposed. The technique is especially useful for imaging arrays with massively parallel image acquisition where simultaneous compensation of dark signal non-uniformity (DSNU) as well as photo-response non-uniformity (PRNU) is critical. The presented approach is based on stopping the ADC clock by...

Full text available to download
Development and tuning of irregular divide-and-conquer applications in DAMPVM/DAC
Publication
- P. Czarnul
- Year 2002
This work presents implementations and tuning experiences with parallel irregular applications developed using the object oriented framework DAM-PVM/DAC. It is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-and-conquer (DAC) applications at runtime and dynamic mapping to processors taking into account their speeds and even loads by other user processes. New implementations of parallel applications...

Full text to download in external service
Performance/energy aware optimization of parallel applications on GPUs under power capping
Publication
- A. Krzywaniak
- P. Czarnul
- Year 2020
In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...

Full text available to download
Performance Analysis of the OpenCL Environment on Mobile Platforms
Publication
- P. Falkowski-Gilski
- M. Plewka
- Year 2022
Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

Full text to download in external service
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
Publication
- A. Malinowski
- P. Czarnul
- P. Dorożyński
- K. Czuryło
- Ł. Dorau
- M. Maciejewski
- P. Skowron
- Annals of Computer Science and Information Systems - Year 2016
While many scientiﬁc, large-scale applications are data-intensive, fast and efﬁcient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...

Full text available to download
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
Publication
- P. Rościszewski
- J. Kaliski
- Year 2017
In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Full text to download in external service
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
Publication
- A. Malinowski
- P. Czarnul
- Procedia Computer Science - Year 2018
The paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...

Full text available to download
Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework
Publication
- P. Rościszewski
- International Journal of Computer Networks & Communications (IJCNC) - Year 2014
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Full text available to download
Video Analytics-Based Algorithm for Monitoring Egress from Buildings
Publication
- M. Szczodrak
- A. Czyżewski
- Year 2013
A concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...

Full text to download in external service
Benchmarking Deep Neural Network Training Using Multi- and Many-Core Processors
Publication
- P. Czarnul
- K. Jabłońska
- International Journal of Computer Information Systems and Industrial Management Applications - Year 2020
In the paper we provide thorough benchmarking of deep neural network (DNN) training on modern multi- and many-core Intel processors in order to assess performance differences for various deep learning as well as parallel computing parameters. We present performance of DNN training for Alexnet, Googlenet, Googlenet_v2 as well as Resnet_50 for various engines used by the deep learning framework, for various batch sizes. Furthermore,...

Full text to download in external service
Surface diffusion and cluster formation of gold on the silicon (111)
Publication
- W. Pleczysty
- I. Shtablavyi
- K. A. Rybacki
- S. Winczewski
- S. Mudry
- J. Rybicki
- Journal of Achievements in Materials and Manufacturing Engineering - Year 2020
Purpose: Investigation of the gold atoms behaviour on the surface of silicon by molecular dynamics simulation method. The studies were performed for the case of one, two and four atoms, as well as incomplete and complete filling of gold atoms on the silicon surface. Design/methodology/approach: Investigations were performed by the method of molecular dynamics simulation using the Large-scale Atomic/Molecular Massively Parallel...

Full text available to download
Use of ICT infrastructure for teaching HPC
Publication
- P. Czarnul
- M. Matuszek
- Year 2019
In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

Full text to download in external service
Tuning matrix-vector multiplication on GPU
Publication
- A. Dziekoński
- M. Mrozowski
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010
A matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
Publication
- Year 2014
Modern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
Publication
- Year 2011
A concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...

Full text to download in external service
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
Publication
- A. Malinowski
- P. Czarnul
- Year 2017
In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...

Full text to download in external service
Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card
Publication
- T. Topa
- A. Noga
- T. Stefański
- Year 2023
In this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...

Full text to download in external service
Online sound restoration system for digital library applications
Publication
- Year 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

Full text to download in external service
Molecular dynamics simulations reveal the balance of forces governing the formation of a guanine tetrad—a common structural unit of G-quadruplex DNA
Publication
- NUCLEIC ACIDS RESEARCH - Year 2016
G-quadruplexes (G4) are nucleic acid conformations of guanine-rich sequences, in which guanines are arranged in the square-planar G-tetrads, stacked on one another. G4 motifs form in vivo and are implicated in regulation of such processes as gene expression and chromosome maintenance. The structure and stability of various G4 topologies were determined experimentally; however, the driving forces for their formation are not fully...

Full text available to download
An MOR Algorithm Based on the Immittance Zero and Pole Eigenvectors for Fast FEM Simulations of Two-Port Microwave Structures
Publication
- G. Fotyga
- D. Szypulski
- A. Lamęcki
- P. Sypek
- M. Rewieński
- V. de la Rubia
- M. Mrozowski
- IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES - Year 2022
The aim of this article is to present a novel model-order reduction (MOR) algorithm for fast finite-element frequency-domain simulations of microwave two-port structures. The projection basis used to construct the reduced-order model (ROM) comprises two sets: singular vectors and regular vectors. The first set is composed of the eigenvectors associated with the poles of the finite-element method (FEM) state-space system, while...

Full text available to download
Further Developments of the Online Sound Restoration System for Digital Library Applications
Publication
- Year 2014
New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

Full text to download in external service
Wpływ kontekstu na efektywność wykonania interaktywnych aplikacji iteracyjnych w dedykowanej przestrzeni usług
Publication
- S. Nasiadka
- Year 2013
Tematyka rozprawy dotyczy aplikacji kontekstowych wykonywanych w środowisku czasu rzeczywistego typu *pervasive computing*. To środowisko nazywane jest przestrzenią inteligentną a aplikacje w niej wykonywane określane są jako Interaktywne Aplikacje Iteracyjne (IAI). IAI analizuje w sposób ciągły sytuacje (wyrażone przez kontekst) zachodzące w przestrzeni i w zależności od bieżącego kontekstu podejmuje określone działania. W skład...
Online sound restoration system for digital library applications.
Publication
- Journal of the Acoustical Society of America - Year 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
Sensitivity of the Baltic Sea level prediction to spatial model resolution
Publication
- M. Kowalewski
- Year 2017
he three-dimensional hydrodynamic model of the Baltic Sea (M3D) and...

Full text to download in external service
Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream
Publication
- G. Szwoch
- Communications in Computer and Information Science - Year 2011
A background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....

Full text to download in external service
Affective Computing
e-Learning Courses
- G. Brodny
- A. Kołakowska
- M. Sowiński
- M. Wróbel
- A. Landowska
Agnieszka Landowska dr hab. inż.

People

Department of Software Engineering

Agnieszka Landowska works for Gdansk University of Technology, FETI, Department of Software Engineering. Her research concentrates on usability, accessibility and technology adoption, as well as affective computing methods. She initiated Emotions in HCI Research Group and conducts resarch on User eXperiene evaluation of applications and other technologies.
Advances in Intelligent Systems and Computing

Journals

ISSN: 2194-5357
DATABASE AND BIGDATA PROCESSING SYSTEM FOR ANALYSIS OF AIS MESSAGES IN THE NETBALTIC RESEARCH PROJECT
Publication
- M. Lewczuk
- P. Cichocki
- J. Woźniak
- TASK Quarterly - Year 2017
A specialized database and a software tool for graphical and numerical presentation of maritime measurement results has been designed and implemented as part of the research conducted under the netBaltic project (Internet over the Baltic Sea – the implementation of a multi-system, self-organizing broadband communications network over the sea for enhancing navigation safety through the development of e-navigation services.) The...

Full text available to download
Processing of Satellite Data in the Cloud
Publication
- J. Proficz
- K. Drypczewski
- TASK Quarterly - Year 2017
The dynamic development of digital technologies, especially those dedicated to devices generating large data streams, such as all kinds of measurement equipment (temperature and humidity sensors, cameras, radio-telescopes and satellites – Internet of Things) enables more in-depth analysis of the surrounding reality, including better understanding of various natural phenomenon, starting from atomic level reactions, through macroscopic...

Full text available to download
Considerations of Computational Efficiency in Volunteer and Cluster Computing
Publication
- P. Czarnul
- M. Matuszek
- Year 2016
In the paper we focus on analysis of performance and power consumption statistics for two modern environments used for computing – volunteer and cluster based systems. The former integrate computational power donated by volunteers from their own locations, often towards social oriented or targeted initiatives, be it of medical, mathematical or space nature. The latter is meant for high performance computing and is typically installed...

Full text to download in external service
Optimization issues in distributed computing systems design
Publication
- K. Walkowiak
- J. Rak
- Year 2014
In recent years, we observe a growing interest focused on distributed computing systems. Both industry and academia require increasing computational power to process and analyze large amount of data, including significant areas like analysis of medical data, earthquake, or weather forecast. Since distributed computing systems – similar to computer networks – are vulnerable to failures, survivability mechanisms are indispensable...
Crowdsourcing and Volunteer Computing as Distributed Approach for Problem Solving
Publication
- Year 2014
In this paper, a combination between volunteer computing and crowdsourcing is presented. Two paradigms of the web computing are described, analyzed and compared in detail: grid computing and volunteer computing. Characteristics of BOINC and its contribution to global Internet processing are shown with the stress put onto applications the system can facilitate and problems it can solve. An alternative instance of a grid computing...

Full text to download in external service
Quality Modeling in Grid and Volunteer-Computing Systems
Publication
- J. Kuchta
- Year 2013
A model of computational quality in large-scale computing systems was presented in the previous chapter of this book. This model describes three quality attributes: performance, reliability and energy efficiency. We assumed that all processes in the system are incessantly ready to perform calculations and that communication between the processes occurs immediately. These assumptions are not true for grid and volunteer computing...
Modeling Parallel Applications in the MERPSYS Environment
Publication
- P. Czarnul
- Year 2016
The chapter presents how to model parallel computational applications for which simulation of execution in a large-scale parallel or distributed environment is performed within the MERPSYS environment. Specifically, it is shown what approaches can be adopted to model key paradigms often used for parallel applications: master-slave, geometric parallelism (single program multiple data), pipelined and divide-and-conquer applications....
Long Distance Geographically Distributed InfiniBand Based Computing
Publication
- K. Niedzielewski
- M. Semeniuk
- J. Skomiał
- J. Proficz
- P. Sumionka
- B. Pliszka
- M. Michalewicz
- Supercomputing Frontiers and Innovations - Year 2020
Collaboration between multiple computing centres, referred as federated computing is becom- ing important pillar of High Performance Computing (HPC) and will be one of its key components in the future. To test technical possibilities of future collaboration using 100 Gb optic fiber link (Connection was 900 km in length with 9 ms RTT time) we prepared two scenarios of operation. In the first one, Interdisciplinary Centre for Mathematical...

Full text available to download
On Computing Curlicues Generated by Circle Homeomorphisms
Publication
- J. Signerska-Rynkowska
- Year 2022
The dataset entitled Computing dynamical curlicues contains values of consecutive points on a curlicue generated, respectively, by rotation on the circle by different angles, the Arnold circle map (with various parameter values) and an exemplary sequence as well as corresponding diameters and Birkhoff averages of these curves. We additionally provide source codes of the Matlab programs which can be used to generate and plot the...

Full text available to download
Metaheuristic algorithms for optimization of resilient overlay computing systems
Publication
- K. Walkowiak
- W. Charewicz
- M. Donajski
- J. Rak
- Logic journal of the IGPL - Year 2014
The idea of distributed computing systems has been gaining much interest in recent years owing to the growing amount of data to be processed for both industrial and academic purposes. However, similar to other systems, also distributed computing systems are vulnerable to failures. Due to strict QoS requirements, survivability guarantees are necessary for provisioning of uninterrupted service. In this article, we focus on reliability...

Full text to download in external service
From Sequential to Parallel Implementation of NLP Using the Actor Model
Publication
- Advances in Intelligent Systems and Computing - Year 2018
The article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...

Full text available to download
Berkeley Open Infrastructure for Network Computing
Publication
- P. Brudło
- Year 2012
Zaprezentowano system BOINC (ang. Berkeley Open Infrastructure for Network Computing) jako interesujące rozwiązanie integrujące rozproszone moce obliczeniowe osobistych komputerów typu PC w Internecie. Przedstawiono zasadę działania opisywanej platformy. W dalszej części zaprezentowano kilka wybranych projektów naukowych wykorzystujących BOINC, które są reprezentatywne w zakresie zastosowania systemu w ujęciu założonego paradygmatu...

Full text to download in external service
Edge-Computing based Secure E-learning Platforms
Publication
- S. A. Bhat
- D. Alyahya
- M. A. Dar
- S. Shah
- Year 2022
Implementation of Information and Communication Technologies (ICT) in E-Learning environments have brought up dramatic changes in the current educational sector. Distance learning, online learning, and networked learning are few examples that promote educational interaction between students, lecturers and learning communities. Although being an efficient form of real learning resource, online electronic resources are subject to...

Full text available to download
Complementary oriented allocation algorithm for cloud computing
Publication
- P. Orzechowski
- TASK Quarterly - Year 2017
Nowadays cloud computing is one of the most popular processing models. More and more different kinds of workloads have been migrated to clouds. This trend obliges the community to design algorithms which could optimize the usage of cloud resources and be more effiient and effective. The paper proposes a new model of workload allocation which bases on the complementarity relation and analyzes it. An example of a case of use is shown...

Full text available to download
Massively bleeding trauma patient: intervene but not too late
Publication
- T. Czarnik
- R. GAWDA
- Minerva Anestesiologica - Year 2022
Full text to download in external service
Affective computing and affective learning – methods, tools and prospects
Publication
- A. Landowska
- EduAkcja. Magazyn Edukacji Elektronicznej - Year 2013
Every teacher knows that interest, active participation and motivation are important factors in the learning process. At the same time e-learning environments almost always address only the cognitive aspects of education. This paper provides a brief review of methods used for affect recognition, representation and processing as well as investigates how these methods may be used to address affective aspect of e-education. The paper...

Full text available to download
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
Publication
- A. Malinowski
- P. Czarnul
- Scalable Computing: Practice and Experience - Year 2018
The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

Full text available to download

Search

Filters

Catalog

Search results for: massively parallel computing

Agnieszka Landowska dr hab. inż.