Search results for: FPGAS, MULTIPLE-PRECISION ARITHMETIC, SCIENTIFIC COMPUTING, PARALLEL PROCESSING, EMBEDDED SYSTEMS
-
Tuning matrix-vector multiplication on GPU
PublicationA matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
-
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
PublicationAuto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...
-
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublicationPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
Network-assisted processing of advanced IoT applications: challenges and proof-of-concept application
PublicationRecent advances in the area of the Internet of Things shows that devices are usually resource-constrained. To enable advanced applications on these devices, it is necessary to enhance their performance by leveraging external computing resources available in the network. This work presents a study of computational platforms to increase the performance of these devices based on the Mobile Cloud Computing (MCC) paradigm. The main...
-
Parallel Background Subtraction in Video Streams Using OpenCL on GPU Platforms
PublicationImplementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access,...
-
Online sound restoration system for digital library applications
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Analytical Expression for the Time-Domain Green's Function of a Discrete Plane Wave Propagating in the 3-D FDTD Grid
PublicationIn this paper, a closed-form expression for the time-domain dyadic Green’s function of a discrete plane wave (DPW) propagating in a 3-D finite-difference time-domain (FDTD) grid is derived. In order to verify our findings, the time-domain implementation of the DPW-injection technique is developed with the use of the derived expression for 3-D total-field/scattered-field (TFSF) FDTD simulations. This implementation requires computations...
-
Analysis of radiation and scattering problems with the use of hybrid techniques based on the discrete Green's function formulation of the FDTD method
PublicationIn this contribution, simulation scenarios are presented which take advantage of the hybrid techniques based on the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method. DGF-FDTD solutions are compatible with the finite-difference grid and can be applied for perfect hybridization of the FDTD method. The following techniques are considered: (i) DGF-FDTD for antenna simulations, (ii) DGF-based...
-
Analytical Expression for the Time-Domain Discrete Green's Function of a Plane Wave Propagating in the 2-D FDTD Grid
PublicationIn this letter, a new closed-form expression for the time-domain discrete Green's function (DGF) of a plane wave propagating in the 2-D finite-difference time-domain (FDTD) grid is derived. For the sake of its verification, the time-domain implementation of the analytic field propagator (AFP) technique was developed for the plane wave injection in 2-D total-field/scattered-field (TFSF) FDTD simulations. Such an implementation of...
-
International Conference on Massively Parallel Computing Systems
Conferences -
Image Processing Techniques for Distributed Grid Applications
PublicationParallel approaches to 2D and 3D convolution processing of series of images have been presented. A distributed, practically oriented, 2D spatial convolution scheme has been elaborated and extended into the temporal domain. Complexity of the scheme has been determined and analysed with respect to coefficients in convolution kernels. Possibilities of parallelisation of the convolution operations have been analysed and the results...
-
Developing Methods for Building Intelligent Systems of Information Resources Processing Using an Ontological Approach
PublicationThe problem of developing methods of information resource processing is investigated. A formal procedure description of processing text content is developed. A new ontological approach to the implementation of business processes is proposed. Consider that the aim of our work is to develop methods and tools for building intelligent systems of information resource processing, the core of knowledge bases of which are ontology’s, and...
-
CNN-CLFFA: Support Mobile Edge Computing in Transportation Cyber Physical System
PublicationIn the present scenario, the transportation Cyber Physical System (CPS) improves the reliability and efficiency of the transportation systems by enhancing the interactions between the physical and cyber systems. With the provision of better storage ability and enhanced computing, cloud computing extends transportation CPS in Mobile Edge Computing (MEC). By inspecting the existing literatures, the cloud computing cannot fulfill...
-
Online sound restoration system for digital library applications.
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Further Developments of the Online Sound Restoration System for Digital Library Applications
PublicationNew signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...
-
Scaling of numbers in residue arithmetic with the flexible selection of scaling factor
PublicationA scaling technique of numbers in resudue arithmetic with the flexible selection of the scaling factor is presented. The required scaling factor can be selected from the set of moduli products of the Residue Number System (RNS) base. By permutation of moduli of the number system base it is possible to create many auxilliary Mixed-Radix Systems associated with the given RNS with respect to the base, but they have different sets...
-
Grzegorz Szwoch dr hab. inż.
PeopleGrzegorz Szwoch was born in 1972 in Gdansk. In 1991-1996 he studied at the Technical University of Gdansk. In 1996 he graduated as a student from the Sound Engineering Department. His thesis was related to physical modeling of musical instruments. Since that time he has been a member of the research staff at the Multimedia Systems Department as a PhD student (1996-2001), Assistant (2001-2004), Assistant professor (2004-2020) and...
-
A memory efficient and fast sparse matrix vector product on a Gpu
PublicationThis paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising...
-
Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform
PublicationResults of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...
-
An facile Fortran-95 algorithm to simulate complex instabilities in three-dimensional hyperbolic systems
Open Research DataIt is well know that the simulation of fractional systems is a difficult task from all points of view. In particular, the computer implementation of numerical algorithms to simulate fractional systems of partial differential equations in three dimensions is a hard task which has no been solved satisfactorily. Here, we provide a Fortran-95 code to solve...
-
Graph Representation Integrating Signals for Emotion Recognition and Analysis
PublicationData reusability is an important feature of current research, just in every field of science. Modern research in Affective Computing, often rely on datasets containing experiments-originated data such as biosignals, video clips, or images. Moreover, conducting experiments with a vast number of participants to build datasets for Affective Computing research is time-consuming and expensive. Therefore, it is extremely important to...
-
Thermal Image Processing for Respiratory Estimation from Cubical Data with Expandable Depth
PublicationAs healthcare costs continue to rise, finding affordable and non-invasive ways to monitor vital signs is increasingly important. One of the key metrics for assessing overall health and identifying potential issues early on is respiratory rate (RR). Most of the existing methods require multiple steps that consist of image and signal processing. This might be difficult to deploy on edge devices that often do not have specialized...
-
Performance evaluation of the parallel object tracking algorithm employing the particle filter
PublicationAn algorithm based on particle filters is employed to track moving objects in video streams from fixed and non-fixed cameras. Particle weighting is based on color histograms computed in the iHLS color space. Particle computations are parallelized with CUDA framework. The algorithm was tested on various GPU devices: a desktop GPU card, a mobile chipset and two embedded GPU platforms. The processing speed depending on the number...
-
Big Data Processing by Volunteer Computing Supported by Intelligent Agents
PublicationIn this paper, volunteer computing systems have been proposed for big data processing. Moreover, intelligent agents have been developed to efficiency improvement of a grid middleware layer. In consequence, an intelligent volunteer grid has been equipped with agents that belong to five sets. The first one consists of some user tasks. Furthermore, two kinds of semi-intelligent tasks have been introduced to implement a middleware...
-
Integration of cloud-based services into distributed workflow systems: challenges and solutions
PublicationThe paper introduces the challenges in modern workflow management in distributed environments spanning multiplecluster, grid and cloud systems. Recent developments in cloud computing infrastructures are presented and are referring howclouds can be incorporated into distributed workflow management, aside from local and grid systems considered so far. Severalchallenges concerning workflow definition, optimisation and execution are...
-
Deep learning in the fog
PublicationIn the era of a ubiquitous Internet of Things and fast artificial intelligence advance, especially thanks to deep learning networks and hardware acceleration, we face rapid growth of highly decentralized and intelligent solutions that offer functionality of data processing closer to the end user. Internet of Things usually produces a huge amount of data that to be effectively analyzed, especially with neural networks, demands high...
-
Krylov Space Iterative Solvers on Graphics Processing Units
PublicationCUDA architecture was introduced by Nvidia three years ago and since then there have been many promising publications demonstrating a huge potential of Graphics Processing Units (GPUs) in scientific computations. In this paper, we investigate the performance of iterative methods such as cg, minres, gmres, bicg that may be used to solve large sparse real and complex systems of equations arising in computational electromagnetics.
-
Measurements of Spectral Spatial Distribution of Scattering Materials for Rear Projection Screens used in Virtual Reality Systems
PublicationRapid development of computing and visualisation systems has resulted in an unprecedented capability to display, in real time, realistic computer-generated worlds. Advanced techniques, including three-dimensional (3D) projection, supplemented by multi-channel surround sound, create immersive environments whose applications range from entertainment to military to scientific. One of the most advanced virtual reality systems are CAVE-type...
-
Video Analytics-Based Algorithm for Monitoring Egress from Buildings
PublicationA concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...
-
An IoT-Based Computational Framework for Healthcare Monitoring in Mobile Environments
PublicationThe new Internet of Things paradigm allows for small devices with sensing, processing and communication capabilities to be designed, which enable the development of sensors, embedded devices and other ‘things’ ready to understand the environment. In this paper, a distributed framework based on the internet of things paradigm is proposed for monitoring human biomedical signals in activities involving physical exertion. The main...
-
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
PublicationA concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...
-
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
PublicationThe paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...
-
Long Distance Geographically Distributed InfiniBand Based Computing
PublicationCollaboration between multiple computing centres, referred as federated computing is becom- ing important pillar of High Performance Computing (HPC) and will be one of its key components in the future. To test technical possibilities of future collaboration using 100 Gb optic fiber link (Connection was 900 km in length with 9 ms RTT time) we prepared two scenarios of operation. In the first one, Interdisciplinary Centre for Mathematical...
-
Semantics for an Interdisciplinary Computation
PublicationSemantics for an interdisciplinary computation is becoming increasingly difficult to capture while dealing with multi-domain problems. Expertise from Computer Science, Computer Engineering, Electrical Engineering, and other disciplines merges as engineering challenges in modern systems, such as, Cyber-Physical Systems, Smart Cities, and Bionic Systems must be tackled in a methodological manner. In this paper, a paradigm for formalization...
-
Future research directions in design of reliable communication systems
PublicationIn this position paper on reliable networks, we discuss new trends in the design of reliable communication systems. We focus on a wide range of research directions including protection against software failures as well as failures of communication systems equipment. In particular, we outline future research trends in software failure mitigation, reliability of wireless communications, robust optimization and network design, multilevel...
-
Performance Assessment of Using Docker for Selected MPI Applications in a Parallel Environment Based on Commodity Hardware
PublicationIn the paper, we perform detailed performance analysis of three parallel MPI applications run in a parallel environment based on commodity hardware, using Docker and bare-metal configurations. The testbed applications are representative of the most typical parallel processing paradigms: master–slave, geometric Single Program Multiple Data (SPMD) as well as divide-and-conquer and feature characteristic computational and communication...
-
On the impact of Big Data and Cloud Computing on a scalable multimedia archiving system
PublicationMultimedia Archiver (MA) is a system build upon the promise and fascination of the possibilities emerging from cloud computing and big data. We aim to present and describe how the Multimedia Archiving system works for us to record, put in context and allow a swift access to large amounts of data. We introduce the architecture, identified goals and needs taken into account while designing a system processing data with Big Data...
-
Software tool for modelling of mechatronic systems with elastic continua
PublicationThe paper presents a systematic computational package for modelling and analysis of complex systems composed of multiple lumped and distributed parameter subsystems. The constructed computer program enables the frequency domain analysis of a class of linear systems and to obtain reduced order model in the form of bond graph. Obtained modal bond graph can be directly exported into 20-Sim package to further processing including nonlinear...
-
Grzegorz Lentka dr hab. inż.
PeopleGrzegorz Lentka obtained his MSc title in electronics, specialization Measurement Systems at Gdańsk University of Technology, Faculty of Electronics, Telecommunications and Informatics in 1996. He obtained the PhD title in 2003 and habilitation in 2014, respectively. Currently he is an professor in Department of Metrology and Optoelectronics. His main scientific interests are focused on digital signal processing for metrology,...
-
Preferred Benchmarking Criteria for Systematic Taxonomy of Embedded Platforms (STEP) in Human System Interaction Systems
PublicationThe rate of progress in the field of Artificial Intelligence (AI) and Machine Learning (ML) has significantly increased over the past ten years and continues to accelerate. Since then, AI has made the leap from research case studies to real production ready applications. The significance of this growth cannot be undermined as it catalyzed the very nature of computing. Conventional platforms struggle to achieve greater performance...
-
VISUALIZATION OF SCANTER AND ARPA RADAR DATA IN THE DISTRIBUTED TELEINFORMATION SYSTEM FOR THE BORDER GUARD
PublicationMonitoring of country maritime border is an important task of the Border Guard. This activity can be enhanced with the use of the technology enabling gathering information from distributed sources, processing of that information and its visualization. The paper presents the next stage of development of the STRADAR project (Streaming of real-time data transmission in distributed dispatching and teleinformation systems of the Border...
-
Michał Wróbel dr inż.
PeopleMichał Wróbel, Assistant Professor of Gdańsk University of Technology, computer scientist, a specialist in software engineering. I graduated from the Faculty of Electronics Technical University of Gdansk in 2002 with a degree in Computer Science, with specialization in Software Engineering and Databases. Until 2006 I worked as system administrator in several companies, including CI TASK. Since 2006 I have been working at the Faculty...
-
Efficiency Evaluation of High Performance Computing Systems Using Data Envelopment Analysis
PublicationThe paper presents an evaluation method of high performance computing (HPC) systems using multicriteria efficiency analysis. The Data Envelopment Analysis approach was applied and adapted to the specifics of HPC, which enabled us to compare relative efficiency of systems considering simultaneously multiple parameters. The analysis is based on the TOP500 list of world largest supercomputers and their parameters such as: the number...
-
Modelling and simulation of GPU processing in the MERPSYS environment
PublicationIn this work, we evaluate an analytical GPU performance model based on Little's law, that expresses the kernel execution time in terms of latency bound, throughput bound, and achieved occupancy. We then combine it with the results of several research papers, introduce equations for data transfer time estimation, and finally incorporate it into the MERPSYS framework, which is a general-purpose simulator for parallel and distributed...
-
A multithreaded CUDA and OpenMP based power‐aware programming framework for multi‐node GPU systems
PublicationIn the paper, we have proposed a framework that allows programming a parallel application for a multi-node system, with one or more GPUs per node, using an OpenMP+extended CUDA API. OpenMP is used for launching threads responsible for management of particular GPUs and extended CUDA calls allow to manage CUDA objects, data and launch kernels. The framework hides inter-node MPI communication from the programmer who can benefit from...
-
Sensors and System for Vehicle Navigation
PublicationIn recent years, vehicle navigation, in particular autonomous navigation, has been at the center of several major developments, both in civilian and defense applications. New technologies, such as multisensory data fusion, big data processing, or deep learning, are changing the quality of areas of applications, improving the sensors and systems used. Recently, the influence of artificial intelligence on sensor data processing and...
-
Self-optimizing generalized adaptive notch filters - comparison of three optimization strategies
PublicationThe paper provides comparison of three different approaches to on-line tuning of generalized adaptive notch filters (GANFs) the algorithms used for identification/tracking of quasi-periodically varying dynamic systems. Tuning is needed to adjust adaptation gains, which control tracking performance of ANF algorithms, to the unknown and/or time time-varying rate of system nonstationarity. Two out ofthree compared approaches are classical...
-
Service-based Resilience via Shared Protection in Mission-critical Embedded Networks
PublicationMission-critical networks, which for example can be found in autonomous cars and avionics, are complex systems with a multitude of interconnected embedded nodes and various service demands. Their resilience against failures and attacks is a crucial property and has to be already considered in their design phase. In this paper, we introduce a novel approach for optimal joint service allocation and routing, leveraging virtualized...
-
A Comprehensive Analysis of Deep Neural-Based Cerebral Microbleeds Detection System
PublicationMachine learning-based systems are gaining interest in the field of medicine, mostly in medical imaging and diagnosis. In this paper, we address the problem of automatic cerebral microbleeds (CMB) detection in magnetic resonance images. It is challenging due to difficulty in distinguishing a true CMB from its mimics, however, if successfully solved it would streamline the radiologists work. To deal with this complex three-dimensional...
-
Modelling of Mechatronic Systems with Distributed Parameter Components
PublicationThe paper presents an uniform, port-based approach to modelling of both lumped and distributed parameter systems. Port-based model of distributed system has been defined by application of the bond graph methodology and the distributed transfer function method (DTFM). The proposed method of modelling enables to formulate input data for computer analysis by application of the DTFM. The computational package for the analysis of complex...