Wyniki wyszukiwania dla: massively parallel computing - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: massively parallel computing

Wyniki wyszukiwania dla: massively parallel computing

  • Fixed Pattern Noise Reduction and Linearity Improvement in Time-Mode CMOS Image Sensors

    Publikacja

    - SENSORS - Rok 2020

    In the paper, a digital clock stopping technique for gain and offset correction in time-mode analog-to-digital converters (ADCs) has been proposed. The technique is dedicated to imagers with massively parallel image acquisition working in the time mode where compensation of dark signal non-uniformity (DSNU) as well as photo-response non-uniformity (PRNU) is critical. Fixed pattern noise (FPN) reduction has been experimentally validated...

    Pełny tekst do pobrania w portalu

  • Towards an efficient multi-stage Riemann solver for nuclear physics simulations

    Publikacja
    • S. Cygert
    • J. Porter-Sobieraj
    • D. Kikoła
    • J. Sikorski
    • M. Słodkowski

    - Rok 2013

    Relativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Mechanism of recognition of parallel G-quadruplexes by DEAH/RHAU helicase DHX36 explored by molecular dynamics simulations

    Because of high stability and slow unfolding rates of G-quadruplexes (G4), cells have evolved specialized helicases that disrupt these non-canonical DNA and RNA structures in an ATP-dependent manner. One example is DHX36, a DEAH-box helicase, which participates in gene expression and replication by recognizing and unwinding parallel G4s. Here, we studied the molecular basis for the high affinity and specificity of DHX36 for parallel-type...

    Pełny tekst do pobrania w portalu

  • Assessment of OpenMP Master–Slave Implementations for Selected Irregular Parallel Applications

    Publikacja

    - Electronics - Rok 2021

    The paper investigates various implementations of a master–slave paradigm using the popular OpenMP API and relative performance of the former using modern multi-core workstation CPUs. It is assumed that a master partitions available input into a batch of predefined number of data chunks which are then processed in parallel by a set of slaves and the procedure is repeated until all input data has been processed. The paper experimentally...

    Pełny tekst do pobrania w portalu

  • Acceleration of the DGF-FDTD method on GPU using the CUDA technology

    We present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Development and tuning of irregular divide-and-conquer applications in DAMPVM/DAC

    Publikacja

    - Rok 2002

    This work presents implementations and tuning experiences with parallel irregular applications developed using the object oriented framework DAM-PVM/DAC. It is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-and-conquer (DAC) applications at runtime and dynamic mapping to processors taking into account their speeds and even loads by other user processes. New implementations of parallel applications...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • A Power-Efficient Digital Technique for Gain and Offset Correction in Slope ADCs

    In this brief, a power-efficient digital technique for gain and offset correction in slope analog-to-digital converters (ADCs) has been proposed. The technique is especially useful for imaging arrays with massively parallel image acquisition where simultaneous compensation of dark signal non-uniformity (DSNU) as well as photo-response non-uniformity (PRNU) is critical. The presented approach is based on stopping the ADC clock by...

    Pełny tekst do pobrania w portalu

  • Performance/energy aware optimization of parallel applications on GPUs under power capping

    Publikacja

    In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...

    Pełny tekst do pobrania w portalu

  • Performance Analysis of the OpenCL Environment on Mobile Platforms

    Publikacja

    Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache

    Publikacja

    - Annals of Computer Science and Information Systems - Rok 2016

    While many scientific, large-scale applications are data-intensive, fast and efficient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...

    Pełny tekst do pobrania w portalu

  • Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

    Publikacja

    In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Three levels of fail-safe mode in MPI I/O NVRAM distributed cache

    The paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...

    Pełny tekst do pobrania w portalu

  • Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework

    Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

    Pełny tekst do pobrania w portalu

  • Video Analytics-Based Algorithm for Monitoring Egress from Buildings

    Publikacja

    A concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Benchmarking Deep Neural Network Training Using Multi- and Many-Core Processors

    In the paper we provide thorough benchmarking of deep neural network (DNN) training on modern multi- and many-core Intel processors in order to assess performance differences for various deep learning as well as parallel computing parameters. We present performance of DNN training for Alexnet, Googlenet, Googlenet_v2 as well as Resnet_50 for various engines used by the deep learning framework, for various batch sizes. Furthermore,...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Surface diffusion and cluster formation of gold on the silicon (111)

    Purpose: Investigation of the gold atoms behaviour on the surface of silicon by molecular dynamics simulation method. The studies were performed for the case of one, two and four atoms, as well as incomplete and complete filling of gold atoms on the silicon surface. Design/methodology/approach: Investigations were performed by the method of molecular dynamics simulation using the Large-scale Atomic/Molecular Massively Parallel...

    Pełny tekst do pobrania w portalu

  • Use of ICT infrastructure for teaching HPC

    Publikacja

    - Rok 2019

    In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Tuning matrix-vector multiplication on GPU

    A matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...

  • A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems

    Publikacja

    Modern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...

  • Behavior Analysis and Dynamic Crowd Management in Video Surveillance System

    A concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix

    Publikacja

    In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Acceleration of Electromagnetic Simulations on Reconfigurable FPGA Card

    Publikacja

    - Rok 2023

    In this contribution, the hardware acceleration of electromagnetic simulations on the reconfigurable field-programmable-gate-array (FPGA) card is presented. In the developed implementation of scientific computations, the matrix-assembly phase of the method of moments (MoM) is accelerated on the Xilinx Alveo U200 card. The computational method involves discretization of the frequency-domain mixed potential integral equation using...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Molecular dynamics simulations reveal the balance of forces governing the formation of a guanine tetrad—a common structural unit of G-quadruplex DNA

    G-quadruplexes (G4) are nucleic acid conformations of guanine-rich sequences, in which guanines are arranged in the square-planar G-tetrads, stacked on one another. G4 motifs form in vivo and are implicated in regulation of such processes as gene expression and chromosome maintenance. The structure and stability of various G4 topologies were determined experimentally; however, the driving forces for their formation are not fully...

    Pełny tekst do pobrania w portalu

  • Online sound restoration system for digital library applications

    Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • An MOR Algorithm Based on the Immittance Zero and Pole Eigenvectors for Fast FEM Simulations of Two-Port Microwave Structures

    The aim of this article is to present a novel model-order reduction (MOR) algorithm for fast finite-element frequency-domain simulations of microwave two-port structures. The projection basis used to construct the reduced-order model (ROM) comprises two sets: singular vectors and regular vectors. The first set is composed of the eigenvectors associated with the poles of the finite-element method (FEM) state-space system, while...

    Pełny tekst do pobrania w portalu

  • Further Developments of the Online Sound Restoration System for Digital Library Applications

    Publikacja

    New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Online sound restoration system for digital library applications.

    Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

  • Wpływ kontekstu na efektywność wykonania interaktywnych aplikacji iteracyjnych w dedykowanej przestrzeni usług

    Publikacja

    - Rok 2013

    Tematyka rozprawy dotyczy aplikacji kontekstowych wykonywanych w środowisku czasu rzeczywistego typu *pervasive computing*. To środowisko nazywane jest przestrzenią inteligentną a aplikacje w niej wykonywane określane są jako Interaktywne Aplikacje Iteracyjne (IAI). IAI analizuje w sposób ciągły sytuacje (wyrażone przez kontekst) zachodzące w przestrzeni i w zależności od bieżącego kontekstu podejmuje określone działania. W skład...

  • Sensitivity of the Baltic Sea level prediction to spatial model resolution

    Publikacja

    - Rok 2017

    he three-dimensional hydrodynamic model of the Baltic Sea (M3D) and...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Performance Evaluation of the Parallel Codebook Algorithm for Background Subtraction in Video Stream

    A background subtraction algorithm based on the codebook approach was implemented on a multi-core processor in a parallel form, using the OpenMP system. The aim of the experiments was to evaluate performance of the multithreaded algorithm in processing video streams recorded from monitoring cameras, depending on a number of computer cores used, method of task scheduling, image resolution and degree of image content variability....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Affective Computing

    Kursy Online
    • G. Brodny
    • A. Kołakowska
    • M. Sowiński
    • M. Wróbel
    • A. Landowska

  • Advances in Intelligent Systems and Computing

    Czasopisma

    ISSN: 2194-5357

  • Agnieszka Landowska dr hab. inż.

    Ukończyła studia na dwóch kierunkach: Finanse i bankowość na Uniwersytecie Gdańskim oraz Informatyka na WETI Politechniki Gdańskiej. Od 2000 roku jest związana z Politechniką Gdańską. W 2006 roku uzyskała stopień doktora w dziedzinie nauk technicznych, a w roku 2019 stopień doktora habilitowanego. Aktualnie jej praca naukowa dotyczy zagadnień interakcji człowiek-komputer oraz informatyki afektywnej (ang. affective computing), która...

  • DATABASE AND BIGDATA PROCESSING SYSTEM FOR ANALYSIS OF AIS MESSAGES IN THE NETBALTIC RESEARCH PROJECT

    Publikacja

    - TASK Quarterly - Rok 2017

    A specialized database and a software tool for graphical and numerical presentation of maritime measurement results has been designed and implemented as part of the research conducted under the netBaltic project (Internet over the Baltic Sea – the implementation of a multi-system, self-organizing broadband communications network over the sea for enhancing navigation safety through the development of e-navigation services.) The...

    Pełny tekst do pobrania w portalu

  • Processing of Satellite Data in the Cloud

    Publikacja

    The dynamic development of digital technologies, especially those dedicated to devices generating large data streams, such as all kinds of measurement equipment (temperature and humidity sensors, cameras, radio-telescopes and satellites – Internet of Things) enables more in-depth analysis of the surrounding reality, including better understanding of various natural phenomenon, starting from atomic level reactions, through macroscopic...

    Pełny tekst do pobrania w portalu

  • Considerations of Computational Efficiency in Volunteer and Cluster Computing

    Publikacja

    - Rok 2016

    In the paper we focus on analysis of performance and power consumption statistics for two modern environments used for computing – volunteer and cluster based systems. The former integrate computational power donated by volunteers from their own locations, often towards social oriented or targeted initiatives, be it of medical, mathematical or space nature. The latter is meant for high performance computing and is typically installed...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Optimization issues in distributed computing systems design

    Publikacja

    - Rok 2014

    In recent years, we observe a growing interest focused on distributed computing systems. Both industry and academia require increasing computational power to process and analyze large amount of data, including significant areas like analysis of medical data, earthquake, or weather forecast. Since distributed computing systems – similar to computer networks – are vulnerable to failures, survivability mechanisms are indispensable...

  • Crowdsourcing and Volunteer Computing as Distributed Approach for Problem Solving

    Publikacja

    In this paper, a combination between volunteer computing and crowdsourcing is presented. Two paradigms of the web computing are described, analyzed and compared in detail: grid computing and volunteer computing. Characteristics of BOINC and its contribution to global Internet processing are shown with the stress put onto applications the system can facilitate and problems it can solve. An alternative instance of a grid computing...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Quality Modeling in Grid and Volunteer-Computing Systems

    Publikacja

    - Rok 2013

    A model of computational quality in large-scale computing systems was presented in the previous chapter of this book. This model describes three quality attributes: performance, reliability and energy efficiency. We assumed that all processes in the system are incessantly ready to perform calculations and that communication between the processes occurs immediately. These assumptions are not true for grid and volunteer computing...

  • Modeling Parallel Applications in the MERPSYS Environment

    Publikacja

    - Rok 2016

    The chapter presents how to model parallel computational applications for which simulation of execution in a large-scale parallel or distributed environment is performed within the MERPSYS environment. Specifically, it is shown what approaches can be adopted to model key paradigms often used for parallel applications: master-slave, geometric parallelism (single program multiple data), pipelined and divide-and-conquer applications....

  • Long Distance Geographically Distributed InfiniBand Based Computing

    Publikacja

    - Supercomputing Frontiers and Innovations - Rok 2020

    Collaboration between multiple computing centres, referred as federated computing is becom- ing important pillar of High Performance Computing (HPC) and will be one of its key components in the future. To test technical possibilities of future collaboration using 100 Gb optic fiber link (Connection was 900 km in length with 9 ms RTT time) we prepared two scenarios of operation. In the first one, Interdisciplinary Centre for Mathematical...

    Pełny tekst do pobrania w portalu

  • On Computing Curlicues Generated by Circle Homeomorphisms

    Publikacja

    The dataset entitled Computing dynamical curlicues contains values of consecutive points on a curlicue generated, respectively, by rotation on the circle by different angles, the Arnold circle map (with various parameter values) and an exemplary sequence as well as corresponding diameters and Birkhoff averages of these curves. We additionally provide source codes of the Matlab programs which can be used to generate and plot the...

    Pełny tekst do pobrania w portalu

  • Metaheuristic algorithms for optimization of resilient overlay computing systems

    Publikacja
    • K. Walkowiak
    • W. Charewicz
    • M. Donajski
    • J. Rak

    - Logic journal of the IGPL - Rok 2014

    The idea of distributed computing systems has been gaining much interest in recent years owing to the growing amount of data to be processed for both industrial and academic purposes. However, similar to other systems, also distributed computing systems are vulnerable to failures. Due to strict QoS requirements, survivability guarantees are necessary for provisioning of uninterrupted service. In this article, we focus on reliability...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • From Sequential to Parallel Implementation of NLP Using the Actor Model

    The article focuses on presenting methods allowing easy parallelization of an existing, sequential Natural Language Processing (NLP) application within a multi-core system. The actor-based solution implemented with the Akka framework has been applied and compared to an application based on Task Parallel Library (TPL) and to the original sequential application. Architectures, data and control flows are described along with execution...

    Pełny tekst do pobrania w portalu

  • Berkeley Open Infrastructure for Network Computing

    Publikacja

    - Rok 2012

    Zaprezentowano system BOINC (ang. Berkeley Open Infrastructure for Network Computing) jako interesujące rozwiązanie integrujące rozproszone moce obliczeniowe osobistych komputerów typu PC w Internecie. Przedstawiono zasadę działania opisywanej platformy. W dalszej części zaprezentowano kilka wybranych projektów naukowych wykorzystujących BOINC, które są reprezentatywne w zakresie zastosowania systemu w ujęciu założonego paradygmatu...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Edge-Computing based Secure E-learning Platforms

    Publikacja

    - Rok 2022

    Implementation of Information and Communication Technologies (ICT) in E-Learning environments have brought up dramatic changes in the current educational sector. Distance learning, online learning, and networked learning are few examples that promote educational interaction between students, lecturers and learning communities. Although being an efficient form of real learning resource, online electronic resources are subject to...

    Pełny tekst do pobrania w portalu

  • Complementary oriented allocation algorithm for cloud computing

    Publikacja

    Nowadays cloud computing is one of the most popular processing models. More and more different kinds of workloads have been migrated to clouds. This trend obliges the community to design algorithms which could optimize the usage of cloud resources and be more effiient and effective. The paper proposes a new model of workload allocation which bases on the complementarity relation and analyzes it. An example of a case of use is shown...

    Pełny tekst do pobrania w portalu

  • Affective computing and affective learning – methods, tools and prospects

    Every teacher knows that interest, active participation and motivation are important factors in the learning process. At the same time e-learning environments almost always address only the cognitive aspects of education. This paper provides a brief review of methods used for affect recognition, representation and processing as well as investigates how these methods may be used to address affective aspect of e-education. The paper...

    Pełny tekst do pobrania w portalu

  • A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache

    The paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...

    Pełny tekst do pobrania w portalu

  • Performance of Noise Map Service Working in Cloud Computing Environment

    In the paper a noise map service designated for the user interested in environmental noise subject is presented. It is based on cloud computing. Noise prediction algorithm and source model, developed for creating acoustic maps, are working in cloud computing environment. In the study issues related to noise modeling of sound propagation in urban spaces are discussed with a special focus on road noise. Examples of results obtained...

    Pełny tekst do pobrania w portalu