Filters
total: 18
Search results for: CACHE
-
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
PublicationIn this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...
-
Cache service for maps presentation in distributed information data exchange system
PublicationThe paper presents the proposition of caches implementation for map presentation in distributed information data exchange system. The concept of cache service is described in the context of distributed information data exchange system elements which control and present on maps positions and other identification data of vessels and other suspicious objects on the territorial sea, sea-coast and the internal sea-waters. The proposed...
-
A Solution to Image Processing with Parallel MPI I/O and Distributed NVRAM Cache
PublicationThe paper presents a new approach to parallel image processing using byte addressable, non-volatile memory (NVRAM). We show that our custom built MPI I/O implementation of selected functions that use a distributed cache that incorporates NVRAMs located in cluster nodes can be used for efficient processing of large images. We demonstrate performance benefits of such a solution compared to a traditional implementation without NVRAM...
-
Three levels of fail-safe mode in MPI I/O NVRAM distributed cache
PublicationThe paper presents architecture and design of three versions for fail-safe data storage in a distributed cache using NVRAM in cluster nodes. In the first one, cache consistency is assured through additional buffering write requests. The second one is based on additional write log managers running on different nodes. The third one benefits from synchronization with a Parallel File System (PFS) for saving data into a new file which...
-
Multi-agent large-scale parallel crowd simulation with NVRAM-based distributed cache
PublicationThis paper presents the architecture, main components and performance results for a parallel and modu-lar agent-based environment aimed at crowd simulation. The environment allows to simulate thousandsor more agents on maps of square kilometers or more, features a modular design and incorporates non-volatile RAM (NVRAM) with a fail-safe mode that can be activated to allow to continue computationsfrom a recently analyzed state in...
-
Taking advantage of the shared explicit cache system based critical sections in the shared memory parallel architectures
PublicationArtykuł prezentuje nową metodę implementacji sekcji krytycznych w równoległych architekturach z pamięcią współdzieloną, takich jak systemy zintegrowane wielowątkowe wieloprocesorowe. Metoda stanowi modyfikację i rozbudowanie metody zwanej Folding, dostępnej w procesorach sieciowych oraz jest w założeniach podobna do techniki zwanej cache-based locking. W porównaniu do dostępnych metod, nowa metoda usuwa problemy skalowalności i...
-
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
PublicationWhile many scientific, large-scale applications are data-intensive, fast and efficient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...
-
Journal of Cachexia Sarcopenia and Muscle
Journals -
Raman spectroscopy in investigation of rheometric processes
PublicationPraca porusza problem analizy cienkiej warstwy przyściennej -oleju za pomocą spektroskopii ramanowskiej. Parametry (grubość, struktura chemiczna) cienkiej warstwy przyśniennej są istotną informacją do oceny przebiegu procesu wyciskania past modelowych. Zaprezentowano wyniki pomiarów oraz konstrukcje ramanowskich układów pomiarowych.
-
Optical monitoring of thin oil film thickness in extrusion processes
PublicationPraca porusza problem pomiaru grubości cienkiej warstwy przyściennej -oleju, podczas procesu wyciskania ceramiki modelowej. Szczegółowo opisuje zjawiska fizyczne zachodzące w warstwie przyściennej oraz dynamiczne parametry wykorzystywane do oceny grubości. Autorzy opisują także metodę pomiaru oraz konstrukcję układów pomiarowych.
-
Integrating SHECS-based critical sections with hardware SMP scheduler in TLP-CMPs
PublicationArtykuł prezentuje koncepcje zintegrowania sekcji krytycznych opartych o układ SHECS (współdzielony jawny cache system) ze sprzętowym menadżerem zadań SMP w zintegrowanych architekturach wieloprocesorowych z wielowątkowością sprzętową (TLP-CMPs). Przedstawione jest porównanie wydajności zintegrowania sekcji krytycznych SHECS z programowym menadżerem zadań SMP względem użycia sprzętowego menadżera zadań SMP. Środowiskiem wykonania...
-
Performance Analysis of Convolutional Neural Networks on Embedded Systems
PublicationMachine learning is no longer confined to cloud and high-end server systems and has been successfully deployed on devices that are part of Internet of Things. This paper presents the analysis of performance of convolutional neural networks deployed on an ARM microcontroller. Inference time is measured for different core frequencies, with and without DSP instructions and disabled access to cache. Networks use both real-valued and...
-
A Fail-Safe NVRAM Based Mechanism for Efficient Creation and Recovery of Data Copies in Parallel MPI Applications
PublicationThe paper presents a fail-safe NVRAM based mechanism for creation and recovery of data copies during parallel MPI application runtime. Specifically, we target a cluster environment in which each node has an NVRAM installed in it. Our previously developed extension to the MPI I/O API can take advantage of NVRAM regions in order to provide an NVRAM based cache like mechanism to significantly speed up I/O operations and allow to preload...
-
Improving Clairvoyant: reduction algorithm resilient to imbalanced process arrival patterns
PublicationThe Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to imbalances in process arrival times” was analyzed, commented and improved. The comments concern handling certain edge cases in the original pseudocode and description, i.e., adding another state of a process, improved cache friendliness more precise complexity estimations and some other issues improving the robustness of the algorithm implementation....
-
Performance assessment of OpenMP constructs and benchmarks using modern compilers and multi-core CPUs
PublicationConsidering ongoing developments of both modern CPUs, especially in the context of increasing numbers of cores, cache memory and architectures as well as compilers there is a constant need for benchmarking representative and frequently run workloads. The key metric is speed-up as the computational power of modern CPUs stems mainly from using multiple cores. In this paper, we show and discuss results from running codes such as:...
-
A highly-efficient technique for evaluating bond-orientational order parameters
PublicationWe propose a novel, highly-efficient approach for the evaluation of bond-orientational order parameters (BOPs). Our approach exploits the properties of spherical harmonics and Wigner 3jj-symbols to reduce the number of terms in the expressions for BOPs, and employs simultaneous interpolation of normalised associated Legendre polynomials and trigonometric functions to dramatically reduce the total number of arithmetic operations....
-
Multi-agent large-scale parallel crowd simulation
PublicationThis paper presents design, implementation and performance results of a new modular, parallel, agent-based and large scale crowd simulation environment. A parallel application, implemented with C and MPI, was implemented and run in this parallel environment for simulation and visualization of an evacuation scenario at Gdansk University of Technology, Poland and further in the area of districts of Gdansk. The application uses a...
-
Improving web user experience with caching user interface
PublicationIn human-computer interaction, response time is assumed generally not to exceed significantly 1-2 seconds. While the natural competition in the Internet public Web serving ensures adhering widely to such limits, some Web environments are less competitive and offer much worse user experience in terms of response time. This paper describes a solution to significantly improve user experience in terms of response time with only modification...