Filters
total: 45
filtered: 36
Chosen catalog filters
Search results for: hpc
-
Use of ICT infrastructure for teaching HPC
PublicationIn this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...
-
Using Redis supported by NVRAM in HPC applications
PublicationNowadays, the efficiency of storage systems is a bottleneck in many modern HPC clusters. High performance in the traditional approach – processing using files – is often difficult to obtain because of a model’s complexity and its read/write patterns. An alternative approach is to apply a key-value database, which usually has low latency and scales well. On the other hand, many key-value stores suffer from a limitation of memory...
-
BalticLSC: A low-code HPC platform for small and medium research teams
Publication -
Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework
PublicationRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublicationIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
PublicationModern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
-
Extended investigation of performance-energy trade-offs under power capping in HPC environments
Publication—In the paper we present investigation of performance-energy trade-offs under power capping using modern processors. The results are presented for systems targeted at both server and client markets and were collected from Intel Xeon E5 and Intel Xeon Phi server processors as well as from desktop and mobile Intel Core i7 processors. The results, when using power capping, show that we can find various interesting combinations of...
-
Pre‐exascale HPC approaches for molecular dynamics simulations. Covid‐19 research: A use case
PublicationExascale computing has been a dream for ages and is close to becoming a reality that will impact how molecular simulations are being performed, as well as the quantity and quality of the information derived for them. We review how the biomolecular simulations field is anticipating these new architectures, making emphasis on recent work from groups in the BioExcel Center of Excellence for High Performance Computing. We exemplified...
-
Dynamic Data Management Among Multiple Databases for Optimization of Parallel Computations in Heterogeneous HPC Systems
PublicationRapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...
-
Simulation of Parallel Applications on Large-scale Distributed Systems
PublicationThis chapter has a form of a review article in the field of simulating High-Performance Computing systems. We justify the need for a new versatile simulator considering heterogeneity, energy efficiency and reliability of HPC systems. We sketch the problems that need to be solved by such simulator and rationalize using discrete-event simulation for this purpose. Based on a review of existing discrete-event HPC simulation solutions...
-
Higher platelet counts correlate to tumour progression and can be induced by intratumoural stroma in non-metastatic breast carcinomas
PublicationBackground Platelets support tumour progression. However, their prognostic significance and relation to circulating tumour cells (CTCs) in operable breast cancer (BrCa) are still scarcely known and, thus, merit further investigation. Methods Preoperative platelet counts (PCs) were compared with clinical data, CTCs, 65 serum cytokines and 770 immune-related transcripts obtained using the NanoString technology. Results High normal...
-
Energy-Aware High-Performance Computing: Survey of State-of-the-Art Tools, Techniques, and Environments
PublicationThe paper presents state of the art of energy-aware high-performance computing (HPC), in particular identification and classification of approaches by system and device types, optimization metrics, and energy/power control methods. System types include single device, clusters, grids, and clouds while considered device types include CPUs, GPUs, multiprocessor, and hybrid systems. Optimization goals include various combinations of...
-
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
PublicationHigh-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...
-
Three dimensional simulations of FRC beams and panels with explicit definition of fibres-concrete interaction
PublicationHigh performance concrete (HPC) is a quite novel material which has been rapidly developed in the last few decades. It exhibits superior mechanical properties and durability comparing to normal concrete. HPC can achieve also superior tensile performance if strong fibres (steel or carbon) are implemented in the matrix. Thus, there exist the unabated interest in studying how the addition of different types of fibres modifies the...
-
Efficiency Evaluation of High Performance Computing Systems Using Data Envelopment Analysis
PublicationThe paper presents an evaluation method of high performance computing (HPC) systems using multicriteria efficiency analysis. The Data Envelopment Analysis approach was applied and adapted to the specifics of HPC, which enabled us to compare relative efficiency of systems considering simultaneously multiple parameters. The analysis is based on the TOP500 list of world largest supercomputers and their parameters such as: the number...
-
Task Allocation and Scalability Evaluation for Real-Time Multimedia Processing in a Cluster Envirinment
PublicationAn allocation algorithm for stream processing tasks is proposed (Modified best Fit Descendent, MBFD). A comparison with another solution (BFD) is provided. Tests of the algorithms in an HPC environment are descrobed and the results are presented. A proper scalability metric is proposed and used for the evaluation of the allocation algorithm.
-
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
PublicationMany important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...
-
Process arrival pattern aware algorithms for acceleration of scatter and gather operations
PublicationImbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...
-
Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
PublicationGPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this...
-
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
PublicationTwo novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted...
-
Superkomputery do wspomagania procesów gospodarczych ze szczególnym uwzględnieniem sektora bankowego
PublicationW artykule omówiono wykorzystanie superkomputerów do wspomagania procesów gospodarczych ze szczególnym uwzględnieniem sektora bankowego. Odniesiono się do wybranych projektów wspierających rozwój gospodarczy w oparciu o superkomputery. W szczególności zaproponowano zastosowanie HPC do implementacji wybranych metod sztucznej inteligencji w bankowości, w tym oceny ryzyka wybranych przedsięwzięć. Zaproponowane podejście umożliwia...
-
Modifiers for Medical Grade Polymeric Systems used in FDM 3D Printing - Short Review
PublicationFDM 3D printing could find an application in the wide range of biomedical applications. Unfortunately, the quantity of polymeric biomaterials suitable to processing into filaments is limited. The most frequently used biomaterials for medical constructs such as bone grafts, soft tissue scaffolds or another DDS include PCL, PLA, PVA, HPC, EVA copolymer, EC and TPUs. Various modifiers such as TCP, HA, TEC, MMC could be applicated...
-
Long Distance Geographically Distributed InfiniBand Based Computing
PublicationCollaboration between multiple computing centres, referred as federated computing is becom- ing important pillar of High Performance Computing (HPC) and will be one of its key components in the future. To test technical possibilities of future collaboration using 100 Gb optic fiber link (Connection was 900 km in length with 9 ms RTT time) we prepared two scenarios of operation. In the first one, Interdisciplinary Centre for Mathematical...
-
Teaching High Performance Computing Using BeesyCluster and Relevant Usage Statistics
PublicationThe paper presents motivations and experiences from using the BeesyCluster middleware for teaching high performance computing at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology. Features of BeesyCluster well suited for conducting courses are discussed including: easy-to-use WWW interface for application development and running hiding queuing systems, publishing applications as services...
-
An experimental study of self-sensing concrete enhanced with multi-wall carbon nanotubes in wedge splitting test and DIC
PublicationConcrete is the worldwide most utilized construction material because of its very good performance, forming ability, long-term durability, and low costs. Concrete is a brittle material prone to cracking. Extensive cracking may impact durability and performance over time considerably. The addition of a small amount of carbon nanotubes (CNT) increases the concrete’s overall electrical conductivity, enabling internal structure...
-
Improving Clairvoyant: reduction algorithm resilient to imbalanced process arrival patterns
PublicationThe Clairvoyant algorithm proposed in “A novel MPI reduction algorithm resilient to imbalances in process arrival times” was analyzed, commented and improved. The comments concern handling certain edge cases in the original pseudocode and description, i.e., adding another state of a process, improved cache friendliness more precise complexity estimations and some other issues improving the robustness of the algorithm implementation....
-
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
PublicationIn this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...
-
Full scale CFD seakeeping simulations for case study ship redesigned from V-shaped bulbous bow to X-bow hull form
PublicationIncreasing propulsion efficiency, safety, comfort and operability are of the great importance, especially for small ships operating on windy sites like the North Sea and the Baltic Sea. Seakeeping performance of ships and offshore structures can be analysed by different methods and the one that is becoming increasingly important is CFD RANS. The recent development of simulation techniques together with rising HPC accessibility...
-
A Parallel MPI I/O Solution Supported by Byte-addressable Non-volatile RAM Distributed Cache
PublicationWhile many scientific, large-scale applications are data-intensive, fast and efficient I/O operations have become of key importance for HPC environments. We propose an MPI I/O extension based on in-system distributed cache with data located in Non-volatile Random Access Memory (NVRAM) available in each cluster node. The presented architecture makes effective use of NVRAM properties such as persistence and byte-level access behind...
-
Parallel Programming for Modern High Performance Computing Systems
PublicationIn view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...
-
BeesyCluster: Architektura systemu dostepu do sieci klastrów przez WWW/Web Services.
PublicationNiniejsza praca prezentuje system BeesyCluster, który integruje rozproszone klastry poprzez łatwy w użyciu interfejs WWW oraz Web Services. W wersji pilotowej system uruchomiony zostanie w charakterze portalu dostępowego do klastrów gdańskiej sieci TASK wykorzystując 128-procesorowy klaster galera oraz 256-procesorowy 64-bitowy klaster holk jak również laboratoria badawcze Wydziału ETI Politechniki Gdańskiej. System, oparty o technologię...
-
Advanced Potential Energy Surfaces for Molecular Simulation
PublicationAdvanced potential energy surfaces are defined as theoretical models that explicitly include many-body effects that transcend the standard fixed-charge, pairwise-additive paradigm typically used in molecular simulation. However, several factors relating to their software implementation have precluded their widespread use in condensed-phase simulations: the computational cost of the theoretical models, a paucity of approximate models...
-
DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing
PublicationIn the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...
-
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
PublicationThis paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...
-
Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns
PublicationThe paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set...
-
Towards Scalable Simulation of Federated Learning
PublicationFederated learning (FL) allows to train models on decentralized data while maintaining data privacy, which unlocks the availability of large and diverse datasets for many practical applications. The ongoing development of aggregation algorithms, distribution architectures and software implementations aims for enabling federated setups employing thousands of distributed devices, selected from millions. Since the availability of...