dr hab. inż. Jerzy Proficz
Zatrudnienie
- Dyrektor Centrum Informatycznego TASK w Centrum Informat. Trójmiejskiej Akadem.Sieci Komputerowej
- Profesor uczelni w Katedra Architektury Systemów Komputerowych
Obszary badawcze
Publikacje
Filtry
wszystkich: 56
Katalog Publikacji
-
Energy-Aware High-Performance Computing: Survey of State-of-the-Art Tools, Techniques, and Environments
PublikacjaThe paper presents state of the art of energy-aware high-performance computing (HPC), in particular identification and classification of approaches by system and device types, optimization metrics, and energy/power control methods. System types include single device, clusters, grids, and clouds while considered device types include CPUs, GPUs, multiprocessor, and hybrid systems. Optimization goals include various combinations of...
-
MERPSYS: An environment for simulation of parallel application execution on large scale HPC systems
PublikacjaIn this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects...
-
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
PublikacjaThis paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...
-
Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors
PublikacjaIn the paper we present extensive results from analyzing energy/performance trade-offs with power capping observed on four different modern CPUs, for three different parallel applications such as 2D heat distribution, numerical integration and Fast Fourier Transform. The CPU tested represent both multi-core type CPUs such as Intel⃝R Xeon⃝R E5, desktop and mobile i7 as well as many-core Intel⃝R Xeon PhiTM x200 but also server, desktop...
-
GPU Power Capping for Energy-Performance Trade-Offs in Training of Deep Convolutional Neural Networks for Image Recognition
PublikacjaIn the paper we present performance-energy trade-off investigation of training Deep Convolutional Neural Networks for image recognition. Several representative and widely adopted network models, such as Alexnet, VGG-19, Inception V3, Inception V4, Resnet50 and Resnet152 were tested using systems with Nvidia Quadro RTX 6000 as well as Nvidia V100 GPUs. Using GPU power capping we found other than default configurations minimizing...
-
Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
PublikacjaHigh-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...
-
Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins
PublikacjaWe report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...
-
Extended investigation of performance-energy trade-offs under power capping in HPC environments
Publikacja—In the paper we present investigation of performance-energy trade-offs under power capping using modern processors. The results are presented for systems targeted at both server and client markets and were collected from Intel Xeon E5 and Intel Xeon Phi server processors as well as from desktop and mobile Intel Core i7 processors. The results, when using power capping, show that we can find various interesting combinations of...
-
Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
PublikacjaGPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this...
-
Tryton Supercomputer Capabilities for Analysis of Massive Data Streams
PublikacjaThe recently deployed supercomputer Tryton, located in the Academic Computer Center of Gdansk University of Technology, provides great means for massive parallel processing. Moreover, the status of the Center as one of the main network nodes in the PIONIER network enables the fast and reliable transfer of data produced by miscellaneous devices scattered in the area of the whole country. The typical examples of such data are streams...
-
Modeling energy consumption of parallel applications
PublikacjaThe paper presents modeling and simulation of energy consumption of two types of parallel applications: geometric Single Program Multiple Data (SPMD) and divide-and-conquer (DAC). Simulation is performed in a new MERPSYS environment. Model of an application uses the Java language with extension representing message exchange between processes working in parallel. Simulation is performed by running threads representing distinct process...
-
Improving all-reduce collective operations for imbalanced process arrival patterns
PublikacjaTwo new algorithms for the all-reduce operation optimized for imbalanced process arrival patterns (PAPs) are presented: (1) sorted linear tree, (2) pre-reduced ring as well as a new way of online PAP detection, including process arrival time estimations, and their distribution between cooperating processes was introduced. The idea, pseudo-code, implementation details, benchmark for performance evaluation and a real case example...
-
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
PublikacjaThe paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...
-
DEPO: A dynamic energy‐performance optimizer tool for automatic power capping for energy efficient high‐performance computing
PublikacjaIn the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms—linear search (LS) and golden section search (GSS), finds...
-
Categorization of Cloud Workload Types with Clustering
PublikacjaThe paper presents a new classification schema of IaaS cloud workloads types, based on the functional characteristics. We show the results of an experiment of automatic categorization performed with different benchmarks that represent particular workload types. Monitoring of resource utilization allowed us to construct workload models that can be processed with machine learning algorithms. The direct connection between the functional...
-
Mobile Offloading Framework: Solution for Optimizing Mobile Applications Using Cloud Computing
PublikacjaNumber of mobile devices and applications is growing rapidly in recent years. Capabilities and performance of these devices can be tremendously extended with the integration of cloud computing. However, multiple challenges regarding implementation of these type of mobile applications are known, like differences in architecture, optimization and operating system support. This paper summarizes issues with mobile cloud computing and...
-
Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns
PublikacjaThe paper presents an evaluation of all-reduce collective MPI algorithms for an environment based on a geographically-distributed compute cluster. The testbed was split into two sites: CI TASK in Gdansk University of Technology and ICM in University of Warsaw, located about 300 km from each other, both connected by a fast optical fiber Ethernet-based 100 Gbps network (900 km part of the PIONIER backbone). Each site hosted a set...
-
UNRES-GPU for Physics-Based Coarse-Grained Simulations of Protein Systems at Biological Time- and Size-Scales
PublikacjaThe dynamics of the virus like particles (VLPs) corresponding to the GII.4 Houston, GII.2 SMV, and GI.1 Norwalk strains of human noroviruses (HuNoV) that cause gastroenteritis was investigated by means of long-time (about 30 μs in the laboratory timescale) molecular dynamics simulations with the coarse-grained UNRES force field. The main motion of VLP units turned out to be the bending at the junction between the P1 subdomain (that...
-
Long Distance Geographically Distributed InfiniBand Based Computing
PublikacjaCollaboration between multiple computing centres, referred as federated computing is becom- ing important pillar of High Performance Computing (HPC) and will be one of its key components in the future. To test technical possibilities of future collaboration using 100 Gb optic fiber link (Connection was 900 km in length with 9 ms RTT time) we prepared two scenarios of operation. In the first one, Interdisciplinary Centre for Mathematical...
-
Process arrival pattern aware algorithms for acceleration of scatter and gather operations
PublikacjaImbalanced process arrival patterns (PAPs) are ubiquitous in many parallel and distributed systems, especially in HPC ones. The collective operations, e.g. in MPI, are designed for equal process arrival times (PATs), and are not optimized for deviations in their appearance. We propose eight new PAP-aware algorithms for the scatter and gather operations. They are binomial or linear tree adaptations introducing additional process...
wyświetlono 4921 razy