Search results for: energy-aware computing, high performance computing, green computing, nas parallel benchmark

Search results for: energy-aware computing, high performance computing, green computing, nas parallel benchmark

results on page:
embed this view on your website

Filters

total: 1015

clear all filters disabled

displaying 1000 best results Help

Energy-Aware Scheduling for High-Performance Computing Systems: A Survey
Publication
- ENERGIES - Year 2023
High-performance computing (HPC), according to its name, is traditionally oriented toward performance, especially the execution time and scalability of the computations. However, due to the high cost and environmental issues, energy consumption has already become a very important factor that needs to be considered. The paper presents a survey of energy-aware scheduling methods used in a modern HPC environment, starting with the...

Full text available to download
Parallel Programming for Modern High Performance Computing Systems
Publication
- P. Czarnul
- Year 2018
In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and...

Full text to download in external service
Energy-Aware High-Performance Computing: Survey of State-of-the-Art Tools, Techniques, and Environments
Publication
- Scientific Programming - Year 2019
The paper presents state of the art of energy-aware high-performance computing (HPC), in particular identification and classification of approaches by system and device types, optimization metrics, and energy/power control methods. System types include single device, clusters, grids, and clouds while considered device types include CPUs, GPUs, multiprocessor, and hybrid systems. Optimization goals include various combinations of...

Full text available to download
Survey of Methodologies, Approaches, and Challenges in Parallel Programming Using High-Performance Computing Systems
Publication
- Scientific Programming - Year 2020
This paper provides a review of contemporary methodologies and APIs for parallel programming, with representative technologies selected in terms of target system type (shared memory, distributed, and hybrid), communication patterns (one-sided and two-sided), and programming abstraction level. We analyze representatives in terms of many aspects including programming model, languages, supported platforms, license, optimization goals,...

Full text available to download
Performance and Power-Aware Modeling of MPI Applications for Cluster Computing
Publication
- J. Proficz
- P. Czarnul
- Year 2016
The paper presents modeling of performance and power consumption when running parallel applications on modern cluster-based systems. The model includes basic so-called blocks representing either computations or communication. The latter includes both point-to-point and collective communication. Real measurements were performed using MPI applications and routines run on three different clusters with both Infiniband and Gigabit Ethernet...

Full text available to download
Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption
Publication
- P. Rościszewski
- Year 2018
Many important computational problems require utilization of high performance computing (HPC) systems that consist of multi-level structures combining higher and higher numbers of devices with various characteristics. Utilizing full power of such systems requires programming parallel applications that are hybrid in two meanings: they can utilize parallelism on multiple levels at the same time and combine together programming interfaces...

Full text to download in external service
Efficient parallel implementation of crowd simulation using a hybrid CPU+GPU high performance computing system
Publication
- J. Skrzypczak
- P. Czarnul
- SIMULATION MODELLING PRACTICE AND THEORY - Year 2023
In the paper we present a modern efficient parallel OpenMP+CUDA implementation of crowd simulation for hybrid CPU+GPU systems and demonstrate its higher performance over CPU-only and GPU-only implementations for several problem sizes including 10 000, 50 000, 100 000, 500 000 and 1 000 000 agents. We show how performance varies for various tile sizes and what CPU–GPU load balancing settings shall be preferred for various domain...

Full text to download in external service
Parallel multithread computing for spectroscopic analysis in optical coherence tomography
Publication
- Year 2014
Spectroscopic Optical Coherence Tomography (SOCT) is an extension of Optical Coherence Tomography (OCT). It allows gathering spectroscopic information from individual scattering points inside the sample. It is based on time-frequency analysis of interferometric signals. Such analysis requires calculating hundreds of Fourier transforms while performing a single A-scan. Additionally, further processing of acquired spectroscopic information...

Full text to download in external service
Highly parallel distributed computing systems with optical interconnections
Publication
- J. Just
- R. Romaniuk
- R. S. Romaniuk
- Microprocessing and Microprogramming - Year 1989
Full text to download in external service
Highly Parallel Distributed Computing System With Optical Interconnections
Publication
- J. Just
- R. Romaniuk
- R. S. Romaniuk
- Year 1990
Full text to download in external service
Review of parallel computing methods and tools for FPGA technology
Publication
- R. Cieszewski
- M. Linczuk
- K. Pozniak
- R. Romaniuk
- R. S. Romaniuk
- Year 2013
Full text to download in external service
Journal of High Performance Computing

Journals

ISSN: 2230-7192
CCF Transactions on High Performance Computing

Journals

ISSN: 2524-4922 , eISSN: 2524-4930
International Journal of Grid and High Performance Computing

Journals

ISSN: 1938-0259 , eISSN: 1938-0267
International Journal of High Performance Computing and Networking

Journals

ISSN: 1740-0562
Paweł Czarnul dr hab. inż.

People

Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics

Paweł Czarnul obtained a D.Sc. degree in computer science in 2015, a Ph.D. in computer science granted by a council at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology in 2003. His research interests include:parallel and distributed processing including clusters, accelerators, coprocessors; distributed information systems; architectures of distributed systems; programming mobile devices....
High Performance Computing Systems Lab / Systemy Obliczeniowe Wysokiej Wydajności - laboratorium
e-Learning Courses
- M. Rewieński
Systemy Obliczeniowe Wysokiej Wydajności (SOWW) - Laboratorium
Systemy obliczeniowe wysokiej wydajności/High performance computing systems / L-22/23
e-Learning Courses
- A. Krzywaniak
- P. Rościszewski
- M. Matuszek
- P. Januszewski
- S. Olewniczak
- P. Czarnul
- R. Kałaska
Systemy obliczeniowe wysokiej wydajności/High performance computing systems / L-23/24
e-Learning Courses
- M. Matuszek
- A. Brzeski
- P. Czarnul
- R. Kałaska
Systemy obliczeniowe wysokiej wydajności/High performance computing >> systems - Nowy - Nowy
e-Learning Courses
- A. Krzywaniak
- P. Rościszewski
- M. Matuszek
- K. Draszawka
- P. Januszewski
- S. Olewniczak
- P. Czarnul
- R. Kałaska
Paweł Rościszewski dr inż.

People

Paweł Rościszewski received his PhD in Computer Science at Gdańsk University of Technology in 2018 based on PhD thesis entitled: "Optimization of hybrid parallel application execution in heterogeneous high performance computing systems considering execution time and power consumption". Currently, he is an Assistant Professor at the Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Poland....
Performance/energy aware optimization of parallel applications on GPUs under power capping
Publication
- A. Krzywaniak
- P. Czarnul
- Year 2020
In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the bench- marks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm- benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance...

Full text available to download
Jerzy Proficz dr hab. inż.

People

Academic Computer Centre TASK, Department of Computer Architecture

Jerzy Proficz, Ph.D. is the director of the Centre of Informatics – Tricity Academic Supercomputer & networK (CI TASK) at Gdansk University of Technology, Poland. He earned his Ph.D. (2012) in HPC (High Performance Computing) in the subject of supercomputer resource provisioning and management for on-line data processing D.Sc. (2022) in the discipline: Information and Communication Technology. Author and co-author of over 50...
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS

Journals

ISSN: 1094-3420 , eISSN: 1741-2846
Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
Publication
- Future Generation Computer Systems-The International Journal of Grid Computing-Theory Methods and Applications - Year 2023
GPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this...

Full text to download in external service
CNN-CLFFA: Support Mobile Edge Computing in Transportation Cyber Physical System
Publication
- A. Bhansali
- R. Kumar Patra
- P. Bidare Divakarachari
- P. Falkowski-Gilski
- G. Shivakanth
- S. N. Patil
- IEEE Access - Year 2024
In the present scenario, the transportation Cyber Physical System (CPS) improves the reliability and efficiency of the transportation systems by enhancing the interactions between the physical and cyber systems. With the provision of better storage ability and enhanced computing, cloud computing extends transportation CPS in Mobile Edge Computing (MEC). By inspecting the existing literatures, the cloud computing cannot fulfill...

Full text available to download
General Provisioning Strategy for Local Specialized Cloud Computing Environments
Publication
- P. Orzechowski
- H. Krawczyk
- Year 2023
The well-known management strategies in cloud computing based on SLA requirements are considered. A deterministic parallel provisioning algorithm has been prepared and used to show its behavior for three different requirements: load balancing, consolidation, and fault tolerance. The impact of these strategies on the total execution time of different sets of services is analyzed for randomly chosen sets of data. This makes it possible...

Full text to download in external service
Towards an efficient multi-stage Riemann solver for nuclear physics simulations
Publication
- S. Cygert
- J. Porter-Sobieraj
- D. Kikoła
- J. Sikorski
- M. Słodkowski
- Year 2013
Relativistic numerical hydrodynamics is an important tool in high energy nuclear science. However, such simulations are extremely demanding in terms of computing power. This paper focuses on improving the speed of solving the Riemann problem with the MUSTA-FORCE algorithm by employing the CUDA parallel programming model. We also propose a new approach to 3D finite difference algorithms, which employ a GPU that uses surface memory....

Full text to download in external service
Performance Analysis of the OpenCL Environment on Mobile Platforms
Publication
- P. Falkowski-Gilski
- M. Plewka
- Year 2022
Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

Full text to download in external service
Karol Zdzisław Zalewski mgr inż.

People
Optimization of parallel implementation of UNRES package for coarse‐grained simulations to treat large proteins
Publication
- A. Sieradzan
- J. Sans‐Duñó
- E. Lubecka
- C. Czaplewski
- A. Lipska
- H. Leszczyński
- K. Ocetkiewicz
- J. Proficz
- P. Czarnul
- H. Krawczyk
- A. Liwo
- JOURNAL OF COMPUTATIONAL CHEMISTRY - Year 2023
We report major algorithmic improvements of the UNRES package for physics-based coarse-grained simulations of proteins. These include (i) introduction of interaction lists to optimize computations, (ii) transforming the inertia matrix to a pentadiagonal form to reduce computing and memory requirements, (iii) removing explicit angles and dihedral angles from energy expressions and recoding the most time-consuming energy/force terms...

Full text available to download
Auto-tuning methodology for configuration and application parameters of hybrid CPU + GPU parallel systems based on expert knowledge
Publication
- P. Czarnul
- P. Rościszewski
- Year 2020
Auto-tuning of configuration and application param- eters allows to achieve significant performance gains in many contemporary compute-intensive applications. Feasible search spaces of parameters tend to become too big to allow for exhaustive search in the auto-tuning process. Expert knowledge about the utilized computing systems becomes useful to prune the search space and new methodologies are needed in the face of emerging heterogeneous...

Full text available to download
Use of ICT infrastructure for teaching HPC
Publication
- P. Czarnul
- M. Matuszek
- Year 2019
In this paper we look at modern ICT infrastructure as well as curriculum used for conducting a contemporary course on high performance computing taught over several years at the Faculty of Electronics Telecommunications and Informatics, Gdansk University of Technology, Poland. We describe the infrastructure in the context of teaching parallel programming at the cluster level using MPI, node level using OpenMP and CUDA. We present...

Full text to download in external service
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
Publication
- K. Łopatka
- A. Czyżewski
- Year 2015
A method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
Toward Intelligent Recommendations Using the Neural Knowledge DNA
Publication
- G. Ning
- C. Wu
- H. Zhang
- E. Szczerbicki
- CYBERNETICS AND SYSTEMS - Year 2021
In this paper we propose a novel recommendation approach using past news click data and the Neural Knowledge DNA (NK-DNA). The Neural Knowledge DNA is a novel knowledge representation method designed to support discovering, storing, reusing, improving, and sharing knowledge among machines and computing systems. We examine our approach for news recommendation tasks on the MIND benchmark dataset. By taking advantages of NK-DNA, deep...

Full text available to download
A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors
Publication
- SENSORS - Year 2020
In recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones,...

Full text available to download
Benchmarking Deep Neural Network Training Using Multi- and Many-Core Processors
Publication
- P. Czarnul
- K. Jabłońska
- International Journal of Computer Information Systems and Industrial Management Applications - Year 2020
In the paper we provide thorough benchmarking of deep neural network (DNN) training on modern multi- and many-core Intel processors in order to assess performance differences for various deep learning as well as parallel computing parameters. We present performance of DNN training for Alexnet, Googlenet, Googlenet_v2 as well as Resnet_50 for various engines used by the deep learning framework, for various batch sizes. Furthermore,...

Full text to download in external service
Influence of YARN Schedulers on Power Consumption and Processing Time for Various Big Data Benchmarks
Publication
- TASK Quarterly - Year 2019
Climate change caused by human activities can influence the lives of everybody onthe planet. The environmental concerns must be taken into consideration by all fields of studyincludingICT. Green Computing aims to reduce negative effects of IT on the environment while,at the same time, maintaining all of the possible benefits it provides. Several Big Data platformslike Apache Spark orYARNhave become widely used in analytics and...

Full text available to download
Network-aware Data Prefetching Optimization of Computations in a Heterogeneous HPC Framework
Publication
- P. Rościszewski
- International Journal of Computer Networks & Communications (IJCNC) - Year 2014
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting of multiple collections of nodes with different types of computing devices. The execution engine of the system is open for...

Full text available to download
Multi-criterion decision making in distributed systems by quantum evolutionary algorithms
Publication
- J. Balicki
- H. Balicka
- J. Masiejczyk
- A. Zacniewski
- Year 2010
Decision making by the AQMEA (Adaptive Quantum-based Multi-criterion Evolutionary Algorithm) has been considered for distributed computer systems. AQMEA has been extended by a chromosome representation with the registry of the smallest units of quantum information. Evolutionary computing with Q-bit chromosomes has been proofed to characterize by the enhanced population diversity than other representations, since individuals represent...
Jerzy Konorski dr hab. inż.

People

Department of Computer Communications

Jerzy Konorski received his M. Sc. degree in telecommunications from Gdansk University of Technology, Poland, and his Ph. D. degree in computer science from the Polish Academy of Sciences, Warsaw, Poland. In 2007, he defended his D. Sc. thesis at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology. He has authored over 150 papers, led scientific projects funded by the European Union,...
Tuning matrix-vector multiplication on GPU
Publication
- A. Dziekoński
- M. Mrozowski
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010
A matrix times vector multiplication (matvec) is a cornerstone operation in iterative methods of solving large sparse systems of equations such as the conjugate gradients method (cg), the minimal residual method (minres), the generalized residual method (gmres) and exerts an influence on overall performance of those methods. An implementation of matvec is particularly demanding when one executes computations on a GPU (Graphics...
Programming, tunning and automatic parallelization of irregular divide and conquer applications in DAMPVM/DAC.
Publication
- P. Czarnul
- INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS - Year 2003
Artykuł prezentuje nowy, obiektowo zorientowany wzorzec programowy DAMPVM/DAC, który zimplementowany został z użyciem systemu DAMPVM i umożliwia automatyczny podział nieregularnych aplikacji "Dziel i zwyciężaj" (DAC) w czasie ich działania.
A Regular Expression Matching Application with Configurable Data Intensity for Testing Heterogeneous HPC Systems
Publication
- Year 2014
Modern High Performance Computing (HPC) systems are becoming increasingly heterogeneous in terms of utilized hardware, as well as software solutions. The problems, that we wish to efficiently solve using those systems have different complexity, not only considering magnitude, but also the type of complexity: computation, data or communication intensity. Developing new mechanisms for dealing with those complexities or choosing an...
Distributed NVRAM Cache – Optimization and Evaluation with Power of Adjacency Matrix
Publication
- A. Malinowski
- P. Czarnul
- Year 2017
In this paper we build on our previously proposed MPI I/O NVRAM distributed cache for high performance computing. In each cluster node it incorporates NVRAMs which are used as an intermediate cache layer between an application and a file for fast read/write operations supported through wrappers of MPI I/O functions. In this paper we propose optimizations of the solution including handling of write requests with a synchronous mode,...

Full text to download in external service
Video Analytics-Based Algorithm for Monitoring Egress from Buildings
Publication
- M. Szczodrak
- A. Czyżewski
- Year 2013
A concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...

Full text to download in external service
BeesyCluster as Front-End for High Performance Computing Services
Publication
- P. Czarnul
- TASK Quarterly - Year 2015
The paper presents the BeesyCluster system as a middleware allowing invocation of services on high performance computing resources within the NIWA Centre of Competence project. Access is possible through both WWW and SOAP Web Service interfaces. The former allows non-experienced users to invoke both simple and complex services exposed through easyto-use servlets. The latter is meant for integration of external applications with...

Full text available to download
Teaching High Performance Computing Using BeesyCluster and Relevant Usage Statistics
Publication
- P. Czarnul
- Year 2014
The paper presents motivations and experiences from using the BeesyCluster middleware for teaching high performance computing at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology. Features of BeesyCluster well suited for conducting courses are discussed including: easy-to-use WWW interface for application development and running hiding queuing systems, publishing applications as services...

Full text to download in external service
Acceleration of the DGF-FDTD method on GPU using the CUDA technology
Publication
- Year 2015
We present a parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD) method on a graphics processing unit (GPU). The compute unified device architecture (CUDA) parallel computing platform is applied in the developed implementation. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of DGF-FDTD on GPU. The efficiency of parallel computations...

Full text to download in external service
Benchmarking Parallel Chess Search in Stockfish on Intel Xeon and Intel Xeon Phi Processors
Publication
- P. Czarnul
- Year 2018
The paper presents results from benchmarking the parallel multithreaded Stockfish chess engine on selected multi- and many-core processors. It is shown how the strength of play for an n-thread version compares to 1-thread version on both Intel Xeon and latest Intel Xeon Phi x200 processors. Results such as the number of wins, losses and draws are presented and how these change for growing numbers of threads. Impact of using particular...

Full text to download in external service

Search

Filters

Catalog

Search results for: energy-aware computing, high performance computing, green computing, nas parallel benchmark

Paweł Czarnul dr hab. inż.

Paweł Rościszewski dr inż.

Jerzy Proficz dr hab. inż.

Karol Zdzisław Zalewski mgr inż.

Jerzy Konorski dr hab. inż.