Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool
Abstract
GPU accelerators have become essential to the recent advance in computational power of high- performance computing (HPC) systems. Current HPC systems’ reaching an approximately 20–30 mega-watt power demand has resulted in increasing CO2 emissions, energy costs and necessitate increasingly complex cooling systems. This is a very real challenge. To address this, new mechanisms of software power control could be employed. In this paper, a dynamic new method of limiting software power is introduced on one of the latest NVIDIA GPUs: a software tool called the Dynamic Energy- Performance Optimiser (DEPO). DEPO minimizes the energy consumption of the CUDA based GPU workloads, with respect to one of the three given metrics: minimum of energy (E), Energy-Delay product (EDP) and Energy-Delay sum (EDS). The tool gathers power measurements from NVIDIA Management Library (NVML). Measuring the application progress at runtime is based on CUDA Profiling Tools Interface (CUPTI) kernel-counting. We have evaluated the DEPO tool on the NVIDIA RTX A4500 and A100 GPUs with machine learning workloads. Depending on the application (training of neural networks: Resnet152, Densenet161, VGG- 19 or a GEMM benchmark) for the E target metric, we were able to obtain energy savings exceeding 22% for both NVIDIA A100 and RTX A4500 GPUs while the performance drop has never been higher than 20%. Using one of the bi-objective EDP or EDS metrics allowed finding configurations resulting in 15% or 18% of energy saved with only 8% of performance loss. For most of the experiments the percentage-wise performance penalty is lower than the energy savings. This demonstrates its potential for energy consumption reduction in HPC systems with GPU accelerators.
Citations
-
1 7
CrossRef
-
0
Web of Science
-
1 3
Scopus
Authors (3)
Cite as
Full text
full text is not available in portal
Keywords
Details
- Category:
- Articles
- Type:
- artykuły w czasopismach
- Published in:
-
Future Generation Computer Systems-The International Journal of Grid Computing-Theory Methods and Applications
no. 145,
pages 396 - 414,
ISSN: 0167-739X - Language:
- English
- Publication year:
- 2023
- Bibliographic description:
- Krzywaniak A., Czarnul P., Proficz J.: Dynamic GPU power capping with online performance tracing for energy efficient GPU computing using DEPO tool// Future Generation Computer Systems-The International Journal of Grid Computing-Theory Methods and Applications -Vol. 145, (2023), s.396-414
- DOI:
- Digital Object Identifier (open in new tab) 10.1016/j.future.2023.03.041
- Sources of funding:
-
- Statutory activity/subsidy
- Verified by:
- Gdańsk University of Technology
seen 124 times