Deep Learning Optimization for Edge Devices: Analysis of Training Quantization Parameters

Alicja Kwaśniewska; Maciej Szankin; Mateusz Ozga; Jason Wolfe; Arun Das; Adam Zajac; Jacek Rumiński; Paul Rad

Deep Learning Optimization for Edge Devices: Analysis of Training Quantization Parameters

Abstrakt

This paper focuses on convolution neural network quantization problem. The quantization has a distinct stage of data conversion from floating-point into integer-point numbers. In general, the process of quantization is associated with the reduction of the matrix dimension via limited precision of the numbers. However, the training and inference stages of deep learning neural network are limited by the space of the memory and a variety of factors including programming complexity and even reliability of the system. On the whole the process of quantization becomes more and more popular due to significant impact on performance and minimal accuracy loss. Various techniques for networks quantization have been already proposed, including quantization aware training and integer arithmetic-only inference. Yet, a detailed comparison of various quantization configurations, combining all proposed methods haven’t been presented yet. This comparison is important to understand selection of quantization hyperparameters during training to optimize networks for inference while preserving their robustness. In this work, we perform in-depth analysis of parameters in the quantization aware training, the process of simulating precision loss in the forward pass by quantizing and dequantizing tensors. Specifically, we modify rounding modes, input preprocessing, output data signedness, bitwidth of the quantization and locations of precision loss simulation to evaluate how they affect accuracy of deep neural network aimed at performing efficient calculations on resource-constrained devices.

Autorzy (8)

Cytuj jako

Pełna treść

pobierz publikację

pobrano 358 razy

Wersja publikacji: Accepted albo Published Version
Licencja: Copyright (2019 IEEE)

Słowa kluczowe

Informacje szczegółowe

Kategoria:: Aktywność konferencyjna
Typ:: publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
Język:: angielski
Rok wydania:: 2019
Opis bibliograficzny:: Kwaśniewska A., Szankin M., Ozga M., Wolfe J., Das A., Zajac A., Rumiński J., Rad P.: Deep Learning Optimization for Edge Devices: Analysis of Training Quantization Parameters// / : , 2019,
Weryfikacja:: Politechnika Gdańska

wyświetlono 197 razy

M. Wang,
T. Sirlapu,
A. Kwaśniewska
+ 3 autorów

2018

Meta Tagi