The Impact of 8- and 4-Bit Quantization on the Accuracy and Silicon Area Footprint of Tiny Neural Networks
Abstrakt
In the field of embedded and edge devices, efforts have been made to make deep neural network models smaller due to the limited size of the available memory and the low computational efficiency. Typical model footprints are under 100 KB. However, for some applications, models of this size are too large. In low-voltage sensors, signals must be processed, classified or predicted with an order of magnitude smaller memory. Model downsizing can be performed by limiting the number of model parameters or quantizing their weights. These types of operations have a negative impact on the accuracy of the deep network. This study tested the effect of model downscaling techniques on accuracy. The main idea was to reduce neural network models to 3 k parameters or less. Tests were conducted on three different neural network architectures in the context of three separate research problems, modeling real tasks for small networks. The impact of the reduction in the accuracy of the network depends mainly on its initial size. For a network reduced from 40 k parameters, a decrease in accuracy of 16 percentage points was achieved, and for a network with 20 k parameters, a decrease of 8 points was achieved. To obtain the best results, knowledge distillation and quantization-aware training methods were used for training. Thanks to this, the accuracy of the 4-bit networks did not differ significantly from the 8-bit ones and their results were approximately four percentage points worse than those of the full precision networks. For the fully connected network, synthesis to ASIC (application-specific integrated circuit) was also performed to demonstrate the reduction in the silicon area occupied by the model. The 4-bit quantization limits the silicon area footprint by 90%.
Cytowania
-
0
CrossRef
-
0
Web of Science
-
0
Scopus
Autorzy (4)
Cytuj jako
Pełna treść
pełna treść publikacji nie jest dostępna w portalu
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Publikacja w czasopiśmie
- Typ:
- artykuły w czasopismach
- Opublikowano w:
-
Electronics
nr 14,
ISSN: 2079-9292 - Język:
- angielski
- Rok wydania:
- 2025
- Opis bibliograficzny:
- Tumialis P., Skierkowski M., Przychodny J., Obszarski P.: The Impact of 8- and 4-Bit Quantization on the Accuracy and Silicon Area Footprint of Tiny Neural Networks// Electronics -,iss. 1 (2024), s.14-
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.3390/electronics14010014
- Źródła finansowania:
-
- Działalność statutowa/subwencja
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 0 razy
Publikacje, które mogą cię zainteresować
Exploring Neural Networks for Musical Instrument Identification in Polyphonic Audio
- M. Blaszke,
- G. Korvel,
- B. Kostek
Efficiency of Artificial Intelligence Methods for Hearing Loss Type Classification: an Evaluation
- M. Kassjański,
- M. Kulawiak,
- T. Przewoźny
- + 7 autorów