Abstrakt
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings. In this paper we primarily focus on how to obtain data for efficiently training, validating, and testing a deep-learning model by using a data augmentation technique. These data are transformed into 2D feature spaces, i.e., mel-scale spectrograms. The Neural Network used in the experiments consists of a single-block DenseNet architecture and a multi-head softmax classifier for efficient learning with the mixup augmentation. For automatic noisy data labeling, the batch-wise loss masking, which is robust to corrupting outliers in data, was applied. To train the models, various audio sample rates and different audio representations were utilized. The method provides promising recognition scores even with real-world recordings that contain noisy data.
Cytowania
-
7
CrossRef
-
0
Web of Science
-
7
Scopus
Autorzy (2)
Cytuj jako
Pełna treść
- Wersja publikacji
- Accepted albo Published Version
- Licencja
- Copyright (2020 Audio Eng. Society)
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Publikacja w czasopiśmie
- Typ:
- artykuły w czasopismach
- Opublikowano w:
-
JOURNAL OF THE AUDIO ENGINEERING SOCIETY
nr 68,
strony 57 - 65,
ISSN: 1549-4950 - Język:
- angielski
- Rok wydania:
- 2020
- Opis bibliograficzny:
- Koszewski D., Kostek B.: Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing// JOURNAL OF THE AUDIO ENGINEERING SOCIETY -Vol. 68,iss. 1/2 (2020), s.57-65
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.17743/jaes.2019.0050
- Źródła finansowania:
-
- Działalność statutowa/subwencja
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 173 razy
Publikacje, które mogą cię zainteresować
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
- G. Korvel,
- P. Treigys,
- G. Tamulevicus
- + 2 autorów