Abstract
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings. In this paper we primarily focus on how to obtain data for efficiently training, validating, and testing a deep-learning model by using a data augmentation technique. These data are transformed into 2D feature spaces, i.e., mel-scale spectrograms. The Neural Network used in the experiments consists of a single-block DenseNet architecture and a multi-head softmax classifier for efficient learning with the mixup augmentation. For automatic noisy data labeling, the batch-wise loss masking, which is robust to corrupting outliers in data, was applied. To train the models, various audio sample rates and different audio representations were utilized. The method provides promising recognition scores even with real-world recordings that contain noisy data.
Citations
-
7
CrossRef
-
0
Web of Science
-
8
Scopus
Authors (2)
Cite as
Full text
- Publication version
- Accepted or Published Version
- License
- Copyright (2020 Audio Eng. Society)
Keywords
Details
- Category:
- Articles
- Type:
- artykuły w czasopismach
- Published in:
-
JOURNAL OF THE AUDIO ENGINEERING SOCIETY
no. 68,
pages 57 - 65,
ISSN: 1549-4950 - Language:
- English
- Publication year:
- 2020
- Bibliographic description:
- Koszewski D., Kostek B.: Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing// JOURNAL OF THE AUDIO ENGINEERING SOCIETY -Vol. 68,iss. 1/2 (2020), s.57-65
- DOI:
- Digital Object Identifier (open in new tab) 10.17743/jaes.2019.0050
- Sources of funding:
-
- Statutory activity/subsidy
- Verified by:
- Gdańsk University of Technology
seen 176 times
Recommended for you
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
- G. Korvel,
- P. Treigys,
- G. Tamulevicus
- + 2 authors