Speech Analytics Based on Machine Learning

Grazina Korvel; Adam Kurowski; Bożena Kostek; Andrzej Czyżewski

doi:10.1007/978-3-319-94030-4_6

Speech Analytics Based on Machine Learning

Abstract

In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information retrieval (MIR) domain. Then, phoneme classification beyond the typically used techniques is extended towards exploring Deep Neural Networks (DNNs). This is done by combining Convolutional Neural Networks (CNNs) with audio data converted to the time-frequency space domain (i.e. spectrograms) and then exported as images. In this way a two-dimensional representation of speech feature space is employed. When preparing the phoneme dataset for CNNs, zero padding and interpolation techniques are used. The obtained results show an improvement in classification accuracy in the case of allophones of the phoneme /l/, when CNNs coupled with spectrogram representation are employed. Contrarily, in the case of vowel classification, the results are better for the approach based on pre-selected features and a conventional machine learning algorithm.

Citations

5

CrossRef
0

Web of Science
1 3

Scopus

Authors (4)

Cite as

Full text

full text is not available in portal

full content of the article see on external site open in new tab

Keywords

Details

Category:

Monographic publication

Type:

rozdział, artykuł w książce - dziele zbiorowym /podręczniku w języku o zasięgu międzynarodowym

Title of issue:

Machine Learning Paradigms :Advances in Data Analytics strony 129 - 157

Language:

English

Publication year:

2019

Bibliographic description:

Korvel G., Kurowski A., Kostek B., Czyżewski A.: Speech Analytics Based on Machine Learning// Machine Learning Paradigms/ ed. George A. Tsihrintzis, Dionisios N. Sotiropoulos, Lakhmi C. Jain Cham: Springer, 2019, s.129-157

DOI:

10.1007/978-3-319-94030-4_6

Sources of funding:

Project Methodology and technology for the polymodal allophonic speech transcription

Verified by:

Gdańsk University of Technology

seen 286 times

Recommended for you

Investigating Feature Spaces for Isolated Word Recognition

P. Treigys,
G. Korvel,
G. Tamulevicius
+ 2 authors

2020

Investigating Feature Spaces for Isolated Word Recognition

G. Korvel,
G. Tamulevicus,
P. Treigys
+ 2 authors

2018

Detecting Lombard Speech Using Deep Learning Approach

K. Kąkol,
G. Korvel,
G. Tamulevicius
+ 1 authors

2023

Data augmentation for improving deep learning in image classification problem

2018

Meta Tags

Speech Analytics Based on Machine Learning

Abstract

Citations

Authors (4)

Grazina Korvel dr

Adam Kurowski dr inż.

Bożena Kostek prof. dr hab. inż.

Andrzej Czyżewski prof. dr hab. inż.

Cite as

Full text

Keywords

Details

Recommended for you

Investigating Feature Spaces for Isolated Word Recognition

Investigating Feature Spaces for Isolated Word Recognition

Detecting Lombard Speech Using Deep Learning Approach

Data augmentation for improving deep learning in image classification problem

Search

Speech Analytics Based on Machine Learning

Abstract

Citations

Authors (4)

Grazina Korvel dr

Adam Kurowski dr inż.

Bożena Kostek prof. dr hab. inż.

Andrzej Czyżewski prof. dr hab. inż.

Cite as

Full text

Keywords

Details

Recommended for you

Investigating Feature Spaces for Isolated Word Recognition

Investigating Feature Spaces for Isolated Word Recognition

Detecting Lombard Speech Using Deep Learning Approach

Data augmentation for improving deep learning in image classification problem