dr Grazina Korvel
Zatrudnienie
Słowa kluczowe Pomoc
- speech recognition, allophone, phonology, foreign language, audio features
- 2d space feature, speech analysis, deep learning, spectrogram, cepstrogram, chromagram
- algorytm knn
- analiza fonematyczna
- automatyczne rozpoznawanie mowy, splotowe sieci, glebokie uczenie, chromagramy, wymiar fraktalny
- convolutional neural networks
- data preparation
- deep learning
- efekt lombarda, analiza sygnalu mowy
- facial motion capture
Kontakt dla biznesu
- Lokalizacja
- Al. Zwycięstwa 27, 80-219 Gdańsk
- Telefon
- +48 58 348 62 62
- biznes@pg.edu.pl
Kontakt
- grazina.korvel@pg.edu.pl
Wybrane publikacje
-
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
-
Speech Analytics Based on Machine Learning
In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...
-
Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System
A voiceless stop consonant phoneme modelling and synthesis framework based on a phoneme modelling in low-frequency range and high-frequency range separately is proposed. The phoneme signal is decomposed into the sums of simpler basic components and described as the output of a linear multiple-input and single-output (MISO) system. The impulse response of each channel is a third order quasi-polynomial. Using this framework, the...
wyświetlono 16555 razy