Wyniki wyszukiwania dla: CEPSTROGRAM

Application of dynamic time warping and cepstrograms to text-dependent speaker verification

Publikacja

A. Kaczmarek
M. Staworko

- Rok 2009

This work provides a description of an automatic speaker verification (ASV) system. In particular, it documents the evolution of all individual stages of the proposed ASV system design from the phase of preprocessing to an operational decision making system. The aim of this research was to achieve the system of the best safety and ease of use in view of users. The objective estimation of this target has been accomplished by assessing...

Automatic labeling of traffic sound recordings using autoencoder-derived features

Publikacja

- Rok 2019

An approach to detection of events occurring in road traffic using autoencoders is presented. Extensions of existing algorithms of acoustic road events detection employing Mel Frequency Cepstral Coefficients combined with classifiers based on k nearest neighbors, Support Vector Machines, and random forests are used. In our research, the acoustic signal gathered from the microphone placed near the road is split into frames and converted...

Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition

Publikacja

G. Korvel
P. Treigys
G. Tamulevicus
J. Bernataviciene
B. Kostek

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2018

convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...

A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces

Publikacja

G. Tamulevicius
G. Korvel
A. B. Yayak
P. Treigys
J. Bernataviciene
B. Kostek

- Electronics - Rok 2020

In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Pełny tekst do pobrania w portalu

Music information retrieval—The impact of technology, crowdsourcing, big data, and the cloud in art.

Publikacja

B. Kostek

- Journal of the Acoustical Society of America - Rok 2019

The exponential growth of computer processing power, cloud data storage, and crowdsourcing model of gathering data bring new possibilities to music information retrieval (mir) field. Mir is no longer music content retrieval only; the area also comprises the discovery of expressing feelings and emotions contained in music, incorporating other than hearing modalities for helping this issue, users’ profiling, merging music with social...

Pełny tekst do pobrania w portalu

Analysis-by-synthesis paradigm evolved into a new concept

Publikacja

B. Kostek

- Journal of the Acoustical Society of America - Rok 2022

This work aims at showing how the well-known analysis-by-synthesis paradigm has recently been evolved into a new concept. However, in contrast to the original idea stating that the created sound should not fail to pass the foolproof synthesis test, the recent development is a consequence of the need to create new data. Deep learning models are greedy algorithms requiring a vast amount of data that, in addition, should be correctly...

Pełny tekst do pobrania w serwisie zewnętrznym

System przetwarzania i wizualizacji sygnału mowy dla potrzeb lingwistycznych = System of speech signal processing and visualisation of the results

Publikacja

Z. Wojan
W. Lis
K. Wojan

- Rok 2005

W artykule przedstawiono sposób przetwarzania i wizualizacji sygnału mowy w formie prostego w obsłudze i relatywnie niedrogiego urządzenia do nagrywania sygnału akustycznego oraz przetwarzania cyfrowego wyselekcjonowanych fragmentów i wizualizacji uzyskanych rezultatów przekształceń. Zastosowano do tego celu komputer z kartą dźwiękową. Przetwarzanie cyfrowe oraz wizualizacja dokonywana była w oparciu o program MATLAB bezpośrednio...

Sravnitel'no-sopostavitel'nyj analiz cifrovoj reprezentacii leksem s differencirovannoj akcentuaciej

Publikacja

Z. Wojan
K. Wojan

- Rok 2007

Artykuł poświęcony jest językoznawczej analizie kontrastywnej dźwięków mowy systemów języka cechujących się "płynnym" akcentowaniem homograficznych leksemów. Język rosyjski jest na wskroś reprezentatywnym przykładem takiego właśnie systemu. W prezentowanej tu metodzie analizy materiałem wyjściowym są cyfrowe nagrania mowy żywej artykułowanej przez lektorów języka rosyjskiego. Akustyczna (cyfrowa) reprezentacja leksemów o tożsamej...

Filtry

Katalog

Kategoria

Rok

Opcje

Application of dynamic time warping and cepstrograms to text-dependent speaker verification

Automatic labeling of traffic sound recordings using autoencoder-derived features

Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition

A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces

Music information retrieval—The impact of technology, crowdsourcing, big data, and the cloud in art.

Analysis-by-synthesis paradigm evolved into a new concept

System przetwarzania i wizualizacji sygnału mowy dla potrzeb lingwistycznych = System of speech signal processing and visualisation of the results

Sravnitel'no-sopostavitel'nyj analiz cifrovoj reprezentacii leksem s differencirovannoj akcentuaciej

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: CEPSTROGRAM