Multimodal English corpus for automatic speech recognition

Bartosz Kunka; Adam Kupryjanow; Piotr Dalka; Piotr Bratoszewski; Maciej Szczodrak; Paweł Spaleniak; Marcin Szykulski; Andrzej Czyżewski

Multimodal English corpus for automatic speech recognition

Abstrakt

A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech recognition system which analyzes many modalities at the same time. The paper describes the process of multimodal material collection and the post-processing procedure applied to this material. Parameterization methods of signals belonging to different modalities are also proposed.

Autorzy (8)

Cytuj jako

Pełna treść

pełna treść publikacji nie jest dostępna w portalu

Słowa kluczowe

Informacje szczegółowe

Kategoria:: Aktywność konferencyjna
Typ:: materiały konferencyjne indeksowane w Web of Science
Tytuł wydania:: Signal Processing Algorithms, Architectures, Arrangements and Applications strony 106 - 111
ISSN:: 2326-0262
Język:: angielski
Rok wydania:: 2013
Opis bibliograficzny:: Kunka B., Kupryjanow A., Dalka P., Bratoszewski P., Szczodrak M., Spaleniak P., Szykulski M., Czyżewski A..: Multimodal English corpus for automatic speech recognition, W: Signal Processing Algorithms, Architectures, Arrangements and Applications, 2013, IEEE,.
Weryfikacja:: Politechnika Gdańska

wyświetlono 155 razy

Publikacje, które mogą cię zainteresować

An audio-visual corpus for multimodal automatic speech recognition

2017

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

A. Czyżewski,
B. Kostek,
T. Ciszewski
+ 1 autorów

2013

Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition

2016

Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions

2016

Meta Tagi