Abstrakt
A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory setting of speech organs for the same phoneme produced in different phonetic environments and the objective signal parameters (both audio and video) is carried out. The method is sensitive to minute allophonic detail as well as to accentual differences. It is shown that by using the analysis of video signals together with the acoustic signal, speech transcription can be performed more accurately and robustly than by using the acoustic modality alone. In particular, various features extracted from the visual signal are tested for their abilities to encode allophonic variations in pronunciation. New methods for modeling the accentual and allophonic variation of speech are developed.
Cytowania
-
1
CrossRef
-
0
Web of Science
-
0
Scopus
Autorzy (3)
Cytuj jako
Pełna treść
pełna treść publikacji nie jest dostępna w portalu
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Publikacja w czasopiśmie
- Typ:
- artykuł w czasopiśmie wyróżnionym w JCR
- Opublikowano w:
-
Journal of the Acoustical Society of America
nr 139,
strony 1 - 16,
ISSN: 0001-4966 - Język:
- angielski
- Rok wydania:
- 2016
- Opis bibliograficzny:
- Czyżewski A., Ciszewski T., Kostek B.: Methodology and technology for the polymodal allophonic speech transcription// Journal of the Acoustical Society of America. -Vol. 139, nr. 4 (2016), s.1-16
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1121/1.4949947
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 135 razy