Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
Abstrakt
A method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip tracking data synchronized with the audio signal. 32 markers were used, 20 of which were placed on the speaker's inner lips and 4 on a special cap, which served as the point of reference and stabilized the FMC image while post-processing. Speech samples were simultaneously recorded as a list of approximately 300 words in which all English consonantal and vocalic allophones were represented. Different parameterization strategies were tested and the accuracy of vocalic segments
Cytowania
-
2
CrossRef
-
0
Web of Science
-
0
Scopus
Autorzy (3)
Cytuj jako
Pełna treść
pełna treść publikacji nie jest dostępna w portalu
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Publikacja w czasopiśmie
- Typ:
- artykuł w czasopiśmie wyróżnionym w JCR
- Opublikowano w:
-
Journal of the Acoustical Society of America
nr 144,
wydanie 3,
strony 1801 - 1802,
ISSN: 0001-4966 - Język:
- angielski
- Rok wydania:
- 2018
- Opis bibliograficzny:
- Czyżewski A., Zaporowski S., Kostek B.: Bimodal classification of English allophones employing acoustic speech signal and facial motion capture// Journal of the Acoustical Society of America. -Vol. 144, iss. 3 (2018), s.1801-1802
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1121/1.5067951
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 134 razy
Publikacje, które mogą cię zainteresować
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
- M. Piotrowska,
- G. Korvel,
- B. Kostek
- + 2 autorów