Abstrakt
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction and an appropriate selection of the classifier were sought. The experimental research was conducted on the basis of a spoken corpus in which speech was represented both acoustically and visually. The extracted features represented three types: geometrical, textural and mixed ones. The features were processed employing the classification algorithms based on Hidden Markov Models and Sequential Minimal Optimization. Tests were carried out employing the processed video material recorded with English native speakers who read specially prepared list of commands. The obtained results are discussed in the paper.
Cytowania
-
1 3
CrossRef
-
0
Web of Science
-
1 2
Scopus
Autorzy (3)
Cytuj jako
Pełna treść
- Wersja publikacji
- Accepted albo Published Version
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1007/s11042-017-5217-5
- Licencja
- otwiera się w nowej karcie
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Publikacja w czasopiśmie
- Typ:
- artykuł w czasopiśmie wyróżnionym w JCR
- Opublikowano w:
-
MULTIMEDIA TOOLS AND APPLICATIONS
nr 77,
strony 16495 - 16532,
ISSN: 1380-7501 - Język:
- angielski
- Rok wydania:
- 2018
- Opis bibliograficzny:
- JACHIMSKI D., Czyżewski A., Ciszewski T.: A comparative study of English viseme recognition methods and algorithms// MULTIMEDIA TOOLS AND APPLICATIONS. -Vol. 77, iss. 13 (2018), s.16495-16532
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1007/s11042-017-5217-5
- Weryfikacja:
- Politechnika Gdańska
Powiązane datasety
- dane badawcze MODALITY corpus - SPEAKER 35 - COMMANDS C1
- dane badawcze MODALITY corpus - SPEAKER 21 - SEQUENCE S6
- dane badawcze MODALITY corpus - SPEAKER 21 - COMMANDS C5
- dane badawcze MODALITY corpus - SPEAKER 21 - SEQUENCE S4
- dane badawcze MODALITY corpus - SPEAKER 10 - SEQUENCE S1
- dane badawcze MODALITY corpus - SPEAKER 01 - SEQUENCE S2
- dane badawcze MODALITY corpus - SPEAKER 39 - COMMANDS C1
- dane badawcze MODALITY corpus - SPEAKER 01 - SEQUENCE S3
- dane badawcze MODALITY corpus - SPEAKER 01 - COMMANDS C3
- dane badawcze MODALITY corpus - SPEAKER 21 - SEQUENCE S2
wyświetlono 199 razy