Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Daniel Korzekwa; Roberto Barra-Chicote; Szymon Zaporowski; Grzegorz Beringer; Jaime Lorenzo-trueba; Alicja Serafinowicz; Jasha Droppo; Thomas Drugman; Bożena Kostek

doi:10.21437/interspeech.2021-86

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Abstrakt

This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de rives optimal syllable-level representation from frame-level and phoneme-level audio features. Training this model is challenging because of the limited amount of incorrect stress patterns. To solve this problem, we propose to augment the training set with incorrectly stressed words generated with Neural TTS. Combining both techniques achieves 94.8% precision and 49.2% recall for the detection of incorrectly stressed words in L2 English speech of Slavic and Baltic speakers.

Cytowania

6

CrossRef
0

Web of Science
5

Scopus

Autorzy (9)

Cytuj jako

Pełna treść

pobierz publikację

pobrano 57 razy

Wersja publikacji: Accepted albo Published Version
DOI:: Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.21437/Interspeech.2021-86
Licencja: Copyright (2021 ISCA)

Słowa kluczowe

Informacje szczegółowe

Kategoria:: Aktywność konferencyjna
Typ:: publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
Język:: angielski
Rok wydania:: 2021
Opis bibliograficzny:: Korzekwa D., Barra-Chicote R., Zaporowski S., Beringer G., Lorenzo-Trueba J., Serafinowicz A., Droppo J., Drugman T., Kostek B.: Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention// / : , 2021,
DOI:: Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.21437/interspeech.2021-86
Weryfikacja:: Politechnika Gdańska

wyświetlono 140 razy

Publikacje, które mogą cię zainteresować

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

A. Czyżewski,
B. Kostek,
T. Ciszewski
+ 1 autorów

2013

Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning

A. Czyżewski

2023

Computer-assisted pronunciation training—Speech synthesis is almost all you need

D. Korzekwa,
J. Lorenzo-trueba,
T. Drugman
+ 1 autorów

2022

Investigating Feature Spaces for Isolated Word Recognition

G. Korvel,
G. Tamulevicus,
P. Treigys
+ 2 autorów

2018

Meta Tagi

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Abstrakt

Cytowania

Autorzy (9)

Daniel Korzekwa

Roberto Barra-Chicote prof.

Szymon Zaporowski mgr inż.

Grzegorz Beringer

Jaime Lorenzo-trueba

Alicja Serafinowicz

Jasha Droppo

Thomas Drugman dr

Bożena Kostek prof. dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning

Computer-assisted pronunciation training—Speech synthesis is almost all you need

Investigating Feature Spaces for Isolated Word Recognition

Wyszukiwarka

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Abstrakt

Cytowania

Autorzy (9)

Daniel Korzekwa

Roberto Barra-Chicote prof.

Szymon Zaporowski mgr inż.

Grzegorz Beringer

Jaime Lorenzo-trueba

Alicja Serafinowicz

Jasha Droppo

Thomas Drugman dr

Bożena Kostek prof. dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning

Computer-assisted pronunciation training—Speech synthesis is almost all you need

Investigating Feature Spaces for Isolated Word Recognition