Search results for: VOICE CONVERSION

Search results for: VOICE CONVERSION

results on page:
embed this view on your website

Found little results, maybe try searching with alternative method.

Filters

total: 3

clear all filters disabled

Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech
Publication
- D. Piotrowski
- R. Korzeniowski
- A. Falai
- S. Cygert
- K. Pokora
- G. Tinchev
- Z. Zhang
- K. Yanagisawa
- Year 2023
In this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...

Full text to download in external service
Creating new voices using normalizing flows
Publication
- P. Biliński
- T. Merritt
- A. Ezzerg
- K. Pokora
- S. Cygert
- K. Yanagisawa
- R. Barra-Chicote
- D. Korzekwa
- Year 2022
Creating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...

Full text available to download
Automated detection of pronunciation errors in non-native English speech employing deep learning
Publication
- D. Korzekwa
- Year 2023
Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...

Full text available to download

Search

Filters

Catalog

Search results for: VOICE CONVERSION

Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech

Creating new voices using normalizing flows

Automated detection of pronunciation errors in non-native English speech employing deep learning