Search results for: asr system

Search results for: asr system

results on page:
embed this view on your website

Filters

total: 18

clear all filters disabled

Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
Publication
- Year 2016
The problem of accurate differentiating between the speaker utterance and the noise parts in a speech signal is considered. The influence of utilizing a voice activity detection in speech signals on the accuracy of the automatic speech recognition (ASR) system is presented. The examined methods of voice activity detection are based on acoustic and visual modalities. The problem of detecting the voice activity in clean and noisy...
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
Publication
- S. Zaporowski
- Year 2024
The article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...

Full text available to download
Examining Influence of Distance to Microphone on Accuracy of Speech Recognition
Publication
- Year 2015
The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal...

Full text to download in external service
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
Publication
- Year 2016
Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Full text to download in external service
Cost-effective methods of fabricating thin rare-earth element layers on SOC interconnects based on low-chromium ferritic stainless steel and exposed to air, humidified air or humidified hydrogen atmospheres
Publication
- Ł. Mazur
- P. Winiarski
- B. Kamecki
- J. Ignaczak
- S. Molin
- T. Brylewski
- INTERNATIONAL JOURNAL OF HYDROGEN ENERGY - Year 2024
Most oxidation studies involving interconnects are conducted in air under isothermal conditions, but during real-life solid oxide cell (SOC) operation, cells are also exposed a mixture of hydrogen and water vapor. For this study, an Fe–16Cr low-chromium ferritic stainless steel was coated with different reactive element oxides – Gd2O3, CeO2, Ce0.9Y0.1O2 – using an array of methods: dip coating, electrodeposition and spray pyrolysis....

Full text to download in external service
Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning
Publication
- A. Czyżewski
- Journal of the Acoustical Society of America - Year 2023
Text-to-Speech synthesis (TTS) can be used to generate training data for building Automatic Speech Recognition models (ASR). Access to medical speech data is because it is sensitive data that is difficult to obtain for privacy reasons; TTS can help expand the data set. Speech can be synthesized by mimicking different accents, dialects, and speaking styles that may occur in a medical language. Reinforcement Learning (RL), in the...

Full text available to download
An audio-visual corpus for multimodal automatic speech recognition
Publication
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2017
review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Full text available to download
Enhanced voice user interface employing spatial filtration of signals from acoustic vector sensor
Publication
- Year 2015
Spatial filtration of sound is introduced to enhance speech recognition accuracy in noisy conditions. An acoustic vector sensor (AVS) is employed. The signals from the AVS probe are processed in order to attenuate the surrounding noise. As a result the signal to noise ratio is increased. An experiment is featured in which speech signals are disturbed by babble noise. The signals before and after spatial filtration are processed...

Full text to download in external service
La 0.6 Sr 0.4 Co 0.2 Fe 0.8 O 3-δ oxygen electrodes for solid oxide cells prepared by polymer precursor and nitrates solution infiltration into gadolinium doped ceria backbone
Publication
- JOURNAL OF THE EUROPEAN CERAMIC SOCIETY - Year 2017
Infiltration is a method, which can be applied for the electrode preparation. In this paper oxygen electrode is prepared solely by the infiltration of La0.6Sr0.4Co0.2Fe0.8O3‐δ (LSCF) into Ce0.8Gd0.2O2-δ (CGO) backbone. The use a polymer precursor as an infiltrating medium, instead of an aqueous nitrate salts solution is presented. It is shown that the polymer forms the single-phase perovskite at 600 °C, contrary to the nitrates...

Full text available to download
Vocalic Segments Classification Assisted by Mouth Motion Capture
Publication
- Year 2018
Visual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...

Full text to download in external service
Low temperature processed MnCo2O4 and MnCo1.8Fe0.2O4 as effective protective coatings for solid oxide fuel cell interconnects at 750 °C
Publication
- S. Molin
- P. Jasiński
- L. Mikkelsen
- W. Zhang
- M. Chen
- P. V. Hendriksen
- JOURNAL OF POWER SOURCES - Year 2016
In this study two materials, MnCo2O4 and MnCo1.8Fe0.2O4 are studied as potential protective coatings for Solid Oxide Fuel Cell interconnects working at 750 °C. First powder fabrication by a modified Pechini method is described followed by a description of the coating procedure. The protective action of the coating applied on Crofer 22 APU is evaluated by following the area specific resistance (ASR) of the scale/coating for 5500...

Full text to download in external service
High temperature oxidation behavior of SUS430 SOFC interconnects with Mn-Co spinel coating in air
Publication
- C. Jia
- Y. Wang
- S. Molin
- Y. Zhang
- M. Chen
- M. Han
- JOURNAL OF ALLOYS AND COMPOUNDS - Year 2019
In this study, SUS430 alloy is evaluated for its high temperature corrosion properties as a possible material for interconnects of solid oxide fuel cells (SOFCs). Samples are coated with Mn-Co by commercial physical vapor deposition (PVD) process and oxidized in air for 1250 h at 800 °C. A dense cubic Mn-Co-Fe spinel layer is formed on the surface, showing great effect on corrosion reduction compared with the samples without coating....

Full text to download in external service
Effectiveness of a dual surface modification of metallic interconnects for application in energy conversion devices
Publication
- Ł. Mazur
- J. Ignaczak
- M. Bik
- S. Molin
- M. Sitarz
- A. Gil
- T. Brylewski
- INTERNATIONAL JOURNAL OF HYDROGEN ENERGY - Year 2022
A dual surface modification of an SOFC metallic interconnect with a Gd2O3 layer and an MnCo2O4 coating was evaluated. The tested samples were oxidized for 7000 h in air at 1073 K. Oxidation products were characterized using XRD, SEM-EDS, and confocal Raman imaging, and ASR was measured. The effect of gadolinium segregation at grain boundaries in Cr2O3 was evaluated using S/TEM-EDS. Area specific-resistance was measured and fuel...

Full text available to download
Experimental review of the performances of protective coatings for interconnects in solid oxide fuel cells
Publication
- M. J. Reddy
- B. Kamecki
- B. Talic
- E. Zanchi
- F. Smeacetto
- J. S. Hardy
- J. C. Choi
- Ł. Mazur
- R. Vaßen
- S. N. Basu... and 3 others
- JOURNAL OF POWER SOURCES - Year 2023
Ferritic stainless steel interconnects are used in solid oxide fuel cells; however, coatings are required to improve their performance. Although several types of coatings have been proposed, they have been scarcely investigated under similar conditions. This study compares the characteristics of uncoated Crofer 22 APU and eight different coatings on Crofer 22 APU for up to 3000 h at 800 ◦C. The coatings were deposited at various...

Full text available to download
Investigating Feature Spaces for Isolated Word Recognition
Publication
- G. Korvel
- G. Tamulevicus
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Year 2018
Much attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
Spotkanie politechnicznego klubu sztucznej inteligencji

Events

24-10-2019 17:30 - 24-10-2019 19:15

Pierwsze w tym roku akademickim spotkanie klubu AI Bay – Zatoka Sztucznej Inteligencji, który działa na Politechnice Gdańskiej odbędzie się w Gmachu B Wydziału Elektroniki, Telekomunikacji i Informatyki (Audytorium 1P).
CSR at HEIs: Between Ignorance, Awareness and Knowledge
Publication
- J. Wasilczuk
- M. M. Popowska
- Problemy Zarządzania - Year 2022
The paper focuses on CSR education in Higher Education Institutions. It analyzes current approaches to this education and the enhancements already deployed in the international perspective. The main aim is to conceptualize CSRS education forms within the context of technology-oriented HEIs and propose the model for this education. This model has also been partially verified using the cases of four technical universities. This research...

Full text available to download
Multimodal English corpus for automatic speech recognition
Publication
- Year 2013
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...

Search

Filters

Catalog