Wyniki wyszukiwania dla: viseme · parameterization of mouth region · support vector machine · hidden markov model · pattern recognition · audiovisual speech recognition

A comparative study of English viseme recognition methods and algorithms

Publikacja

- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018

An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...

Pełny tekst do pobrania w portalu

A comparative study of English viseme recognition methods and algorithm

Publikacja

- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018

An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

Pełny tekst do pobrania w portalu

Examining Influence of Distance to Microphone on Accuracy of Speech Recognition

Publikacja

- Rok 2015

The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal...

Pełny tekst do pobrania w serwisie zewnętrznym

Hybrid of Neural Networks and Hidden Markov Models as a modern approach to speech recognition systems

Publikacja

- Pomiary Automatyka Robotyka - Rok 2013

The aim of this paper is to present a hybrid algorithm that combines the advantages ofartificial neural networks and hidden Markov models in speech recognition for control purpos-es. The scope of the paper includes review of currently used solutions, description and analysis of implementation of selected artificial neural network (NN) structures and hidden Markov mod-els (HMM). The main part of the paper consists of a description...

Pełny tekst do pobrania w portalu

Intracranial hemorrhage detection in 3D computed tomography images using a bi-directional long short-term memory network-based modified genetic algorithm

Publikacja

J. Sengupta
R. Alzbutas
P. Falkowski-Gilski
B. Falkowska-Gilska

- Frontiers in Neuroscience - Rok 2023

Introduction: Intracranial hemorrhage detection in 3D Computed Tomography (CT) brain images has gained more attention in the research community. The major issue to deal with the 3D CT brain images is scarce and hard to obtain the labelled data with better recognition results. Methods: To overcome the aforementioned problem, a new model has been implemented in this research manuscript. After acquiring the images from the Radiological...

Pełny tekst do pobrania w portalu

Dangerous sound event recognition using Support Vector Machine classifiers

Publikacja

- Rok 2010

A method of recognizing events connected to danger based on their acoustic representation through Support Vector Machine classification is presented. The method proposed is particularly useful in an automatic surveillance system. The set of 28 parameters used in the classifier consists of dedicated parameters and MPEG-7 features. Methods for parameter calculation are presented, as well as a design of SVM model used for classification....

Language Models in Speech Recognition

Publikacja

J. Daciuk

- Rok 2022

This chapter describes language models used in speech recognition, It starts by indicating the role and the place of language models in speech recognition. Mesures used to compare language models follow. An overview of n-gram, syntactic, semantic, and neural models is given. It is accompanied by a list of popular software.

Pełny tekst do pobrania w serwisie zewnętrznym

Audiovisual speech recognition for training hearing impaired patients

Publikacja

- Rok 2006

Praca przedstawia system rozpoznawania izolowanych głosek mowy wykorzystujący dane wizualne i akustyczne. Modele Active Shape Models zostały wykorzystane do wyznaczania parametrów wizualnych na podstawie analizy kształtu i ruchu ust w nagraniach wideo. Parametry akustyczne bazują na współczynnikach melcepstralnych. Sieć neuronowa została użyta do rozpoznawania wymawianych głosek na podstawie wektora cech zawierającego oba typy...

Multimodal English corpus for automatic speech recognition

Publikacja

- Rok 2013

A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...

Visual Lip Contour Detection for the Purpose of Speech Recognition

Publikacja

- Rok 2014

A method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...

Examining Feature Vector for Phoneme Recognition

Publikacja

G. Korvel
B. Kostek

- Rok 2018

The aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

Publikacja

A. Czyżewski
B. Kostek
T. Ciszewski
D. Majewicz

- Rok 2013

The bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...

Support Vector Machine Applied to Road Traffic Event Classification

Publikacja

M. Blaszke
B. Kostek

- MATEC Web of Conferences - Rok 2018

The aim of this paper is to present results of road traffic event signal recognition. First, several types of systems for road traffic monitoring, including Intelligent Transport System (ITS) are shortly described. Then, assumptions of creating a database of vehicle signals recorded in different weather and road conditions are outlined. Registered signals were edited as single vehicle pass by. Using the Matlab-based application...

Pełny tekst do pobrania w portalu

Scoreboard Architectural Pattern and Integration of Emotion Recognition Results

Publikacja

- IEEE Access - Rok 2019

This paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...

Pełny tekst do pobrania w portalu

Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning

Publikacja

A. Czyżewski

- Journal of the Acoustical Society of America - Rok 2023

Text-to-Speech synthesis (TTS) can be used to generate training data for building Automatic Speech Recognition models (ASR). Access to medical speech data is because it is sensitive data that is difficult to obtain for privacy reasons; TTS can help expand the data set. Speech can be synthesized by mimicking different accents, dialects, and speaking styles that may occur in a medical language. Reinforcement Learning (RL), in the...

Pełny tekst do pobrania w portalu

An audio-visual corpus for multimodal automatic speech recognition

Publikacja

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017

review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Pełny tekst do pobrania w portalu

EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

Publikacja

- Rok 2014

The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...

EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

Publikacja

- Rok 2014

The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...

A survey of automatic speech recognition deep models performance for Polish medical terms

Publikacja

- Rok 2023

Among the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....

Pełny tekst do pobrania w serwisie zewnętrznym

Robust and Efficient Machine Learning Algorithms for Visual Recognition

Publikacja

S. Cygert

- Rok 2022

In visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...

Pełny tekst do pobrania w portalu

Filtry

Katalog

Kategoria

Rok

Opcje

A comparative study of English viseme recognition methods and algorithms

A comparative study of English viseme recognition methods and algorithm

Examining Influence of Distance to Microphone on Accuracy of Speech Recognition

Hybrid of Neural Networks and Hidden Markov Models as a modern approach to speech recognition systems

Intracranial hemorrhage detection in 3D computed tomography images using a bi-directional long short-term memory network-based modified genetic algorithm

Dangerous sound event recognition using Support Vector Machine classifiers

Language Models in Speech Recognition

Audiovisual speech recognition for training hearing impaired patients

Multimodal English corpus for automatic speech recognition

Visual Lip Contour Detection for the Purpose of Speech Recognition

Examining Feature Vector for Phoneme Recognition

Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej

Support Vector Machine Applied to Road Traffic Event Classification

Scoreboard Architectural Pattern and Integration of Emotion Recognition Results

Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning

An audio-visual corpus for multimodal automatic speech recognition

EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

A survey of automatic speech recognition deep models performance for Polish medical terms

Robust and Efficient Machine Learning Algorithms for Visual Recognition

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: viseme · parameterization of mouth region · support vector machine · hidden markov model · pattern recognition · audiovisual speech recognition