Search results for: SPEECH EMOTION RECOGNITION

Search results for: SPEECH EMOTION RECOGNITION

results on page:
embed this view on your website

Filters

total: 1043

clear all filters disabled

displaying 1000 best results Help

Comparison of selected off-the-shelf solutions for emotion recognition based on facial expressions
Publication
- Year 2016
The paper concerns accuracy of emotion recognition from facial expressions. As there are a couple of ready off-the-shelf solutions available in the market today, this study aims at practical evaluation of selected solutions in order to provide some insight into what potential buyers might expect. Two solutions were compared: FaceReader by Noldus and Xpress Engine by QuantumLab. The performed evaluation revealed that the recognition...

Full text to download in external service
A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors
Publication
- SENSORS - Year 2020
In recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones,...

Full text available to download
Ontological Model for Contextual Data Defining Time Series for Emotion Recognition and Analysis
Publication
- T. Zawadzka
- W. Waloszek
- A. Karpus
- S. Zapalowska
- M. Wróbel
- IEEE Access - Year 2021
One of the major challenges facing the field of Affective Computing is the reusability of datasets. Existing affective-related datasets are not consistent with each other, they store a variety of information in different forms, different formats, and the terms used to describe them are not unified. This paper proposes a new ontology, ROAD, as a solution to this problem, by formally describing the datasets and unifying the terms...

Full text available to download
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
Publication
- Electronics - Year 2022
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...

Full text available to download
Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
Publication
- Year 2016
The problem of accurate differentiating between the speaker utterance and the noise parts in a speech signal is considered. The influence of utilizing a voice activity detection in speech signals on the accuracy of the automatic speech recognition (ASR) system is presented. The examined methods of voice activity detection are based on acoustic and visual modalities. The problem of detecting the voice activity in clean and noisy...
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publication
- G. Korvel
- P. Treigys
- G. Tamulevicus
- J. Bernataviciene
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2018
convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
A survey of automatic speech recognition deep models performance for Polish medical terms
Publication
- Year 2023
Among the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....

Full text to download in external service
Combining visual and acoustic modalities to ease speech recognition by hearing impaired people
Publication
- B. Kostek
- P. Dalka
- Year 2005
Artykuł prezentuje system, którego celem działania jest ułatwienie procesu treningu poprawnej wymowy dla osób z poważnymi wadami słuchu. W analizie mowy wykorzystane zostały parametry akutyczne i wizualne. Do wyznaczenia parametrów wizualnych na podstawie kształtu i ruchu ust zostały wykorzystane modele Active Shape Models. Parametry akustyczne bazują na współczynnikach melcepstralnych. Do klasyfikacji wypowiadanych głosek została...
Hybrid of Neural Networks and Hidden Markov Models as a modern approach to speech recognition systems
Publication
- P. Sokólski
- T. A. Rutkowski
- Pomiary Automatyka Robotyka - Year 2013
The aim of this paper is to present a hybrid algorithm that combines the advantages ofartificial neural networks and hidden Markov models in speech recognition for control purpos-es. The scope of the paper includes review of currently used solutions, description and analysis of implementation of selected artificial neural network (NN) structures and hidden Markov mod-els (HMM). The main part of the paper consists of a description...

Full text available to download
Ontological Modeling for Contextual Data Describing Signals Obtained from Electrodermal Activity for Emotion Recognition and Analysis
Publication
- IEEE Access - Year 2023
Most of the research in the field of emotion recognition is based on datasets that contain data obtained during affective computing experiments. However, each dataset is described by different metadata, stored in various structures and formats. This research can be counted among those whose aim is to provide a structural and semantic pattern for affective computing datasets, which is an important step to solve the problem of data...

Full text available to download
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publication
- Year 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publication
- Year 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
Intra-subject class-incremental deep learning approach for EEG-based imagined speech recognition
Publication
- J. S. Garcia Salinas
- A. A. Torres-García
- C. A. Reyes-Garćia
- L. Villaseñor-Pineda
- Biomedical Signal Processing and Control - Year 2023
Brain–computer interfaces (BCIs) aim to decode brain signals and transform them into commands for device operation. The present study aimed to decode the brain activity during imagined speech. The BCI must identify imagined words within a given vocabulary and thus perform the requested action. A possible scenario when using this approach is the gradual addition of new words to the vocabulary using incremental learning methods....

Full text to download in external service
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
Publication
- A. Czyżewski
- B. Kostek
- T. Ciszewski
- D. Majewicz
- Year 2013
The bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
IEEE Automatic Speech Recognition and Understanding Workshop

Conferences
ISCA Tutorial and Research Workshop Automatic Speech Recognition

Conferences
Introduction to the special issue on machine learning in acoustics
Publication
- Z. Michalopoulou
- P. Gerstoft
- B. Kostek
- M. A. Roch
- Journal of the Acoustical Society of America - Year 2021
When we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...

Full text available to download
Artur Gańcza mgr inż.

People

Department of Marine Electronic Systems

I received the M.Sc. degree from the Gdańsk University of Technology (GUT), Gdańsk, Poland, in 2019. I am currently a Ph.D. student at GUT, with the Department of Automatic Control, Faculty of Electronics, Telecommunications and Informatics. My professional interests include speech recognition, system identification, adaptive signal processing and linear algebra.
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
Publication
- S. Zaporowski
- B. Kostek
- Year 2018
W niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
Andrzej Czyżewski prof. dr hab. inż.

People

Department of Multimedia Systems

Prof. zw. dr hab. inż. Andrzej Czyżewski jest absolwentem Wydziału Elektroniki PG (studia magisterskie ukończył w 1982 r.). Pracę doktorską na temat związany z dźwiękiem cyfrowym obronił z wyróżnieniem na Wydziale Elektroniki PG w roku 1987. W 1992 r. przedstawił rozprawę habilitacyjną pt.: „Cyfrowe operacje na sygnałach fonicznych”. Jego kolokwium habilitacyjne zostało przyjęte jednomyślnie w czerwcu 1992 r. w Akademii Górniczo-Hutniczej...

Search

Filters

Catalog

Search results for: SPEECH EMOTION RECOGNITION

Artur Gańcza mgr inż.

Andrzej Czyżewski prof. dr hab. inż.