Search results for: SPEECH EMOTION RECOGNITION

Search results for: SPEECH EMOTION RECOGNITION

results on page:
embed this view on your website

Filters

total: 1040

clear all filters disabled

displaying 1000 best results Help

Marking the Allophones Boundaries Based on the DTW Algorithm
Publication
- J. Rafałko
- Year 2018
The paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighborhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border...
MACHINE LEARNING APPLICATIONS IN RECOGNIZING HUMAN EMOTIONS BASED ON THE EEG
Publication
- A. Kastrau
- M. Koronowski
- M. Liksza
- P. Jasik
- Year 2021
This study examined the machine learning-based approach allowing the recognition of human emotional states with the use of EEG signals. After a short introduction to the fundamentals of electroencephalography and neural oscillations, the two-dimensional valence-arousal Russell’s model of emotion was described. Next, we present the assumptions of the performed EEG experiment. Detail aspects of the data sanitization including preprocessing,...
Deep neural networks for data analysis
e-Learning Courses
- K. Draszawka
The aim of the course is to familiarize students with the methods of deep learning for advanced data analysis. Typical areas of application of these types of methods include: image classification, speech recognition and natural language understanding. Celem przedmiotu jest zapoznanie studentów z metodami głębokiego uczenia maszynowego na potrzeby zaawansowanej analizy danych. Do typowych obszarów zastosowań tego typu metod należą:...
The Innovative Faculty for Innovative Technologies
Publication
- Year 2013
A leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...

Full text to download in external service
Recognizing emotions on the basis of keystroke dynamics
Publication
- A. Kołakowska
- Year 2015
The article describes a research on recognizing emotional states on the basis of keystroke dynamics. An overview of various studies and applications of emotion recognition based on data coming from keyboard is presented. Then, the idea of an experiment is presented, i.e. the way of collecting and labeling training data, extracting features and finally training classifiers. Different classification approaches are proposed to be...

Full text to download in external service
Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling
Publication
- D. Korzekwa
- J. Lorenzo-trueba
- S. Zaporowski
- S. Calamaro
- T. Drugman
- B. Kostek
- Year 2021
A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...

Full text to download in external service
Analysis of human behavioral patterns
Publication
- A. Kołakowska
- Year 2022
Widespread usage of Internet and mobile devices entailed growing requirements concerning security which in turn brought about development of biometric methods. However, a specially designed biometric system may infer more about users than just verifying their identity. Proper analysis of users’ characteristics may also tell much about their skills, preferences, feelings. This chapter presents biometric methods applied in several...

Full text to download in external service
Selection of Features for Multimodal Vocalic Segments Classification
Publication
- S. Zaporowski
- A. Czyżewski
- Year 2018
English speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the inﬂuence of feature vector dimension reduction for the accuracy of vocalic segments classiﬁcation employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...

Full text to download in external service
Agnieszka Landowska dr hab. inż.

People

Department of Software Engineering

Agnieszka Landowska works for Gdansk University of Technology, FETI, Department of Software Engineering. Her research concentrates on usability, accessibility and technology adoption, as well as affective computing methods. She initiated Emotions in HCI Research Group and conducts resarch on User eXperiene evaluation of applications and other technologies.
Discovering Rule-Based Learning Systems for the Purpose of Music Analysis
Publication
- G. Korvel
- B. Kostek
- Journal of the Acoustical Society of America - Year 2019
Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

Full text available to download
Affect-awareness framework for intelligent tutoring systems
Publication
- A. Landowska
- Year 2013
The paper proposes a framework for construction of Intelligent Tutoring Systems (ITS), that take into consideration student emotional states and make affective interventions. The paper provides definitions of `affect-aware systems' and `affective interventions' and describes the concept of the affect-awareness framework. The proposed framework separates emotion recognition from its definition, processing and making decisions on...

Full text to download in external service
Music Mood Visualization Using Self-Organizing Maps
Publication
- M. Piotrowska
- B. Kostek
- Archives of Acoustics - Year 2015
Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

Full text available to download
Orken Mamyrbayev Professor

People

1. Education: Higher. In 2001, graduated from the Abay Almaty State University (now Abay Kazakh National Pedagogical University), in the specialty: Computer science and computerization manager. 2. Academic degree: Ph.D. in the specialty "6D070300-Information systems". The dissertation was defended in 2014 on the topic: "Kazakh soileulerin tanudyn kupmodaldy zhuyesin kuru". Under my supervision, 16 masters, 1 dissertation...
Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training
Publication
- P. Rościszewski
- Procedia Computer Science - Year 2017
In the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...

Full text available to download
Methodology of Affective Intervention Design for Intelligent Systems
Publication
- INTERACTING WITH COMPUTERS - Year 2016
This paper concerns how intelligent systems should be designed to make adequate, valuable and natural affective interventions. The article proposes a process for choosing an affective intervention model for an intelligent system. The process consists of 10 activities that allow for step-by-step design of an affective feedback loop and takes into account the following factors: expected and desired emotional states, characteristics...

Full text to download in external service
Robot-Based Intervention for Children With Autism Spectrum Disorder: A Systematic Literature Review
Publication
- K. D. Bartl-Pokorny
- P. Uluer
- D. E. Barkana
- A. Baird
- H. Kose
- T. Zorcec
- B. Robins
- B. Schuller
- A. Landowska
- M. Pykała
- IEEE Access - Year 2021
Children with autism spectrum disorder (ASD) have deficits in the socio-communicative domain and frequently face severe difficulties in the recognition and expression of emotions. Existing literature suggested that children with ASD benefit from robot-based interventions. However, studies varied considerably in participant characteristics, applied robots, and trained skills. Here, we reviewed robot-based interventions targeting...

Full text available to download
Detection of Face Position and Orientation Using Depth Data
Publication
- M. Szwoch
- P. Pieniążek
- Advances in Intelligent Systems and Computing - Year 2015
In this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...

Full text to download in external service
Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging
Publication
- P. Rościszewski
- J. Kaliski
- Year 2017
In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modiﬁcation of the training program which minimizes the...

Full text to download in external service
Performance Analysis of the OpenCL Environment on Mobile Platforms
Publication
- P. Falkowski-Gilski
- M. Plewka
- Year 2022
Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

Full text to download in external service
Separability Assessment of Selected Types of Vehicle-Associated Noise
Publication
- Advances in Intelligent Systems and Computing - Year 2016
Music Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...

Full text to download in external service
Identification of Emotional States Using Phantom Miro M310 Camera
Publication
- M. Przyborski
- Internal Security - Year 2013
The purpose of this paper is to present the possibilities associated with the use of remote sensing methods in identifying human emotional states, and to present the results of the research conducted by the authors in this field. The studies presented involved the use of advanced image analysis to identify areas on the human face that change their activity along with emotional expression. Most of the research carried out in laboratories...
Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model
Publication
- K. Leckey
- R. Neininger
- W. Szpankowski
- Year 2013
Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...
IMAGE CORRELATION AS A TOLL FOR TRACKING FACIAL CHANGES CAUSING BY EXTERNAL STIMULI
Publication
- K. Bobkowska
- A. Janowski
- M. Przyborski
- Year 2015
Expressions of the human face bring a lot of information, which are a valuable source in the areas of computer vision, remote sensing and affective computing. For years, by analyzing the movement of the skin and facial muscles scientists are trying to create the perfect tool, based on image analysis, allowing the recognition of emotional states of human beings. To create a reliable algorithm, it is necessary to explore and examine...

Full text to download in external service
Computer-assisted pronunciation training—Speech synthesis is almost all you need
Publication
- D. Korzekwa
- J. Lorenzo-trueba
- T. Drugman
- B. Kostek
- SPEECH COMMUNICATION - Year 2022
The research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...

Full text available to download
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
Publication
- G. Korvel
- K. Kąkol
- O. Kurasova
- B. Kostek
- IEEE Access - Year 2020
The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Full text available to download
EMOTION

Journals

ISSN: 1528-3542 , eISSN: 1931-1516
Emotion Monitor - Concept, Construction and Lessons Learned
Publication
- A. Landowska
- Annals of Computer Science and Information Systems - Year 2015
This paper concerns the design and physical construction of an emotion monitor stand for tracking human emotions in Human-Computer Interaction using multi-modal approach. The concept of the stand using cameras, behavioral analysis tools and a set of physiological sensors such as galvanic skin response, blood-volume pulse, temperature, breath and electromyography is presented and followed...

Full text available to download
Speech Intelligibility Measurements in Auditorium
Publication
- K. Leo
- ACTA PHYSICA POLONICA A - Year 2010
Speech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...

Full text available to download
Transient detection for speech coding applications
Publication
- International Journal of Computer Science and Network Security - Year 2006
Signal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...

Full text to download in external service
Improving the quality of speech in the conditions of noise and interference
Publication
- B. Kostek
- K. Kąkol
- Journal of the Acoustical Society of America - Year 2018
The aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...

Full text to download in external service
Emotion Monitoring – Verification of Physiological Characteristics Measurement Procedures
Publication
- A. Landowska
- Metrology and Measurement Systems - Year 2014
This paper concerns measurement procedures on an emotion monitoring stand designed for tracking human emotions in the Human-Computer Interaction with physiological characteristics. The paper addresses the key problem of physiological measurements being disturbed by a motion typical for human-computer interaction such as keyboard typing or mouse movements. An original experiment...

Full text available to download
Applying the Lombard Effect to Speech-in-Noise Communication
Publication
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Electronics - Year 2023
This study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...

Full text available to download
Constructing a Dataset of Speech Recordingswith Lombard Effect
Publication
- D. Weber
- S. Zaporowski
- D. Korzekwa
- Year 2020
Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
Improved method for real-time speech stretching
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2012
n algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...

Full text to download in external service
The Impact of Lexicon Adaptation on the Emotion Mining From Software Engineering Artifacts
Publication
- M. Wróbel
- IEEE Access - Year 2020
Sentiment analysis and emotion mining techniques are increasingly being used in the field of software engineering. However, the experiments conducted so far have not yielded high accuracy results. Researchers indicate a lack of adaptation of the methods of emotion mining to the specific context of the domain as the main cause of this situation. The article describes research aimed at examining whether the adaptation of the lexicon...

Full text available to download
Real-time speech-rate modification experiments
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2010
An algorithm designed for real-time speech time scale modification (stretching) is proposed, providing a combination of typical synchronous overlap and add based time scale modification algorithm and signal redundancy detection algorithms that allow to remove parts of the speech signal and replace them with the stretched speech signal fragments. Effectiveness of signal processing algorithms are examined experimentally together...

Full text to download in external service
Improving Objective Speech Quality Indicators in Noise Conditions
Publication
- K. Kąkol
- G. Korvel
- B. Kostek
- Year 2020
This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

Full text to download in external service
Emotion work
Publication
- M. Noon
- P. Blyton
- A. Klimczuk
- Year 2002
Full text to download in external service
Emotion Work
Publication
- A. Klimczuk
- M. Klimczuk-Kochańska
- Year 2016
Full text to download in external service
Detecting Lombard Speech Using Deep Learning Approach
Publication
- K. Kąkol
- G. Korvel
- G. Tamulevicius
- B. Kostek
- SENSORS - Year 2023
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Full text available to download
Speech synthesis controlled by eye gazing
Publication
- A. Czyżewski
- K. Łopatka
- B. Kunka
- R. Rybacki
- B. Kostek
- Year 2010
A method of communication based on eye gaze controlling is presented. Investigations of using gaze tracking have been carried out in various context applications. The solution proposed in the paper could be referred to as ''talking by eyes'' providing an innovative approach in the domain of speech synthesis. The application proposed is dedicated to disabled people, especially to persons in a so-called locked-in syndrome who cannot...
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
Publication
- SENSORS - Year 2021
The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

Full text available to download
Time-domain prosodic modifications for text-to-speech synthesizer
Publication
- J. Łopatka
- P. Suchomski
- A. Czyżewski
- Year 2010
An application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. Prosodic modifications that improve the naturalness of the synthesized signal are discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.
A Method of Real-Time Non-uniform Speech Stretching
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2012
Developed method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...

Full text to download in external service
Emotion Review

Journals

ISSN: 1754-0739 , eISSN: 1754-0747
Cognition and Emotion

Journals

ISSN: 0269-9931 , eISSN: 1464-0600
MOTIVATION AND EMOTION

Journals

ISSN: 0146-7239 , eISSN: 1573-6644
Predicting emotion from color present in images and video excerpts by machine learning
Publication
- IEEE Access - Year 2023
This work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...

Full text available to download
Compulsive sexual behavior and dysregulation of emotion
Publication
- M. Lew-Starowicz
- K. Lewczuk
- I. Nowakowska
- S. Kraus
- M. Gola
- Sexual Medicine Reviews - Year 2020
Introduction Dysregulation of emotion (DE) is commonly seen in individuals suffering from compulsive sexual behavior (CSB), as well as represents a crucial element of its common comorbidities like mood, anxiety, and substance use disorders. Aim To investigate the links between CSB and DE. Methods A review of pertinent literature on CSB and DE was performed using EBSCO, PubMed, and Google Scholar databases. Main Outcome Measure...

Full text to download in external service
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publication
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Year 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Full text available to download

Search

Filters

Catalog

Search results for: SPEECH EMOTION RECOGNITION

Agnieszka Landowska dr hab. inż.

Orken Mamyrbayev Professor