Search results for: SPEECH EMOTION RECOGNITION - Bridge of Knowledge

Search

Search results for: SPEECH EMOTION RECOGNITION

Search results for: SPEECH EMOTION RECOGNITION

  • Marking the Allophones Boundaries Based on the DTW Algorithm

    Publication

    - Year 2018

    The paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighborhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border...

  • MACHINE LEARNING APPLICATIONS IN RECOGNIZING HUMAN EMOTIONS BASED ON THE EEG

    Publication
    • A. Kastrau
    • M. Koronowski
    • M. Liksza
    • P. Jasik

    - Year 2021

    This study examined the machine learning-based approach allowing the recognition of human emotional states with the use of EEG signals. After a short introduction to the fundamentals of electroencephalography and neural oscillations, the two-dimensional valence-arousal Russell’s model of emotion was described. Next, we present the assumptions of the performed EEG experiment. Detail aspects of the data sanitization including preprocessing,...

  • Deep neural networks for data analysis

    e-Learning Courses
    • K. Draszawka

    The aim of the course is to familiarize students with the methods of deep learning for advanced data analysis. Typical areas of application of these types of methods include: image classification, speech recognition and natural language understanding. Celem przedmiotu jest zapoznanie studentów z metodami głębokiego uczenia maszynowego na potrzeby zaawansowanej analizy danych. Do typowych obszarów zastosowań tego typu metod należą:...

  • The Innovative Faculty for Innovative Technologies

    A leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...

    Full text to download in external service

  • Recognizing emotions on the basis of keystroke dynamics

    Publication

    - Year 2015

    The article describes a research on recognizing emotional states on the basis of keystroke dynamics. An overview of various studies and applications of emotion recognition based on data coming from keyboard is presented. Then, the idea of an experiment is presented, i.e. the way of collecting and labeling training data, extracting features and finally training classifiers. Different classification approaches are proposed to be...

    Full text to download in external service

  • Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling

    Publication

    - Year 2021

    A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...

    Full text to download in external service

  • Analysis of human behavioral patterns

    Publication

    - Year 2022

    Widespread usage of Internet and mobile devices entailed growing requirements concerning security which in turn brought about development of biometric methods. However, a specially designed biometric system may infer more about users than just verifying their identity. Proper analysis of users’ characteristics may also tell much about their skills, preferences, feelings. This chapter presents biometric methods applied in several...

    Full text to download in external service

  • Selection of Features for Multimodal Vocalic Segments Classification

    Publication

    English speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...

    Full text to download in external service

  • Agnieszka Landowska dr hab. inż.

    Agnieszka Landowska works for Gdansk University of Technology, FETI, Department of Software Engineering.  Her research concentrates on usability, accessibility and technology adoption, as well as affective computing methods. She initiated Emotions in HCI Research Group and conducts resarch on User eXperiene evaluation of applications and other technologies.

  • Discovering Rule-Based Learning Systems for the Purpose of Music Analysis

    Publication

    Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

    Full text available to download

  • Affect-awareness framework for intelligent tutoring systems

    Publication

    - Year 2013

    The paper proposes a framework for construction of Intelligent Tutoring Systems (ITS), that take into consideration student emotional states and make affective interventions. The paper provides definitions of `affect-aware systems' and `affective interventions' and describes the concept of the affect-awareness framework. The proposed framework separates emotion recognition from its definition, processing and making decisions on...

    Full text to download in external service

  • Music Mood Visualization Using Self-Organizing Maps

    Publication

    Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

    Full text available to download

  • Orken Mamyrbayev Professor

    People

    1.  Education: Higher. In 2001, graduated from the Abay Almaty State University (now Abay Kazakh National Pedagogical University), in the specialty: Computer science and computerization manager. 2.  Academic degree: Ph.D. in the specialty "6D070300-Information systems". The dissertation was defended in 2014 on the topic: "Kazakh soileulerin tanudyn kupmodaldy zhuyesin kuru". Under my supervision, 16 masters, 1 dissertation...

  • Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training

    Publication

    In the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...

    Full text available to download

  • Methodology of Affective Intervention Design for Intelligent Systems

    This paper concerns how intelligent systems should be designed to make adequate, valuable and natural affective interventions. The article proposes a process for choosing an affective intervention model for an intelligent system. The process consists of 10 activities that allow for step-by-step design of an affective feedback loop and takes into account the following factors: expected and desired emotional states, characteristics...

    Full text to download in external service

  • Robot-Based Intervention for Children With Autism Spectrum Disorder: A Systematic Literature Review

    Publication
    • K. D. Bartl-Pokorny
    • P. Uluer
    • D. E. Barkana
    • A. Baird
    • H. Kose
    • T. Zorcec
    • B. Robins
    • B. Schuller
    • A. Landowska
    • M. Pykała

    - IEEE Access - Year 2021

    Children with autism spectrum disorder (ASD) have deficits in the socio-communicative domain and frequently face severe difficulties in the recognition and expression of emotions. Existing literature suggested that children with ASD benefit from robot-based interventions. However, studies varied considerably in participant characteristics, applied robots, and trained skills. Here, we reviewed robot-based interventions targeting...

    Full text available to download

  • Detection of Face Position and Orientation Using Depth Data

    Publication

    In this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...

    Full text to download in external service

  • Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

    Publication

    In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...

    Full text to download in external service

  • Performance Analysis of the OpenCL Environment on Mobile Platforms

    Publication

    Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

    Full text to download in external service

  • Separability Assessment of Selected Types of Vehicle-Associated Noise

    Music Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...

    Full text to download in external service

  • Identification of Emotional States Using Phantom Miro M310 Camera

    Publication

    The purpose of this paper is to present the possibilities associated with the use of remote sensing methods in identifying human emotional states, and to present the results of the research conducted by the authors in this field. The studies presented involved the use of advanced image analysis to identify areas on the human face that change their activity along with emotional expression. Most of the research carried out in laboratories...

  • Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

    Publication

    - Year 2013

    Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...

  • IMAGE CORRELATION AS A TOLL FOR TRACKING FACIAL CHANGES CAUSING BY EXTERNAL STIMULI

    Publication

    - Year 2015

    Expressions of the human face bring a lot of information, which are a valuable source in the areas of computer vision, remote sensing and affective computing. For years, by analyzing the movement of the skin and facial muscles scientists are trying to create the perfect tool, based on image analysis, allowing the recognition of emotional states of human beings. To create a reliable algorithm, it is necessary to explore and examine...

    Full text to download in external service

  • Computer-assisted pronunciation training—Speech synthesis is almost all you need

    Publication

    - SPEECH COMMUNICATION - Year 2022

    The research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...

    Full text available to download

  • Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement

    Publication

    - IEEE Access - Year 2020

    The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

    Full text available to download

  • EMOTION

    Journals

    ISSN: 1528-3542 , eISSN: 1931-1516

  • Emotion Monitor - Concept, Construction and Lessons Learned

    This paper concerns the design and physical construction of an emotion monitor stand for tracking human emotions in Human-Computer Interaction using multi-modal approach. The concept of the stand using cameras, behavioral analysis tools and a set of physiological sensors such as galvanic skin response, blood-volume pulse, temperature, breath and electromyography is presented and followed...

    Full text available to download

  • Speech Intelligibility Measurements in Auditorium

    Publication

    Speech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...

    Full text available to download

  • Transient detection for speech coding applications

    Signal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...

    Full text to download in external service

  • Improving the quality of speech in the conditions of noise and interference

    Publication

    The aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...

    Full text to download in external service

  • Emotion Monitoring – Verification of Physiological Characteristics Measurement Procedures

    This paper concerns measurement procedures on an emotion monitoring stand designed for tracking human emotions in the Human-Computer Interaction with physiological characteristics. The paper addresses the key problem of physiological measurements being disturbed by a motion typical for human-computer interaction such as keyboard typing or mouse movements. An original experiment...

    Full text available to download

  • Applying the Lombard Effect to Speech-in-Noise Communication

    Publication

    - Electronics - Year 2023

    This study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...

    Full text available to download

  • Constructing a Dataset of Speech Recordingswith Lombard Effect

    Publication

    - Year 2020

    Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...

  • Improved method for real-time speech stretching

    Publication

    n algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...

    Full text to download in external service

  • The Impact of Lexicon Adaptation on the Emotion Mining From Software Engineering Artifacts

    Publication

    - IEEE Access - Year 2020

    Sentiment analysis and emotion mining techniques are increasingly being used in the field of software engineering. However, the experiments conducted so far have not yielded high accuracy results. Researchers indicate a lack of adaptation of the methods of emotion mining to the specific context of the domain as the main cause of this situation. The article describes research aimed at examining whether the adaptation of the lexicon...

    Full text available to download

  • Real-time speech-rate modification experiments

    Publication

    An algorithm designed for real-time speech time scale modification (stretching) is proposed, providing a combination of typical synchronous overlap and add based time scale modification algorithm and signal redundancy detection algorithms that allow to remove parts of the speech signal and replace them with the stretched speech signal fragments. Effectiveness of signal processing algorithms are examined experimentally together...

    Full text to download in external service

  • Improving Objective Speech Quality Indicators in Noise Conditions

    Publication

    - Year 2020

    This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

    Full text to download in external service

  • Emotion work

    Publication

    - Year 2002

    Full text to download in external service

  • Emotion Work

    Publication

    - Year 2016

    Full text to download in external service

  • Detecting Lombard Speech Using Deep Learning Approach

    Publication
    • K. Kąkol
    • G. Korvel
    • G. Tamulevicius
    • B. Kostek

    - SENSORS - Year 2023

    Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

    Full text available to download

  • Speech synthesis controlled by eye gazing

    Publication

    - Year 2010

    A method of communication based on eye gaze controlling is presented. Investigations of using gaze tracking have been carried out in various context applications. The solution proposed in the paper could be referred to as ''talking by eyes'' providing an innovative approach in the domain of speech synthesis. The application proposed is dedicated to disabled people, especially to persons in a so-called locked-in syndrome who cannot...

  • Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions

    Publication

    The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

    Full text available to download

  • Time-domain prosodic modifications for text-to-speech synthesizer

    Publication

    - Year 2010

    An application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. Prosodic modifications that improve the naturalness of the synthesized signal are discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.

  • A Method of Real-Time Non-uniform Speech Stretching

    Publication

    Developed method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...

    Full text to download in external service

  • Emotion Review

    Journals

    ISSN: 1754-0739 , eISSN: 1754-0747

  • Cognition and Emotion

    Journals

    ISSN: 0269-9931 , eISSN: 1464-0600

  • MOTIVATION AND EMOTION

    Journals

    ISSN: 0146-7239 , eISSN: 1573-6644

  • Predicting emotion from color present in images and video excerpts by machine learning

    Publication

    This work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...

    Full text available to download

  • Compulsive sexual behavior and dysregulation of emotion

    Publication

    - Sexual Medicine Reviews - Year 2020

    Introduction Dysregulation of emotion (DE) is commonly seen in individuals suffering from compulsive sexual behavior (CSB), as well as represents a crucial element of its common comorbidities like mood, anxiety, and substance use disorders. Aim To investigate the links between CSB and DE. Methods A review of pertinent literature on CSB and DE was performed using EBSCO, PubMed, and Google Scholar databases. Main Outcome Measure...

    Full text to download in external service

  • Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

    Publication
    • D. Korzekwa
    • R. Barra-Chicote
    • B. Kostek
    • T. Drugman
    • M. Łajszczak

    - Year 2019

    We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

    Full text available to download