displaying 1000 best results Help
Search results for: SPEECH EMOTION RECOGNITION
-
International Conference on Artificial Intelligence and Pattern Recognition
Conferences -
IEEE International Conference on Document Analysis and Recognition
Conferences -
Instantaneous complex frequency for pipeline pitch estimation
PublicationIn the paper a pipeline algorithm for estimating the pitch of speech signal is proposed. The algorithm uses instantaneous complex frequencies estimated for four waveforms obtained by filtering the original speech signal through four bandpass complex Hilbert filters. The imaginary parts of ICFs from each channel give four candidates for pitch estimates. The decision regarding the final estimate is made based on the real parts of...
-
Engineering Candida albicans glucosamine-6-phosphate synthase for efficient enzyme purification
PublicationRationally designed muteins of Candida albicans glucosamine-6-phosphate synthase, an enzyme known as a promising target for antifungal chemotherapy, were constructed, overexpressed in Escherichia coli and purified to near homogeneity. To facilitate and to optimize the purification of the enzyme, three recombinant versionscontaining internal oligoHis fragments were constructed: (i) by substituting residues 343 - 348...
-
Bridging challenges of clinical decision support systems with a semantic approach. A case study on breast cancer
PublicationThe integration of Clinical Decision Support Systems (CDSS) in nowadays clinical environments has not been fully achieved yet. Although numerous approaches and technologies have been proposed since 1960, there are still open gaps that need to be bridged. In this work we present advances from the established state of the art, overcoming some of the most notorious reported difficulties in: (i) automating CDSS, (ii) clinical workflow...
-
Digital fingerprinting for color images based on the quaternion encryption scheme
PublicationIn this paper we present a new quaternion-based encryption technique for color images. In the proposed encryption method, images are written as quaternions and are rotated in a three-dimensional space around another quaternion, which is an encryption key. The encryption process uses the cipher block chaining (CBC) mode. Further, this paper shows that our encryption algorithm enables digital fingerprinting as an additional feature....
-
Simultaneous determination of thermodynamic and kinetic parameters of aminopolycarbonate complexes of cobalt(II) and nickel(II) based on isothermal titration calorimetry data
Publication -
Zinc(II) complexation by some biologically relevant pH buffers
Publication -
XVIII Międzynarodowe Sympozjum Inżynierii i Reżyserii Dźwięku
PublicationThe subjective assessment of speech signals takes into account previous experiences and habits of an individual. Since the perception process deteriorates with age, differences should be noticeable among people from dissimilar age groups. In this work, we investigated the difference of speech quality assessment between high school students and university students. The study involved 60 participants, with 30 people in both the adolescents...
-
Creating new voices using normalizing flows
PublicationCreating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...
-
Human voice modification using instantaneous complex frequency
PublicationThe paper presents the possibilities of changing human voice by modifying instantaneous complex frequency (ICF) of the speech signal. The proposed method provides a flexible way of altering voice without the necessity of finding fundamental frequency and formants' positions or detecting voiced and unvoiced fragments of speech. The algorithm is simple and fast. Apart from ICF it uses signal factorization into two factors: one fully...
-
Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek
PublicationW wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...
-
Auditory-visual attention stimulator
PublicationNew approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...
-
International Conference on Advances in Pattern Recognition and Digital Techniques
Conferences -
IEEE International Conference on Automatic Face and Gesture Recognition
Conferences -
INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH
PublicationThe Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublicationIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
Emotional distress, burnout and sense of safety during the COVID-19 pandemic in teachers after the reopening of schools
PublicationThe COVID-19 pandemic is having a significant impact on people's psychological well-being and mental health. This study aimed to identify factors linked to emotional distress, burnout and sense of safety in teachers related to the reopening of Polish schools after lockdown, remote work, and the holiday period between March and August 2020. A total of 1,286 teachers from different educational institutions participated in the...
-
Auditory Brainstem Responses recorded employing Audio ABR device
Open Research DataThe dataset consists of ABR measurements employing click, burst and speech stimuli. Parameters of the particular stimuli were as follows:
-
Pracujący w czasie rzeczywistym system detekcji gazów wykorzystujący przenośny komputer Raspberry PI oraz matrycę półprzewodnikowych czujników gazu
PublicationThe gas-analyzing systems based on the array of partially selective gas sensors and pattern-recognition techniques are potentially fast and lowcost alternative for other devices, like gas‑analysers. They give the possibility of recognition the type and the concentration of measured volatile compounds in their working environment. In this work we present the implementation of gas recognition system, in which the signals from an...
-
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
PublicationIn this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...
-
Interactions with recognized patients using smart glasses
PublicationRecently, different smart glasses solutions have been proposed on the market. The rapid development of this wearable technology has led to several research projects related to applications of smart glasses in healthcare. In this paper we propose a general architecture of the system enabling data integration for the recognized person. In the proposed system smart glasses integrates data obtained for the recognized patient from health...
-
Prof. Haitham Abu-Rub - A Visit to Poland's Gdansk University of Technology
PublicationReport on visit of Prof. Haitham Abu-Rub in Gdansk University of Technology. Speech on the Smart Grid Centre. Visit in the new smart grid laboratory of the GUT, the Laboratory for Innovative Power Technologies and Integration of Renewable Energy Sources (LINTE^2).
-
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
PublicationThis paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...
-
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
PublicationThis work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...
-
Gesture-based computer control system
PublicationIn the paper a system for controlling computer applications by hand gestures is presented. First, selected methods used for gesture recognition are described. The system hardware and a way of controlling a computer by gestures are described. The architecture of the software along with hand gesture recognition methods and algorithms used are presented. Examples of basic and complex gestures recognized by the system are given.
-
Automatic Classification of Polish Sign Language Words
PublicationIn the article we present the approach to automatic recognition of hand gestures using eGlove device. We present the research results of the system for detection and classification of static and dynamic words of Polish language. The results indicate the usage of eGlove allows to gain good recognition quality that additionally can be improved using additional data sources such as RGB cameras.
-
Comparative analysis of various transformation techniques for voiceless consonants modeling
PublicationIn this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....
-
Modeling and Designing Acoustical Conditions of the Interior – Case Study
PublicationThe primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...
-
Automatic music set organizatio based on mood of music / Automatyczna organizacja bazy muzycznej na podstawie nastroju muzyki
PublicationThis work is focused on an approach based on the emotional content of music and its automatic recognition. A vector of features describing emotional content of music was proposed. Additionally, a graphical model dedicated to the subjective evaluation of mood of music was created. A series of listening tests was carried out, and results were compared with automatic mood recognition employing SOM (Self Organizing Maps) and ANN (Artificial...
-
The role of time perspectives and impulsivity dimensions in coping styles
PublicationBoth time perspectives and impulsivity dimensions are groups of traits that are connected to self-control abilities and might be important for coping styles. However, to date, no study has systematically investigated their utility in predicting coping styles with regard to their multidimensional nature. The current study was correlational and exploratory, aiming to discover what amount of variance in each of the three coping...
-
Employing a biofeedback method based on hemispheric synchronization in effective learning
PublicationIn this paper an approach to build a brain computer-based hemispheric synchronization system is presented. The concept utilizes the wireless EEG signal registration and acquisition as well as advanced pre-processing methods. The influence of various filtration techniques of EOG artifacts on brain state recognition is examined. The emphasis is put on brain state recognition using band pass filtration for separation of individual...
-
Krzysztof Goczyła prof. dr hab. inż.
PeopleKrzysztof Goczyła, full professor of Gdańsk University of Technology, computer scientist, a specialist in software engineering, knowledge engineering and databases. He graduated from the Faculty of Electronics Technical University of Gdansk in 1976 with a degree in electronic engineering, specializing in automation. Since then he has been working at Gdańsk University of Technology. In 1982 he obtained a doctorate in computer science...
-
Endoscopic Video Classification with the Consideration of Temporal Patterns
PublicationThe article describes a novel approach to automatic recognition and classification of diseases in endoscopic videos. Current directions of research in this field are discussed. Most presented methods focus on processing single frames and do not take into consideration the temporal relationship between continuous classifications. Existing approaches that consider the temporal structure of an incoming frame sequence are focused on...
-
Wykorzystanie sztucznych sieci neuronowych do wykrywania i rozpoznawania tablic rejestracyjnych na zdjęciach pojazdów
PublicationW artykule przedstawiono koncepcję algorytmu wykrywania i rozpoznawania tablic rejestracyjnych (AWiRTR) na obrazach cyfrowych pojazdów. Detekcja i lokalizacja tablic rejestracyjnych oraz wyodrębnienie z obrazu tablicy rejestracyjnej poszczególnych znaków odbywa się z wykorzystaniem podstawowych technik przetwarzania obrazu (przekształcenia morfologiczne, wykrywanie krawędzi) jak i podstawowych danych statystycznych obiektów wykrytych...
-
A video monitoring system using ontology-driven identification of threats
PublicationIn this paper, we present a video monitoring systemthat leverages image recognition and ontological reasoningabout threats. In the solution, an image processing subsystemuses video recording of a monitored area and recognizesknown concepts in scenes. Then, a reasoning subsystem uses anontological description of security conditions and informationfrom image recognition to check if a violation of a conditionhas occurred. If a threat...
-
FEEDB: A multimodal database of facial expressions and emotions
PublicationIn this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...
-
Interpretation and modeling of emotions in the management of autonomous robots using a control paradigm based on a scheduling variable
PublicationThe paper presents a technical introduction to psychological theories of emotions. It highlights a usable ideaimplemented in a number of recently developed computational systems of emotions, and the hypothesis thatemotion can play the role of a scheduling variable in controlling autonomous robots. In the main part ofthis study, we outline our own computational system of emotion – xEmotion – designed as a key structuralelement in...
-
Playback detection using machine learning with spectrogram features approach
PublicationThis paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...
-
Endoscopic Videos Deinterlacing and On-Screen Text and Light Flashes Removal and Its Influence on Image Analysis Algorithms' Efficiency
PublicationIn this article, deinterlacing and removing on- screen text and light flashes methods on endoscopic video images are discussed. The research is intended to improve disease recognition algorithms' performance. In the article, four configurations of deinterlacing methods and another four configurations of text and flashes removal methods are described and examined. The efficiency of endoscopic video analysis algorithms is measured...
-
An electronic nose for quantitative determination of gas concentrations
PublicationThe practical application of human nose for fragrance recognition is severely limited by the fact that our sense of smell is subjective and gets tired easily. Consequen tly, there is considerable need for an instrument that can be a substitution of the human sense of smell. Electronic nose devices from the mid 1980s are used in growing number of applications. They comprise an array of several electrochemical gas sensors...
-
Video Semantic Analysis Framework based on Run-time Production Rules - Towards Cognitive Vision
PublicationThis paper proposes a service-oriented architecture for video analysis which separates object detection from event recognition. Our aim is to introduce new tools to be considered in the pathway towards Cognitive Vision as a support for classical Computer Vision techniques that have been broadly used by the scientific community. In the article, we particularly focus in solving some of the reported scalability issues found in current...
-
Intelligent multimedia solutions supporting special education needs.
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
Intelligent video and audio applications for learning enhancement
PublicationThe role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
-
Evaluation Criteria for Affect-Annotated Databases
PublicationIn this paper a set of comprehensive evaluation criteria for affect-annotated databases is proposed. These criteria can be used for evaluation of the quality of a database on the stage of its creation as well as for evaluation and comparison of existing databases. The usefulness of these criteria is demonstrated on several databases selected from affect computing domain. The databases contain different kind of data: video or still...
-
Geometric Algebra Model of Distributed Representations
PublicationFormalism based on GA is an alternative to distributed representation models developed so far-Smolensky's tensor product, Holographic Reduced Representations (HRR) and Binary Spatter Code (BSC). Convolutions are replaced by geometric products, interpretable in terms of geometry which seems to be the most natural language for visualization of higher concepts. This paper recalls the main ideas behind the GA model and investigates...
-
The project IDENT: Multimodal biometric system for bank client identity verification
PublicationBiometric identity verification methods are implemented inside a real banking environment comprising: dynamic handwritten signature verification, face recognition, bank cli-ent voice recognition and hand vein distribution verification. A secure communication system based on an intra-bank client-server architecture was designed for this purpose. Hitherto achieved progress within the project is reported in this paper with a focus...
-
Methodology for Text Classification using Manually Created Corpora-based Sentiment Dictionary
PublicationThis paper presents the methodology of Textual Content Classification, which is based on a combination of algorithms: preliminary formation of a contextual framework for the texts in particular problem area; manual creation of the Hierarchical Sentiment Dictionary (HSD) on the basis of a topically-oriented Corpus; tonality texts recognition via using HSD for analysing the documents as a collection of topically completed fragments...
-
Magdalena Szuflita-Żurawska
PeopleHead of the Scientific and Technical Information Services at the Gdansk University of Technology Library and the Leader of the Open Science Competence Center. She is also a Plenipotentiary of the Rector of the Gdańsk University of Technology for open science. She is a PhD Candidate. Her main areas of research and interests include research productivity, motivation, management of HEs, Open Access, Open Research Data, information...
-
Szymon Andrzejewski dr
PeopleMaster’s degree at the University of Gdańsk in 2008 Major in political system and self-government. Overgraduate studies at the Gdańsk University of Technology „Management and evaluation of projects financed from EU funds” and at AGH University of Science and Technology Noise protection against noise and vibration. Student of sociology PhD studies at the University of Gdańsk from 2016. The research scope is democracy and institutions...