Search results for: SPEECH EMOTION RECOGNITION

Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech

Publication

D. Piotrowski
R. Korzeniowski
A. Falai
S. Cygert
K. Pokora
G. Tinchev
Z. Zhang
K. Yanagisawa

- Year 2023

In this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...

Full text to download in external service

Human-Computer Interface Based on Visual Lip Movement and Gesture Recognition

Publication

- International Journal of Computing Science and Mathematics - Year 2010

The multimodal human-computer interface (HCI) called LipMouse is presented, allowing a user to work on a computer using movements and gestures made with his/her mouth only. Algorithms for lip movement tracking and lip gesture recognition are presented in details. User face images are captured with a standard webcam. Face detection is based on a cascade of boosted classifiers using Haar-like features. A mouth region is located in...

Full text to download in external service

Automatic recognition of therapy progress among children with autism

Publication

A. Kołakowska
A. Landowska
A. Anzulewicz
K. Sobota

- Scientific Reports - Year 2017

The article presents a research study on recognizing therapy progress among children with autism spectrum disorder. The progress is recognized on the basis of behavioural data gathered via five specially designed tablet games. Over 180 distinct parameters are calculated on the basis of raw data delivered via the game flow and tablet sensors - i.e. touch screen, accelerometer and gyroscope. The results obtained confirm the possibility...

Full text available to download

Local Texture Pattern Selection for Efficient Face Recognition and Tracking

Publication

- Advances in Intelligent Systems and Computing - Year 2015

This paper describes the research aimed at finding the optimal configuration of the face recognition algorithm based on local texture descriptors (binary and ternary patterns). Since the identification module was supposed to be a part of the face tracking system developed for interactive wearable computer, proper feature selection, allowing for real-time operation, became particularly important. Our experiments showed that it is...

Full text to download in external service

A Concept of Automatic Film Color Grading Based on Music Recognition and Evoked Emotions

Publication

- Year 2019

The article presents the aspects of the final selection of the color of shots in film production based on the psychology of color. First of all, the elements of color processing, contrast, saturation or white balance in the film shots were presented and the definition of color grading was given. In the second part of the article the analysis of film music was conducted in the context of stimulating appropriate emotions while watching...

Feasibility Study for Food Intake Tasks Recognition Based on Smart Glasses

Publication

M. Biallas
A. Andrushevich
R. Kistler
A. Klapproth
K. Czuszyński
A. Bujnowski

- Journal of Medical Imaging and Health Informatics - Year 2015

In this exploratory study 13 adult test subjects have performed different food intake tasks while wearing a three axis accelerometer mounted at a temple of glasses. Two different algorithms for task recognition have been applied and compared. The retrospective data processing leads to better task recognition results when the frequency range of 50 Hz to 100 Hz is analysed within accelerometer signal recordings. A straightforward...

Full text to download in external service

Fuzzy rule-based dynamic gesture recognition employing camera & multimedia projector

Publication

- Year 2010

In the paper the system based on camera and multimedia projector enabling a user to control computer applications by dynamic hand gestures is presented. The main objective is to present the gesture recognition methodology which bases on representing hand movement trajectory by motion vectors analyzed using fuzzy rule-based inference. The approach was engineered in the system developed with J2SE and C++ / OpenCV technology. OpenCV...

Full text to download in external service

Sign Language Recognition Using Convolution Neural Networks

Publication

- Year 2024

The objective of this work was to provide an app that can automatically recognize hand gestures from the American Sign Language (ASL) on mobile devices. The app employs a model based on Convolutional Neural Network (CNN) for gesture classification. Various CNN architectures and optimization strategies suitable for devices with limited resources were examined. InceptionV3 and VGG-19 models exhibited negligibly higher accuracy than...

Full text to download in external service

Integration of speech enhancement and coding techniques

Publication

M. Kuropatwinski
D. Leckschat
K. Kroschel
A. Czyzewski
M. Kuropatwiński

- Year 1999

Full text to download in external service

Broadband interference in speech reinforcement systems

Publication

- Year 2008

Artykuł podejmuje niedoceniany problem wpływu liczby i rozkładu głośników w systemach nagłośnienia, na jakość przekazu głosowego, czyli na zrozumiałość mowy w audytoriach. Superpozycji przesuniętych w czasie szerokopasmowych sygnałów o tym samym kształcie i lekko różnych wielkościach, które docierają do słuchacza z licznych spójnych źródeł, towarzyszy zjawisko interferencji prowadzące do głębokiej modyfikacji odbieranych sygnałów...

Novel approaches to wideband speech coding

Publication

- Year 2008

Dwie metoda kodowania szerokopasmowego mowy zostały zaprezentowane. W pierwszej metodzie wykorzystano algorytm kompresji i ekspansji czasowej sygnału mowy, pozwalający na kodowanie szerokopasmowe sygnału mowy z wykorzystaniem ustandaryzowanych kodeków. Metoda ta jest przewidziana do zastosowania w adaptacyjnych algorytmach kodowania mowy. Drugie z proponowanych rozwiazan dotyczy nowej metody estymacji obwiedni widma sygnalu mowy...

Full text to download in external service

A system for multitask noisy speech enhancement.

Publication

A. Czyżewski
A. Kaczmarek
J. Kotus
A. Pawlik
A. Rypulak
P. Żwan

- Year 2004

W artykule przedstawiono ogolną charakterystyke opracowanego systemu rejestracji i rekonstrukcji mowy. Artykuł zawiera opis składników systemu, ktory jest oprogramowaniem zawierającym zaawansowane narzędzia służące poprawie zrozumiałości mowy. Zaimplementowane narzędzia systemu umożliwiają wyszukiwanie nagrań dźwiękowych i ich obróbkę przy pomocy zaimplementowanych pluginów. W artykule przedstawione wykorzystane w systemie algorytmy...

Multitask Noisy Speech Enhancement System

Publication

- Year 2005

W referacie opisano Wielozadaniowy System Poprawy Jakości Sygnału Mowy. Jest to wyspecjalizowany pakiet oprogramowania przeznaczony do rejestrowania sygnału mowy i do poprawy jego jakości oraz zrozumiałości mowy, przy użyciu zaawansowanych procedur cyfrowego przetwarzania sygnału. Pakiet oprogramowania składa się z programów: Rejestrator, Przeglądarka oraz Rekonstruktor. Oprogramowanie to może być użyte w przypadkach, gdy zrozumiałość...

Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students

Publication

P. Falkowski-Gilski

- Year 2021

The user perceived quality is a mixture of factors, including the background of an individual. The process of auditory perception is discussed in a wide variety of fields, ranging from engineering to medicine. Many studies examine the difference between musicians and non-musicians. Since musical training develops musical hearing and other various auditory capabilities, similar enhancements should be observable in case of bilingual...

Full text to download in external service

Contextual Knowledge to Enhance Workplace Hazard Recognition and Interpretation in a Cognitive Vision Platform

Publication

C. De
C. Sanin
E. Szczerbicki

- Year 2018

The combination of vision and sensor data together with the resulting necessity for formal representations builds a central component of an autonomous Cyber Physical System for detection and tracking of laborers in workplaces environments. This system must be adaptable and perceive the environment as automatically as possible, performing in a variety of plants and scenes without the necessity of recoding the application for each...

Full text available to download

Theory of recognition in a historical perspective. Axel Honneth's Anerkennung: Eine europäische Ideengeschichte

Publication

A. Karalus

- Archiwum Historii Filozofii i Myśli Społecznej - Year 2019

The article discusses Honneth excursion into the realm of the history of ideas. This time Honneth decides to laser it on the notion of "recognition" in three different cultural areas and three different traditions: French, English, and German. The article discusses Honneth's persepctive and attempts at finding the common thread that would link three aforementioned traditions.

Full text available to download

Speaker Recognition Using Convolutional Neural Network with Minimal Training Data for Smart Home Solutions

Publication

M. Wang
T. Sirlapu
A. Kwaśniewska
M. Szankin
M. Bartscherer
R. Nicolas

- Year 2018

With the technology advancements in smart home sector, voice control and automation are key components that can make a real difference in people's lives. The voice recognition technology market continues to involve rapidly as almost all smart home devices are providing speaker recognition capability today. However, most of them provide cloud-based solutions or use very deep Neural Networks for speaker recognition task, which are...

Full text to download in external service

Examining Classifiers Applied to Static Hand Gesture Recognition in Novel Sound Mixing System

Publication

- Advances in Intelligent Systems and Computing - Year 2013

The main objective of the chapter is to present the methodology and results of examining various classifiers (Nearest Neighbor-like algorithm with non-nested generalization (NNge), Naive Bayes, C4.5 (J48), Random Tree, Random Forests, Artificial Neural Networks (Multilayer Perceptron), Support Vector Machine (SVM) used for static gesture recognition. A problem of effective gesture recognition is outlined in the context of the system...

Full text to download in external service

Estimation of time-frequency complex phase-based speech attributes using narrow band filter banks

Publication

K. Abratkiewicz
K. Czarnecki
D. Fourer
F. Auger

- Year 2017

In this paper, we present nonlinear estimators of nonstationary and multicomponent signal attributes (parameters, properties) which are instantaneous frequency, spectral (or group) delay, and chirp-rate (also known as instantaneous frequency slope). We estimate all of these distributions in the time-frequency domain using both finite and infinite impulse response (FIR and IIR) narrow band filers for speech analysis. Then, we present...

Full text available to download

Recognition of environmentally important ions

Publication

N. Łukasik
E. Wagner-Wysiecka
V. Hubscher-Bruder
M. Bocheńska
S. Michel

- Logistyka - Year 2013

..

Influence of Thermal Imagery Resolution on Accuracy of Deep Learning based Face Recognition

Publication

- Year 2019

Human-system interactions frequently require a retrieval of the key context information about the user and the environment. Image processing techniques have been widely applied in this area, providing details about recognized objects, people and actions. Considering remote diagnostics solutions, e.g. non-contact vital signs estimation and smart home monitoring systems that utilize person’s identity, security is a very important factor....

Full text available to download

From Linear Classifier to Convolutional Neural Network for Hand Pose Recognition

Publication

P. Rościszewski

- Computer Science - Year 2017

Recently gathered image datasets and the new capabilities of high-performance computing systems have allowed developing new artificial neural network models and training algorithms. Using the new machine learning models, computer vision tasks can be accomplished based on the raw values of image pixels instead of specific features. The principle of operation of deep neural networks resembles more and more what we believe to be happening...

Full text available to download

Deep Learning: A Case Study for Image Recognition Using Transfer Learning

Publication

S. Erpolat Tasabat
O. Aydin

- Year 2021

Deep learning (DL) is a rising star of machine learning (ML) and artificial intelligence (AI) domains. Until 2006, many researchers had attempted to build deep neural networks (DNN), but most of them failed. In 2006, it was proven that deep neural networks are one of the most crucial inventions for the 21st century. Nowadays, DNN are being used as a key technology for many different domains: self-driven vehicles, smart cities,...

Full text to download in external service

Transient detection algorithms for speech coding applications

Publication

G. Szwoch
M. Kulesza
A. Czyzewski

- Journal of the Acoustical Society of America - Year 2006

Full text to download in external service

Comprehensive Evaluation of Statistical Speech Waveform Synthesis

Publication

T. Merritt
B. Putrycz
A. Nadolski
T. Ye
D. Korzekwa
W. Dolecki
T. Drugman
V. Klimkov
A. Moinet
A. Breen... and 3 others

- Year 2018

Full text to download in external service

New generation speech aid for stuttering people

Publication

- Year 2008

Współczesne Cyfrowe Procesory Sygnałowe (ang. DSP) mają niewielkie wymiary, ale są w stanie re-alizować złożone algorytmy. Ich dodatkową zaletą jest łatwość wymiany oprogramowania, a co za tym idzie łatwość zmiany dziedziny zastosowań. Wykorzystując możliwości procesów stało się możliwe budowanie miniaturowych protez słuchu i mowy. W referacie skupiono się na zagadnieniach związanych z projekto-wanie i implementacją algorytmów...

Full text available to download

New generation speech aid for stuttering people

Publication

- Archives of Acoustics - Year 2008

Współczesne Cyfrowe Procesory Sygnałowe (ang. DSP) mają niewielkie wymiary, ale są w stanie re-alizować złożone algorytmy. Ich dodatkową zaletą jest łatwość wymiany oprogramowania, a co za tym idzie łatwość zmiany dziedziny zastosowań. Wykorzystując możliwości procesów stało się możliwe budowanie miniaturowych protez słuchu i mowy. W referacie skupiono się na zagadnieniach związanych z projekto-wanie i implementacją algorytmów...

Full text available to download

Influence of modulation detection threshold on speech intelligibility

Publication

K. Leo

- ACTA PHYSICA POLONICA A - Year 2011

Full text available to download

Automatic singing quality recognition employing artificial neural networks

Publication

P. Żwan

- Archives of Acoustics - Year 2008

Celem artykułu jest udowodnienie możliwości automatycznej oceny jakości technicznej głosów śpiewaczych. Pokrótce zaprezentowano w nim stworzoną bazę danych głosów śpiewaczych oraz zaimplementowane parametry. Przy pomocy sztucznych sieci neuronowych zaprojektowano system decyzyjny, który oceniono w pięciostopniowej skali jakość techniczną głosu. Przy pomocy metod statystycznych udowodniono, że wyniki generowane przez ten system...

Full text available to download

A Novel IoT-Perceptive Human Activity Recognition (HAR) Approach Using Multi-Head Convolutional Attention

Publication

H. Zhang
Z. Xiao
J. Wang
F. Li
E. Szczerbicki

- IEEE Internet of Things Journal - Year 2019

Together with fast advancement of the Internet of Things (IoT), smart healthcare applications and systems are equipped with increasingly more wearable sensors and mobile devices. These sensors are used not only to collect data, but also, and more importantly, to assist in daily activity tracking and analyzing of their users. Various human activity recognition (HAR) approaches are used to enhance such tracking. Most of the existing...

Full text available to download

Real-time working gas recognition system based on the array of semiconductor gas sensors and portable computer Raspberry PI

Publication

- Year 2013

The gas-analyzing systems based on the array of partially selective gas sensors and pattern-recognition techniques are potentially fast and low-cost alternative for other devices, like gas analysers. They give the possibility of recognition the type and the concentration of measured volatile compounds in their working environment. In this work we present the implementation of gas recognition system, in which the signals from an...

Improving Traffic Light Recognition Methods using Shifting Time-Windows

Publication

A. Blokus
H. Krawczyk

- Year 2018

We propose a novel method of improving algorithms recognizing traffic lights in video sequences. Our focus is on algorithms for applications which notify the driver of a light in sight. Many existing methods process images in the recording separately. Our method bases on the observation that real-life videos depict underlying continuous processes. We named our method FSA (Frame Sequence Analyzed). It is applicable for any underlying...

Full text to download in external service

Unraveling the Interplay between DNA and Proteins: A Computational Exploration of Sequence and Structure-Specific Recognition Mechanisms

Publication

K. A. Hossain

- Year 2023

My PhD dissertation focused on DNA-protein interactions and the recognition of specific DNA sequences and structures. I discovered that acidic amino acid residues (Asp/Glu) play a crucial role by exhibiting a preference for cytosine. Their contribution to binding affinity depends on nearby cytosines, balancing electrostatic repulsion with specific interactions. Acidic residues act as negative selectors, discouraging non-cytosine...

Full text available to download

Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams

Publication

K. Łopatka

- Year 2015

A system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...

Viruses, cancer and non-self recognition

Publication

M. Padariya
U. Kalathiya
S. Mikac
K. Dziubek
M. Tovar
E. Sroka
R. Fahraeus
A. Sznarkowska

- Open Biology - Year 2021

Full text to download in external service

Face Recognition: Shape versus Texture

Publication

M. Smiatacz

- Year 2015

This paper describes experiments related to the application of well-known techniques of the texture feature extraction (Local Binary Patterns and Gabor filtering) to the problem of automatic face verification. Results of the tests show that simple image normalization strategy based on the eye center detection and a regular grid of fiducial points outperforms the more complicated approach, employing active models that are able to...

Full text to download in external service

Balance recognition on the basis of EEG measurement.

Publication

- Annals of Computer Science and Information Systems - Year 2016

Although electroencephalography (EEG) is not typically used for verifying the sense of balance, it can be used for analysing cortical signals responsible for this phenomenon. Simple balance tasks can be proposed as a good indicator of whether the sense of balance is acting more or less actively. This article presents preliminary results for the potential of using EEG to balance sensing....

Full text available to download

Role of cholesterol in substrate recognition by -secretase

Publication

- Scientific Reports - Year 2021

-Secretase is an enzyme known to cleave multiple substrates within their transmembrane domains, with the amyloid precursor protein of Alzheimer’s Disease among the most prominent examples. The activity of -secretase strictly depends on the membrane cholesterol content, yet the mechanistic role of cholesterol in the substrate binding and cleavage remains unclear. In this work, we used all-atom molecular dynamics simulations to examine...

Full text available to download

System of speech signal processing and visualisation for linguistic purposes

Publication

K. Wojan

- Archives of Acoustics - Year 2005

Digital analysis of ethnic speech – extraction of information code

Publication

K. Wojan

- Archives of Acoustics - Year 2003

On the EM algorithm for the estimation of speech AR parameters in noise

Publication

M. Kuropatwinski
B. Kleijn
M. Kuropatwiński

- Year 2014

Full text to download in external service

Evaluation and Irony in Text in the Light of Speech Act Theory

Publication

K. Kukowicz-Zarska

- Forum Filologiczne Ateneum - Year 2020

Full text to download in external service

Investigations of speech signal parameters with regard to articulation influences

Publication

A. Kaczmarek

- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2008

W pracy zostało podjęte zagadnienie parametryzacji sygnału mowy w kontekście ekstrakcji cech biometrycznych. Analizowane parametry to parametry cepstralne (cepstrum liniowe i mel-cepstrum, czyli MFCC), parametry liniowej predykcji (LPC) oraz momenty widmowe i parametr F0. Zastosowano analize w krótkich stałych segmentach sygnału z zastosowaniem dużego zakładkowania, tzw. ''implicite segmentation''. Umożliwiło to zaobserwowanie...

New approach to localization of clicks in archive speech signals.

Publication

- Year 2004

Przedstawiono problem lokalizacji zniekształceń impulsowych w archiwalnych sygnałach mowy. Pokazano, że detekcja oparta na dwuzakresowym modelu autoregresyjnym i przetwarzanie dwukierunkowe pozwala uzyskać znaczącą poprawę działania w stosunku do istniejących metod lokalizacji zniekształceń.

Advanced speech archiving and restoration system for aviation applications

Publication

A. Czyżewski
J. Kotus
A. Kaczmarek
A. Rypulak
A. Pawlik

- Year 2005

W referacie przedstawiono opracowany System Rejestracji I Rekonstrukcji Mowy dla potrzeb lotnictwa. System ten umożliwia jednoczesny zapis, archiwizację i poprawę zrozumiałości sygnału mowy pochodzącego z wielu różnych kanałów komunikacji radiowej. Głównym celem systemu jest rejestracja i rekonstrukcja komunikatów słownych wymienianych drogą radiową pomiędzy pilotem samolotu a stacją kontroli lotów - jest to niezwykle istotne w...

Application of hybrid signals processors to speech and hearing aids

Publication

- Year 2005

Dzięki postępowi w technice Cyfrowych Procesorów Sygnałowych (ang. DSP) stało się możliwe budowanie miniaturowych protez słuchu i mowy. Mimo niewielkich wymiarów procesory te są w stanie wykonywać złożone algorytmy. Ich dodatkową zaletą jest łatwość zmiany oprogramowania, a co za tym idzie łatwość zmiany dziedziny zastosowań. W pracy skupiono się na zagadnieniach związanych z projektowanie i implementacją algorytmów mających zastosowanie...

Detection of dialogue in movie soundtrack for speech intelligibility enhancement

Publication

K. Łopatka

- Year 2014

A method for detecting dialogue in 5.1 movie soundtrack based on interchannel spectral disparity is presented. The front channel signals (left, right, center) are analyzed in the frequency domain. The selected partials in the center channel signal, which yield high disparity with left and right channels, are detected as dialogue. Subsequently, the dialogue frequency components are boosted to achieve increased dialogue intelligibility....

Full text to download in external service

The Influence of Selecting Regions from Endoscopic Video Frames on The Efficiency of Large Bowel Disease Recognition Algorithms

Publication

- Year 2012

The article presents our research in the field of the automatic diagnosis of large intestine diseases on endoscopic video. It focuses on the methods of selecting regions of interest from endoscopic video frames for further analysis by specialized disease recognition algorithms. Four methods of selecting regions of interest have been discussed: a. trivial, b. with the deletion of characteristic, endoscope specific additions to the...

Determination of toxic gases based on the responses of a single electrocatalytic sensor and pattern recognition techniques

Publication

- MEASUREMENT SCIENCE & TECHNOLOGY - Year 2014

A response from an electrocatalytic gas sensor contains fingerprint information about the type of gas and its concentration. As a result, a single gas sensor can be used for the determination of different gases. However, information about the type of gas and its concentration is hidden in the unique shape of the current–voltage response and it is quite difficult to explore. One of the ways to get precise information about the measured...

Full text to download in external service

1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type

Publication

- Year 2020

A network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence....

Search

Filters

Catalog

Category

Year

Options

Search results for: SPEECH EMOTION RECOGNITION