Wyniki wyszukiwania dla: audiovisual speech recognition

Rough Sets Applied to Mood of Music Recognition

Publikacja

- Rok 2016

With the growth of accessible digital music libraries over the past decade, there is a need for research into automated systems for searching, organizing and recommending music. Mood of music is considered as one of the most intuitive criteria for listeners, thus this work is focused on the emotional content of music and its automatic recognition. The research study presented in this work contains an attempt to music emotion recognition...

Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency

Publikacja

- Elektronika : konstrukcje, technologie, zastosowania - Rok 2008

In this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.

Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency

Publikacja

- Rok 2007

In this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.

Automated detection of pronunciation errors in non-native English speech employing deep learning

Publikacja

D. Korzekwa

- Rok 2023

Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...

Pełny tekst do pobrania w portalu

Emotion Recognition Using Physiological Signals

Publikacja

W. Szwoch

- Rok 2015

In this paper the problem of emotion recognition using physiological signals is presented. Firstly the problems with acquisition of physiological signals related to specific human emotions are described. It is not a trivial problem to elicit real emotions and to choose stimuli that always, and for all people, elicit the same emotion. Also different kinds of physiological signals for emotion recognition are considered. A set of...

Pełny tekst do pobrania w serwisie zewnętrznym

Facial emotion recognition using depth data

Publikacja

M. Szwoch
P. Pieniazek

- Rok 2015

In this paper an original approach is presented for facial expression and emotion recognition based only on depth channel from Microsoft Kinect sensor. The emotional user model contains nine emotions including the neutral one. The proposed recognition algorithm uses local movements detection within the face area in order to recognize actual facial expression. This approach has been validated on Facial Expressions and Emotions Database...

Pełny tekst do pobrania w serwisie zewnętrznym

Emotion recognition and its application in software engineering

Publikacja

- Rok 2013

In this paper a novel application of multimodal emotion recognition algorithms in software engineering is described. Several application scenarios are proposed concerning program usability testing and software process improvement. Also a set of emotional states relevant in that application area is identified. The multimodal emotion recognition method that integrates video and depth channels, physiological signals and input devices...

Pełny tekst do pobrania w serwisie zewnętrznym

A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times

Publikacja

- SENSORS - Rok 2022

Objective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...

Pełny tekst do pobrania w portalu

Dependable Integration of Medical Image Recognition Components

Publikacja

- Rok 2012

Computer driven medical image recognition may support medical doctors in the diagnosis process, but requires high dependability considering potential consequences of incorrect results. The paper presentsa system that improves dependability of medical image recognition by integration of results from redundant components. The components implement alternative recognition algorithms of diseases in thefield of gastrointestinal endoscopy....

Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech

Publikacja

D. Korzekwa
J. Lorenzo-trueba
T. Drugman
S. Calamaro
B. Kostek

- Rok 2021

We propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions for L2 speech means that the model has to learn only from a weak signal of word-level mispronunciations. Because of that and due to the limited amount of mispronounced...

Pełny tekst do pobrania w portalu

Feature extraction in detection and recognition of graphical objects

Publikacja

J. Dembski

- Rok 2022

Detection and recognition of graphic objects in images are of great and growing importance in many areas, such as medical and industrial diagnostics, control systems in automation and robotics, or various types of security systems, including biometric security systems related to the recognition of the face or iris of the eye. In addition, there are all systems that facilitate the personal life of the blind people, visually impaired...

Mining inconsistent emotion recognition results with the multidimensional model

Publikacja

- IEEE Access - Rok 2021

The paper deals with the challenge of inconsistency in multichannel emotion recognition. The focus of the paper is to explore factors that might influence the inconsistency. The paper reports an experiment that used multi-camera facial expression analysis with multiple recognition systems. The data were analyzed using a multidimensional approach and data mining techniques. The study allowed us to explore camera location, occlusions...

Pełny tekst do pobrania w portalu

Guido: a musical score recognition system

Publikacja

M. Szwoch

- Rok 2007

This paper presents an optical music recognition system Guido that can automatically recognize the main musical symbols of music scores that were scanned or taken by a digital camera. The application is based on object model of musical notation and uses linguistic approach for symbol interpretation and error correction. The system offers musical editor with a partially automatic error correction.

Mowa nienawiści (hate speech) a odpowiedzialność dostawców usług internetowych w orzecznictwie sądów europejskich

Publikacja

K. Kowalik-Bańczyk

- Rok 2015

The article analyses the phenomenon of hate speech in the Internet contrasted with the problem of responsability of Internet Service Providers for cases of such abuses of freedom of expression. The text provides an analysis of jurisprudence of two European Courts. On the one hand it presents the position of the European Court of Human Rights on the problem of hate speech: its definition and the liability for it as an exception...

Multiclass AdaBoost Classifier Parameter Adaptation for Pattern Recognition

Publikacja

J. Dembski

- Advances in Intelligent Systems and Computing - Rok 2017

The article presents the problem of parameter value selection of the multiclass ``one against all'' approach of an AdaBoost algorithm in tasks of object recognition based on two-dimensional graphical images. AdaBoost classifier with Haar features is still used in mobile devices due to the processing speed in contrast to other methods like deep learning or SVM but its main drawback is the need to assembly the results of binary...

Pełny tekst do pobrania w serwisie zewnętrznym

Objectivization of phonological evaluation of speech elements by means of audio parametrization

Publikacja

- Rok 2018

This study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...

Anion recognition by n,n'-diarylalkanediamides

Publikacja

- Rok 2012

The preparation of N,N'-diarylalkanediamides from respective aliphatic dicarboxylic acidesand 4-nitroaniline via microwave-promoted reactions is presented. The most positive effect of microwave irradiation was observed for N,N'-bis(4-nitrophenyl)butanediamide. Anion binding studies on the obtained diamides were carried out in DMSO and acetonitrile using UV-vis and 1H NMR spectroscopy. A mechanism for selective fluoride recognition...

Pełny tekst do pobrania w serwisie zewnętrznym

Elimination of clicks from archive speech signals using sparse autoregressive modeling

Publikacja

- Rok 2012

This paper presents a new approach to elimination of impulsivedisturbances from archive speech signals. The proposedsparse autoregressive (SAR) signal representation is given ina factorized form - the model is a cascade of the so-called formantfilter and pitch filter. Such a technique has been widelyused in code-excited linear prediction (CELP) systems, as itguarantees model stability. After detection of noise pulses usinglinear...

Pełny tekst do pobrania w serwisie zewnętrznym

Robust and Efficient Machine Learning Algorithms for Visual Recognition

Publikacja

S. Cygert

- Rok 2022

In visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...

Pełny tekst do pobrania w portalu

AN ALGORITHM FOR PORTAL HYPERTENSIVE GASTROPATHY RECOGNITION ON THE ENDOSCOPIC RECORDINGS

Publikacja

- Rok 2014

Symptoms recognition of portal hypertensive gastropathy (PHG) can be done by analysing endoscopic recordings, but manual analysis done by physician may take a long time. This increases probability of missing some symptoms and automated methods may be applied to prevent that. In this paper a novel hybrid algorithm for recognition of early stage of portal hypertensive gastropathy is proposed. First image preprocessing is described....

Human-computer interactions in speech therapy using a blowing interface

Publikacja

- Rok 2014

In this paper we present a new human-computer interface for the quantitative measurement of blowing activities. The interface can measure the air flow and air pressure during the blowing activity. The measured values are stored and used to control the state of the graphical objects in the graphical user interface. In speech therapy children will find easier to play attractive therapeutic games than to perform repetitive and tedious,...

Pełny tekst do pobrania w serwisie zewnętrznym

Limitations of Emotion Recognition in Software User Experience Evaluation Context

Publikacja

- Annals of Computer Science and Information Systems - Rok 2016

This paper concerns how an affective-behavioural- cognitive approach applies to the evaluation of the software user experience. Although it may seem that affect recognition solutions are accurate in determining the user experience, there are several challenges in practice. This paper aims to explore the limitations of the automatic affect recognition applied in the usability context as well as...

Pełny tekst do pobrania w portalu

Accelerometer signal pre-processing influence on human activity recognition

Publikacja

- Rok 2009

A study of data pre-processing influence on accelerometer-based human activity recognition algorithms is presented. The frequency band used to filter-out the accelerometer signals and the number of accelerometers involved were considered in terms of their influence on the recognition accuracy.

Bimodal Emotion Recognition Based on Vocal and Facial Features

Publikacja

- Rok 2023

Emotion recognition is a crucial aspect of human communication, with applications in fields such as psychology, education, and healthcare. Identifying emotions accurately is challenging, as people use a variety of signals to express and perceive emotions. In this study, we address the problem of multimodal emotion recognition using both audio and video signals, to develop a robust and reliable system that can recognize emotions...

Pełny tekst do pobrania w portalu

Music Genre Recognition in the Rough Set-Based Environment

Publikacja

- Rok 2015

The aim of this paper is to investigate music genre recognition in the rough set-based environment. Experiments involve a parameterized music data-base containing 1100 music excerpts. The database is divided into 11 classes cor-responding to music genres. Tests are conducted using the Rough Set Exploration System (RSES), a toolset for analyzing data with the use of methods based on the rough set theory. Classification effectiveness...

Pełny tekst do pobrania w serwisie zewnętrznym

Emotion Recognition from Physiological Channels Using Graph Neural Network

Publikacja

- SENSORS - Rok 2022

In recent years, a number of new research papers have emerged on the application of neural networks in affective computing. One of the newest trends observed is the utilization of graph neural networks (GNNs) to recognize emotions. The study presented in the paper follows this trend. Within the work, GraphSleepNet (a GNN for classifying the stages of sleep) was adjusted for emotion recognition and validated for this purpose. The...

Pełny tekst do pobrania w portalu

Limitations of Emotion Recognition from Facial Expressions in e-Learning Context

Publikacja

- Rok 2017

The paper concerns technology of automatic emotion recognition applied in e-learning environment. During a study of e-learning process the authors applied facial expressions observation via multiple video cameras. Preliminary analysis of the facial expressions using automatic emotion recognition tools revealed several unexpected results, including unavailability of recognition due to face coverage and significant inconsistency...

Pełny tekst do pobrania w serwisie zewnętrznym

Hand gesture recognition supported by fuzzy rules and Kalman filters

Publikacja

- International Journal of Intelligent Information and Database Systems - Rok 2012

The paper presents a system based on camera and multimediaprojector enabling a user to control computer applications by dynamic hand gestures. Gesture recognition methodology based on representing hand movement trajectory by motion vectors analysed using fuzzy rule-based inference is first given. For effective hand position tracking Kalman filters are employed. The system engineered is developed using J2SE and C++/OpenCV technology....

Database of speech and facial expressions recorded with optimized face motion capture settings

Publikacja

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2019

The broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...

Pełny tekst do pobrania w portalu

Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System

Publikacja

M. Zamłyńska
P. Falkowski-Gilski
G. Debita
B. Miedziński

- Rok 2021

Although there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...

Pełny tekst do pobrania w serwisie zewnętrznym

Transfer learning in imagined speech EEG-based BCIs

Publikacja

J. S. Garcia Salinas
L. Villaseñor-Pineda
C. A. Reyes-Garćia
A. A. Torres-García

- Biomedical Signal Processing and Control - Rok 2019

The Brain–Computer Interfaces (BCI) based on electroencephalograms (EEG) are systems which aim is to provide a communication channel to any person with a computer, initially it was proposed to aid people with disabilities, but actually wider applications have been proposed. These devices allow to send messages or to control devices using the brain signals. There are different neuro-paradigms which evoke brain signals of interest...

Pełny tekst do pobrania w portalu

Estimation of the short-term predictor parameters of speech under noisy conditions

Publikacja

M. Kuropatwinski
W. Kleijn
M. Kuropatwiński

- IEEE Transactions on Audio Speech and Language Processing - Rok 2006

Pełny tekst do pobrania w serwisie zewnętrznym

Emotion Recognition Based on Facial Expressions of Gamers

Publikacja

- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Rok 2014

This article presents an approach to emotion recognition based on facial expressions of gamers. With application of certain methods crucial features of an analyzed face like eyebrows' shape, eyes and mouth width, height were extracted. Afterwards a group of artificial intelligence methods was applied to classify a given feature set as one of the following emotions: happiness, sadness, anger and fear. The approach presented in this...

Emotion Recognition Based on Facial Expressions of Gamers

Publikacja

- Rok 2012

This article presents an approach to emotion recognition based on facial expressions of gamers. With application of certain methods crucial features of an analysed face like eyebrows' shape, eyes and mouth width, height were extracted. Afterwards a group of artificial intelligence methods was applied to classify a given feature set as one of the following emotions: happiness, sadness, anger and fear.The approach presented in this...

Adversarial attack algorithm for traffic sign recognition

Publikacja

J. Wang
L. Shi
Y. Zhao
H. Zhang
E. Szczerbicki

- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2022

Deep learning suffers from the threat of adversarial attacks, and its defense methods have become a research hotspot. In all applications of deep learning, intelligent driving is an important and promising one, facing serious threat of adversarial attack in the meanwhile. To address the adversarial attack, this paper takes the traffic sign recognition as a typical object, for it is the core function of intelligent driving. Considering...

Pełny tekst do pobrania w serwisie zewnętrznym

Estimation of the excitation variances of speech and noise AR-models for enhanced speech coding

Publikacja

M. Kuropatwinski
W. Kleijn
M. Kuropatwiński

- Rok 2001

Pełny tekst do pobrania w serwisie zewnętrznym

Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning

Publikacja

K. Kąkol

- Rok 2023

The Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...

Pełny tekst do pobrania w portalu

Topology recognition and leader election in colored networks

Publikacja

D. Dereniowski
A. Pelc

- THEORETICAL COMPUTER SCIENCE - Rok 2016

Topology recognition and leader election are fundamental tasks in distributed computing in networks. The first of them requires each node to find a labeled isomorphic copy of the network, while the result of the second one consists in a single node adopting the label 1 (leader), with all other nodes adopting the label 0 and learning a path to the leader. We consider both these problems in networks whose nodes are equipped with...

Pełny tekst do pobrania w portalu

Gesture recognition framework for multimedia content viewer controlling

Publikacja

- Rok 2009

In the paper a system for controlling a multimedia content viewer by hand gestures is presented. First, selected methods used for gesture recognition are described. Two different application cases of the system, i.e. for multimedia presentation purposes and for multimedia content viewing are outlined. Moreover, a proposal of improvement of the system combining these approaches is also given. The system work cycle is reviewed. The...

Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System

Publikacja

P. Falkowski-Gilski
G. Debita
M. Habrych
B. Miedziński
P. Jedlikowski
B. Polnik
J. Wandzio
X. Wang

- Rok 2020

The broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...

Pełny tekst do pobrania w serwisie zewnętrznym

Comparison of edge detection algorithms for electric wire recognition

Publikacja

- Rok 2018

Edge detection is the preliminary step in image processing for object detection and recognition procedure. It allows to remove useless information and reduce amount of data before further analysis. The paper contains the comparison of edge detection algorithms optimized for detection of horizontal edges. For comparison purposes the algorithms were implemented in the developed application dedicated to detection of electric line...

Pełny tekst do pobrania w serwisie zewnętrznym

Optical recognition elements: macrocyclic imidazole chromoionophores entrapped in silica xerogel

Publikacja

M. Jamrógiewicz
K. Kledzik
M. Gwiazda
E. Wagner-Wysiecka
J. Jezierska
J. Biernat
A. Kłonkowski

- MATERIALS SCIENCE-POLAND - Rok 2007

Materials containing new chromoionophores consisting of crown residue and azole moiety as partsof macrocycles were encapsulated by the sol-gel procedure in silica xerogel matrices and proposed aschemical recognition elements especially for such metal ions as Li+, Cs+ and Cu2+. Action of these recognition elements is in principle based on changes of reflectance. The recognition elements containing 21-membered chromogenic...

Acceleration of decision making in sound event recognition employing supercomputing cluster

Publikacja

- INFORMATION SCIENCES - Rok 2014

Parallel processing of audio data streams is introduced to shorten the decision making time in hazardous sound event recognition. A supercomputing cluster environment with a framework dedicated to processing multimedia data streams in real time is used. The sound event recognition algorithms employed are based on detecting foreground events, calculating their features in short time frames, and classifying the events with Support...

Pełny tekst do pobrania w serwisie zewnętrznym

Gesture Recognition With the Linear Optical Sensor and Recurrent Neural Networks

Publikacja

- IEEE SENSORS JOURNAL - Rok 2018

In this paper, the optical linear sensor, a representative of low-resolution sensors, was investigated in the multiclass recognition of near-field hand gestures. The recurrent neural network (RNN) with a gated recurrent unit (GRU) memory cell was utilized as a gestures classifier. A set of 27 gestures was collected from a group of volunteers. The 27 000 sequences obtained were divided into training, validation, and test subsets....

Pełny tekst do pobrania w portalu

Digits Recognition with Quadrant Photodiode and Convolutional Neural Network

Publikacja

- Rok 2018

In this paper we have investigated the capabilities of a quadrant photodiode based gesture sensor in the recognition of digits drawn in the air. The sensor consisting of 4 active elements, 4 LEDs and a pinhole was considered as input interface for both discrete and continuous gestures. Index finger and a round pointer were used as navigating mediums for the sensor. Experiments performed with 5 volunteers...

Pełny tekst do pobrania w serwisie zewnętrznym

Camera angle invariant shape recognition in surveillance systems

Publikacja

- Rok 2010

A method for human action recognition in surveillance systems is described. Problems within this task are discussed and a solution based on 3D object models is proposed. The idea is shown and some of its limitations are talked over. Shape description methods are introduced along with their main features. Utilized parameterization algorithm is presented. Classification problem, restricted to bi-nary cases is discussed. Support vector...

Intelligent processing of stuttered speech.

Publikacja

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2003

W artykule zaprezentowano kilka metod analizy i automatycznego zliczania potknięć artykulacyjnych, związanych z jąkaniem się, opartych na wykorzystaniu algorytmów uczących się sztucznych sieci neuronowych i zbiorów przybliżonych.

Scoreboard Architectural Pattern and Integration of Emotion Recognition Results

Publikacja

- IEEE Access - Rok 2019

This paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...

Pełny tekst do pobrania w portalu

Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine

Publikacja

P. Falkowski-Gilski
G. Debita

- Archives of Acoustics - Rok 2023

In order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...

Pełny tekst do pobrania w portalu

System for automatic singing voice recognition

Publikacja

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2008

W artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: audiovisual speech recognition