Wyniki wyszukiwania dla: AUTOMATIC SPEECH RECOGNITION, WHISPER, MEDICAL LANGUAGE RECOGNITION, SPEECH PROCESSING

Feature extraction in detection and recognition of graphical objects

Publikacja

J. Dembski

- Rok 2022

Detection and recognition of graphic objects in images are of great and growing importance in many areas, such as medical and industrial diagnostics, control systems in automation and robotics, or various types of security systems, including biometric security systems related to the recognition of the face or iris of the eye. In addition, there are all systems that facilitate the personal life of the blind people, visually impaired...

Communication Platform for Evaluation of Transmitted Speech Quality

Publikacja

- Journal of Telecommunications and Information Technology - Rok 2011

A voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...

Pełny tekst do pobrania w portalu

System of speech signal processing and visualisation for linguistic purposes

Publikacja

K. Wojan

- Archives of Acoustics - Rok 2005

Limitations of Emotion Recognition from Facial Expressions in e-Learning Context

Publikacja

- Rok 2017

The paper concerns technology of automatic emotion recognition applied in e-learning environment. During a study of e-learning process the authors applied facial expressions observation via multiple video cameras. Preliminary analysis of the facial expressions using automatic emotion recognition tools revealed several unexpected results, including unavailability of recognition due to face coverage and significant inconsistency...

Pełny tekst do pobrania w serwisie zewnętrznym

Ranking Speech Features for Their Usage in Singing Emotion Classification

Publikacja

- Rok 2020

This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Pełny tekst do pobrania w portalu

Human emotion recognition with biosignals

Publikacja

W. Szwoch

- Rok 2022

This chapter presents issues in the field of affective computing. Basic preliminary information for the recognition of emotions is given and models of emotions, various ways of evoking emotions, as well as their theoretical foundations are discussed. The particular attention is given to the use of physiological signals in recognizing emotions. This subject is outlined further below by presenting selected biosignals, their relationship...

Pełny tekst do pobrania w serwisie zewnętrznym

High quality speech codec employing sines+noise+transients model

Publikacja

- Archives of Acoustics - Rok 2006

A method of high quality wideband speech signal representation employing sines+transients+noise model is presented. The need for a wideband speech coding approach as well as various methods for analysis and synthesis of sines, residual and transient states of speech signal is discussed. The perceptual criterion is applied in the proposed approach during encoding of sines amplitudes in order to reduce bandwidth requirements and...

Pełny tekst do pobrania w portalu

Virtual keyboard controlled by eye gaze employing speech synthesis

Publikacja

- Rok 2010

The article presents the speech synthesis integrated into the eye gaze tracking system. This approach can significantly improve the quality of life of physically disabled people who are unable to communicate. The virtual keyboard (QWERTY) is an interface which allows for entering the text for the speech synthesizer. First, this article describes a methodology of determining the fixation point on a computer screen. Then it presents...

Virtual Keyboard controlled by eye gaze employing speech synthesis

Publikacja

- Elektronika : konstrukcje, technologie, zastosowania - Rok 2011

The article presents the speech synthesis integrated into the eye gaze tracking system. This approach can significantly improve the quality of life of physically disabled people who are unable to communicate. The virtual keyboard (QWERTY) is an interface which allows for entering the text for the speech synthesizer. First, this article describes a methodology of determining the fixation point on a computer screen. Then it presents...

Pełny tekst do pobrania w serwisie zewnętrznym

Mining inconsistent emotion recognition results with the multidimensional model

Publikacja

- IEEE Access - Rok 2021

The paper deals with the challenge of inconsistency in multichannel emotion recognition. The focus of the paper is to explore factors that might influence the inconsistency. The paper reports an experiment that used multi-camera facial expression analysis with multiple recognition systems. The data were analyzed using a multidimensional approach and data mining techniques. The study allowed us to explore camera location, occlusions...

Pełny tekst do pobrania w portalu

Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions

Publikacja

K. Kąkol
G. Korvel
B. Kostek

- Rok 2018

The aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...

Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym

Publikacja

- Rok 2015

A method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...

Camera-based Automatic System for Tool Measurements and Recognition

Publikacja

T. Mikołajczyk
A. Kłodowski
A. Mrozinski
A. Mroziński

- Procedia Technology - Rok 2016

Pełny tekst do pobrania w serwisie zewnętrznym

Automatic recognition of the arterial input function in MRI studies

Publikacja

- Rok 2005

Artykuł prezentuje opis automatycznej metody detekcji tętniczej funkcji wejście (AIF). Metoda została porównana z klinicznie pomierzonymi seriami obrazów DSC-MRI.

Corrupted speech intelligibility improvement using adaptive filter based algorithm

Publikacja

- Rok 2010

A technique for improving the quality of speech signals recorded in strong noise is presented. The proposed algorithmemploying adaptive filtration is described and additional possibilities of speech intelligibility improvement arediscussed. Results of the tests are presented.

Distortion of speech signals in the listening area: its mechanism and measurements

Publikacja

- Rok 2014

The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

Pełny tekst do pobrania w serwisie zewnętrznym

A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times

Publikacja

- SENSORS - Rok 2022

Objective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...

Pełny tekst do pobrania w portalu

Limitations of Emotion Recognition in Software User Experience Evaluation Context

Publikacja

- Annals of Computer Science and Information Systems - Rok 2016

This paper concerns how an affective-behavioural- cognitive approach applies to the evaluation of the software user experience. Although it may seem that affect recognition solutions are accurate in determining the user experience, there are several challenges in practice. This paper aims to explore the limitations of the automatic affect recognition applied in the usability context as well as...

Pełny tekst do pobrania w portalu

Scoreboard Architectural Pattern and Integration of Emotion Recognition Results

Publikacja

- IEEE Access - Rok 2019

This paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...

Pełny tekst do pobrania w portalu

A non-uniform real-time speech time-scale stretching method

Publikacja

- Rok 2011

An algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...

Multiclass AdaBoost Classifier Parameter Adaptation for Pattern Recognition

Publikacja

J. Dembski

- Advances in Intelligent Systems and Computing - Rok 2017

The article presents the problem of parameter value selection of the multiclass ``one against all'' approach of an AdaBoost algorithm in tasks of object recognition based on two-dimensional graphical images. AdaBoost classifier with Haar features is still used in mobile devices due to the processing speed in contrast to other methods like deep learning or SVM but its main drawback is the need to assembly the results of binary...

Pełny tekst do pobrania w serwisie zewnętrznym

Extracting concepts from the software requirements specification using natural language processing

Publikacja

- Rok 2018

Extracting concepts from the software require¬ments is one of the first step on the way to automating the software development process. This task is difficult due to the ambiguity of the natural language used to express the requirements specification. The methods used so far consist mainly of statistical analysis of words and matching expressions with a specific ontology of the domain in which the planned software will be applicable....

Pełny tekst do pobrania w serwisie zewnętrznym

Recognition of Hand Drawn Flowcharts

Publikacja

W. Szwoch
M. Mucha

- Rok 2013

In this paper the problem of hand drawn flowcharts recognition is presented. There are described two attitudes to this problem: on-line and off-line. A concept of FCE, a system for recognizing and understanding of freehand drawn on-line flow charts on desktop computer and mobile devices is presented. The first experiments with the FCE system and the planes for future are also described.

Semantic Integration of Heterogeneous Recognition Systems

Publikacja

P. Kaczmarek
P. Raszkowski

- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2011

Computer perception of real-life situations is performed using a variety of recognition techniques, including video-based computer vision, biometric systems, RFID devices and others. The proliferation of recognition modules enables development of complex systems by integration of existing components, analogously to the Service Oriented Architecture technology. In the paper, we propose a method that enables integration of information...

Using Physiological Signals for Emotion Recognition

Publikacja

W. Szwoch

- Rok 2013

Recognizing user’s emotions is the promising area of research in a field of human-computer interaction. It is possible to recognize emotions using facial expression, audio signals, body poses, gestures etc. but physiological signals are very useful in this field because they are spontaneous and not controllable. In this paper a problem of using physiological signals for emotion recognition is presented. The kinds of physiological...

Pełny tekst do pobrania w serwisie zewnętrznym

Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System

Publikacja

M. Zamłyńska
P. Falkowski-Gilski
G. Debita
B. Miedziński

- Rok 2021

Although there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...

Pełny tekst do pobrania w serwisie zewnętrznym

Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine

Publikacja

P. Falkowski-Gilski
G. Debita

- Archives of Acoustics - Rok 2023

In order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...

Pełny tekst do pobrania w portalu

Emotion Recognition and Its Applications

Publikacja

- Advances in Intelligent Systems and Computing - Rok 2014

The paper proposes a set of research scenarios to be applied in four domains: software engineering, website customization, education and gaming. The goal of applying the scenarios is to assess the possibility of using emotion recognition methods in these areas. It also points out the problems of defining sets of emotions to be recognized in different applications, representing the defined emotional states, gathering the data and...

Pełny tekst do pobrania w serwisie zewnętrznym

Preliminary Study on Automatic Recognition of Spatial Expressions in Polish Texts

Publikacja

M. Marcińczuk
M. Oleksy
J. Wieczorek

- Rok 2016

Pełny tekst do pobrania w serwisie zewnętrznym

Soft computing based automatic recognition of musical instrument classes.

Publikacja

B. Kostek

- Journal of ITC Sangeet Research Academy - Rok 2002

W artykule przedstawiono wyniki eksperymentów dotyczących automatycznego rozpoznawania klas instrumentów muzycznych. Proces klasyfikacji zrealizowano w oparciu o sztuczne sieci neuronowe, zaś wektor cch został oparty o parametry obliczane w wyniku analizy falkowej dźwięków instrumentów muzycznych.

Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency

Publikacja

- Rok 2007

In this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.

Pitch estimation of narrowband-filtered speech signal using instantaneous complex frequency

Publikacja

- Elektronika : konstrukcje, technologie, zastosowania - Rok 2008

In this paper we propose a novel method of pitch estimation, based on instantaneous complex frequency (ICF). New iterative algorithm for analysis of ICF of speech signal in presented. Obtained results are compared with commonly used methods to prove its accuracy and connection between ICF and pitch, particularly for narrowband-filtered speech signal.

Automated detection of pronunciation errors in non-native English speech employing deep learning

Publikacja

D. Korzekwa

- Rok 2023

Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...

Pełny tekst do pobrania w portalu

Emotion Recognition Using Physiological Signals

Publikacja

W. Szwoch

- Rok 2015

In this paper the problem of emotion recognition using physiological signals is presented. Firstly the problems with acquisition of physiological signals related to specific human emotions are described. It is not a trivial problem to elicit real emotions and to choose stimuli that always, and for all people, elicit the same emotion. Also different kinds of physiological signals for emotion recognition are considered. A set of...

Pełny tekst do pobrania w serwisie zewnętrznym

Acceleration of decision making in sound event recognition employing supercomputing cluster

Publikacja

- INFORMATION SCIENCES - Rok 2014

Parallel processing of audio data streams is introduced to shorten the decision making time in hazardous sound event recognition. A supercomputing cluster environment with a framework dedicated to processing multimedia data streams in real time is used. The sound event recognition algorithms employed are based on detecting foreground events, calculating their features in short time frames, and classifying the events with Support...

Pełny tekst do pobrania w serwisie zewnętrznym

Hand gesture recognition supported by fuzzy rules and Kalman filters

Publikacja

- International Journal of Intelligent Information and Database Systems - Rok 2012

The paper presents a system based on camera and multimediaprojector enabling a user to control computer applications by dynamic hand gestures. Gesture recognition methodology based on representing hand movement trajectory by motion vectors analysed using fuzzy rule-based inference is first given. For effective hand position tracking Kalman filters are employed. The system engineered is developed using J2SE and C++/OpenCV technology....

Facial emotion recognition using depth data

Publikacja

M. Szwoch
P. Pieniazek

- Rok 2015

In this paper an original approach is presented for facial expression and emotion recognition based only on depth channel from Microsoft Kinect sensor. The emotional user model contains nine emotions including the neutral one. The proposed recognition algorithm uses local movements detection within the face area in order to recognize actual facial expression. This approach has been validated on Facial Expressions and Emotions Database...

Pełny tekst do pobrania w serwisie zewnętrznym

Emotion recognition and its application in software engineering

Publikacja

- Rok 2013

In this paper a novel application of multimodal emotion recognition algorithms in software engineering is described. Several application scenarios are proposed concerning program usability testing and software process improvement. Also a set of emotional states relevant in that application area is identified. The multimodal emotion recognition method that integrates video and depth channels, physiological signals and input devices...

Pełny tekst do pobrania w serwisie zewnętrznym

Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech

Publikacja

D. Korzekwa
J. Lorenzo-trueba
T. Drugman
S. Calamaro
B. Kostek

- Rok 2021

We propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions for L2 speech means that the model has to learn only from a weak signal of word-level mispronunciations. Because of that and due to the limited amount of mispronounced...

Pełny tekst do pobrania w portalu

Parameters optimization in medicine supporting image recognition algorithms

Publikacja

A. Brzeski

- Rok 2011

In this paper, a procedure of automatic set up of image recognition algorithms' parameters is proposed, for the purpose of reducing the time needed for algorithms' development. The procedure is presented on two medicine supporting algorithms, performing bleeding detection in endoscopic images. Since the algorithms contain multiple parameters which must be specified, empirical testing is usually required to optimise the algorithm's...

Mowa nienawiści (hate speech) a odpowiedzialność dostawców usług internetowych w orzecznictwie sądów europejskich

Publikacja

K. Kowalik-Bańczyk

- Rok 2015

The article analyses the phenomenon of hate speech in the Internet contrasted with the problem of responsability of Internet Service Providers for cases of such abuses of freedom of expression. The text provides an analysis of jurisprudence of two European Courts. On the one hand it presents the position of the European Court of Human Rights on the problem of hate speech: its definition and the liability for it as an exception...

Comparison of edge detection algorithms for electric wire recognition

Publikacja

- Rok 2018

Edge detection is the preliminary step in image processing for object detection and recognition procedure. It allows to remove useless information and reduce amount of data before further analysis. The paper contains the comparison of edge detection algorithms optimized for detection of horizontal edges. For comparison purposes the algorithms were implemented in the developed application dedicated to detection of electric line...

Pełny tekst do pobrania w serwisie zewnętrznym

Objectivization of phonological evaluation of speech elements by means of audio parametrization

Publikacja

- Rok 2018

This study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...

Automatic singing voice recognition employing neural networks and rough sets

Publikacja

- Rok 2007

Celem prac opisanych w referacie jest automatyczne rozpoznawanie głosów śpiewaczych. Do tego celu utworzona została baza nagrań próbek śpiewu profesjonalnego i amatorskiego. Próbki poddane zostały parametryzacji parametrami zaproponowanymi przez autorów ściśle do tego celu. Sposób wyznaczenia parametrów i ich interpretacja fizyczna przedstawione są w referacie. Parametry wprowadzane są do systemów decyzyjnych, klasyfikatorów opartych...

Verification of the Parameterization Methods in the Context of Automatic Recognition of Sounds Related to Danger

Publikacja

- Journal of Digital Forensic Practice - Rok 2010

W artykule opisano aplikację, która automatycznie wykrywa zdarzenia dźwiękowe takie jak: rozbita szyba, wystrzał, wybuch i krzyk. Opisany system składa się z bloku parametryzacji i klasyfikatora. W artykule dokonano porównania parametrów dedykowanych dla tego zastosowania oraz standardowych deskryptorów MPEG-7. Porównano też dwa klasyfikatory: Jeden oparty o Percetron (sieci neuronowe) i drugi oparty o Maszynę wektorów wspierających....

Pełny tekst do pobrania w serwisie zewnętrznym

Anion recognition by n,n'-diarylalkanediamides

Publikacja

- Rok 2012

The preparation of N,N'-diarylalkanediamides from respective aliphatic dicarboxylic acidesand 4-nitroaniline via microwave-promoted reactions is presented. The most positive effect of microwave irradiation was observed for N,N'-bis(4-nitrophenyl)butanediamide. Anion binding studies on the obtained diamides were carried out in DMSO and acetonitrile using UV-vis and 1H NMR spectroscopy. A mechanism for selective fluoride recognition...

Pełny tekst do pobrania w serwisie zewnętrznym

Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów

Publikacja

- Rok 2017

The aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...

Elimination of clicks from archive speech signals using sparse autoregressive modeling

Publikacja

- Rok 2012

This paper presents a new approach to elimination of impulsivedisturbances from archive speech signals. The proposedsparse autoregressive (SAR) signal representation is given ina factorized form - the model is a cascade of the so-called formantfilter and pitch filter. Such a technique has been widelyused in code-excited linear prediction (CELP) systems, as itguarantees model stability. After detection of noise pulses usinglinear...

Pełny tekst do pobrania w serwisie zewnętrznym

Robust and Efficient Machine Learning Algorithms for Visual Recognition

Publikacja

S. Cygert

- Rok 2022

In visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...

Pełny tekst do pobrania w portalu

AN ALGORITHM FOR PORTAL HYPERTENSIVE GASTROPATHY RECOGNITION ON THE ENDOSCOPIC RECORDINGS

Publikacja

- Rok 2014

Symptoms recognition of portal hypertensive gastropathy (PHG) can be done by analysing endoscopic recordings, but manual analysis done by physician may take a long time. This increases probability of missing some symptoms and automated methods may be applied to prevent that. In this paper a novel hybrid algorithm for recognition of early stage of portal hypertensive gastropathy is proposed. First image preprocessing is described....

Wyszukiwarka

Filtry

Katalog

Kategoria

Rok

Opcje

Wyniki wyszukiwania dla: AUTOMATIC SPEECH RECOGNITION, WHISPER, MEDICAL LANGUAGE RECOGNITION, SPEECH PROCESSING