Wyniki wyszukiwania dla: AUTOMATIC SPEECH RECOGNITION, WHISPER, MEDICAL LANGUAGE RECOGNITION, SPEECH PROCESSING

Wyniki wyszukiwania dla: AUTOMATIC SPEECH RECOGNITION, WHISPER, MEDICAL LANGUAGE RECOGNITION, SPEECH PROCESSING

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 1234

wyczyść wszystkie filtry niedostępne

wyświetlamy 1000 najlepszych wyników Pomoc

Building Knowledge for the Purpose of Lip Speech Identification
Publikacja
- Advances in Intelligent Systems and Computing - Rok 2017
Consecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...

Pełny tekst do pobrania w serwisie zewnętrznym
A Method of Real-Time Non-uniform Speech Stretching
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Rok 2012
Developed method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...

Pełny tekst do pobrania w serwisie zewnętrznym
A comparative study of English viseme recognition methods and algorithms
Publikacja
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...

Pełny tekst do pobrania w portalu
A comparative study of English viseme recognition methods and algorithm
Publikacja
- D. Jachimski
- A. Czyżewski
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

Pełny tekst do pobrania w portalu
Automatic recognition of males and females among web browser users based on behavioural patterns of peripherals usage
Publikacja
- A. Kołakowska
- A. Landowska
- P. Jarmolkowicz
- M. Jarmolkowicz
- K. Sobota
- Internet Research - Rok 2016
Purpose The purpose of this paper is to answer the question whether it is possible to recognise the gender of a web browser user on the basis of keystroke dynamics and mouse movements. Design/methodology/approach An experiment was organised in order to track mouse and keyboard usage using a special web browser plug-in. After collecting the data, a number of parameters describing the users’ keystrokes, mouse movements and clicks...

Pełny tekst do pobrania w serwisie zewnętrznym
Emotion Recognition for Affect Aware Video Games
Publikacja
- M. Szwoch
- W. Szwoch
- Advances in Intelligent Systems and Computing - Rok 2015
In this paper the idea of affect aware video games is presented. A brief review of automatic multimodal affect recognition of facial expressions and emotions is given. The first result of emotions recognition using depth data as well as prototype affect aware video game are presented

Pełny tekst do pobrania w serwisie zewnętrznym
COMPUTER SPEECH AND LANGUAGE

Czasopisma

ISSN: 0885-2308 , eISSN: 1095-8363
SEMINARS IN SPEECH AND LANGUAGE

Czasopisma

ISSN: 0734-0478 , eISSN: 1098-9056
Speech and Language Technology

Czasopisma

ISSN: 1895-0434
Speech Language and Hearing

Czasopisma

ISSN: 1361-3286 , eISSN: 2050-5728
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
Publikacja
- K. Łopatka
- Rok 2015
A system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publikacja
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Rok 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Pełny tekst do pobrania w portalu
JOURNAL OF MEDICAL SPEECH-LANGUAGE PATHOLOGY

Czasopisma

ISSN: 1065-1438
Comparison of various speech time-scale modificartion methods
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Archives of Acoustics - Rok 2011
The objective of this work is to investigate the influence of the different time-scale modification (TSM) methods on the quality of the speech stretched up using the designed non-uniform real-time speech time-scale modification algorithm (NU-RTSM). The algorithm provides a combination of the typical TSM algorithm with the vowels, consonants, stutter, transients and silence detectors. Based on the information about the content and...
Noise profiling for speech enhancement employing machine learning models
Publikacja
- K. Kąkol
- G. Korvel
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2022
This paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...

Pełny tekst do pobrania w portalu
Speech codec enhancements utilizing time compression and perceptual coding
Publikacja
- M. Kulesza
- A. Czyżewski
- Rok 2007
A method for encoding wideband speech signal employing standardized narrowband speech codecs is presented as well as experimental results concerning detection of tonal spectral components. The speech signal sampled with a higher sampling rate than it is suitable for narrowband coding algorithm is compressed in order to decrease the amount of samples. Next, the time-compressed representation of a signal is encoded using a narrowband...
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Diagnostic Pathology - Rok 2012
Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based onthe non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of theproposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearingimpaired children and elderly listeners. It was shown that for the speech with average rate equal to or...

Pełny tekst do pobrania w portalu
Recognition and sensing of anions
Publikacja
- N. Łukasik
- E. Wagner-Wysiecka
- Rok 2013
Molecular ion recognition is one of the most intensively studied areas of supramolecular technology. The reason for this is the essential role that ions play in many biological as well as industrial processes. On the other hand, however, it has been proved that ions can have a negative impact on human health and the environment. For these reasons, it is extremly important to develop rapid and simple methods allowing the determination...
Guido: a musical score recognition system
Publikacja
- M. Szwoch
- Rok 2007
This paper presents an optical music recognition system Guido that can automatically recognize the main musical symbols of music scores that were scanned or taken by a digital camera. The application is based on object model of musical notation and uses linguistic approach for symbol interpretation and error correction. The system offers musical editor with a partially automatic error correction.
Feature extraction in detection and recognition of graphical objects
Publikacja
- J. Dembski
- Rok 2022
Detection and recognition of graphic objects in images are of great and growing importance in many areas, such as medical and industrial diagnostics, control systems in automation and robotics, or various types of security systems, including biometric security systems related to the recognition of the face or iris of the eye. In addition, there are all systems that facilitate the personal life of the blind people, visually impaired...
Communication Platform for Evaluation of Transmitted Speech Quality
Publikacja
- A. Ciarkowski
- A. Czyżewski
- Journal of Telecommunications and Information Technology - Rok 2011
A voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...

Pełny tekst do pobrania w portalu
System of speech signal processing and visualisation for linguistic purposes
Publikacja
- K. Wojan
- Archives of Acoustics - Rok 2005
Limitations of Emotion Recognition from Facial Expressions in e-Learning Context
Publikacja
- Rok 2017
The paper concerns technology of automatic emotion recognition applied in e-learning environment. During a study of e-learning process the authors applied facial expressions observation via multiple video cameras. Preliminary analysis of the facial expressions using automatic emotion recognition tools revealed several unexpected results, including unavailability of recognition due to face coverage and significant inconsistency...

Pełny tekst do pobrania w serwisie zewnętrznym
Ranking Speech Features for Their Usage in Singing Emotion Classification
Publikacja
- S. Zaporowski
- B. Kostek
- Rok 2020
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Pełny tekst do pobrania w portalu
Human emotion recognition with biosignals
Publikacja
- W. Szwoch
- Rok 2022
This chapter presents issues in the field of affective computing. Basic preliminary information for the recognition of emotions is given and models of emotions, various ways of evoking emotions, as well as their theoretical foundations are discussed. The particular attention is given to the use of physiological signals in recognizing emotions. This subject is outlined further below by presenting selected biosignals, their relationship...

Pełny tekst do pobrania w serwisie zewnętrznym
Mining inconsistent emotion recognition results with the multidimensional model
Publikacja
- A. Landowska
- T. Zawadzka
- M. Zawadzki
- IEEE Access - Rok 2021
The paper deals with the challenge of inconsistency in multichannel emotion recognition. The focus of the paper is to explore factors that might influence the inconsistency. The paper reports an experiment that used multi-camera facial expression analysis with multiple recognition systems. The data were analyzed using a multidimensional approach and data mining techniques. The study allowed us to explore camera location, occlusions...

Pełny tekst do pobrania w portalu
High quality speech codec employing sines+noise+transients model
Publikacja
- Archives of Acoustics - Rok 2006
A method of high quality wideband speech signal representation employing sines+transients+noise model is presented. The need for a wideband speech coding approach as well as various methods for analysis and synthesis of sines, residual and transient states of speech signal is discussed. The perceptual criterion is applied in the proposed approach during encoding of sines amplitudes in order to reduce bandwidth requirements and...

Pełny tekst do pobrania w portalu
Virtual keyboard controlled by eye gaze employing speech synthesis
Publikacja
- B. Kunka
- R. Rybacki
- K. Łopatka
- A. Czyżewski
- B. Kostek
- Rok 2010
The article presents the speech synthesis integrated into the eye gaze tracking system. This approach can significantly improve the quality of life of physically disabled people who are unable to communicate. The virtual keyboard (QWERTY) is an interface which allows for entering the text for the speech synthesizer. First, this article describes a methodology of determining the fixation point on a computer screen. Then it presents...
Virtual Keyboard controlled by eye gaze employing speech synthesis
Publikacja
- K. Łopatka
- R. Rybacki
- B. Kunka
- A. Czyżewski
- B. Kostek
- Elektronika : konstrukcje, technologie, zastosowania - Rok 2011
The article presents the speech synthesis integrated into the eye gaze tracking system. This approach can significantly improve the quality of life of physically disabled people who are unable to communicate. The virtual keyboard (QWERTY) is an interface which allows for entering the text for the speech synthesizer. First, this article describes a methodology of determining the fixation point on a computer screen. Then it presents...

Pełny tekst do pobrania w serwisie zewnętrznym
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
Publikacja
- K. Kąkol
- G. Korvel
- B. Kostek
- Rok 2018
The aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
Recognition of hazardous acoustic events employing parallel processing on a supercomputing cluster . Rozpoznawanie niebezpiecznych zdarzeń dźwiękowych z wykorzystaniem równoległego przetwarzania na klastrze superkomputerowym
Publikacja
- K. Łopatka
- A. Czyżewski
- Rok 2015
A method for automatic recognition of hazardous acoustic events operating on a super computing cluster is introduced. The methods employed for detecting and classifying the acoustic events are outlined. The evaluation of the recognition engine is provided: both on the training set and using real-life signals. The algorithms yield sufficient performance in practical conditions to be employed in security surveillance systems. The...
Camera-based Automatic System for Tool Measurements and Recognition
Publikacja
- T. Mikołajczyk
- A. Kłodowski
- A. Mrozinski
- A. Mroziński
- Procedia Technology - Rok 2016
Pełny tekst do pobrania w serwisie zewnętrznym
Automatic recognition of the arterial input function in MRI studies
Publikacja
- J. Rumiński
- B. Karczewski
- Rok 2005
Artykuł prezentuje opis automatycznej metody detekcji tętniczej funkcji wejście (AIF). Metoda została porównana z klinicznie pomierzonymi seriami obrazów DSC-MRI.
Corrupted speech intelligibility improvement using adaptive filter based algorithm
Publikacja
- D. Ellwart
- A. Czyżewski
- Rok 2010
A technique for improving the quality of speech signals recorded in strong noise is presented. The proposed algorithmemploying adaptive filtration is described and additional possibilities of speech intelligibility improvement arediscussed. Results of the tests are presented.
Distortion of speech signals in the listening area: its mechanism and measurements
Publikacja
- H. Lasota
- R. Mazurek
- I. Kochańska
- Rok 2014
The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

Pełny tekst do pobrania w serwisie zewnętrznym
Limitations of Emotion Recognition in Software User Experience Evaluation Context
Publikacja
- A. Landowska
- J. Miler
- Annals of Computer Science and Information Systems - Rok 2016
This paper concerns how an affective-behavioural- cognitive approach applies to the evaluation of the software user experience. Although it may seem that affect recognition solutions are accurate in determining the user experience, there are several challenges in practice. This paper aims to explore the limitations of the automatic affect recognition applied in the usability context as well as...

Pełny tekst do pobrania w portalu
Scoreboard Architectural Pattern and Integration of Emotion Recognition Results
Publikacja
- A. Landowska
- G. Brodny
- IEEE Access - Rok 2019
This paper proposes a new design pattern, named Scoreboard , dedicated for applications solving complex, multi-stage, non-deterministic problems. The pattern provides a computational framework for the design and implementation of systems that integrate a large number of diverse specialized modules that may vary in accuracy, solution level, and modality. The Scoreboard is an extension of Blackboard design pattern and comes under...

Pełny tekst do pobrania w portalu
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
Publikacja
- SENSORS - Rok 2022
Objective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...

Pełny tekst do pobrania w portalu
A non-uniform real-time speech time-scale stretching method
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Rok 2011
An algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...
Multiclass AdaBoost Classifier Parameter Adaptation for Pattern Recognition
Publikacja
- J. Dembski
- Advances in Intelligent Systems and Computing - Rok 2017
The article presents the problem of parameter value selection of the multiclass ``one against all'' approach of an AdaBoost algorithm in tasks of object recognition based on two-dimensional graphical images. AdaBoost classifier with Haar features is still used in mobile devices due to the processing speed in contrast to other methods like deep learning or SVM but its main drawback is the need to assembly the results of binary...

Pełny tekst do pobrania w serwisie zewnętrznym
Extracting concepts from the software requirements specification using natural language processing
Publikacja
- J. Kuchta
- P. Padhiyar
- Rok 2018
Extracting concepts from the software require¬ments is one of the first step on the way to automating the software development process. This task is difficult due to the ambiguity of the natural language used to express the requirements specification. The methods used so far consist mainly of statistical analysis of words and matching expressions with a specific ontology of the domain in which the planned software will be applicable....

Pełny tekst do pobrania w serwisie zewnętrznym
Recognition of Hand Drawn Flowcharts
Publikacja
- W. Szwoch
- M. Mucha
- Rok 2013
In this paper the problem of hand drawn flowcharts recognition is presented. There are described two attitudes to this problem: on-line and off-line. A concept of FCE, a system for recognizing and understanding of freehand drawn on-line flow charts on desktop computer and mobile devices is presented. The first experiments with the FCE system and the planes for future are also described.
Semantic Integration of Heterogeneous Recognition Systems
Publikacja
- P. Kaczmarek
- P. Raszkowski
- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2011
Computer perception of real-life situations is performed using a variety of recognition techniques, including video-based computer vision, biometric systems, RFID devices and others. The proliferation of recognition modules enables development of complex systems by integration of existing components, analogously to the Service Oriented Architecture technology. In the paper, we propose a method that enables integration of information...
Using Physiological Signals for Emotion Recognition
Publikacja
- W. Szwoch
- Rok 2013
Recognizing user’s emotions is the promising area of research in a field of human-computer interaction. It is possible to recognize emotions using facial expression, audio signals, body poses, gestures etc. but physiological signals are very useful in this field because they are spontaneous and not controllable. In this paper a problem of using physiological signals for emotion recognition is presented. The kinds of physiological...

Pełny tekst do pobrania w serwisie zewnętrznym
Emotions in polish speech recordings
Dane Badawcze
open access
- M. Mięsikowska
- D. Świsulski
The data set presents emotions recorded in sound files that are expressions of Polish speech. Statements were made by people aged 21-23, young voices of 5 men. Each person said the following words / nie – no, oddaj - give back, podaj – pass, stop - stop, tak - yes, trzymaj -hold / five times representing a specific emotion - one of three - anger (a),...
Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System
Publikacja
- M. Zamłyńska
- P. Falkowski-Gilski
- G. Debita
- B. Miedziński
- Rok 2021
Although there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...

Pełny tekst do pobrania w serwisie zewnętrznym
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
Publikacja
- P. Falkowski-Gilski
- G. Debita
- Archives of Acoustics - Rok 2023
In order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...

Pełny tekst do pobrania w portalu
Emotion Recognition and Its Applications
Publikacja
- Advances in Intelligent Systems and Computing - Rok 2014
The paper proposes a set of research scenarios to be applied in four domains: software engineering, website customization, education and gaming. The goal of applying the scenarios is to assess the possibility of using emotion recognition methods in these areas. It also points out the problems of defining sets of emotions to be recognized in different applications, representing the defined emotional states, gathering the data and...

Pełny tekst do pobrania w serwisie zewnętrznym
Soft computing based automatic recognition of musical instrument classes.
Publikacja
- B. Kostek
- Journal of ITC Sangeet Research Academy - Rok 2002
W artykule przedstawiono wyniki eksperymentów dotyczących automatycznego rozpoznawania klas instrumentów muzycznych. Proces klasyfikacji zrealizowano w oparciu o sztuczne sieci neuronowe, zaś wektor cch został oparty o parametry obliczane w wyniku analizy falkowej dźwięków instrumentów muzycznych.
Preliminary Study on Automatic Recognition of Spatial Expressions in Polish Texts
Publikacja
- M. Marcińczuk
- M. Oleksy
- J. Wieczorek
- Rok 2016
Pełny tekst do pobrania w serwisie zewnętrznym

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: AUTOMATIC SPEECH RECOGNITION, WHISPER, MEDICAL LANGUAGE RECOGNITION, SPEECH PROCESSING