Filters
total: 329
Search results for: speech rate
-
Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review
PublicationThe automatic emotion recognition domain brings new methods and technologies that might be used to enhance therapy of children with autism. The paper aims at the exploration of methods and tools used to recognize emotions in children. It presents a literature review study that was performed using a systematic approach and PRISMA methodology for reporting quantitative and qualitative results. Diverse observation channels and modalities...
-
Separability Assessment of Selected Types of Vehicle-Associated Noise
PublicationMusic Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...
-
The influence of time of hearing aid use on auditory perception in various acoustic situations
PublicationThe assessment of sound perception in hearing aids, especially in the context of benefits that a prosthesis can bring, is a complex issue. The objective parameters of the hearing aids can easily be determined. These parameters, however, do not always have a direct and decisive influence on the subjective assessment of quality of the patient’s hearing while using a hearing aid. The paper presents the development of a method for...
-
Noise profiling for speech enhancement employing machine learning models
PublicationThis paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...
-
Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model
PublicationTries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...
-
Study Analysis of Transmission Efficiency in DAB+ Broadcasting System
PublicationDAB+ is a very innovative and universal multimedia broadcasting system. Thanks to its updated multimedia technologies and metadata options, digital radio keeps pace with changing consumer expectations and the impact of media convergence. Broadcasting analog and digital radio services does vary, concerning devices on both transmitting and receiving side, as well as content processing mechanisms. However, the biggest difference is...
-
A palatal prosthesis from archaeological research in the St Francis of Assisi church in Cracow (Poland)
PublicationThe hard palate is a septum that not only prevents food from entering between the oral and nasal cavity, but also plays an important role during breathing or speech. The presence of cavities within it negatively affects the comfort of life of people with this type of impairment. Hence, in the literature one can find examples of the use of hard palate prostheses to restore the separation between the nasal and oral cavity. During...
-
Contactless hearing aid designed for infants
PublicationIt is a well known fact that language development through home intervention for a hearing-impaired infant should start in the early months of a newborn baby's life. The aim of this paper is to present a concept of a contactless digital hearing aid designed especially for infants. In contrast to all typical wearable hearing aid solutions (ITC, ITE, BTE), the proposed device is mounted in the infant's bed with any parts of its set-up...
-
Assessment of hearing in coma patients employing auditory brainstem response, electroencephalography, and eye-gaze-tracking
PublicationThe results of the study conducted by Tagliaferri et al. in 12 European countries indicate that the ratio of registered brain injury cases in Europe amounts to 150-300 per 100 000 people, with the European mean value of 235 cases per 100 000 people. The project presented in the paper assumes development of a combined metric of patients’ state remaining in coma by intelligent fusion of GCS (subjective Glasgow Coma Scale or its derivatives)...
-
Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System
PublicationA voiceless stop consonant phoneme modelling and synthesis framework based on a phoneme modelling in low-frequency range and high-frequency range separately is proposed. The phoneme signal is decomposed into the sums of simpler basic components and described as the output of a linear multiple-input and single-output (MISO) system. The impulse response of each channel is a third order quasi-polynomial. Using this framework, the...
-
Clinical situations text database for Polish language
Open Research DataDataset contains a database of anonymized texts in Polish for the purposes of building a medical speech corpus, for clinical situations in the following areas: medical interview, interview and description of the result of an oncological examination, description of a radiological examination, description of a pathomorphological examination, description...
-
Digital Transformation of Terrestrial Radio: An Analysis of Simulcasted Broadcasts in FM and DAB+ for a Smart and Successful Switchover
PublicationThe process of digitizing radio is far from over. It is an important interdisciplinary aspect, involving Big Data and AI (Artificial Intelligence) when it comes to classifying and handling content, and an organizational challenge in the Industry 4.0 concept. There exist several methods for delivering audio signals, including terrestrial broadcasting and internet streaming. Among them, the DAB+ (Digital Audio Broadcasting plus)...
-
Transfer learning in imagined speech EEG-based BCIs
PublicationThe Brain–Computer Interfaces (BCI) based on electroencephalograms (EEG) are systems which aim is to provide a communication channel to any person with a computer, initially it was proposed to aid people with disabilities, but actually wider applications have been proposed. These devices allow to send messages or to control devices using the brain signals. There are different neuro-paradigms which evoke brain signals of interest...
-
Multimodal human-computer interfaces based on advanced video and audio analysis
PublicationMultimodal interfaces development history is reviewed briefly in the introduction. Examples of applications of multimodal interfaces to education software and for the disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and the audio interface for speech stretching for hearing impaired and stuttering people. The Smart...
-
BPL-PLC Voice Communication System for the Oil and Mining Industry
PublicationApplication of a high-efficiency voice communication systems based on broadband over power line-power line communication (BPL-PLC) technology in medium voltage networks, including hazardous areas (like the oil and mining industry), as a redundant mean of wired communication (apart from traditional fiber optics and electrical wires) can be beneficial. Due to the possibility of utilizing existing electrical infrastructure, it can...
-
Waveguide model of the hearing aid earmold system
PublicationBackground The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...
-
Waveguide model of the hearing aid earmold system
PublicationBackground The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...
-
Multimodal learning application with interactive animated character. [Multimodalna aplikacja edukacyjna wykorzystująca interaktywną animowaną postać]
PublicationThe aim of this study is to design a computer application that may assist teachers and therapists in multimodal manner in their work with impaired or disabled children. The application can be operated in many different ways, giving to a child with special educational needs a possibility to learn and train many skills or treat speech disorders. The main stress in this research is on the creation of animated character that will serve...
-
Trzej prorocy: Sołżenicyn, Friedman, Dugin. Część pierwsza: Sołżenicyn
PublicationArtykuł przedstawia na tle biograficznym dzieło i myśl profetyczną Aleksandra Sołżenicyna. Podstawą jej analizy jest mowa z okazji przyznania autorowi Oddziału chorych na raka literackiej Nagrody Nobla oraz jego wykład na temat stanu cywilizacji Zachodu wygłoszony na Uniwersytecie Harvarda – zatytułowany Zmierzch odwagi. Proroctwa Sołżenicyna dotyczące Zachodu pokazane są w kontekście jego pracy Jak odbudować Rosję? W artykule...
-
New Applications of Multimodal Human-Computer Interfaces
PublicationMultimodal computer interfaces and examples of their applications to education software and for the disabled people are presented. The proposed interfaces include the interactive electronic whiteboard based on video image analysis, application for controlling computers with gestures and the audio interface for speech stretching for hearing impaired and stuttering people. Application of the eye-gaze tracking system to awareness...
-
MODALITY corpus - SPEAKER 35 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S4
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 10 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S2
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 39 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S2
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 33 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C2
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S4
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C4
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C4
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 35 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 39 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 33 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 17 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C2
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 05 - SEQUENCE S1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...