Wyniki wyszukiwania dla: SPEECH PROCESSING
-
Building Knowledge for the Purpose of Lip Speech Identification
PublikacjaConsecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...
-
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
PublikacjaAutomatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...
-
Vocalic Segments Classification Assisted by Mouth Motion Capture
PublikacjaVisual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...
-
Human-computer interactions in speech therapy using a blowing interface
PublikacjaIn this paper we present a new human-computer interface for the quantitative measurement of blowing activities. The interface can measure the air flow and air pressure during the blowing activity. The measured values are stored and used to control the state of the graphical objects in the graphical user interface. In speech therapy children will find easier to play attractive therapeutic games than to perform repetitive and tedious,...
-
A Device for Measuring Auditory Brainstem Responses to Audio
PublikacjaStandard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...
-
Secured wired BPL voice transmission system
PublikacjaDesigning a secured voice transmission system is not a trivial task. Wired media, thanks to their reliability and resistance to mechanical damage, seem an ideal solution. The BPL (Broadband over Power Line) cable is resistant to electricity stoppage and partial damage of phase conductors, ensuring continuity of transmission in case of an emergency. It seems an appropriate tool for delivering critical data, mostly clear and understandable...
-
Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów
PublikacjaThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training
PublikacjaIn the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...
-
Estimation of time-frequency complex phase-based speech attributes using narrow band filter banks
PublikacjaIn this paper, we present nonlinear estimators of nonstationary and multicomponent signal attributes (parameters, properties) which are instantaneous frequency, spectral (or group) delay, and chirp-rate (also known as instantaneous frequency slope). We estimate all of these distributions in the time-frequency domain using both finite and infinite impulse response (FIR and IIR) narrow band filers for speech analysis. Then, we present...
-
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
PublikacjaThe goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...
-
Multimedia industrial and medical applications supported by machine learning
PublikacjaThis article outlines a keynote paper presented at the Intelligent DecisionTechnologies conference providing a part of the KES Multi-theme Conference “Smart Digital Futures” organized in Rome on June 14–16, 2023. It briefly discusses projects related to traffic control using developed intelligent traffic signs and diagnosing the health of wind turbine mechanisms and multimodal biometric authentication for banking branches to provide...
-
The Innovative Faculty for Innovative Technologies
PublikacjaA leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...
-
Visual Lip Contour Detection for the Purpose of Speech Recognition
PublikacjaA method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...
-
Ultrawideband transmission in physical channels: a broadband interference view
PublikacjaThe superposition of multipath components (MPC) of an emitted wave, formed by reflections from limiting surfaces and obstacles in the propagation area, strongly affects communication signals. In the case of modern wideband systems, the effect should be seen as a broadband counterpart of classical interference which is the cause of fading in narrowband systems. This paper shows that in wideband communications, the time- and frequency-domain...
-
Examining Feature Vector for Phoneme Recognition
PublikacjaThe aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...
-
Detection and localization of selected acoustic events in 3D acoustic field for smart surveillance applications
PublikacjaA method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The events are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...
-
Quality Evaluation of Novel DTD Algorithm Based on Audio Watermarking
PublikacjaEcho cancellers typically employ a doubletalk detection (DTD) algorithm in order to keep the adaptive filter from diverging in the presence of near-end speech signal or other disruptive sounds in the microphone signal. A novel doubletalk detection algorithm based on techniques similar to those used for audio signal watermarking was introduced by the authors. The application of the described DTD algorithm within acoustic echo cancellation...
-
Detection and localization of selected acoustic events in acoustic field for smart surveillance applications
PublikacjaA method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The evens are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...
-
Cross-domain applications of multimodal human-computer interfaces
PublikacjaDeveloped multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...
-
Subjective and Objective Comparative Study of DAB+ Broadcast System
PublikacjaBroadcasting services seek to optimize their use of bandwidth in order to maximize user’s quality of experience. They aim to transmit high-quality digital speech and music signals at the lowest bitrate. They intend to offer the best quality under available conditions. Due to bandwidth limitations, audio quality is in conflict with the number of transmitted radio programs. This paper analyzes whether the quality of real-time digital...