Search results for: SPEECH REINFORCEMENT SYSTEMS - Bridge of Knowledge

Search

Search results for: SPEECH REINFORCEMENT SYSTEMS

Search results for: SPEECH REINFORCEMENT SYSTEMS

  • Broadband interference in speech reinforcement systems

    Publication

    - Year 2008

    Artykuł podejmuje niedoceniany problem wpływu liczby i rozkładu głośników w systemach nagłośnienia, na jakość przekazu głosowego, czyli na zrozumiałość mowy w audytoriach. Superpozycji przesuniętych w czasie szerokopasmowych sygnałów o tym samym kształcie i lekko różnych wielkościach, które docierają do słuchacza z licznych spójnych źródeł, towarzyszy zjawisko interferencji prowadzące do głębokiej modyfikacji odbieranych sygnałów...

  • Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning

    Text-to-Speech synthesis (TTS) can be used to generate training data for building Automatic Speech Recognition models (ASR). Access to medical speech data is because it is sensitive data that is difficult to obtain for privacy reasons; TTS can help expand the data set. Speech can be synthesized by mimicking different accents, dialects, and speaking styles that may occur in a medical language. Reinforcement Learning (RL), in the...

    Full text available to download

  • A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems

    Publication

    This paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...

    Full text to download in external service

  • Hybrid of Neural Networks and Hidden Markov Models as a modern approach to speech recognition systems

    The aim of this paper is to present a hybrid algorithm that combines the advantages ofartificial neural networks and hidden Markov models in speech recognition for control purpos-es. The scope of the paper includes review of currently used solutions, description and analysis of implementation of selected artificial neural network (NN) structures and hidden Markov mod-els (HMM). The main part of the paper consists of a description...

    Full text available to download

  • Badanie rozkładów parametrów sygnału mowy w zastosowaniach do prognozowania prawdopodobieństwa popełnienia błędów w systemach identyfikacji mówców = Examining distribution of speech signal parameters for the prognosis of error probability in speaker verification systems

    Publication

    - Year 2010

    Przedmiotem pracy jest system identyfikacji mówców w sposób zależny od tekstu ("text dependent''). Dokonano analizy wielu różnych wypowiedzi kilkudziesięciu mówców. Zastosowana metoda parametryzacji to metoda oparta na wynikach analizy cepstralnej sygnału mowy. Zdefiniowane zostały nowe parametry skojarzone z elementarnymi zdarzeniami w procesie weryfikacji mówców. Na tej podstawie dokonano estymacji funkcji gęstości prawdopodobieństwa...

  • PHONEME DISTORTION IN PUBLIC ADDRESS SYSTEMS

    Publication

    - Year 2015

    The quality of voice messages in speech reinforcement and public address systems is often poor. The sound engineering projects of such systems take care of sound intensity and possible reverberation phenomena in public space without, however, considering the influence of acoustic interference related to the number and distribution of loudspeakers. This paper presents the results of measurements and numerical simulations of the...

  • Distortion of speech signals in the listening area: its mechanism and measurements

    Publication

    - Year 2014

    The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

    Full text to download in external service

  • Modeling and Designing Acoustical Conditions of the Interior – Case Study

    The primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...

    Full text available to download

  • KORPUS MOWY ANGIELSKIEJ DO CELÓW MULTIMODALNEGO AUTOMATYCZNEGO ROZPOZNAWANIA MOWY

    W referacie zaprezentowano audiowizualny korpus mowy zawierający 31 godzin nagrań mowy w języku angielskim. Korpus dedykowany jest do celów automatycznego audiowizualnego rozpoznawania mowy. Korpus zawiera nagrania wideo pochodzące z szybkoklatkowej kamery stereowizyjnej oraz dźwięk zarejestrowany przez matrycę mikrofonową i mikrofon komputera przenośnego. Dzięki uwzględnieniu nagrań zarejestrowanych w warunkach szumowych korpus...

  • Hossein Nejatbakhsh Esfahani PhD

    People

    Since 2012 when I graduated in master of mechatronics engineering I've been dealing with kinds of control theory problems in both theoretical and practical perspective. I have five years of work experience in industrial automation area in Iran where I was swamped with some industrial-based control algorithms such as PID and MPC algorithms which were adopted to control some processes including steam turbine, gas turbine, casting...

  • Koncepcja systemu wspomagania decyzji nawigatora statku opartego na ewolucyjnym planowaniu manewrów antykolizyjnych

    Publication

    Artykuł przedstawia koncepcję systemu wspomagania decyzji nawigatora statku opartego na wątkach badań prowadzonych wcześniej przez autora. System będzie rozszerzał funkcjonalność systemów dotychczasowych o możliwość szczegółowego planowania bezpiecznej trajektorii statku na wodach zamkniętych, z dużą liczbą statków obcych i ograniczeniami toru wodnego. Artykuł zawiera dyskusję możliwych podejść do planowania manewrów, optymalizacji...

  • Orken Mamyrbayev Professor

    People

    1.  Education: Higher. In 2001, graduated from the Abay Almaty State University (now Abay Kazakh National Pedagogical University), in the specialty: Computer science and computerization manager. 2.  Academic degree: Ph.D. in the specialty "6D070300-Information systems". The dissertation was defended in 2014 on the topic: "Kazakh soileulerin tanudyn kupmodaldy zhuyesin kuru". Under my supervision, 16 masters, 1 dissertation...

  • Computer-assisted pronunciation training—Speech synthesis is almost all you need

    Publication

    - SPEECH COMMUNICATION - Year 2022

    The research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...

    Full text available to download

  • Zdzisław Kowalczuk prof. dr hab. inż.

    Zdzislaw Kowalczuk received his M.Sc. degree in 1978 and Ph.D. degree in 1986, both in Automatic Control from Technical University of Gdańsk (TUG), Gdańsk, Poland. In 1993 he received his D.Sc. degree (Dr Habilitus) in Automatic Control from Silesian Technical University, Gliwice, Poland, and the title of Professor from the President of Poland in 2003. Since 1978 he has been with Faculty of Electronics, Telecommunications and Informatics...

  • Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement

    Publication

    - IEEE Access - Year 2020

    The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

    Full text available to download

  • An audio-visual corpus for multimodal automatic speech recognition

    review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

    Full text available to download

  • Stan graniczny nośności dźwigara żelbetowego mostu na zginanie według norm PN-EN 1992-2 oraz PN-S-10042:1991

    Publication

    - Year 2016

    Praca włącza się w bogaty w ostatnich latach w krajowym piśmiennictwie nurt porównań dwóch generacji norm projektowania mostów z betonu: polskiej - wycofanej, aczkolwiek powszechnie stosowanej oraz europejskiej – wciąż jeszcze wdrażanej do praktyki projektowej. Nowością w stosunku do dotychczasowych publikacji polskich jest szersze ujęcie różnic pomiędzy obydwiema generacjami norm. Poza rozpatrywanymi przez wielu autorów różnicami...

  • A note on total reinforcement in graphs

    Publication

    - DISCRETE APPLIED MATHEMATICS - Year 2011

    In this note we prove a conjecture and inprove some results presendet in a recent paper of N. Sridharan, M.D. Elias, V.S.A. Subramanian, Total reinforcement number of a graph, AKCE Int. J. Graphs Comb. 4 (2) (2007) 197-202.

    Full text available to download

  • CYBERNETICS AND SYSTEMS

    Journals

    ISSN: 0196-9722 , eISSN: 1087-6553

  • Andrzej Czyżewski prof. dr hab. inż.

    Prof. zw. dr hab. inż. Andrzej Czyżewski jest absolwentem Wydziału Elektroniki PG (studia magisterskie ukończył w 1982 r.). Pracę doktorską na temat związany z dźwiękiem cyfrowym obronił z wyróżnieniem na Wydziale Elektroniki PG w roku 1987. W 1992 r. przedstawił rozprawę habilitacyjną pt.: „Cyfrowe operacje na sygnałach fonicznych”. Jego kolokwium habilitacyjne zostało przyjęte jednomyślnie w czerwcu 1992 r. w Akademii Górniczo-Hutniczej...

  • Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions

    Publication

    Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

    Full text to download in external service

  • Speech Intelligibility Measurements in Auditorium

    Publication

    Speech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...

    Full text available to download

  • Language Models in Speech Recognition

    Publication

    - Year 2022

    This chapter describes language models used in speech recognition, It starts by indicating the role and the place of language models in speech recognition. Mesures used to compare language models follow. An overview of n-gram, syntactic, semantic, and neural models is given. It is accompanied by a list of popular software.

    Full text to download in external service

  • Detecting Lombard Speech Using Deep Learning Approach

    Publication
    • K. Kąkol
    • G. Korvel
    • G. Tamulevicius
    • B. Kostek

    - SENSORS - Year 2023

    Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

    Full text available to download

  • Transient detection for speech coding applications

    Signal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...

    Full text to download in external service

  • Improving the quality of speech in the conditions of noise and interference

    Publication

    The aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...

    Full text to download in external service

  • A survey of automatic speech recognition deep models performance for Polish medical terms

    Among the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....

    Full text to download in external service

  • Applying the Lombard Effect to Speech-in-Noise Communication

    Publication

    - Electronics - Year 2023

    This study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...

    Full text available to download

  • Constructing a Dataset of Speech Recordingswith Lombard Effect

    Publication

    - Year 2020

    Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...

  • Virtual keyboard controlled by eye gaze employing speech synthesis

    Publication

    - Year 2010

    The article presents the speech synthesis integrated into the eye gaze tracking system. This approach can significantly improve the quality of life of physically disabled people who are unable to communicate. The virtual keyboard (QWERTY) is an interface which allows for entering the text for the speech synthesizer. First, this article describes a methodology of determining the fixation point on a computer screen. Then it presents...

  • Virtual Keyboard controlled by eye gaze employing speech synthesis

    The article presents the speech synthesis integrated into the eye gaze tracking system. This approach can significantly improve the quality of life of physically disabled people who are unable to communicate. The virtual keyboard (QWERTY) is an interface which allows for entering the text for the speech synthesizer. First, this article describes a methodology of determining the fixation point on a computer screen. Then it presents...

    Full text to download in external service

  • Improved method for real-time speech stretching

    Publication

    n algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...

    Full text to download in external service

  • Methodology and technology for the polymodal allophonic speech transcription

    A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

    Full text to download in external service

  • Methodology and technology for the polymodal allophonic speech transcription

    A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

    Full text to download in external service

  • Model-free and Model-based Reinforcement Learning, the Intersection of Learning and Planning

    Publication

    - Year 2022

    My doctoral dissertation is intended as the compound of four publications considering: structure and randomness in planning and reinforcement learning, continuous control with ensemble deep deterministic policy gradients, toddler-inspired active representation learning, and large-scale deep reinforcement learning costs.

    Full text to download in external service

  • Building Knowledge for the Purpose of Lip Speech Identification

    Consecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...

    Full text to download in external service

  • The influence of reinforcement on load carrying capacity and cracking of the reinforced concrete deep beam joint

    The paper presents the results of experimental research of the spatial reinforced concrete deep beam systems orthogonally reinforced and with additional inclined bars. Joint of the deep beams in this research was composed of the longitudinal deep beam with a cantilever suspended at the transversal deep beam. The cantilever deep beam was loaded throughout the depth and the transversal deep beam was loaded at the mid-span by longitudinal...

    Full text to download in external service

  • Real-time speech-rate modification experiments

    Publication

    An algorithm designed for real-time speech time scale modification (stretching) is proposed, providing a combination of typical synchronous overlap and add based time scale modification algorithm and signal redundancy detection algorithms that allow to remove parts of the speech signal and replace them with the stretched speech signal fragments. Effectiveness of signal processing algorithms are examined experimentally together...

    Full text to download in external service

  • Alternative way of the main reinforcement anchorage in reinforced short cantilevers

    Publication

    The results of the experimental investigation of the reinforced corbels and dapped end beams with steel studs were introduced in the work. The efficiency of this type of the reinforcement was compared with the typical reinforcenet applied in reinforced conctrete construcions.

  • Improving Objective Speech Quality Indicators in Noise Conditions

    Publication

    - Year 2020

    This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

    Full text to download in external service

  • Speech Analytics Based on Machine Learning

    Publication

    In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

    Full text to download in external service

  • Speech synthesis controlled by eye gazing

    Publication

    - Year 2010

    A method of communication based on eye gaze controlling is presented. Investigations of using gaze tracking have been carried out in various context applications. The solution proposed in the paper could be referred to as ''talking by eyes'' providing an innovative approach in the domain of speech synthesis. The application proposed is dedicated to disabled people, especially to persons in a so-called locked-in syndrome who cannot...

  • Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions

    Publication

    The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

    Full text available to download

  • Time-domain prosodic modifications for text-to-speech synthesizer

    Publication

    - Year 2010

    An application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. Prosodic modifications that improve the naturalness of the synthesized signal are discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.

  • A Method of Real-Time Non-uniform Speech Stretching

    Publication

    Developed method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...

    Full text to download in external service

  • Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

    Publication
    • D. Korzekwa
    • R. Barra-Chicote
    • B. Kostek
    • T. Drugman
    • M. Łajszczak

    - Year 2019

    We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

    Full text available to download

  • Examining Influence of Distance to Microphone on Accuracy of Speech Recognition

    Publication

    The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal...

    Full text to download in external service

  • Comparison of various speech time-scale modificartion methods

    The objective of this work is to investigate the influence of the different time-scale modification (TSM) methods on the quality of the speech stretched up using the designed non-uniform real-time speech time-scale modification algorithm (NU-RTSM). The algorithm provides a combination of the typical TSM algorithm with the vowels, consonants, stutter, transients and silence detectors. Based on the information about the content and...

  • Speech codec enhancements utilizing time compression and perceptual coding

    Publication

    A method for encoding wideband speech signal employing standardized narrowband speech codecs is presented as well as experimental results concerning detection of tonal spectral components. The speech signal sampled with a higher sampling rate than it is suitable for narrowband coding algorithm is compressed in order to decrease the amount of samples. Next, the time-compressed representation of a signal is encoded using a narrowband...

  • Tensor Decomposition for Imagined Speech Discrimination in EEG

    Publication

    - LECTURE NOTES IN COMPUTER SCIENCE - Year 2018

    Most of the researches in Electroencephalogram(EEG)-based Brain-Computer Interfaces (BCI) are focused on the use of motor imagery. As an attempt to improve the control of these interfaces, the use of language instead of movement has been recently explored, in the form of imagined speech. This work aims for the discrimination of imagined words in electroencephalogram signals. For this purpose, the analysis of multiple variables...

    Full text to download in external service