Wyniki wyszukiwania dla: speech synthesis - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: speech synthesis

Wyniki wyszukiwania dla: speech synthesis

  • A comparative study of English viseme recognition methods and algorithm

    An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

    Pełny tekst do pobrania w portalu

  • Modeling and Designing Acoustical Conditions of the Interior – Case Study

    The primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...

    Pełny tekst do pobrania w portalu

  • POPRAWA OBIEKTYWNYCH WSKAŹNIKÓW JAKOŚCI MOWY W WARUNKACH HAŁASU

    Celem pracy jest modyfikacja sygnału mowy, aby uzyskać zwiększenie poprawy obiektywnych wskaźników jakości mowy po zmiksowaniu sygnału użytecznego z szumem bądź z sygnałem zakłócającym. Wykonane modyfikacje sygnału bazują na cechach mowy lombardzkiej, a w szczególności na efekcie podniesienia częstotliwości podstawowej F0. Sesja nagraniowa obejmowała zestawy słów i zdań w języku polskim, nagrane w warunkach ciszy, jak również w...

    Pełny tekst do pobrania w portalu

  • Playback detection using machine learning with spectrogram features approach

    Publikacja

    This paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...

    Pełny tekst do pobrania w portalu

  • Intelligent multimedia solutions supporting special education needs.

    The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

  • Intelligent video and audio applications for learning enhancement

    The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

    Pełny tekst do pobrania w portalu

  • Evaluation Criteria for Affect-Annotated Databases

    In this paper a set of comprehensive evaluation criteria for affect-annotated databases is proposed. These criteria can be used for evaluation of the quality of a database on the stage of its creation as well as for evaluation and comparison of existing databases. The usefulness of these criteria is demonstrated on several databases selected from affect computing domain. The databases contain different kind of data: video or still...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Minimizing Distribution and Data Loading Overheads in Parallel Training of DNN Acoustic Models with Frequent Parameter Averaging

    Publikacja

    In the paper we investigate the performance of parallel deep neural network training with parameter averaging for acoustic modeling in Kaldi, a popular automatic speech recognition toolkit. We describe experiments based on training a recurrent neural network with 4 layers of 800 LSTM hidden states on a 100-hour corpora of annotated Polish speech data. We propose a MPI-based modification of the training program which minimizes the...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING

    The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming...

    Pełny tekst do pobrania w portalu

  • Selection of Features for Multimodal Vocalic Segments Classification

    Publikacja

    English speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • A study on signal processing methods applied to hearing aids

    Publikacja

    - Rok 2016

    This paper presents a short survey on current technology available in hearing aids with a focus on digital signal processing techniques used. First, factors influencing the hearing aid effectiveness are introduced. Then, examples of the present DSP methods and strategies are provided. Also, a description of current limitations of hearing aids and future trends of development are shown. Finally, the notion of computational auditory...

  • MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

    Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

    Pełny tekst do pobrania w portalu

  • Vocalic Segments Classification Assisted by Mouth Motion Capture

    Visual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Modeling and Simulation for Exploring Power/Time Trade-off of Parallel Deep Neural Network Training

    In the paper we tackle bi-objective execution time and power consumption optimization problem concerning execution of parallel applications. We propose using a discrete-event simulation environment for exploring this power/time trade-off in the form of a Pareto front. The solution is verified by a case study based on a real deep neural network training application for automatic speech recognition. A simulation lasting over 2 hours...

    Pełny tekst do pobrania w portalu

  • Examining Feature Vector for Phoneme Recognition / Analiza parametrów w kontekście automatycznej klasyfikacji fonemów

    Publikacja

    - Rok 2017

    The aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...

  • A Device for Measuring Auditory Brainstem Responses to Audio

    Standard ABR devices use clicks and tone bursts to assess subjects’ hearing in an objective way. A new device was developed that extends the functionality of a standard ABR audiometer by collecting and analyzing auditory brainstem responses (ABR). The developed accessory allows for the use of complex sounds (e.g., speech or music excerpts) as stimuli. Therefore, it is possible to find out how efficiently different types of sounds...

    Pełny tekst do pobrania w portalu

  • Secured wired BPL voice transmission system

    Publikacja

    - Scientific Journal of the Military University of Land Forces - Rok 2020

    Designing a secured voice transmission system is not a trivial task. Wired media, thanks to their reliability and resistance to mechanical damage, seem an ideal solution. The BPL (Broadband over Power Line) cable is resistant to electricity stoppage and partial damage of phase conductors, ensuring continuity of transmission in case of an emergency. It seems an appropriate tool for delivering critical data, mostly clear and understandable...

    Pełny tekst do pobrania w portalu

  • Multimedia industrial and medical applications supported by machine learning

    Publikacja

    - Rok 2023

    This article outlines a keynote paper presented at the Intelligent DecisionTechnologies conference providing a part of the KES Multi-theme Conference “Smart Digital Futures” organized in Rome on June 14–16, 2023. It briefly discusses projects related to traffic control using developed intelligent traffic signs and diagnosing the health of wind turbine mechanisms and multimodal biometric authentication for banking branches to provide...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results

    Publikacja

    - Archives of Acoustics - Rok 2019

    The goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...

    Pełny tekst do pobrania w portalu

  • Chirp Rate and Instantaneous Frequency Estimation: Application to Recursive Vertical Synchrosqueezing

    Publikacja

    - IEEE SIGNAL PROCESSING LETTERS - Rok 2017

    This letter introduces new chirp rate and instantaneous frequency estimators designed for frequency-modulated signals. These estimators are first investigated from a deterministic point of view, then compared together in terms of statistical efficiency. They are also used to design new recursive versions of the vertically synchrosqueezed short-time Fourier transform, using a previously published method (D. Fourer, F. Auger, and...

    Pełny tekst do pobrania w portalu

  • The Innovative Faculty for Innovative Technologies

    A leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization

    Publikacja

    - Rok 2017

    An allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...

  • Ultrawideband transmission in physical channels: a broadband interference view

    The superposition of multipath components (MPC) of an emitted wave, formed by reflections from limiting surfaces and obstacles in the propagation area, strongly affects communication signals. In the case of modern wideband systems, the effect should be seen as a broadband counterpart of classical interference which is the cause of fading in narrowband systems. This paper shows that in wideband communications, the time- and frequency-domain...

    Pełny tekst do pobrania w portalu

  • Examining Feature Vector for Phoneme Recognition

    Publikacja

    - Rok 2018

    The aim of this paper is to analyze usability of descriptors coming from music information retrieval to the phoneme analysis. The case study presented consists in several steps. First, a short overview of parameters utilized in speech analysis is given. Then, a set of time and frequency domain-based parameters is selected and discussed in the context of stop consonant acoustical characteristics. A toolbox created for this purpose...

  • Detection and localization of selected acoustic events in acoustic field for smart surveillance applications

    A method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The evens are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...

    Pełny tekst do pobrania w portalu

  • Quality Evaluation of Novel DTD Algorithm Based on Audio Watermarking

    Publikacja

    Echo cancellers typically employ a doubletalk detection (DTD) algorithm in order to keep the adaptive filter from diverging in the presence of near-end speech signal or other disruptive sounds in the microphone signal. A novel doubletalk detection algorithm based on techniques similar to those used for audio signal watermarking was introduced by the authors. The application of the described DTD algorithm within acoustic echo cancellation...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Detection and localization of selected acoustic events in 3D acoustic field for smart surveillance applications

    A method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The events are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • A Novel Approach to the Assessment of Cough Incidence

    Publikacja

    In this paper we consider the problem of identication of cough events in patients suffering from chronic respiratory diseases. The information about frequency of cough events is necessary to medical treatment. The proposed approach is based on bidirectional processing of a measured vibration signal - cough events are localized by combining the results of forward-time and backward-time analysis. The signal is at rst transformed...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Extracting concepts from the software requirements specification using natural language processing

    Publikacja

    - Rok 2018

    Extracting concepts from the software require¬ments is one of the first step on the way to automating the software development process. This task is difficult due to the ambiguity of the natural language used to express the requirements specification. The methods used so far consist mainly of statistical analysis of words and matching expressions with a specific ontology of the domain in which the planned software will be applicable....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • New approach for determining the QoS of MP3-coded voice signals in IP networks

    Publikacja

    Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...

    Pełny tekst do pobrania w portalu

  • Cross-domain applications of multimodal human-computer interfaces

    Publikacja

    - Rok 2015

    Developed multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...

  • Subjective and Objective Comparative Study of DAB+ Broadcast System

    Broadcasting services seek to optimize their use of bandwidth in order to maximize user’s quality of experience. They aim to transmit high-quality digital speech and music signals at the lowest bitrate. They intend to offer the best quality under available conditions. Due to bandwidth limitations, audio quality is in conflict with the number of transmitted radio programs. This paper analyzes whether the quality of real-time digital...

    Pełny tekst do pobrania w portalu

  • Analysis of a caustic formed by a spherical reflector: Impact of a caustic on architectural acoustics

    Publikacja

    Focusing sound in rooms intended for listening to music or speech is an acoustic defect. Design recommendations provide remedial steps to effectively prevent this. However, there is a category of objects of high historical or architectural value in which the sound focus correction is limited or even abandoned. This also applies to indoor or outdoor concert shells, installations for teaching and acoustic presentations, etc. The...

    Pełny tekst do pobrania w portalu

  • Impact of the glazed roof on acoustics of historic interiors

    Publikacja

    - Rok 2018

    The paper discusses the adverse acoustic phenomena occurring in the semi-open interiors (courtyards, yards) covered with a glass roof. Particularly negative is the rever-beration noise, which leads to the degradation of the utility functions of the resulting spaces. It involves the drastically reducing the intelligibility of speech, loss of natural sounding of music, problems with the sound system, as well as disturbances in the...

  • Pursuing Analytically the Influence of Hearing Aid Use on Auditory Perception in Various Acoustic Situations

    Publikacja

    - Vibrations in Physical Systems - Rok 2022

    The paper presents the development of a method for assessing auditory perception and the effectiveness of applying hearing aids for hard-of-hearing people during short-term (up to 7 days) and longer-term (up to 3 months) use. The method consists of a survey based on the APHAB questionnaire. Additional criteria such as the degree of hearing loss, technological level of hearing aids used, as well as the user experience are taken...

    Pełny tekst do pobrania w portalu

  • Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network

    Publikacja

    - Journal of the Acoustical Society of America - Rok 2021

    The goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...

    Pełny tekst do pobrania w portalu

  • ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU

    Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory...

    Pełny tekst do pobrania w portalu

  • Audio Content and Crowdsourcing: A Subjective Quality Evaluation of Radio Programs Streamed Online

    Publikacja

    - Rok 2023

    Radio broadcasting has been present in our lives for over 100 years. The transmission of speech and music signals accompanies us from an early age. Broadcasts provide the latest information from home and abroad. They also shape musical tastes and allow many artists to share their creativity. Modern distribution involves transmission over a number of terrestrial systems. The most popular are analog FM (Frequency Modulation) and...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • A low complexity double-talk detector based on the signal envelope

    A new algorithm for double-talk detection, intended for use in the acoustic echo canceller for voice communication applications, is proposed. The communication system developed by the authors required the use of a double-talk detection algorithm with low complexity and good accuracy. The authors propose an approach to doubletalk detection based on the signal envelopes. For each of three signals: the far-end speech, the microphone...

    Pełny tekst do pobrania w portalu

  • Engineering Challenges in the Design of Cochlear Implants

    Publikacja

    - Rok 2021

    Hearing aids such as cochlear implants have been used by both adults and children for a long time. In addition, cochlear implants are used by patients who have severe hearing loss either by birth or after an accident. This paper aims to investigate the engineering challenges bounding the design of cochlear implants and present its possible solution...

  • English Language Learning Employing Developments in Multimedia IS

    Publikacja

    In the realm of the development of information systems related to education, integrating multimedia technologies offers novel ways to enhance foreign language learning. This study investigates audio-video processing methods that leverage real-time speech rate adjustment and dynamic captioning to support English language acquisition. Through a mixed-methods analysis involving participants from a language school, we explore the impact...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Analysis of allophones based on audio signal recordings and parameterization

    The aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Discovery of Stylistic Patterns in Business Process Textual Descriptions: IT Ticket Case

    Publikacja

    - Rok 2019

    Growing IT complexity and related problems, which are reflected in IT tickets,create a need for new qualitative approaches. The goal isto automate the extraction of main topics described in tickets in order to provide high quality support for the IT process workers and enablea smooth service delivery to the end user. Present paper proposes a method of knowledge extraction in a form of stylistic patterns in business...

    Pełny tekst do pobrania w portalu

  • The influence of time of hearing aid use on auditory perception in various acoustic situations

    Publikacja

    - Journal of the Acoustical Society of America - Rok 2018

    The assessment of sound perception in hearing aids, especially in the context of benefits that a prosthesis can bring, is a complex issue. The objective parameters of the hearing aids can easily be determined. These parameters, however, do not always have a direct and decisive influence on the subjective assessment of quality of the patient’s hearing while using a hearing aid. The paper presents the development of a method for...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Separability Assessment of Selected Types of Vehicle-Associated Noise

    Music Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Automatic Emotion Recognition in Children with Autism: A Systematic Literature Review

    Publikacja

    - SENSORS - Rok 2022

    The automatic emotion recognition domain brings new methods and technologies that might be used to enhance therapy of children with autism. The paper aims at the exploration of methods and tools used to recognize emotions in children. It presents a literature review study that was performed using a systematic approach and PRISMA methodology for reporting quantitative and qualitative results. Diverse observation channels and modalities...

    Pełny tekst do pobrania w portalu

  • Performance Analysis of the OpenCL Environment on Mobile Platforms

    Publikacja

    Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

    Publikacja

    - Rok 2013

    Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...

  • Study Analysis of Transmission Efficiency in DAB+ Broadcasting System

    Publikacja

    - Rok 2018

    DAB+ is a very innovative and universal multimedia broadcasting system. Thanks to its updated multimedia technologies and metadata options, digital radio keeps pace with changing consumer expectations and the impact of media convergence. Broadcasting analog and digital radio services does vary, concerning devices on both transmitting and receiving side, as well as content processing mechanisms. However, the biggest difference is...

    Pełny tekst do pobrania w portalu

  • Contactless hearing aid designed for infants

    It is a well known fact that language development through home intervention for a hearing-impaired infant should start in the early months of a newborn baby's life. The aim of this paper is to present a concept of a contactless digital hearing aid designed especially for infants. In contrast to all typical wearable hearing aid solutions (ITC, ITE, BTE), the proposed device is mounted in the infant's bed with any parts of its set-up...

    Pełny tekst do pobrania w portalu