Wyniki wyszukiwania dla: automatic speech recognition

The Influence of Selecting Regions from Endoscopic Video Frames on The Efficiency of Large Bowel Disease Recognition Algorithms

Publikacja

- Rok 2012

The article presents our research in the field of the automatic diagnosis of large intestine diseases on endoscopic video. It focuses on the methods of selecting regions of interest from endoscopic video frames for further analysis by specialized disease recognition algorithms. Four methods of selecting regions of interest have been discussed: a. trivial, b. with the deletion of characteristic, endoscope specific additions to the...

Robot Eye Perspective in Perceiving Facial Expressions in Interaction with Children with Autism

Publikacja

A. Landowska
B. Robins

- Advances in Intelligent Systems and Computing - Rok 2020

The paper concerns automatic facial expression analysis applied in a study of natural “in the wild” interaction between children with autism and a social robot. The paper reports a study that analyzed the recordings captured via a camera located in the eye of a robot. Children with autism exhibit a diverse level of deficits, including ones in social interaction and emotional expression. The aim of the study was to explore the possibility...

Pełny tekst do pobrania w serwisie zewnętrznym

Trustworthy Applications of ML Algorithms in Medicine - Discussion and Preliminary Results for a Problem of Small Vessels Disease Diagnosis.

Publikacja

M. Ferlin
Z. Klawikowska
J. Niemierko
M. Grzywińska
A. Kwasigroch
E. Szurowska
M. Grochowski

- Rok 2022

ML algorithms are very effective tools for medical data analyzing, especially at image recognition. Although they cannot be considered as a stand-alone diagnostic tool, because it is a black-box, it can certainly be a medical support that minimize negative effect of human-factors. In high-risk domains, not only the correct diagnosis is important, but also the reasoning behind it. Therefore, it is important to focus on trustworthiness...

Pełny tekst do pobrania w portalu

Towards Emotion Acquisition in IT Usability Evaluation Context

Publikacja

A. Landowska

- Rok 2015

The paper concerns extension of IT usability studies with automatic analysis of the emotional state of a user. Affect recognition methods and emotion representation models are reviewed and evaluated for applicability in usability testing procedures. Accuracy of emotion recognition, susceptibility to disturbances, independence on human will and interference with usability testing procedures are...

Pełny tekst do pobrania w serwisie zewnętrznym

Potential and Use of the Googlenet Ann for the Purposes of Inland Water Ships Classification

Publikacja

K. Bobkowska
I. Bodus-olkowska Izabela

- Polish Maritime Research - Rok 2020

This article presents an analysis of the possibilities of using the pre-degraded GoogLeNet artificial neural network to classify inland vessels. Inland water authorities monitor the intensity of the vessels via CCTV. Such classification seems to be an improvement in their statutory tasks. The automatic classification of the inland vessels from video recording is a one of the main objectives of the Automatic Ship Recognition and...

Pełny tekst do pobrania w portalu

Analysis of Image Preprocessing and Binarization Methods for OCR-Based Detection and Classification of Electronic Integrated Circuit Labeling

Publikacja

- Electronics - Rok 2023

Automatic recognition and classification of electronic integrated circuits based on optical character recognition combined with the analysis of the shape of their housings are essential to machine vision methods supporting the production of electronic parts, especially small-volume ones in the through-hole technology, characteristic of printed circuit boards. Since such methods utilize binary images, applying appropriate image...

Pełny tekst do pobrania w serwisie zewnętrznym

Classifying type of vehicles on the basis of data extracted from audio signal characteristics

Publikacja

- Journal of the Acoustical Society of America - Rok 2017

The aim of this study is to find and optimize a feature vector for an automatic recognition of the type of vehicles, extracted form an audio signal. First, the influence of weather-based conditions of road surface on spectral characteristic of the audio signal recorded from a passing vehicle in close proximity to the road is discussed. Next, parameterization of the recorded audio signal is performed. For that purpose, the MIRtoolbox,...

Pełny tekst do pobrania w serwisie zewnętrznym

Playback detection using machine learning with spectrogram features approach

Publikacja

- Rok 2017

This paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...

Pełny tekst do pobrania w portalu

Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results

Publikacja

G. Korvel
O. Kurasova
B. Kostek

- Archives of Acoustics - Rok 2019

The goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...

Pełny tekst do pobrania w portalu

Analiza stanu nawierzchni i klas pojazdów na podstawie parametrów ekstrahowanych z sygnału fonicznego

Publikacja

- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2016

Celem badań jest poszukiwanie parametrów wektora cech ekstrahowanego z sygnału fonicznego w kontekście automatycznego rozpoznawania stanu nawierzchni jezdni oraz typu pojazdów. W pierwszej kolejności przedstawiono wpływ warunków pogodowych na charakterystykę widmową sygnału fonicznego rejestrowanego przy przejeżdżających pojazdach. Następnie, dokonano parametryzacji sygnału fonicznego oraz przeprowadzano analizę korelacyjną w celu...

Pełny tekst do pobrania w portalu

Detection and localization of selected acoustic events in 3D acoustic field for smart surveillance applications

Publikacja

- Communications in Computer and Information Science - Rok 2011

A method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The events are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...

Pełny tekst do pobrania w serwisie zewnętrznym

Detection and localization of selected acoustic events in acoustic field for smart surveillance applications

Publikacja

- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2014

A method for automatic determination of position of chosen sound events such as speech signals and impulse sounds in 3-dimensional space is presented. The evens are localized in the presence of sound reflections employing acoustic vector sensors. Human voice and impulsive sounds are detected using adaptive detectors based on modified peak-valley difference (PVD) parameter and sound pressure level. Localization based on signals...

Pełny tekst do pobrania w portalu

Multi-Stage Video Analysis Framework

Publikacja

- Rok 2011

The chapter is organized as follows. Section 2 presents the general structure of the proposed framework and a method of data exchange between system elements. Section 3 is describing the low-level analysis modules for detection and tracking of moving objects. In Section 4 we present the object classification module. Sections 5 and 6 describe specialized modules for detection and recognition of faces and license plates, respectively....

Pełny tekst do pobrania w serwisie zewnętrznym

Quality of graphical markers for the needs of eyewear devices

Publikacja

A. Kwaśniewska
J. Rumiński
J. Klimiuk-Myszk
F. Jérôme
M. Benoît
P. Isabelle

- Rok 2015

in this paper we propose to cast the problem of identification of people, objects or places into an application for smart glasses that decodes information from graphical markers. We focus on analyzing different factors that can have influence on the processes of the automatic recognition of information from a code. The research we present aims at reviewing recognition performances in function of: size of a marker, distance from/to...

Pełny tekst do pobrania w serwisie zewnętrznym

Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing

Publikacja

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2020

Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Pełny tekst do pobrania w portalu

Music Mood Visualization Using Self-Organizing Maps

Publikacja

- Archives of Acoustics - Rok 2015

Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

Pełny tekst do pobrania w portalu

Robot-Based Intervention for Children With Autism Spectrum Disorder: A Systematic Literature Review

Publikacja

K. D. Bartl-Pokorny
P. Uluer
D. E. Barkana
A. Baird
H. Kose
T. Zorcec
B. Robins
B. Schuller
A. Landowska
M. Pykała

- IEEE Access - Rok 2021

Children with autism spectrum disorder (ASD) have deficits in the socio-communicative domain and frequently face severe difficulties in the recognition and expression of emotions. Existing literature suggested that children with ASD benefit from robot-based interventions. However, studies varied considerably in participant characteristics, applied robots, and trained skills. Here, we reviewed robot-based interventions targeting...

Pełny tekst do pobrania w portalu

Identification of acoustic event of selected noise sources in a long-term environmental monitoring systems

Publikacja

M. Kłaczyński
W. Cioch
T. Wszołek
W. Wszołek
D. Mleczko
P. Pawlik
A. Grzeczka

- Rok 2014

ABSTRACT Undertaking long-term acoustic measurements on sites located near an airport is related to a problem of large quantities of recorded data, which very often represents information not related to flight operations. In such areas, usually defined as zone of limited use, often other sources of noise exist, such as roads or railway lines treated is such context as acoustic background. Manual verification of such recorded data...

Semi complex navigation with an active optical gesture sensor

Publikacja

- Rok 2016

This paper presents the methods of diversified touchless interactions between a user and a mobile platform utilizing the optical gesture sensor. The sensor uses 8 photodiodes to measure the reflected light in the active mode (using embedded LEDs) or it measures shadows caused by fingers in the passive mode. Several algorithms were implemented: automatic mode switching, adaptive illumination level compensation, resolution improvements...

Pełny tekst do pobrania w serwisie zewnętrznym

Development and tuning of irregular divide-and-conquer applications in DAMPVM/DAC

Publikacja

P. Czarnul

- Rok 2002

This work presents implementations and tuning experiences with parallel irregular applications developed using the object oriented framework DAM-PVM/DAC. It is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-and-conquer (DAC) applications at runtime and dynamic mapping to processors taking into account their speeds and even loads by other user processes. New implementations of parallel applications...

Pełny tekst do pobrania w serwisie zewnętrznym

In uence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classi cation

Publikacja

A. Rosner
F. Weninger
B. Schuller
M. Michalak
B. Kostek

- Rok 2013

We present a comprehensive evaluation of the infuence of 'harmonic' and rhythmic sections contained in an audio file on automatic music genre classi cation. The study is performed using the ISMIS database composed of music files, which are represented by vectors of acoustic parameters describing low-level music features. Non-negative Matrix Factorization serves for blind separation of instrument components. Rhythmic components...

Performance Analysis of the OpenCL Environment on Mobile Platforms

Publikacja

- Rok 2022

Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

Pełny tekst do pobrania w serwisie zewnętrznym

Separability Assessment of Selected Types of Vehicle-Associated Noise

Publikacja

- Advances in Intelligent Systems and Computing - Rok 2016

Music Information Retrieval (MIR) area as well as development of speech and environmental information recognition techniques brought various tools in-tended for recognizing low-level features of acoustic signals based on a set of calculated parameters. In this study, the MIRtoolbox MATLAB tool, designed for music parameter extraction, is used to obtain a vector of parameters to check whether they are suitable for separation of...

Pełny tekst do pobrania w serwisie zewnętrznym

The Hough transform in the classification process of inland ships

Publikacja

K. Bobkowska
N. Wawrzyniak

- Zeszyty Naukowe Akademii Morskiej w Szczecinie - Rok 2019

This article presents an analysis of the possibilities of using image processing methods for feature extraction that allows kNN classification based on a ship’s image delivered from an on-water video surveillance system. The subject of the analysis is the Hough transform which enables the detection of straight lines in an image. The recognized straight lines and the information about them serve as features in the classification...

Pełny tekst do pobrania w portalu

Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

Publikacja

K. Leckey
R. Neininger
W. Szpankowski

- Rok 2013

Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel- Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing...

Detecting Apples in the Wild: Potential for Harvest Quantity Estimation

Publikacja

A. Janowski
R. Kaźmierczak
C. Kowalczyk
J. Szulwic

- Sustainability - Rok 2021

Knowing the exact number of fruits and trees helps farmers to make better decisions in their orchard production management. The current practice of crop estimation practice often involves manual counting of fruits (before harvesting), which is an extremely time-consuming and costly process. Additionally, this is not practicable for large orchards. Thanks to the changes that have taken place in recent years in the field of image...

Pełny tekst do pobrania w portalu

Audio content analysis in the urban area telemonitoring system

Publikacja

- Rok 2010

Artykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...

Pełny tekst do pobrania w serwisie zewnętrznym

Smart Virtual Bass Synthesis Algorithm Based on Music Genre Classification

Publikacja

- Rok 2014

The aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) algorithms applied to portable computers. The proposed algorithm employed automatic music genre recognition to determine the optimum parameters for the synthesis of additional frequencies. The synthesis was carried out using the non-linear device (NLD) and phase vocoder (PV) methods depending on the music excerpt genre. Classification of musical...

Selection of an artificial pre-training neural network for the classification of inland vessels based on their images

Publikacja

K. Bobkowska
I. Bodus-olkowska Izabela

- Zeszyty Naukowe Akademii Morskiej w Szczecinie - Rok 2021

Artificial neural networks (ANN) are the most commonly used algorithms for image classification problems. An image classifier takes an image or video as input and classifies it into one of the possible categories that it was trained to identify. They are applied in various areas such as security, defense, healthcare, biology, forensics, communication, etc. There is no need to create one’s own ANN because there are several pre-trained...

Pełny tekst do pobrania w portalu

Video content analysis in the urban area telemonitoring system

Publikacja

- Rok 2010

The task of constant monitoring of video streams from a large number of cameras and reviewing the recordings in order to find a specified event requires a considerable amount of time and effort from the system operators and it is prone to errors. A solution to this problem is an automatic system for constant analysis of camera images being able to raise an alarm if a predefined event is detected. The chapter presents various aspects...

Pełny tekst do pobrania w serwisie zewnętrznym

Computer-assisted pronunciation training—Speech synthesis is almost all you need

Publikacja

D. Korzekwa
J. Lorenzo-trueba
T. Drugman
B. Kostek

- SPEECH COMMUNICATION - Rok 2022

The research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...

Pełny tekst do pobrania w portalu

Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement

Publikacja

G. Korvel
K. Kąkol
O. Kurasova
B. Kostek

- IEEE Access - Rok 2020

The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Pełny tekst do pobrania w portalu

A Hybrid Method for Objective Quality Assessment of Binary Images

Publikacja

- IEEE Access - Rok 2023

In the paper, a novel hybrid method for an automatic quality assessment of binary images is proposed that may be useful, e.g., for computationally limited embedded systems or Optical Character Recognition applications. Since the quality of binary images used as the input for further image analysis strongly influences the obtained results, a reliable evaluation of their quality is a crucial element for the validation of such...

Pełny tekst do pobrania w serwisie zewnętrznym

Using Convolutional Neural Networks for Corneal Arcus Detection Towards Familial Hypercholesterolemia Screening

Publikacja

T. Kocejko
J. Rumiński
M. Mazur-Milecka
M. Romanowska-Kocejko
K. Chlebus
J. Kang-Hyun

- Journal of King Saud University-Computer and Information Sciences - Rok 2022

Familial hypercholesterolemia (FH) is a highly undiagnosed disease. Among FH patients, the onset of premature coronary artery disease is 13 times higher than in the general population. Early diagnosis and treatment is essential to prevent cardiovascular diseases and their complications, and to prolong life. One of the clinical criteria of FH is the occurrence of a corneal arcus (CA) among patients, especially those under 45 years...

Pełny tekst do pobrania w portalu

Visual Features for Endoscopic Bleeding Detection

Publikacja

A. Brzeski

- Current Journal of Applied Science and Technology (British Journal of Applied Science & Technology) - Rok 2014

Aims: To define a set of high-level visual features of endoscopic bleeding and evaluate their capabilities for potential use in automatic bleeding detection. Study Design: Experimental study. Place and Duration of Study: Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, between March 2014 and May 2014. Methodology: The features have...

Pełny tekst do pobrania w portalu

DIAGNOSIS OF MALIGNANT MELANOMA BY NEURAL NETWORK ENSEMBLE-BASED SYSTEM UTILISING HAND-CRAFTED SKIN LESION FEATURES

Publikacja

- Metrology and Measurement Systems - Rok 2019

Malignant melanomas are the most deadly type of skin cancer but detected early have high chances for successful treatment. In the last twenty years, the interest of automated melanoma recognition detection and classification dynamically increased partially because of public datasets appearing with dermatoscopic images of skin lesions. Automated computer-aided skin cancer detection in dermatoscopic images is a very challenging task...

Pełny tekst do pobrania w portalu

Speech Intelligibility Measurements in Auditorium

Publikacja

K. Leo

- ACTA PHYSICA POLONICA A - Rok 2010

Speech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...

Pełny tekst do pobrania w portalu

Transient detection for speech coding applications

Publikacja

- International Journal of Computer Science and Network Security - Rok 2006

Signal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...

Pełny tekst do pobrania w serwisie zewnętrznym

Improving the quality of speech in the conditions of noise and interference

Publikacja

B. Kostek
K. Kąkol

- Journal of the Acoustical Society of America - Rok 2018

The aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...

Pełny tekst do pobrania w serwisie zewnętrznym

Constructing a Dataset of Speech Recordingswith Lombard Effect

Publikacja

D. Weber
S. Zaporowski
D. Korzekwa

- Rok 2020

Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...

Improved method for real-time speech stretching

Publikacja

- Rok 2012

n algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...

Pełny tekst do pobrania w serwisie zewnętrznym

The Impact of Weather on Traffic Speed in Urban Area

Publikacja

J. Chmielewski
M. Budzyński

- IOP Conference Series: Materials Science and Engineering - Rok 2019

The issue of the impact of weather conditions on trip speed of vehicles has been studied for a long time and it is still the subject of many scientific researches. The impact of atmospheric conditions on the speed with which drivers drive their vehicles seems to be obvious. Good weather conditions, sunny weather with good visibility surely provokes higher speed while rainfall, wind...

Pełny tekst do pobrania w portalu

Real-time speech-rate modification experiments

Publikacja

- Rok 2010

An algorithm designed for real-time speech time scale modification (stretching) is proposed, providing a combination of typical synchronous overlap and add based time scale modification algorithm and signal redundancy detection algorithms that allow to remove parts of the speech signal and replace them with the stretched speech signal fragments. Effectiveness of signal processing algorithms are examined experimentally together...

Pełny tekst do pobrania w serwisie zewnętrznym

Improving Objective Speech Quality Indicators in Noise Conditions

Publikacja

K. Kąkol
G. Korvel
B. Kostek

- Rok 2020

This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

Pełny tekst do pobrania w serwisie zewnętrznym

Detecting Lombard Speech Using Deep Learning Approach

Publikacja

K. Kąkol
G. Korvel
G. Tamulevicius
B. Kostek

- SENSORS - Rok 2023

Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Pełny tekst do pobrania w portalu

Speech synthesis controlled by eye gazing

Publikacja

- Rok 2010

A method of communication based on eye gaze controlling is presented. Investigations of using gaze tracking have been carried out in various context applications. The solution proposed in the paper could be referred to as ''talking by eyes'' providing an innovative approach in the domain of speech synthesis. The application proposed is dedicated to disabled people, especially to persons in a so-called locked-in syndrome who cannot...

Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions

Publikacja

- SENSORS - Rok 2021

The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

Pełny tekst do pobrania w portalu

Time-domain prosodic modifications for text-to-speech synthesizer

Publikacja

- Rok 2010

An application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. Prosodic modifications that improve the naturalness of the synthesized signal are discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.

A Method of Real-Time Non-uniform Speech Stretching

Publikacja

- Rok 2012

Developed method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...

Pełny tekst do pobrania w serwisie zewnętrznym

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: automatic speech recognition