Wyniki wyszukiwania dla: speech-to-text technology

Wyniki wyszukiwania dla: speech-to-text technology

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 6623

wyczyść wszystkie filtry niedostępne

wyświetlamy 1000 najlepszych wyników Pomoc

Language Models in Speech Recognition
Publikacja
- J. Daciuk
- Rok 2022
This chapter describes language models used in speech recognition, It starts by indicating the role and the place of language models in speech recognition. Mesures used to compare language models follow. An overview of n-gram, syntactic, semantic, and neural models is given. It is accompanied by a list of popular software.

Pełny tekst do pobrania w serwisie zewnętrznym
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
Publikacja
- Rok 2016
Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Pełny tekst do pobrania w serwisie zewnętrznym
Transient detection for speech coding applications
Publikacja
- International Journal of Computer Science and Network Security - Rok 2006
Signal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...

Pełny tekst do pobrania w serwisie zewnętrznym
Development and Research of the Text Messages Semantic Clustering Methodology
Publikacja
- N. Rizun
- P. Kapłański
- Y. Taranenko
- Rok 2016
The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

Pełny tekst do pobrania w portalu
Generating actionable evidence from free-text feedback to improve maternity and acute hospital experiences: A computational text analytics & predictive modelling approach
Publikacja
- A. Ojo
- N. Rizun
- M. Isazad Mashinchi
- G. Walsh
- J. Gruda
- M. N. Narayana
- M. Venosa
- C. Foley
- D. Rohde
- R. Flynn
- EUROPEAN JOURNAL OF PUBLIC HEALTH - Rok 2023
Background Patient experience surveys are a key source of evidence for supporting decision-making and quality improvement in healthcare services. These surveys contain two main types of questions: closed and open-ended, asking about patients’ care experiences. Apart from the knowledge obtained from analysing closed-ended questions, invaluable insights can be gleaned from free-text data. Advanced analytics techniques are increasingly...

Pełny tekst do pobrania w serwisie zewnętrznym
Improving the quality of speech in the conditions of noise and interference
Publikacja
- B. Kostek
- K. Kąkol
- Journal of the Acoustical Society of America - Rok 2018
The aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...

Pełny tekst do pobrania w serwisie zewnętrznym
Text Categorization Improvement via User Interaction
Publikacja
- J. Atroszko
- J. Szymański
- D. Gil
- H. Mora
- Rok 2018
In this paper, we propose an approach to improvement of text categorization using interaction with the user. The quality of categorization has been defined in terms of a distribution of objects related to the classes and projected on the self-organizing maps. For the experiments, we use the articles and categories from the subset of Simple Wikipedia. We test three different approaches for text representation. As a baseline we use...

Pełny tekst do pobrania w serwisie zewnętrznym
Comparative Analysis of Text Representation Methods Using Classification
Publikacja
- J. Szymański
- CYBERNETICS AND SYSTEMS - Rok 2014
In our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...

Pełny tekst do pobrania w serwisie zewnętrznym
Applying the Lombard Effect to Speech-in-Noise Communication
Publikacja
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Electronics - Rok 2023
This study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...

Pełny tekst do pobrania w portalu
Constructing a Dataset of Speech Recordingswith Lombard Effect
Publikacja
- D. Weber
- S. Zaporowski
- D. Korzekwa
- Rok 2020
Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
Improved method for real-time speech stretching
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Rok 2012
n algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...

Pełny tekst do pobrania w serwisie zewnętrznym
Semantic Analysis and Text Summarization in Socio-Technical Systems
Publikacja
- N. Rizun
- Rok 2018
In this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...

Pełny tekst do pobrania w serwisie zewnętrznym
Real-time speech-rate modification experiments
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Rok 2010
An algorithm designed for real-time speech time scale modification (stretching) is proposed, providing a combination of typical synchronous overlap and add based time scale modification algorithm and signal redundancy detection algorithms that allow to remove parts of the speech signal and replace them with the stretched speech signal fragments. Effectiveness of signal processing algorithms are examined experimentally together...

Pełny tekst do pobrania w serwisie zewnętrznym
Evaluation of Path Based Methods for Conceptual Representation of the Text
Publikacja
- Ł. Kucharczyk
- J. Szymański
- Rok 2014
Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

Pełny tekst do pobrania w serwisie zewnętrznym
Interactive Information Search in Text Data Collections
Publikacja
- Rok 2013
This article presents a new idea for retrieving in text repositories, as well as it describes general infrastructure of a system created to implement and test those ideas. The implemented system differs from today’s standard search engine by introducing process of interactive search with users and data clustering. We present the basic algorithms behind our system and measures we used for results evaluation. The achieved results...

Pełny tekst do pobrania w serwisie zewnętrznym
Promocja zasobów Pomorskiej Biblioteki Cyfrowej na przykładzie XVIII-wiecznego rękopisu
Publikacja
- K. Kokot-Kanikuła
- A. Sobolewska
- Z Badań nad Książką i Księgozbiorami Historycznymi - Rok 2022
Celem artykułu jest przedstawienie sposobu udostępniania i promocji zbiorów rękopiśmiennych na przykładzie XVIII-wiecznego rękopisu Christiana Gabriela Fishera dostępnego w Pomorskiej Bibliotece Cyfrowej (dalej: PBC). Rękopis ten stał się inspiracją do podjęcia współpracy Biblioteki Politechniki Gdańskiej oraz Instytutu Kultury Miejskiej w Gdańsku. Dzięki wspólnej inicjatywie rozpoczęto prace nad transkrypcją niemieckiego tekstu...

Pełny tekst do pobrania w portalu
Improving Objective Speech Quality Indicators in Noise Conditions
Publikacja
- K. Kąkol
- G. Korvel
- B. Kostek
- Rok 2020
This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

Pełny tekst do pobrania w serwisie zewnętrznym
Selection of Relevant Features for Text Classification with K-NN
Publikacja
- Rok 2013
In this paper, we describe five features selection techniques used for a text classification. An information gain, independent significance feature test, chi-squared test, odds ratio test, and frequency filtering have been compared according to the text benchmarks based on Wikipedia. For each method we present the results of classification quality obtained on the test datasets using K-NN based approach. A main advantage of evaluated...

Pełny tekst do pobrania w serwisie zewnętrznym
Speech Analytics Based on Machine Learning
Publikacja
- Rok 2019
In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

Pełny tekst do pobrania w serwisie zewnętrznym
Speech synthesis controlled by eye gazing
Publikacja
- A. Czyżewski
- K. Łopatka
- B. Kunka
- R. Rybacki
- B. Kostek
- Rok 2010
A method of communication based on eye gaze controlling is presented. Investigations of using gaze tracking have been carried out in various context applications. The solution proposed in the paper could be referred to as ''talking by eyes'' providing an innovative approach in the domain of speech synthesis. The application proposed is dedicated to disabled people, especially to persons in a so-called locked-in syndrome who cannot...
Detecting Lombard Speech Using Deep Learning Approach
Publikacja
- K. Kąkol
- G. Korvel
- G. Tamulevicius
- B. Kostek
- SENSORS - Rok 2023
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Pełny tekst do pobrania w portalu
A Method of Real-Time Non-uniform Speech Stretching
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Rok 2012
Developed method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...

Pełny tekst do pobrania w serwisie zewnętrznym
Text

Czasopisma

eISSN: 1327-9556
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publikacja
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Rok 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Pełny tekst do pobrania w portalu
Examining Influence of Distance to Microphone on Accuracy of Speech Recognition
Publikacja
- Rok 2015
The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal...

Pełny tekst do pobrania w serwisie zewnętrznym
Comparison of various speech time-scale modificartion methods
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Archives of Acoustics - Rok 2011
The objective of this work is to investigate the influence of the different time-scale modification (TSM) methods on the quality of the speech stretched up using the designed non-uniform real-time speech time-scale modification algorithm (NU-RTSM). The algorithm provides a combination of the typical TSM algorithm with the vowels, consonants, stutter, transients and silence detectors. Based on the information about the content and...
Speech codec enhancements utilizing time compression and perceptual coding
Publikacja
- M. Kulesza
- A. Czyżewski
- Rok 2007
A method for encoding wideband speech signal employing standardized narrowband speech codecs is presented as well as experimental results concerning detection of tonal spectral components. The speech signal sampled with a higher sampling rate than it is suitable for narrowband coding algorithm is compressed in order to decrease the amount of samples. Next, the time-compressed representation of a signal is encoded using a narrowband...
Tensor Decomposition for Imagined Speech Discrimination in EEG
Publikacja
- J. S. Garcia Salinas
- L. Villaseñor-Pineda
- C. A. Reyes-Garćia
- A. A. Torres-García
- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2018
Most of the researches in Electroencephalogram(EEG)-based Brain-Computer Interfaces (BCI) are focused on the use of motor imagery. As an attempt to improve the control of these interfaces, the use of language instead of movement has been recently explored, in the form of imagined speech. This work aims for the discrimination of imagined words in electroencephalogram signals. For this purpose, the analysis of multiple variables...

Pełny tekst do pobrania w serwisie zewnętrznym
Multimodal English corpus for automatic speech recognition
Publikacja
- Rok 2013
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
Publikacja
- A. Kupryjanow
- A. Czyżewski
- Diagnostic Pathology - Rok 2012
Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based onthe non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of theproposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearingimpaired children and elderly listeners. It was shown that for the speech with average rate equal to or...

Pełny tekst do pobrania w portalu
An audio-visual corpus for multimodal automatic speech recognition
Publikacja
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017
review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Pełny tekst do pobrania w portalu
Machine Learning and Text Analysis in an Artificial Intelligent System for the Training of Air Traffic Controllers
Publikacja
- T. Shmelova
- Y. Sikirda
- N. Rizun
- V. Lazorenko
- V. Kharchenko
- Rok 2020
This chapter presents the application of new information technology in education for the training of air traffic controllers (ATCs). Machine learning, multi-criteria decision analysis, and text analysis as the methods of artificial intelligence for ATCs training have been described. The authors have made an analysis of the International Civil Aviation Organization documents for modern principles of ATCs education. The prototype...

Pełny tekst do pobrania w portalu
Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network
Publikacja
- A. Wawrzyński
- J. Szymański
- Applied Sciences-Basel - Rok 2021
To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...

Pełny tekst do pobrania w portalu
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
Publikacja
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Rok 2022
The aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...

Pełny tekst do pobrania w portalu
Decoding imagined speech for EEG-based BCI
Publikacja
- C. A. Reyes-García
- A. A. Torres-García
- T. Hernández-del-Toro
- J. S. Garcia Salinas
- L. Villaseñor-Pineda
- Rok 2024
Brain–computer interfaces (BCIs) are systems that transform the brain's electrical activity into commands to control a device. To create a BCI, it is necessary to establish the relationship between a certain stimulus, internal or external, and the brain activity it provokes. A common approach in BCIs is motor imagery, which involves imagining limb movement. Unfortunately, this approach allows few commands. As an alternative, this...

Pełny tekst do pobrania w serwisie zewnętrznym
Two Stage SVM and kNN Text Documents Classifier
Publikacja
- M. Kępa
- J. Szymański
- Rok 2015
The paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...
The Method of a Two-Level Text-Meaning Similarity Approximation of the Customers’ Opinions
Publikacja
- N. Rizun
- P. Kapłański
- Y. Taranenko
- Studia Ekonomiczne. Zeszyty Naukowe Uniwersytetu Ekonomicznego w Katowicach - Rok 2016
The method of two-level text-meaning similarity approximation, consisting in the implementation of the classification of the stages of text opinions of customers and identifying their rank quality level was developed. Proposed and proved the significance of major hypotheses, put as the basis of the developed methodology, notably about the significance of suggestions about the existence of analogies between mathematical bases of...

Pełny tekst do pobrania w portalu
Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
Publikacja
- Rok 2016
The problem of accurate differentiating between the speaker utterance and the noise parts in a speech signal is considered. The influence of utilizing a voice activity detection in speech signals on the accuracy of the automatic speech recognition (ASR) system is presented. The examined methods of voice activity detection are based on acoustic and visual modalities. The problem of detecting the voice activity in clean and noisy...
Ranking Speech Features for Their Usage in Singing Emotion Classification
Publikacja
- S. Zaporowski
- B. Kostek
- Rok 2020
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Pełny tekst do pobrania w portalu
System Supporting Speech Perception in Special Educational Needs Schoolchildren
Publikacja
- A. Kupryjanow
- P. Suchomski
- P. Odya
- A. Czyżewski
- Rok 2012
The system supporting speech perception during the classes is presented in the paper. The system is a combination of portable device, which enables real-time speech stretching, with the workstation designed in order to perform hearing tests. System was designed to help children suffering from Central Auditory Processing Disorders.

Pełny tekst do pobrania w serwisie zewnętrznym
Silence/noise detection for speech and music signals
Publikacja
- M. Papaj
- Rok 2008
This paper introduces a novel off-line algorithm for silence/noise detection in noisy signals. The main concept of the proposed algorithm is to provide noise patterns for further signals processing i.e. noise reduction for speech enhancement. The algorithm is based on frequency domain characteristics of signals. The examples of different types of noisy signals are presented.
High quality speech codec employing sines+noise+transients model
Publikacja
- Archives of Acoustics - Rok 2006
A method of high quality wideband speech signal representation employing sines+transients+noise model is presented. The need for a wideband speech coding approach as well as various methods for analysis and synthesis of sines, residual and transient states of speech signal is discussed. The perceptual criterion is applied in the proposed approach during encoding of sines amplitudes in order to reduce bandwidth requirements and...

Pełny tekst do pobrania w portalu
Nina Rizun dr

Osoby

Katedra Informatyki w Zarządzaniu

Nina Rizun jest adiunktem na Wydziale Zarządzania i Ekonomii Politechniki Gdańskiej. W październiku 1999 r. uzyskała stopień doktora nauk technicznych za specjalizacją Gospodarka przedsiębiorstwa i organizacja produkcji. W latach 1993–2000 pracowała na Wydziale Informatyki Ekonomicznej w Akademji Metalurgicznej, Dnipro, Ukraina. W latach 2000–2016 – na Wydziale Cybernetyki Ekonomicznej i Metod Matematycznych na Uniwersytecie Alfreda...
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
Publikacja
- K. Kąkol
- G. Korvel
- B. Kostek
- Rok 2018
The aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
Thresholding Strategies for Large Scale Multi-Label Text Classifier
Publikacja
- K. Draszawka
- J. Szymański
- Rok 2013
This article presents an overview of thresholding methods for labeling objects given a list of candidate classes’ scores. These methods are essential to multi-label classiﬁcation tasks, especially when there are a lot of classes which are organized in a hierarchy. Presented techniques are evaluated using the state-of-the-art dedicated classiﬁer on medium scale text corpora extracted from Wikipedia. Obtained results show that the...

Pełny tekst do pobrania w serwisie zewnętrznym
Text-mining Similarity Approximation Operators for Opinion Mining in BI tools
Publikacja
- N. Rizun
- P. Kapłański
- Y. Taranenko
- S. Alessandro
- Rok 2016
The concept of the Text-mining Similarity Approximation Operators for Opinion Mining as extensions to Natural Language Interface Database is defined. The new operators: “keywords of” dimension; subsetting operator “about C is q”; aggregation operator “by similar C” are proposed. These operators are based on the Latent Semantic Analysis and Social Network Analysis

Pełny tekst do pobrania w portalu
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
Publikacja
- G. Korvel
- O. Kurasova
- B. Kostek
- Rok 2019
The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...

Pełny tekst do pobrania w portalu
Corrupted speech intelligibility improvement using adaptive filter based algorithm
Publikacja
- D. Ellwart
- A. Czyżewski
- Rok 2010
A technique for improving the quality of speech signals recorded in strong noise is presented. The proposed algorithmemploying adaptive filtration is described and additional possibilities of speech intelligibility improvement arediscussed. Results of the tests are presented.
Distortion of speech signals in the listening area: its mechanism and measurements
Publikacja
- H. Lasota
- R. Mazurek
- I. Kochańska
- Rok 2014
The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

Pełny tekst do pobrania w serwisie zewnętrznym
Database of speech and facial expressions recorded with optimized face motion capture settings
Publikacja
- A. Czyżewski
- M. Kawaler
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2019
The broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...

Pełny tekst do pobrania w portalu

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: speech-to-text technology

Nina Rizun dr