Wyniki wyszukiwania dla: text-to-speech transcription

Wyniki wyszukiwania dla: text-to-speech transcription

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 1419

wyczyść wszystkie filtry niedostępne

wyświetlamy 1000 najlepszych wyników Pomoc

Visual Lip Contour Detection for the Purpose of Speech Recognition
Publikacja
- Rok 2014
A method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...
Objectivization of phonological evaluation of speech elements by means of audio parametrization
Publikacja
- Rok 2018
This study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...
Third Text

Czasopisma

ISSN: 0952-8822 , eISSN: 1475-5297
Social Text

Czasopisma

ISSN: 0164-2472 , eISSN: 1527-1951
Word and Text

Czasopisma

ISSN: 2069-9271
Text & Talk

Czasopisma

ISSN: 1860-7330 , eISSN: 1860-7349
Text Mining Algorithms for Extracting Brand Knowledge; The fashion Industry Case
Publikacja
- N. Rizun
- W. Kucharska
- Rok 2018
Brand knowledge is determined by customer knowledge. The opportunity to develop brands based on customer knowledge management has never been greater. Social media as a set of leading communication platforms enable peer to peer interplays between customers and brands. A large stream of such interactions is a great source of information which, when thoroughly analyzed, can become a source of innovation and lead to competitive advantage....

Pełny tekst do pobrania w portalu
Elimination of clicks from archive speech signals using sparse autoregressive modeling
Publikacja
- M. Niedźwiecki
- M. Ciołek
- Rok 2012
This paper presents a new approach to elimination of impulsivedisturbances from archive speech signals. The proposedsparse autoregressive (SAR) signal representation is given ina factorized form - the model is a cascade of the so-called formantfilter and pitch filter. Such a technique has been widelyused in code-excited linear prediction (CELP) systems, as itguarantees model stability. After detection of noise pulses usinglinear...

Pełny tekst do pobrania w serwisie zewnętrznym
A Text as a Set of Research Data. A Number of Aspects of Data Acquisition and Creation of Datasets in Neo-Latin Studies
Publikacja
- Rok 2022
In this paper, the authors, who specialise in part in neo-Latin studies and the his-tory of early modern education, share their experiences of collecting sources for Open Research Data sets under the Bridge of Data project. On the basis of inscription texts from St. Mary’s Church in Gdańsk, they created 29 Open Research Data sets. In turn, the text of the lectures of the Gdańsk scholar Michael Christoph Hanow, Praecepta de arte...

Pełny tekst do pobrania w portalu
Endoscopic Videos Deinterlacing and On-Screen Text and Light Flashes Removal and Its Influence on Image Analysis Algorithms' Efficiency
Publikacja
- International Journal of Image Processing and Visual Communication - Rok 2013
In this article, deinterlacing and removing on- screen text and light flashes methods on endoscopic video images are discussed. The research is intended to improve disease recognition algorithms' performance. In the article, four configurations of deinterlacing methods and another four configurations of text and flashes removal methods are described and examined. The efficiency of endoscopic video analysis algorithms is measured...

Pełny tekst do pobrania w serwisie zewnętrznym
Hybrid of Neural Networks and Hidden Markov Models as a modern approach to speech recognition systems
Publikacja
- P. Sokólski
- T. A. Rutkowski
- Pomiary Automatyka Robotyka - Rok 2013
The aim of this paper is to present a hybrid algorithm that combines the advantages ofartificial neural networks and hidden Markov models in speech recognition for control purpos-es. The scope of the paper includes review of currently used solutions, description and analysis of implementation of selected artificial neural network (NN) structures and hidden Markov mod-els (HMM). The main part of the paper consists of a description...

Pełny tekst do pobrania w portalu
Overview of the Virtual Transcription Laboratory Usage Scenarios and Architecture
Publikacja
- A. Dudczak
- M. Dudziński
- C. Mazurek
- P. Smoczyk
- Rok 2014
Pełny tekst do pobrania w serwisie zewnętrznym
Human-computer interactions in speech therapy using a blowing interface
Publikacja
- Rok 2014
In this paper we present a new human-computer interface for the quantitative measurement of blowing activities. The interface can measure the air flow and air pressure during the blowing activity. The measured values are stored and used to control the state of the graphical objects in the graphical user interface. In speech therapy children will find easier to play attractive therapeutic games than to perform repetitive and tedious,...

Pełny tekst do pobrania w serwisie zewnętrznym
Enabling Deeper Linguistic-based Text Analytics – Construct Development for the Criticality of Negative Service Experience
Publikacja
- A. Ojo
- N. Rizun
- IEEE Access - Rok 2019
Significant progress has been made in linguistic-based text analytics particularly with the increasing availability of data and deep learning computational models for more accurate opinion analysis and domain-specific entity recognition. In understanding customer service experience from texts, analysis of sentiments associated with different stages of the service lifecycle is a useful starting point. However, when richer insights...

Pełny tekst do pobrania w portalu
Speech and Drama

Czasopisma

ISSN: 0038-7142
LANGUAGE AND SPEECH

Czasopisma

ISSN: 0023-8309 , eISSN: 1756-6053
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
Publikacja
- Electronics - Rok 2022
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...

Pełny tekst do pobrania w portalu
Clinical situations text database for Polish language
Dane Badawcze
open access
- A. Czyżewski
- D. Szplit
- J. Bogdan
- B. Graff
- K. Narkiewicz
- K. Marciniuk
- A. Harasimiuk
- P. Odya
- seria: ADMEDVOICE
Dataset contains a database of anonymized texts in Polish for the purposes of building a medical speech corpus, for clinical situations in the following areas: medical interview, interview and description of the result of an oncological examination, description of a radiological examination, description of a pathomorphological examination, description...
Towards Effective Processing of Large Text Collections
Publikacja
- J. Szymański
- H. Krawczyk
- Rok 2012
In the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof...
Text Documents Classification with Support Vector Machines
Publikacja
- P. Majewski
- Rok 2008
Parallel Computations of Text Similarities for Categorization Task
Publikacja
- J. Szymański
- Rok 2013
In this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....
Database of speech and facial expressions recorded with optimized face motion capture settings
Publikacja
- A. Czyżewski
- M. Kawaler
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2019
The broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...

Pełny tekst do pobrania w portalu
Study on Speech Transmission under Varying QoS Parameters in a OFDM Communication System
Publikacja
- M. Zamłyńska
- P. Falkowski-Gilski
- G. Debita
- B. Miedziński
- Rok 2021
Although there has been an outbreak of multiple multimedia platforms worldwide, speech communication is still the most essential and important type of service. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission, based most often on multi-valued modulations, multiple...

Pełny tekst do pobrania w serwisie zewnętrznym
Machine Learning and Text Analysis in an Artificial Intelligent System for the Training of Air Traffic Controllers
Publikacja
- T. Shmelova
- Y. Sikirda
- N. Rizun
- V. Lazorenko
- V. Kharchenko
- Rok 2020
This chapter presents the application of new information technology in education for the training of air traffic controllers (ATCs). Machine learning, multi-criteria decision analysis, and text analysis as the methods of artificial intelligence for ATCs training have been described. The authors have made an analysis of the International Civil Aviation Organization documents for modern principles of ATCs education. The prototype...

Pełny tekst do pobrania w portalu
Artur Gańcza dr inż.

Osoby

Katedra Sygnałów i Systemów WETI

I received the M.Sc. degree from the Gdańsk University of Technology (GUT), Gdańsk, Poland, in 2019. I am currently a Ph.D. student at GUT, with the Department of Automatic Control, Faculty of Electronics, Telecommunications and Informatics. My professional interests include speech recognition, system identification, adaptive signal processing and linear algebra.
Transfer learning in imagined speech EEG-based BCIs
Publikacja
- J. S. Garcia Salinas
- L. Villaseñor-Pineda
- C. A. Reyes-Garćia
- A. A. Torres-García
- Biomedical Signal Processing and Control - Rok 2019
The Brain–Computer Interfaces (BCI) based on electroencephalograms (EEG) are systems which aim is to provide a communication channel to any person with a computer, initially it was proposed to aid people with disabilities, but actually wider applications have been proposed. These devices allow to send messages or to control devices using the brain signals. There are different neuro-paradigms which evoke brain signals of interest...

Pełny tekst do pobrania w portalu
Estimation of the short-term predictor parameters of speech under noisy conditions
Publikacja
- M. Kuropatwinski
- W. Kleijn
- M. Kuropatwiński
- IEEE Transactions on Audio Speech and Language Processing - Rok 2006
Pełny tekst do pobrania w serwisie zewnętrznym
Estimation of the excitation variances of speech and noise AR-models for enhanced speech coding
Publikacja
- M. Kuropatwinski
- W. Kleijn
- M. Kuropatwiński
- Rok 2001
Pełny tekst do pobrania w serwisie zewnętrznym
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
Publikacja
- K. Kąkol
- Rok 2023
The Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...

Pełny tekst do pobrania w portalu
Text Technology: A Journal of computer Text Processing

Czasopisma

ISSN: 1496-0958
The role of Snail1 transcription factor in colorectal cancer progression and metastasis
Publikacja
- M. Brzozowa
- M. Michalski
- G. Wyrobiec
- A. Piecuch
- A. Dittfeld
- M. Harabin-Słowińska
- D. Boroń
- R. Wojnicz
- Contemporary Oncology/Współczesna Onkologia - Rok 2015
Pełny tekst do pobrania w serwisie zewnętrznym
Multiple transcriptional factors regulate transcription of the rpoE gene in Escherichia coli under different growth conditions and when the lipopolysaccharide biosynthesis is defective.
Publikacja
- G. Klein-Raina
- A. Stupak
- D. Biernacka
- P. Wojtkiewicz
- B. Lindner
- S. Raina
- JOURNAL OF BIOLOGICAL CHEMISTRY - Rok 2016
The RpoE sigma factor is essential for the viability of Escherichia coli. RpoE regulates extracytoplasmic functions including lipopolysaccharide (LPS) translocation and some of its non-stoichiometric modifications. Transcription of the rpoE gene is positively autoregulated by EσE and by unknown mechanisms that control the expression of its distally located promoter(s). Mapping of 5′ ends of rpoE mRNA identified five new transcriptional...

Pełny tekst do pobrania w portalu
Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System
Publikacja
- P. Falkowski-Gilski
- G. Debita
- M. Habrych
- B. Miedziński
- P. Jedlikowski
- B. Polnik
- J. Wandzio
- X. Wang
- Rok 2020
The broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...

Pełny tekst do pobrania w serwisie zewnętrznym
Noise profiling for speech enhancement employing machine learning models
Publikacja
- K. Kąkol
- G. Korvel
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2022
This paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...

Pełny tekst do pobrania w portalu
Developing a Low SNR Resistant, Text Independent Speaker Recognition System for Intercom Solutions - A Case Study
Publikacja
- Rok 2024
This article presents a case study on the development of a biometric voice verification system for an intercom solution, utilizing the DeepSpeaker neural network architecture. Despite the variety of solutions available in the literature, there is a noted lack of evaluations for "text-independent" systems under real conditions and with varying distances between the speaker and the microphone. This article aims to bridge this gap....

Pełny tekst do pobrania w portalu
Intra-subject class-incremental deep learning approach for EEG-based imagined speech recognition
Publikacja
- J. S. Garcia Salinas
- A. A. Torres-García
- C. A. Reyes-Garćia
- L. Villaseñor-Pineda
- Biomedical Signal Processing and Control - Rok 2023
Brain–computer interfaces (BCIs) aim to decode brain signals and transform them into commands for device operation. The present study aimed to decode the brain activity during imagined speech. The BCI must identify imagined words within a given vocabulary and thus perform the requested action. A possible scenario when using this approach is the gradual addition of new words to the vocabulary using incremental learning methods....

Pełny tekst do pobrania w serwisie zewnętrznym
Intelligent processing of stuttered speech.
Publikacja
- A. Czyżewski
- A. Kaczmarek
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2003
W artykule zaprezentowano kilka metod analizy i automatycznego zliczania potknięć artykulacyjnych, związanych z jąkaniem się, opartych na wykorzystaniu algorytmów uczących się sztucznych sieci neuronowych i zbiorów przybliżonych.
External Validation Measures for Nested Clustering of Text Documents
Publikacja
- K. Draszawka
- J. Szymański
- Rok 2011
Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...
Towards facts extraction from text in Polish language
Publikacja
- T. M. Boiński
- A. Chojnowski
- Rok 2017
Natural Language Processing (NLP) finds many usages in different fields of endeavor. Many tools exists allowing analysis of English language. For Polish language the situation is different as the language itself is more complicated. In this paper we show differences between NLP of Polish and English language. Existing solutions are presented and TEAMS software for facts extraction is described. The paper shows also evaluation of...

Pełny tekst do pobrania w portalu
Text categorization with semantic commonsense knowledge: First results
Publikacja
- P. Majewski
- J. Szymański
- Rok 2008
Do przetwarzania tekstów typowo wykorzystuje się reprezentacjeBOW. Podejście takie nie daje jednak dobrych rezultatów w sytuacjigdy podobne dokumenty nie współdzielą ze sobą słów.W artykule zaprezentowano podejście do konstrukcji funkcjijądra dla klasyfikatorów SVM opartego na zewnętrznej bazie wiedzyo pojęciach językowych.
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
Publikacja
- P. Falkowski-Gilski
- G. Debita
- Archives of Acoustics - Rok 2023
In order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...

Pełny tekst do pobrania w portalu
Comparison of Language Models Trained on Written Texts and Speech Transcripts in the Context of Automatic Speech Recognition
Publikacja
- S. Dziadzio
- A. Nabożny
- A. Smywiński-Pohl
- B. Ziółko
- Rok 2015
Pełny tekst do pobrania w serwisie zewnętrznym
Expression of Selected Epithelial-Mesenchymal Transition Transcription Factors in Endometrial Cancer
Publikacja
- P. Sadłecki
- J. Jóźwicki
- P. Antosik
- M. Walentowicz-Sadłecka
- Biomed Research International - Rok 2020
Pełny tekst do pobrania w serwisie zewnętrznym
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publikacja
- Rok 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publikacja
- Rok 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
Text und Kontext

Czasopisma

ISSN: 0105-7014
Post-Colonial Text

Czasopisma

ISSN: 1705-9100
Text & Performance Quarterly

Czasopisma

ISSN: 1046-2937 , eISSN: 1479-5760
Text: Kritische Beiträge

Czasopisma

ISSN: 1420-1496
Text und Kritik

Czasopisma

ISSN: 0040-5329

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: text-to-speech transcription

Artur Gańcza dr inż.