Search results for: NEURAL TEXT-TO-SPEECH MULTILINGUAL SYNTHESIS VOICE CONVERSION SYNTHETIC DATA NORMALISING FLOWS

Search results for: NEURAL TEXT-TO-SPEECH MULTILINGUAL SYNTHESIS VOICE CONVERSION SYNTHETIC DATA NORMALISING FLOWS

results on page:
embed this view on your website

Filters

total: 11094

clear all filters disabled

displaying 1000 best results Help

Agile Commerce in the light of Text Mining
Publication
- A. Baj-Rogowska
- Przedsiębiorczość i Zarządzanie - Year 2017
The survey conducted for this study reveals that more than 84% of respondents have never encountered the term “agile commerce” and do not understand its meaning. At the same time, they are active participants of this strategy. Using digital channels as customers more often than ever before, they have already been included in the agile philosophy. Based on the above, the purpose of the study is to analyse major text sets containing...

Full text available to download
Ontology-based text convolution neural network (TextCNN) for prediction of construction accidents
Publication
- S. Donghui
- L. Zhigang
- J. Zurada
- A. Manikas
- J. Guan
- P. Weichbroth
- KNOWLEDGE AND INFORMATION SYSTEMS - Year 2024
The construction industry suffers from workplace accidents, including injuries and fatalities, which represent a significant economic and social burden for employers, workers, and society as a whole.The existing research on construction accidents heavily relies on expert evaluations,which often suffer from issues such as low efficiency, insufficient intelligence, and subjectivity.However, expert opinions provided in construction...

Full text to download in external service
Deep neural networks for data analysis 27/28
e-Learning Courses
- K. Draszawka
Deep neural networks for data analysis 25/26
e-Learning Courses
- K. Draszawka
Deep neural networks for data analysis 26/27
e-Learning Courses
- K. Draszawka
Automated detection of pronunciation errors in non-native English speech employing deep learning
Publication
- D. Korzekwa
- Year 2023
Despite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...

Full text available to download
BPL-PLC Voice Communication System for the Oil and Mining Industry
Publication
- G. Debita
- P. Falkowski-Gilski
- M. Habrych
- G. Wiśniewski
- B. Miedziński
- P. Jedlikowski
- A. Waniewska
- J. Wandzio
- B. Polnik
- ENERGIES - Year 2020
Application of a high-efficiency voice communication systems based on broadband over power line-power line communication (BPL-PLC) technology in medium voltage networks, including hazardous areas (like the oil and mining industry), as a redundant mean of wired communication (apart from traditional fiber optics and electrical wires) can be beneficial. Due to the possibility of utilizing existing electrical infrastructure, it can...

Full text available to download
EU-Turkey Customs Union and Bilateral Foreign Direct Investment Flows
Publication
- A. Marszk
- Year 2016
Main aim of this text is presentation of the effects of customs union between the European Union and turkey on bilateral FDI flows in light of the theory of linkages between economic integration and FDI flows. First section of the text is a survey of main theoretical links between economic integration and FDI flows. Second section focuses on the history and scope of the customs union. Third and fourth sections are empirical and...

Full text to download in external service
Knowledge Flows in Cluster Organizations
Publication
- A. Lis
- M. Zięba
- Year 2019
This paper aims to identify knowledge flows in cluster organizations (COs). On the basis of a literature analysis on knowledge flows and cluster organizations, the following research question was formulated: What kind of knowledge flows can be identified in cluster organizations and what is their main characteristics? The paper is based on a literature analysis and Grounded Theory methodology, examining four purposefully selected...

Full text to download in external service
INFLUENCE OF DATA NORMALIZATION ON THE EFFECTIVENESS OF NEURAL NETWORKS APPLIED TO CLASSIFICATION OF PAVEMENT CONDITIONS – CASE STUDY
Publication
- K. Marciniuk
- B. Kostek
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2018
In recent years automatic classification employing machine learning seems to be in high demand for tele-informatic-based solutions. An example of such solutions are intelligent transportation systems (ITS), in which various factors are taken into account. The subject of the study presented is the impact of data pre-processing and normalization on the accuracy and training effectiveness of artificial neural networks in the case...
Bożena Kostek prof. dr hab. inż.

People

Laboratorium Akustyki Fonicznej
Comprehensive Evaluation of Statistical Speech Waveform Synthesis
Publication
- T. Merritt
- B. Putrycz
- A. Nadolski
- T. Ye
- D. Korzekwa
- W. Dolecki
- T. Drugman
- V. Klimkov
- A. Moinet
- A. Breen... and 3 others
- Year 2018
Full text to download in external service
Text Categorization Improvement via User Interaction
Publication
- J. Atroszko
- J. Szymański
- D. Gil
- H. Mora
- Year 2018
In this paper, we propose an approach to improvement of text categorization using interaction with the user. The quality of categorization has been defined in terms of a distribution of objects related to the classes and projected on the self-organizing maps. For the experiments, we use the articles and categories from the subset of Simple Wikipedia. We test three different approaches for text representation. As a baseline we use...

Full text to download in external service
Training of Deep Learning Models Using Synthetic Datasets
Publication
- Z. Kowalczuk
- J. Glinko
- Year 2022
In order to solve increasingly complex problems, the complexity of Deep Neural Networks also needs to be constantly increased, and therefore training such networks requires more and more data. Unfortunately, obtaining such massive real world training data to optimize neural networks parameters is a challenging and time-consuming task. To solve this problem, we propose an easy-touse and general approach to training deep learning...

Full text to download in external service
Communication Platform for Evaluation of Transmitted Speech Quality
Publication
- A. Ciarkowski
- A. Czyżewski
- Journal of Telecommunications and Information Technology - Year 2011
A voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...

Full text available to download
Examining Influence of Distance to Microphone on Accuracy of Speech Recognition
Publication
- Year 2015
The problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal...

Full text to download in external service
Detecting Lombard Speech Using Deep Learning Approach
Publication
- K. Kąkol
- G. Korvel
- G. Tamulevicius
- B. Kostek
- SENSORS - Year 2023
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Full text available to download
Developing a Low SNR Resistant, Text Independent Speaker Recognition System for Intercom Solutions - A Case Study
Publication
- Year 2024
This article presents a case study on the development of a biometric voice verification system for an intercom solution, utilizing the DeepSpeaker neural network architecture. Despite the variety of solutions available in the literature, there is a noted lack of evaluations for "text-independent" systems under real conditions and with varying distances between the speaker and the microphone. This article aims to bridge this gap....

Full text available to download
Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network
Publication
- A. Wawrzyński
- J. Szymański
- Applied Sciences-Basel - Year 2021
To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...

Full text available to download
Comparative Analysis of Text Representation Methods Using Classification
Publication
- J. Szymański
- CYBERNETICS AND SYSTEMS - Year 2014
In our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...

Full text to download in external service
Neural network training with limited precision and asymmetric exponent
Publication
- M. Blok
- M. Pietrołaj
- Journal of Big Data - Year 2022
Along with an extremely increasing number of mobile devices, sensors and other smart utilities, an unprecedented growth of data can be observed in today’s world. In order to address multiple challenges facing the big data domain, machine learning techniques are often leveraged for data analysis, filtering and classification. Wide usage of artificial intelligence with large amounts of data creates growing demand not only for storage...

Full text available to download
Prioritising national healthcare service issues from free text feedback – A computational text analysis & predictive modelling approach
Publication
- A. Ojo
- N. Rizun
- G. Walsh
- M. I. Mashinchi
- M. Venosa
- M. N. Rao
- DECISION SUPPORT SYSTEMS - Year 2024
Patient experience surveys have become a key source of evidence for supporting decision-making and continuous quality improvement within healthcare services. To harness free-text feedback collected as part of these surveys for additional insights, text analytics methods are increasingly employed when the data collected is not amenable to traditional qualitative analysis due to volume. However, while text analytics techniques offer...

Full text available to download
Applying the Lombard Effect to Speech-in-Noise Communication
Publication
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Electronics - Year 2023
This study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...

Full text available to download
Generating actionable evidence from free-text feedback to improve maternity and acute hospital experiences: A computational text analytics & predictive modelling approach
Publication
- A. Ojo
- N. Rizun
- M. Isazad Mashinchi
- G. Walsh
- J. Gruda
- M. N. Narayana
- M. Venosa
- C. Foley
- D. Rohde
- R. Flynn
- EUROPEAN JOURNAL OF PUBLIC HEALTH - Year 2023
Background Patient experience surveys are a key source of evidence for supporting decision-making and quality improvement in healthcare services. These surveys contain two main types of questions: closed and open-ended, asking about patients’ care experiences. Apart from the knowledge obtained from analysing closed-ended questions, invaluable insights can be gleaned from free-text data. Advanced analytics techniques are increasingly...

Full text to download in external service
A non-uniform real-time speech time-scale stretching method
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2011
An algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...
Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets
Publication
- Year 2008
Celem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
Publication
- D. Korzekwa
- R. Barra-Chicote
- S. Zaporowski
- G. Beringer
- J. Lorenzo-trueba
- A. Serafinowicz
- J. Droppo
- T. Drugman
- B. Kostek
- Year 2021
This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

Full text available to download
Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System
Publication
- P. Falkowski-Gilski
- G. Debita
- M. Habrych
- B. Miedziński
- P. Jedlikowski
- B. Polnik
- J. Wandzio
- X. Wang
- Year 2020
The broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...

Full text to download in external service
Constructing a Dataset of Speech Recordingswith Lombard Effect
Publication
- D. Weber
- S. Zaporowski
- D. Korzekwa
- Year 2020
Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publication
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Year 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Full text available to download
Two-step Conversion of Crude Glycerol Generated by Biodiesel Production into Biopolyols: Synthesis, Structural and Physical Chemical Characterization
Publication
- A. Hejna
- P. Kosmela
- M. Klein
- K. Formela
- M. Kopczyńska
- J. T. Haponiuk
- Ł. Piszczyk
- JOURNAL OF POLYMERS AND THE ENVIRONMENT - Year 2018
In this work biopolyols were synthesized via two-step process from crude glycerol and castor oil. For better evaluation of analyzed process, the impact of its time and temperature on the structure and properties of biopolyols was determined. Obtained results fully justified conducting of synthesis in two steps. Prepared materials were characterized by hydroxyl value and water content comparable to polyols industrially applied in...

Full text to download in external service
Mowa nienawiści (hate speech) a odpowiedzialność dostawców usług internetowych w orzecznictwie sądów europejskich
Publication
- K. Kowalik-Bańczyk
- Year 2015
The article analyses the phenomenon of hate speech in the Internet contrasted with the problem of responsability of Internet Service Providers for cases of such abuses of freedom of expression. The text provides an analysis of jurisprudence of two European Courts. On the one hand it presents the position of the European Court of Human Rights on the problem of hate speech: its definition and the liability for it as an exception...
Subjective Quality Evaluation of Underground BPL-PLC Voice Communication System
Publication
- G. Debita
- P. Falkowski-Gilski
- M. Habrych
- B. Miedziński
- B. Polnik
- J. Wandzio
- P. Jedlikowski
- Year 2020
Designing a reliable voice transmission system is not a trivial task. Wired media, thanks to their resistance to mechanical damage, seem an ideal solution. The BPL-PLC (Broadband over Power Line – Power Line Communication) cable is resilient to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency situation, including paramedic rescue operations. These features...

Full text to download in external service
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
Publication
- G. Tamulevicius
- G. Korvel
- A. B. Yayak
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Electronics - Year 2020
In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Full text available to download
A review of amide bond formation in microwave organic synthesis
Publication
- N. Łukasik
- E. Wagner-Wysiecka
- CURRENT ORGANIC SYNTHESIS - Year 2014
Microwave-Assisted Organic Synthesis (MAOS) is one of the most current trends in organic chemistry. Herein, both the most popular and new approaches in microwave-syntheses of very important linkage in Nature - amide bond - are overviewed and compared with conventional synthetic routes.

Full text to download in external service
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
Publication
- S. Zaporowski
- Year 2024
The article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...

Full text available to download
Evaluation of Path Based Methods for Conceptual Representation of the Text
Publication
- Ł. Kucharczyk
- J. Szymański
- Year 2014
Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

Full text to download in external service
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
Publication
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Year 2022
The aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...

Full text available to download
Speech Intelligibility Measurements in Auditorium
Publication
- K. Leo
- ACTA PHYSICA POLONICA A - Year 2010
Speech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...

Full text available to download
Distortion of speech signals in the listening area: its mechanism and measurements
Publication
- H. Lasota
- R. Mazurek
- I. Kochańska
- Year 2014
The paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...

Full text to download in external service
Self-Organising map neural network in the analysis of electromyography data of muscles acting at temporomandibular joint.
Publication
- Year 2021
The temporomandibular joint (TMJ) is the joint that via muscle action and jaw motion allows for necessary physiological performances such as mastication. Whereas mandible translates and rotates [1]. Estimation of activity of muscles acting at the TMJ provides a knowledge of activation pattern solely of a specific patient that an electromyography (EMG) examination was carried out [2]. In this work, a Self-Organising Maps (SOMs)...

Full text to download in external service
High quality speech codec employing sines+noise+transients model
Publication
- Archives of Acoustics - Year 2006
A method of high quality wideband speech signal representation employing sines+transients+noise model is presented. The need for a wideband speech coding approach as well as various methods for analysis and synthesis of sines, residual and transient states of speech signal is discussed. The perceptual criterion is applied in the proposed approach during encoding of sines amplitudes in order to reduce bandwidth requirements and...

Full text available to download
Ranking Speech Features for Their Usage in Singing Emotion Classification
Publication
- S. Zaporowski
- B. Kostek
- Year 2020
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Full text available to download
Impact of Clustering on a Synthetic Instance Generation in Imbalanced Data Streams Classification
Publication
- I. Czarnowski
- D. Martins
- Year 2022
Full text to download in external service
Biosynthetic and synthetic access to amino sugars.
Publication
- K. Skarbek
- M. J. Milewska
- CARBOHYDRATE RESEARCH - Year 2016
Amino sugars are important constituents of a number of biomacromolecules and products of mi crobial secondary metabolism, including antibiotics. For most of them, the amino group is located at the positions C1, C2 or C3 of the hexose or pentose ring. In biological systems, amino sugars are formed due to the catalytic activity of specific aminotransferases or amidotransferases by introducing an amino functionality derived from L-glutamate...

Full text available to download
Transient detection for speech coding applications
Publication
- International Journal of Computer Science and Network Security - Year 2006
Signal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...

Full text to download in external service
Synthesis metods of nanomaterials
e-Learning Courses
- M. S. Łapiński
Development and Research of the Text Messages Semantic Clustering Methodology
Publication
- N. Rizun
- P. Kapłański
- Y. Taranenko
- Year 2016
The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

Full text available to download
REAL-TIME VOICE QUALITY MONITORING TOOL FOR VOIP OVER IPV6 NETWORKS
Publication
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2013
The primary aim of this paper is to present a new application which is at this moment the only open source real-time VoIP quality monitoring tool that supports IPv6 networks. The application can keep VoIP system administrators provided at any time with up-to-date voice quality information. Multiple quality scores that are automatically obtained throughout each call reflect influence of variable packet losses and delays on voice...
Computer vision techniques applied for reconstruction of seafloor 3D images from side scan and synthetic aperture sonars data
Publication
- Year 2008
The Side Scan Sonar and Synthetic Aperture Sonar are well known echo signal processing technologies that produce 2D images of the seafloor. Both systems combines a number of acoustic pings to form a high resolution image of seafloor. It was shown in numerous papers that 2D images acquired by such systems can be transformed into 3D models of seafloor surface by algorithmic approach using intensity information, contained in a grayscaled...

Search

Filters

Catalog

Search results for: NEURAL TEXT-TO-SPEECH MULTILINGUAL SYNTHESIS VOICE CONVERSION SYNTHETIC DATA NORMALISING FLOWS

Bożena Kostek prof. dr hab. inż.