Filters
total: 11094
-
Catalog
- Publications 7181 available results
- Journals 204 available results
- Conferences 107 available results
- People 168 available results
- Inventions 1 available results
- Projects 8 available results
- Laboratories 1 available results
- e-Learning Courses 496 available results
- Events 29 available results
- Open Research Data 2899 available results
displaying 1000 best results Help
Search results for: NEURAL TEXT-TO-SPEECH MULTILINGUAL SYNTHESIS VOICE CONVERSION SYNTHETIC DATA NORMALISING FLOWS
-
Agile Commerce in the light of Text Mining
PublicationThe survey conducted for this study reveals that more than 84% of respondents have never encountered the term “agile commerce” and do not understand its meaning. At the same time, they are active participants of this strategy. Using digital channels as customers more often than ever before, they have already been included in the agile philosophy. Based on the above, the purpose of the study is to analyse major text sets containing...
-
Ontology-based text convolution neural network (TextCNN) for prediction of construction accidents
PublicationThe construction industry suffers from workplace accidents, including injuries and fatalities, which represent a significant economic and social burden for employers, workers, and society as a whole.The existing research on construction accidents heavily relies on expert evaluations,which often suffer from issues such as low efficiency, insufficient intelligence, and subjectivity.However, expert opinions provided in construction...
-
Deep neural networks for data analysis 27/28
e-Learning Courses -
Deep neural networks for data analysis 25/26
e-Learning Courses -
Deep neural networks for data analysis 26/27
e-Learning Courses -
Automated detection of pronunciation errors in non-native English speech employing deep learning
PublicationDespite significant advances in recent years, the existing Computer-Assisted Pronunciation Training (CAPT) methods detect pronunciation errors with a relatively low accuracy (precision of 60% at 40%-80% recall). This Ph.D. work proposes novel deep learning methods for detecting pronunciation errors in non-native (L2) English speech, outperforming the state-of-the-art method in AUC metric (Area under the Curve) by 41%, i.e., from...
-
BPL-PLC Voice Communication System for the Oil and Mining Industry
PublicationApplication of a high-efficiency voice communication systems based on broadband over power line-power line communication (BPL-PLC) technology in medium voltage networks, including hazardous areas (like the oil and mining industry), as a redundant mean of wired communication (apart from traditional fiber optics and electrical wires) can be beneficial. Due to the possibility of utilizing existing electrical infrastructure, it can...
-
EU-Turkey Customs Union and Bilateral Foreign Direct Investment Flows
PublicationMain aim of this text is presentation of the effects of customs union between the European Union and turkey on bilateral FDI flows in light of the theory of linkages between economic integration and FDI flows. First section of the text is a survey of main theoretical links between economic integration and FDI flows. Second section focuses on the history and scope of the customs union. Third and fourth sections are empirical and...
-
Knowledge Flows in Cluster Organizations
PublicationThis paper aims to identify knowledge flows in cluster organizations (COs). On the basis of a literature analysis on knowledge flows and cluster organizations, the following research question was formulated: What kind of knowledge flows can be identified in cluster organizations and what is their main characteristics? The paper is based on a literature analysis and Grounded Theory methodology, examining four purposefully selected...
-
INFLUENCE OF DATA NORMALIZATION ON THE EFFECTIVENESS OF NEURAL NETWORKS APPLIED TO CLASSIFICATION OF PAVEMENT CONDITIONS – CASE STUDY
PublicationIn recent years automatic classification employing machine learning seems to be in high demand for tele-informatic-based solutions. An example of such solutions are intelligent transportation systems (ITS), in which various factors are taken into account. The subject of the study presented is the impact of data pre-processing and normalization on the accuracy and training effectiveness of artificial neural networks in the case...
-
Bożena Kostek prof. dr hab. inż.
People -
Comprehensive Evaluation of Statistical Speech Waveform Synthesis
Publication -
Text Categorization Improvement via User Interaction
PublicationIn this paper, we propose an approach to improvement of text categorization using interaction with the user. The quality of categorization has been defined in terms of a distribution of objects related to the classes and projected on the self-organizing maps. For the experiments, we use the articles and categories from the subset of Simple Wikipedia. We test three different approaches for text representation. As a baseline we use...
-
Training of Deep Learning Models Using Synthetic Datasets
PublicationIn order to solve increasingly complex problems, the complexity of Deep Neural Networks also needs to be constantly increased, and therefore training such networks requires more and more data. Unfortunately, obtaining such massive real world training data to optimize neural networks parameters is a challenging and time-consuming task. To solve this problem, we propose an easy-touse and general approach to training deep learning...
-
Communication Platform for Evaluation of Transmitted Speech Quality
PublicationA voice communication system designed and implemented is described. The purpose of the presented platform was to enable a series of experiments related to the quality assessment of algorithms used in the coding and transmitting of speech. The system is equipped with tools for recording signals at each stage of processing, making it possible to subject them to subjective assessments by listening tests or, objective evaluation employing...
-
Examining Influence of Distance to Microphone on Accuracy of Speech Recognition
PublicationThe problem of controlling a machine by the distant-talking speaker without a necessity of handheld or body-worn equipment usage is considered. A laboratory setup is introduced for examination of performance of the developed automatic speech recognition system fed by direct and by distant speech acquired by microphones placed at three different distances from the speaker (0.5 m to 1.5 m). For feature extraction from the voice signal...
-
Detecting Lombard Speech Using Deep Learning Approach
PublicationRobust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...
-
Developing a Low SNR Resistant, Text Independent Speaker Recognition System for Intercom Solutions - A Case Study
PublicationThis article presents a case study on the development of a biometric voice verification system for an intercom solution, utilizing the DeepSpeaker neural network architecture. Despite the variety of solutions available in the literature, there is a noted lack of evaluations for "text-independent" systems under real conditions and with varying distances between the speaker and the microphone. This article aims to bridge this gap....
-
Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network
PublicationTo effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...
-
Comparative Analysis of Text Representation Methods Using Classification
PublicationIn our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...
-
Neural network training with limited precision and asymmetric exponent
PublicationAlong with an extremely increasing number of mobile devices, sensors and other smart utilities, an unprecedented growth of data can be observed in today’s world. In order to address multiple challenges facing the big data domain, machine learning techniques are often leveraged for data analysis, filtering and classification. Wide usage of artificial intelligence with large amounts of data creates growing demand not only for storage...
-
Prioritising national healthcare service issues from free text feedback – A computational text analysis & predictive modelling approach
PublicationPatient experience surveys have become a key source of evidence for supporting decision-making and continuous quality improvement within healthcare services. To harness free-text feedback collected as part of these surveys for additional insights, text analytics methods are increasingly employed when the data collected is not amenable to traditional qualitative analysis due to volume. However, while text analytics techniques offer...
-
Applying the Lombard Effect to Speech-in-Noise Communication
PublicationThis study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...
-
Generating actionable evidence from free-text feedback to improve maternity and acute hospital experiences: A computational text analytics & predictive modelling approach
PublicationBackground Patient experience surveys are a key source of evidence for supporting decision-making and quality improvement in healthcare services. These surveys contain two main types of questions: closed and open-ended, asking about patients’ care experiences. Apart from the knowledge obtained from analysing closed-ended questions, invaluable insights can be gleaned from free-text data. Advanced analytics techniques are increasingly...
-
A non-uniform real-time speech time-scale stretching method
PublicationAn algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...
-
Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets
PublicationCelem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...
-
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
PublicationThis paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...
-
Subjective Quality Evaluation of Speech Signals Transmitted via BPL-PLC Wired System
PublicationThe broadband over power line – power line communication (BPL-PLC) cable is resistant to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency. These features make it an ideal solution for delivering data, e.g. in an underground mine environment, especially clear and easily understandable voice messages. This paper describes a subjective quality evaluation of...
-
Constructing a Dataset of Speech Recordingswith Lombard Effect
PublicationThepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
-
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
PublicationWe present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...
-
Two-step Conversion of Crude Glycerol Generated by Biodiesel Production into Biopolyols: Synthesis, Structural and Physical Chemical Characterization
PublicationIn this work biopolyols were synthesized via two-step process from crude glycerol and castor oil. For better evaluation of analyzed process, the impact of its time and temperature on the structure and properties of biopolyols was determined. Obtained results fully justified conducting of synthesis in two steps. Prepared materials were characterized by hydroxyl value and water content comparable to polyols industrially applied in...
-
Mowa nienawiści (hate speech) a odpowiedzialność dostawców usług internetowych w orzecznictwie sądów europejskich
PublicationThe article analyses the phenomenon of hate speech in the Internet contrasted with the problem of responsability of Internet Service Providers for cases of such abuses of freedom of expression. The text provides an analysis of jurisprudence of two European Courts. On the one hand it presents the position of the European Court of Human Rights on the problem of hate speech: its definition and the liability for it as an exception...
-
Subjective Quality Evaluation of Underground BPL-PLC Voice Communication System
PublicationDesigning a reliable voice transmission system is not a trivial task. Wired media, thanks to their resistance to mechanical damage, seem an ideal solution. The BPL-PLC (Broadband over Power Line – Power Line Communication) cable is resilient to electricity stoppage and partial damage of phase conductors. It maintains continuity of transmission in case of an emergency situation, including paramedic rescue operations. These features...
-
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
PublicationIn this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...
-
A review of amide bond formation in microwave organic synthesis
PublicationMicrowave-Assisted Organic Synthesis (MAOS) is one of the most current trends in organic chemistry. Herein, both the most popular and new approaches in microwave-syntheses of very important linkage in Nature - amide bond - are overviewed and compared with conventional synthetic routes.
-
The Impact of Foreign Accents on the Performance of Whisper Family Models Using Medical Speech in Polish
PublicationThe article presents preliminary experiments investigating the impact of accent on the performance of the Whisper automatic speech recognition (ASR) system, specifically for the Polish language and medical data. The literature review revealed a scarcity of studies on the influence of accents on speech recognition systems in Polish, especially concerning medical terminology. The experiments involved voice cloning of selected individuals...
-
Evaluation of Path Based Methods for Conceptual Representation of the Text
PublicationTypical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...
-
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
PublicationThe aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...
-
Speech Intelligibility Measurements in Auditorium
PublicationSpeech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...
-
Distortion of speech signals in the listening area: its mechanism and measurements
PublicationThe paper deals with a problem of the influence of the number and distribution of loudspeakers in speech reinforcement systems on the quality of publicly addressed voice messages, namely on speech intelligibility in the listening area. Linear superposition of time-shifted broadband waves of a same form and slightly different magnitudes that reach a listener from numerous coherent sources, is accompanied by interference effects...
-
Self-Organising map neural network in the analysis of electromyography data of muscles acting at temporomandibular joint.
PublicationThe temporomandibular joint (TMJ) is the joint that via muscle action and jaw motion allows for necessary physiological performances such as mastication. Whereas mandible translates and rotates [1]. Estimation of activity of muscles acting at the TMJ provides a knowledge of activation pattern solely of a specific patient that an electromyography (EMG) examination was carried out [2]. In this work, a Self-Organising Maps (SOMs)...
-
High quality speech codec employing sines+noise+transients model
PublicationA method of high quality wideband speech signal representation employing sines+transients+noise model is presented. The need for a wideband speech coding approach as well as various methods for analysis and synthesis of sines, residual and transient states of speech signal is discussed. The perceptual criterion is applied in the proposed approach during encoding of sines amplitudes in order to reduce bandwidth requirements and...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublicationThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
Impact of Clustering on a Synthetic Instance Generation in Imbalanced Data Streams Classification
Publication -
Biosynthetic and synthetic access to amino sugars.
PublicationAmino sugars are important constituents of a number of biomacromolecules and products of mi crobial secondary metabolism, including antibiotics. For most of them, the amino group is located at the positions C1, C2 or C3 of the hexose or pentose ring. In biological systems, amino sugars are formed due to the catalytic activity of specific aminotransferases or amidotransferases by introducing an amino functionality derived from L-glutamate...
-
Transient detection for speech coding applications
PublicationSignal quality in speech codecs may be improved by selecting transients from speech signal and encoding them using a suitable method. This paper presents an algorithm for transient detection in speech signal. This algorithm operates in several frequency bands. Transient detection functions are calculated from energy measured in short frames of the signal. The final selection of transient frames is based on results of detection...
-
Synthesis metods of nanomaterials
e-Learning Courses -
Development and Research of the Text Messages Semantic Clustering Methodology
PublicationThe methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...
-
REAL-TIME VOICE QUALITY MONITORING TOOL FOR VOIP OVER IPV6 NETWORKS
PublicationThe primary aim of this paper is to present a new application which is at this moment the only open source real-time VoIP quality monitoring tool that supports IPv6 networks. The application can keep VoIP system administrators provided at any time with up-to-date voice quality information. Multiple quality scores that are automatically obtained throughout each call reflect influence of variable packet losses and delays on voice...
-
Computer vision techniques applied for reconstruction of seafloor 3D images from side scan and synthetic aperture sonars data
PublicationThe Side Scan Sonar and Synthetic Aperture Sonar are well known echo signal processing technologies that produce 2D images of the seafloor. Both systems combines a number of acoustic pings to form a high resolution image of seafloor. It was shown in numerous papers that 2D images acquired by such systems can be transformed into 3D models of seafloor surface by algorithmic approach using intensity information, contained in a grayscaled...