displaying 1000 best results Help
Search results for: AUTOMATIC SPEECH RECOGNITION
-
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
PublicationThis paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...
-
Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole
PublicationThis paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed
-
Wsparcie rzeczywistości wirtualnej dla projektów realizowanych w ASP w Gdańsku / Virtual reality support for the projects carried out in the AFA in Gdańsk
PublicationLaboratorium Zanurzonej Wizualizacji Przestrzennej (LZWP) jest unikatowym w skali kraju miejscem, gdzie w grupie kilku osób można wspólnie eksplorować świat iluzji. Na takie doświadczenie pozwalają znajdujące się tam jaskinie rzeczywistości wirtualnej (ang. Cave Automatic Virtual Environment, CAVE) umożliwiające uczestnikom eksperymentu wniknięcie w środowisko zarówno kreowane komputerowo, jak i stanowiące cyfrową kopię realnego...
-
IEEE Conference on Computer Vision and Pattern Recognition
Conferences -
International Workshop on Pattern Recognition in Information Systems
Conferences -
International Conference on Pattern Recognition Applications and Methods
Conferences -
International Conference on Artificial Intelligence and Pattern Recognition
Conferences -
IEEE International Conference on Document Analysis and Recognition
Conferences -
Instantaneous complex frequency for pipeline pitch estimation
PublicationIn the paper a pipeline algorithm for estimating the pitch of speech signal is proposed. The algorithm uses instantaneous complex frequencies estimated for four waveforms obtained by filtering the original speech signal through four bandpass complex Hilbert filters. The imaginary parts of ICFs from each channel give four candidates for pitch estimates. The decision regarding the final estimate is made based on the real parts of...
-
XVIII Międzynarodowe Sympozjum Inżynierii i Reżyserii Dźwięku
PublicationThe subjective assessment of speech signals takes into account previous experiences and habits of an individual. Since the perception process deteriorates with age, differences should be noticeable among people from dissimilar age groups. In this work, we investigated the difference of speech quality assessment between high school students and university students. The study involved 60 participants, with 30 people in both the adolescents...
-
Engineering Candida albicans glucosamine-6-phosphate synthase for efficient enzyme purification
PublicationRationally designed muteins of Candida albicans glucosamine-6-phosphate synthase, an enzyme known as a promising target for antifungal chemotherapy, were constructed, overexpressed in Escherichia coli and purified to near homogeneity. To facilitate and to optimize the purification of the enzyme, three recombinant versionscontaining internal oligoHis fragments were constructed: (i) by substituting residues 343 - 348...
-
Simultaneous determination of thermodynamic and kinetic parameters of aminopolycarbonate complexes of cobalt(II) and nickel(II) based on isothermal titration calorimetry data
Publication -
Zinc(II) complexation by some biologically relevant pH buffers
Publication -
Digital fingerprinting for color images based on the quaternion encryption scheme
PublicationIn this paper we present a new quaternion-based encryption technique for color images. In the proposed encryption method, images are written as quaternions and are rotated in a three-dimensional space around another quaternion, which is an encryption key. The encryption process uses the cipher block chaining (CBC) mode. Further, this paper shows that our encryption algorithm enables digital fingerprinting as an additional feature....
-
Bridging challenges of clinical decision support systems with a semantic approach. A case study on breast cancer
PublicationThe integration of Clinical Decision Support Systems (CDSS) in nowadays clinical environments has not been fully achieved yet. Although numerous approaches and technologies have been proposed since 1960, there are still open gaps that need to be bridged. In this work we present advances from the established state of the art, overcoming some of the most notorious reported difficulties in: (i) automating CDSS, (ii) clinical workflow...
-
Creating new voices using normalizing flows
PublicationCreating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...
-
Human voice modification using instantaneous complex frequency
PublicationThe paper presents the possibilities of changing human voice by modifying instantaneous complex frequency (ICF) of the speech signal. The proposed method provides a flexible way of altering voice without the necessity of finding fundamental frequency and formants' positions or detecting voiced and unvoiced fragments of speech. The algorithm is simple and fast. Apart from ICF it uses signal factorization into two factors: one fully...
-
Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek
PublicationW wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...
-
New Approach to Noncasual Identification of Nonstationary Stochastic FIR Systems Subject to Both Smooth and Abrupt Parameter Changes
PublicationIn this technical note, we consider the problem of finite-interval parameter smoothing for a class of nonstationary linear stochastic systems subject to both smooth and abrupt parameter changes. The proposed parallel estimation scheme combines the estimates yielded by several exponentially weighted basis function algorithms. The resulting smoother automatically adjusts its smoothing bandwidth to the type and rate of nonstationarity...
-
Lattice filter based multivariate autoregressive spectral estimation with joint model order and estimation bandwidth adaptation
PublicationThe problem of parametric, autoregressive model based estimation of a time-varying spectral density function of a multivariate nonstationary process is considered. It is shown that estimation results can be considerably improved if identification of the autoregressive model is carried out using the two-sided doubly exponentially weighted lattice algorithm which combines results yielded by two one-sided lattice algorithms running...
-
Auditory-visual attention stimulator
PublicationNew approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...
-
International Conference on Advances in Pattern Recognition and Digital Techniques
Conferences -
INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH
PublicationThe Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublicationIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
Auditory Brainstem Responses recorded employing Audio ABR device
Open Research DataThe dataset consists of ABR measurements employing click, burst and speech stimuli. Parameters of the particular stimuli were as follows:
-
Pracujący w czasie rzeczywistym system detekcji gazów wykorzystujący przenośny komputer Raspberry PI oraz matrycę półprzewodnikowych czujników gazu
PublicationThe gas-analyzing systems based on the array of partially selective gas sensors and pattern-recognition techniques are potentially fast and lowcost alternative for other devices, like gas‑analysers. They give the possibility of recognition the type and the concentration of measured volatile compounds in their working environment. In this work we present the implementation of gas recognition system, in which the signals from an...
-
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
PublicationIn this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...
-
Interactions with recognized patients using smart glasses
PublicationRecently, different smart glasses solutions have been proposed on the market. The rapid development of this wearable technology has led to several research projects related to applications of smart glasses in healthcare. In this paper we propose a general architecture of the system enabling data integration for the recognized person. In the proposed system smart glasses integrates data obtained for the recognized patient from health...
-
Prof. Haitham Abu-Rub - A Visit to Poland's Gdansk University of Technology
PublicationReport on visit of Prof. Haitham Abu-Rub in Gdansk University of Technology. Speech on the Smart Grid Centre. Visit in the new smart grid laboratory of the GUT, the Laboratory for Innovative Power Technologies and Integration of Renewable Energy Sources (LINTE^2).
-
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
PublicationThis paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...
-
Investigation of educational processes with affective computing methods
PublicationThis paper concerns the monitoring of educational processes with the use of new technologies for the recognition of human emotions. This paper summarizes results from three experiments, aimed at the validation of applying emotion recognition to e-learning. An analysis of the experiments’ executions provides an evaluation of the emotion elicitation methods used to monitor learners. The comparison of affect recognition algorithms...
-
High-resolution wind wave parameters in the area of the Gulf of Gdańsk during 21 extreme storms
Open Research DataThis dataset contains the results of wind-wave parameter modelling in the area of the Gulf of Gdańsk (Southern Baltic). For the simulations, a high resolution SWAN model was used. The dataset consists of the significant wave height, the direction of the wave approaching the shore and the wave period during 21 historical, extreme storms. The storms were...
-
Gesture-based computer control system
PublicationIn the paper a system for controlling computer applications by hand gestures is presented. First, selected methods used for gesture recognition are described. The system hardware and a way of controlling a computer by gestures are described. The architecture of the software along with hand gesture recognition methods and algorithms used are presented. Examples of basic and complex gestures recognized by the system are given.
-
High-resolution wind wave parameters in the area of the Gulf of Gdańsk during 21 extreme storms (GIS dataset)
Open Research DataThis GIS dataset contains the results of wind-wave parameter modelling in the area of the Gulf of Gdańsk (Southern Baltic). For the simulations, a high resolution SWAN model was used. The dataset consists of the significant wave height, the direction of the wave approaching the shore and the wave period during 21 historical, extreme storms (rasters)....
-
Modeling and Designing Acoustical Conditions of the Interior – Case Study
PublicationThe primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...
-
Comparative analysis of various transformation techniques for voiceless consonants modeling
PublicationIn this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....
-
Stanowisko do diagnostyki i regulacji wyzwalaczy przekaźników w obwodach lokomotyw elektrycznych
PublicationFast circuit breaker and protective relays play an important role of protecting locomotive’s electrical circuits from damage caused by short circuit or overload. Undisturbed operation and proper circuit protection depends on the technical condition and fine calibration of the over-current trips. Therefore the relays should be subject to periodic inspection and adjustment due to locomotive maintenance schedule. A test stand for...
-
Design of synchronous generator voltage regulator based on h control theory
PublicationThe power system is highly nonlinear system, which dynamics depends on system network configuration, system loading…etc. To overcome the above mentioned difficulties and to fulfill the performance requirements the different control methods are considered and tested for application to generating control unit. Application of the H optimization method to automatic voltage regulator is studied in this paper. Simulation results have...
-
A Study on Audio Signal Processed by "Instant Mastering"
PublicationAn increasing amount of music produced in home- and project-studios results in development and growth of "automatic mastering services". The presented investigation explores changes introduced to audio signal by various online mastering platforms. A music set consisting of 10 songs produced in small facilities was processed by eight on-line automatic mastering services. Additionally, some laboratory-constructed signals were tested....
-
Employing a biofeedback method based on hemispheric synchronization in effective learning
PublicationIn this paper an approach to build a brain computer-based hemispheric synchronization system is presented. The concept utilizes the wireless EEG signal registration and acquisition as well as advanced pre-processing methods. The influence of various filtration techniques of EOG artifacts on brain state recognition is examined. The emphasis is put on brain state recognition using band pass filtration for separation of individual...
-
Wykorzystanie sztucznych sieci neuronowych do wykrywania i rozpoznawania tablic rejestracyjnych na zdjęciach pojazdów
PublicationW artykule przedstawiono koncepcję algorytmu wykrywania i rozpoznawania tablic rejestracyjnych (AWiRTR) na obrazach cyfrowych pojazdów. Detekcja i lokalizacja tablic rejestracyjnych oraz wyodrębnienie z obrazu tablicy rejestracyjnej poszczególnych znaków odbywa się z wykorzystaniem podstawowych technik przetwarzania obrazu (przekształcenia morfologiczne, wykrywanie krawędzi) jak i podstawowych danych statystycznych obiektów wykrytych...
-
A video monitoring system using ontology-driven identification of threats
PublicationIn this paper, we present a video monitoring systemthat leverages image recognition and ontological reasoningabout threats. In the solution, an image processing subsystemuses video recording of a monitored area and recognizesknown concepts in scenes. Then, a reasoning subsystem uses anontological description of security conditions and informationfrom image recognition to check if a violation of a conditionhas occurred. If a threat...
-
FEEDB: A multimodal database of facial expressions and emotions
PublicationIn this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...
-
The Effect of Welding Conditions on Mechanical Properties of Superduplex Stainless Steel Welded Joints
PublicationThe tests results of superduplex stainless steel welded joints made with a different heat input, using automatic submerged arc welding (SAW) and semi-automatic flux-cored arc welding (FCAW) have been presented. Metallographic examinations, the measurements of the ferrite content, the width of the heat affected zone (HAZ) and the hardness of the welds in characteristic areas have been performed. Significant differences in the amount of...
-
Towards New Mappings between Emotion Representation Models
PublicationThere are several models for representing emotions in affect-aware applications, and available emotion recognition solutions provide results using diverse emotion models. As multimodal fusion is beneficial in terms of both accuracy and reliability of emotion recognition, one of the challenges is mapping between the models of affect representation. This paper addresses this issue by: proposing a procedure to elaborate new mappings,...
-
Endoscopic Videos Deinterlacing and On-Screen Text and Light Flashes Removal and Its Influence on Image Analysis Algorithms' Efficiency
PublicationIn this article, deinterlacing and removing on- screen text and light flashes methods on endoscopic video images are discussed. The research is intended to improve disease recognition algorithms' performance. In the article, four configurations of deinterlacing methods and another four configurations of text and flashes removal methods are described and examined. The efficiency of endoscopic video analysis algorithms is measured...
-
Deduplication of Position Data and Global Identification of Objects Tracked in Distributed Vessel Monitoring System
PublicationVessel monitoring systems (VMS) play a very important role in safety navigation. In most cases, their structure is distributed and they are based on two data sources, namely Automatic Identification System (AIS) and Automatic Radar Plotting Aids (ARPA). Such approach results in several objects identification and position data duplication problems, which need to be solved in order to ensure the correct performance of a given VMS....
-
The generalization by simplification operator with the Simplify Building tool of objects representing groups of buildings in Gdańsk district - scale 1:10000. Data from OSM
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the Open Street Map databases (OSM) [1].
-
A New Adaptive Method for the Extraction of Steel Design Structures from an Integrated Point Cloud
Open Research DataA new automatic and adaptive algorithm for edge extraction from a random point cloud was developed and presented herein. The proposed algorithm was tested using real measurement data. The developed algorithm is able to realistically reduce the amount of redundant data and correctly extract stable edges representing the geometric structures of a studied...
-
An electronic nose for quantitative determination of gas concentrations
PublicationThe practical application of human nose for fragrance recognition is severely limited by the fact that our sense of smell is subjective and gets tired easily. Consequen tly, there is considerable need for an instrument that can be a substitution of the human sense of smell. Electronic nose devices from the mid 1980s are used in growing number of applications. They comprise an array of several electrochemical gas sensors...