Search results for: RECONSTRUCTION OF SPEECH SIGNALS - Bridge of Knowledge

Search

Search results for: RECONSTRUCTION OF SPEECH SIGNALS

Search results for: RECONSTRUCTION OF SPEECH SIGNALS

  • WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE

    Publication

    - Year 2018

    W niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...

  • Investigating Feature Spaces for Isolated Word Recognition

    Publication

    - Year 2018

    Much attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...

  • Krzysztof Kutt dr inż.

    People

    Computer scientist and psychologist trying to combine expertise from both disciplines into something cool. My research activity focuses on the development of affective HCI/BCI interfaces (based on multimodal fusion of signals and contextual data), methods for processing sensory data (including semantization of such data) and the development of knowledge-based systems (in particular knowledge graphs and semantic web systems).

  • Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

    Publication

    - Year 2021

    This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

    Full text available to download

  • Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole

    Publication

    This paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed

  • KORPUS MOWY ANGIELSKIEJ DO CELÓW MULTIMODALNEGO AUTOMATYCZNEGO ROZPOZNAWANIA MOWY

    W referacie zaprezentowano audiowizualny korpus mowy zawierający 31 godzin nagrań mowy w języku angielskim. Korpus dedykowany jest do celów automatycznego audiowizualnego rozpoznawania mowy. Korpus zawiera nagrania wideo pochodzące z szybkoklatkowej kamery stereowizyjnej oraz dźwięk zarejestrowany przez matrycę mikrofonową i mikrofon komputera przenośnego. Dzięki uwzględnieniu nagrań zarejestrowanych w warunkach szumowych korpus...

  • Instantaneous complex frequency for pipeline pitch estimation

    Publication
    • M. [. Kaniewska

    - Year 2010

    In the paper a pipeline algorithm for estimating the pitch of speech signal is proposed. The algorithm uses instantaneous complex frequencies estimated for four waveforms obtained by filtering the original speech signal through four bandpass complex Hilbert filters. The imaginary parts of ICFs from each channel give four candidates for pitch estimates. The decision regarding the final estimate is made based on the real parts of...

  • Creating new voices using normalizing flows

    Publication
    • P. Biliński
    • T. Merritt
    • A. Ezzerg
    • K. Pokora
    • S. Cygert
    • K. Yanagisawa
    • R. Barra-Chicote
    • D. Korzekwa

    - Year 2022

    Creating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...

    Full text available to download

  • Janusz Smulko prof. dr hab. inż.

    He was born on April 25, 1964 in Kolno. He graduated in 1989 with honors from the Faculty of Electronics at Gdańsk University of Technology, specialising in measuring instruments. In 1989 he took second place in the Red Rose competition for the best student in the Pomerania Region. Since the beginning of his career ha has been associated with Gdańsk University of Technology: research assistant (1989-1996), Assistant Professor (1996-2012),...

  • PHONEME DISTORTION IN PUBLIC ADDRESS SYSTEMS

    Publication

    - Year 2015

    The quality of voice messages in speech reinforcement and public address systems is often poor. The sound engineering projects of such systems take care of sound intensity and possible reverberation phenomena in public space without, however, considering the influence of acoustic interference related to the number and distribution of loudspeakers. This paper presents the results of measurements and numerical simulations of the...

  • Human voice modification using instantaneous complex frequency

    Publication
    • M. Kaniewska

    - Year 2010

    The paper presents the possibilities of changing human voice by modifying instantaneous complex frequency (ICF) of the speech signal. The proposed method provides a flexible way of altering voice without the necessity of finding fundamental frequency and formants' positions or detecting voiced and unvoiced fragments of speech. The algorithm is simple and fast. Apart from ICF it uses signal factorization into two factors: one fully...

  • PI observer stability and application in an induction motor control system

    Publication

    - Bulletin of the Polish Academy of Sciences-Technical Sciences - Year 2013

    The paper discusses the problem of stability of a proportional-integral Luenberger observer, designated for the state variables reconstruction of a linear, time-invariant dynamical system.

    Full text available to download

  • Investigating Feature Spaces for Isolated Word Recognition

    Publication
    • P. Treigys
    • G. Korvel
    • G. Tamulevicius
    • J. Bernataviciene
    • B. Kostek

    - Year 2020

    The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

    Full text to download in external service

  • Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek

    W wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...

    Full text available to download

  • Auditory-visual attention stimulator

    New approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...

    Full text to download in external service

  • INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH

    Publication

    The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...

    Full text available to download

  • Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.

    In this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...

    Full text to download in external service

  • Comparison of three methods of EPR retrospective dosimetry in watch glass

    Publication
    • A. Marciniak
    • M. Juniewicz
    • B. Ciesielski
    • A. Prawdzik-Dampc
    • J. Karczewski

    - Frontiers in Public Health - Year 2022

    In this article we present results of our follow-up studies of samples of watch glass obtained and examined within a framework of international intercomparison dosimetry project RENEB ILC 2021. We present three methods of dose reconstruction based on EPR measurements of these samples: calibration method (CM), added dose method (ADM) and added dose&heating method (ADHM). The study showed that the three methods of dose reconstruction...

    Full text available to download

  • Auditory Brainstem Responses recorded employing Audio ABR device

    Open Research Data
    open access

    The dataset consists of ABR measurements employing click, burst and speech stimuli. Parameters of the particular stimuli were as follows:

  • Surgical Site Infection after Breast Surgery: A Retrospective Analysis of 5-Year Postoperative Data from a Single Center in Poland

    Publication

    - Medicina-Lithuania - Year 2019

    Background and Objectives: Surgical site infection (SSI) is a significant complication of non-reconstructive and reconstructive breast surgery. This study aimed to assess SSI after breast surgery over five years in a single center in Poland. The microorganisms responsible for SSI and their antibiotic susceptibilities were determined. Materials and Methods: Data from 2129 patients acquired over five years postoperatively by the...

    Full text available to download

  • Ryzyko realizacji prac projektowych i prowadzenia przebudowy obiektów prywatnych

    Publication

    W artykule przedstawiono przykład realizacji przebudowy prywatnego budynku mieszkalnego, podczas której okazało się, że nie jest on wykonany zgodnie z projektem i zasadami sztuki budowlanej. Opracowany i zatwierdzony projekt budowlany przebudowy niewiele miał wspólnego z obiektem, którego dotyczył. Realizacja prac zgodnie z tym projektem była więc obarczona dużym ryzykiem, a w konsekwencji groziła katastrofą budowlaną.

    Full text to download in external service

  • Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter

    Publication

    - Archives of Acoustics - Year 2014

    In this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...

    Full text available to download

  • Method of reconstructing two-dimensional velocity fields on the basis of temperature field values measured with a thermal imaging camera

    This paper describes a novel numerical reconstruction procedure (NRP) of the velocity field during natural convective heat transfer from a two-sided, isothermal, heated vertical plate based only on the known temperature field obtained, e.g. with a thermal imaging camera. It has been demonstrated that with a knowledge of temperature distributions, the NRP enables the reconstruction of velocity fields by solving the Navier-Stokes...

    Full text available to download

  • Prof. Haitham Abu-Rub - A Visit to Poland's Gdansk University of Technology

    Report on visit of Prof. Haitham Abu-Rub in Gdansk University of Technology. Speech on the Smart Grid Centre. Visit in the new smart grid laboratory of the GUT, the Laboratory for Innovative Power Technologies and Integration of Renewable Energy Sources (LINTE^2).

    Full text available to download

  • Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling

    Publication

    - Year 2021

    A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...

    Full text to download in external service

  • Wikipedia Articles Representation with Matrix'u

    Publication

    - Year 2013

    In the article we evaluate different text representation methods used for a task of Wikipedia articles categorization. We present the Matrix’u application used for creating computational datasets ofWikipedia articles. The representations have been evaluated with SVM classifiers used for reconstruction human made categories.

    Full text to download in external service

  • Digital microcontroller for sonar waveform generator

    Publication

    Generating sounding signals is essential for the operation of active sonar. The system should be highly reliable. This can be achieved through architecture, communication between the devices, and a well-designed and self–testing software. The system presented in the article is responsible for the generation of hydroacoustic sounding signals, and ensures proper interaction between power amplifiers and power supplies. Thanks to its...

    Full text available to download

  • Surface EMG-based signal acquisition for decoding hand movements

    Open Research Data
    open access

    Biosignal processing plays a crucial role in modern hand prosthetics. The challenge is to restore functionality of a lost limb based on the signals acquired from the surface of the stump. The number of sensors (emg channels) used for signal acquisition influence the quality of a prosthetic hand. Modern algorithms (including neural networks) can significantly...

  • Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set

    Publication

    - Applied Sciences-Basel - Year 2023

    This work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...

    Full text available to download

  • A comparative study of English viseme recognition methods and algorithm

    An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

    Full text available to download

  • Modeling and Designing Acoustical Conditions of the Interior – Case Study

    The primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...

    Full text available to download

  • A comparative study of English viseme recognition methods and algorithms

    An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...

    Full text available to download

  • Comparative analysis of various transformation techniques for voiceless consonants modeling

    Publication

    In this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....

    Full text available to download

  • Multistatyczny, Dopplerowski System określania położenia i prędkości ruchomych celów w wodzie

    Publication

    - Year 2015

    W omawianym w pracy multistatycznym, dopplerowskim systemie określania położenia i prędkości ruchomych celów w wodzie źródłem sygnału są dwa nadajniki emitujące sinusoidalne, akustyczne fale ciągłe o różnych częstotliwościach, które po odbiciu od ruchomego celu są obierane przez cztery hydrofony. W artykule przedstawiono analize teoretyczna efektu Dopplera, na którym oparte jest działanie systemu oraz metodę rozwiązania głównych...

  • Applicability of null-steering for spoofing mitigation in civilian GPS

    Publication

    - Year 2014

    Civilian GPS signals are currently used in many critical applications, such as precise timing for power grids and telecommunication networks. Spoofing may cause their improper functioning. It is a threat which emerges with the growing availability of GPS constellation simulators and other devices which may be used to perform such attack. Development of the effective countermeasures, covering detection and mitigation, is necessary...

  • Szybka identyfikacja harmonicznych na podstawie oszczędnego próbkowania

    Publication

    W pracy przedstawiono implementację szybkiego algorytmu rekonstrukcji sygnału, opartego na teorii oszczędnego próbkowania, który może wykrywać harmoniczne w sygnale wejściowym. Zagadnienie rekonstrukcji sygnału jest problemem optymalizacyjnym rozwiązywanym za pomocą algorytmu programowania liniowego. Dodatkowo, aby przyspieszyć zbieżność rozwiązania zastosowano w rzadkiej dziedzinie sygnału filtr typu K-rank-order. Przeprowadzona...

    Full text available to download

  • Comparison of near infrared spectroscopy (NIRS) and near-infrared transillumination-backscattering sounding (NIR-T/BSS) methods

    Publication

    - Scientific Reports - Year 2020

    The aim of the study was to compare simultaneously recorded a NIR-T/BSS and NIRS signals from healthy volunteers. NIR-T/BSS is a device which give an ability to non-invasively detect and monitor changes in the subarachnoid space width (SAS). Experiments were performed on a group of 30 healthy volunteers (28 males and 2 females, age 30.8 ± 13.4 years, BMI = 24.5 ± 2.3 kg/m2). We analysed recorded signals using analysis methods based...

    Full text available to download

  • A Wearable System Developed to Monitor People Suffering from Vasovagal Syncope

    A wearable system for monitoring non-invasively signals invaluable when examining person suffering from vasovagal syncope is presented in the paper. Following signals are continuously recorded: electrocardiogram, photopletysmogram, impedance cardiogram and electrodermal resistance.

    Full text available to download

  • Rzadka reprezentacja sygnału niestacjonarnego w technice oszczędnego próbkowania

    Przedstawiono zastosowanie techniki oszczędnego próbkowania do rekonstrukcji sygnału niestacjonarnego na podstawie skompresowanych próbek w dziedzinie czas-częstotliwość. Zastosowano nadmiarowy algorytm z różnymi słownikami aby znaleźć rzadką reprezentację sygnału. Wyniki symulacji potwierdzają, że zastosowanie oszczędnego próbkowania pozwala na rekonstrukcję sygnału niestacjonarnego z małej liczby losowo pobranych próbek, z niewielką...

    Full text available to download

  • Reliability of Pulse Measurements in Videoplethysmography

    Reliable, remote pulse rate measurement is potentially very important for medical diagnostics and screening. In this paper the Videoplethysmography was analyzed especially to verify the possible use of signals obtained for the YUV color model in order to estimate the pulse rate, to examine what is the best pulse estimation method for short video sequences and finally, to analyze how potential PPG-signals can be distinguished from...

    Full text available to download

  • Self diagnostics using smart glasses - preliminary study

    n this preliminary study we analyzed the possibility of the reliable measurement of biomedical signals with some potential hardware extensions of smart glasses. Using specially designed experimental prototypes four category of biomedical signals were measured: electrocardiograms, electromyograms, electroencephalograms and respiration waveforms. Experi- ments with volunteers proved that using even simple construc- tion of sensors...

    Full text to download in external service

  • Receiver of Doppler multistatic system for moving target detection and tracking

    Publication

    The article presents a method for solving major structural problems that occur in the receiver used in the multistatic Doppler system, aimed at determination of the trajectory and velocity of a moving target. In the system two transmitters emit acoustic continuous sinusoidal waves at different frequencies. The signals, scattered from a moving target are received by four hydrophones. Beside of the echoes, much larger signals coming...

    Full text available to download

  • The dynamic signature verification using population-based vertical partitioning

    Publication

    - Year 2020

    The dynamic signature is an attribute used in behavioral biometrics for verifying the identity of an individual. This attribute, apart from the shape of the signature, also contains information about the dynamics of the signing process described by the signals which tend to change over time. It is possible to process those signals in order to obtain descriptors of the signature characteristic of an individual user. One of the methods...

    Full text to download in external service

  • A multisensor detector of a sleep apnea for using at home

    Diagnosis of obstructive sleep apnea usually involves polysomnographic analysis, which unfortunately requires overnight stay in a specialized clinic and is very uncomfortable for a patient. This paper describes the method and apparatus for recording a set of signals to detect sleep apnea. The device records the following signals simultaneously: three-channel ECG, respiratory functions, signals from the accelerometer, and snoring...

    Full text to download in external service

  • Effectiveness of the robust PSS design

    Publication

    The paper discusses optimal PSS of synchronous generator synthesis. The optimal controller is an Hinf controller, what means that minimises Hinf norm of transfer function between the exogenous signals such as reference inputs and disturbances, and the error signals which are to be minimised to meet the control objective. The dynamic properties of the plant are shaped by choosing appropriate weighting function applied to the plant...

    Full text to download in external service

  • SkinDepth - synthetic 3D skin lesion database

    Open Research Data
    version 1.0 open access

    SkinDepth is the first synthetic 3D skin lesion database. The release of SkinDepth dataset intends to contribute to the development of algorithms for:

  • Aleksander Mroziński dr inż.

    Aleksander Mroziński is a PhD student at the Faculty of Electronics, Telecommunications and Informatics of the Gdańsk University of Technology. From June 2019 is employed as a metadata editor in the project THE BRIDGE OF DATA. He also participates in three other projects implemented at the Gdańsk University of Technology: InterPhD II, OPUS 13 and PROM. His desire to discover the world motivates him to acquire multidisciplinary...

  • Zastosowanie sygnałów o projektowanych kształtach do diagnostyki obiektów wysoko-impedancyjnych metodą spektroskopii impedancyjnej

    Publication

    W artykule przedstawiono metodę szybkiej spektroskopii impedancyjnej obiektów o wysokich impedancjach (|Zx| > 1 GOhm) z zastosowaniem sygnałów o projektowanych kształtach. Sygnał pobudzenia wytwarzany jest w module DAQ U2531A i doprowadzany na wejście badanego obiektu za pośrednictwem przetwornika cyfrowo-analogowego (CA). Sygnały odpowiedzi proporcjonalne do napięcia na mierzonej impedancji Zx oraz prądu płynącego przez Zx są...

    Full text available to download

  • Analiza techniczno-ekonomiczna rozwiązań rewitalizacyjnych zabytkowego mostu w Tczewie

    Przykłady koncepcji odbudowy przyczółka zachodniego zabytkowego mostu przez rzekę Wisłę w Tczewie wraz z zespołem bramnym. Ocena propozycji trzech rozwiązań rewitalizacyjnych mostu i wybór optymalnego pod względem finansowym i technologicznym. Czas i koszty budowy oraz wierność odbudowy zespołu bramnego mostu jako kryteria optymalizacji. Analiza w celu pozyskania środków finansowych potrzebnych na odbudowę, ale także adaptację...

    Full text available to download

  • The accuracy of pulse rate estimation from the sequence of face images

    Publication

    - Year 2016

    The goal of this paper is to analyze the accuracy of pulse rate estimation from the sequence of face images. Simulated and real signals were used to evaluate two pulse rate estimators; one for frequency domain and the second one for time domain using the autocorrelation function. The results show that the mean difference between the reference measurements and estimated pulse rate values are about 2bpm. In the analysis of short...

    Full text to download in external service