Search results for: ALOPHONEME ANALISYS, SPEECH PROCESSING, DYNAMIC TIME WARPING

Search results for: ALOPHONEME ANALISYS, SPEECH PROCESSING, DYNAMIC TIME WARPING

results on page:
embed this view on your website

Filters

total: 831

clear all filters disabled

Dynamic mass measurement in checkweighers using a discrete time-variant low-pass filter
Publication
- P. Pietrzak
- M. Meller
- M. Niedźwiecki
- MECHANICAL SYSTEMS AND SIGNAL PROCESSING - Year 2014
Conveyor belt type checkweighers are complex mechanical systems consisting of a weighing sensor (strain gauge load cell, electrodynamically compensated load cell), packages (of different shapes, made of different materials) and a transport system (motors, gears, rollers). Disturbances generated by the vibrating parts of such a system are reflected in the signal power spectra in a form of strong spectral peaks, located usually in...

Full text available to download
Fixed final time and free final state optimal control problem for fractional dynamic systems – linear quadratic discrete-time case
Publication
- A. Dzieliński
- P. Czyronis
- Bulletin of the Polish Academy of Sciences-Technical Sciences - Year 2013
Full text to download in external service
Investigation of the 16-year and 18-year ZTD Time Series Derived from GPS Data Processing
Publication
- Z. Bałdysz
- G. Nykiel
- M. Figurski
- K. Szafranek
- K. Kroszczyński
- Acta Geophysica - Year 2015
The GPS system can play an important role in activities related to the monitoring of climate. Long time series, coherent strategy, and very high quality of tropospheric parameter Zenith Tropospheric Delay (ZTD) estimated on the basis of GPS data analysis allows to investigate its usefulness for climate research as a direct GPS product. This paper presents results of analysis of 16-year time series derived from EUREF Permanent Network...

Full text available to download
EURASIP Journal on Audio Speech and Music Processing

Journals

ISSN: 1687-4714 , eISSN: 1687-4722
Statistical Data Pre-Processing and Time Series Incorporation for High-Efficacy Calibration of Low-Cost NO2 Sensor Using Machine Learning
Publication
- Scientific Reports - Year 2024
Air pollution stands as a significant modern-day challenge impacting life quality, the environment, and the economy. It comprises various pollutants like gases, particulate matter, biological molecules, and more, stemming from sources such as vehicle emissions, industrial operations, agriculture, and natural events. Nitrogen dioxide (NO2), among these harmful gases, is notably prevalent in densely populated urban regions. Given...

Full text available to download
Machine-learning-based precise cost-efficient NO2 sensor calibration by means of time series matching and global data pre-processing
Publication
- Engineering Science and Technology-An International Journal-JESTECH - Year 2024
Air pollution remains a considerable contemporary challenge affecting life quality, the environment, and economic well-being. It encompasses an array of pollutants—gases, particulate matter, biological molecules—emanating from sources such as vehicle emissions, industrial activities, agriculture, and natural occurrences. Nitrogen dioxide (NO2), a harmful gas, is particularly abundant in densely populated urban areas. Given its...

Full text available to download
Metoda i algorytmy modyfikacji sygnału do celu wspomagania rozumienia mowy przez osoby z pogorszoną rozdzielczością czasową słuchu
Publication
- A. Kupryjanow
- Year 2013
Przedmiotem badań przeprowadzonych w ramach rozprawy są metody modyfikacji czasu trwania sygnału (ang. Time Scale Modification –TSM) mowy operujące w czasie rzeczywistym oraz ocena ich wpływu na rozumienie wypowiedzi przez osoby z pogorszoną rozdzielczością czasową słuchu. Pogorszona rozdzielczość słuchu jest jednym z symptomów związanych z ośrodkowymi zaburzeniami słuchu (ang. Cetnral Auditory Processing Disorder – CAPD). W odróżnieniu...
Journal of Real-Time Image Processing

Journals

ISSN: 1861-8200 , eISSN: 1861-8219
Marek Czachor prof. dr hab.

People

Instytut Fizyki i Informatyki Stosowanej
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
Publication
- M. Niedźwiecki
- M. Ciołek
- IEEE Transactions on Audio Speech and Language Processing - Year 2013
In this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...

Full text available to download
Jan Daciuk dr hab. inż.

People

Faculty of Electronics, Telecommunications and Informatics, Department of Intelligent Interactive Systems

Jan Daciuk received his M.Sc. from the Faculty of Electronics of Gdansk University of Technology in 1986, and his Ph.D. from the Faculty of Electronics, Telecommunications and Informatics of Gdańsk University of Technology in 1999. He has been working at the Faculty from 1988. His research interests include finite state methods in natural language processing and computational linguistics including speech processing. Dr. Daciuk...
System Supporting Speech Perception in Special Educational Needs Schoolchildren
Publication
- A. Kupryjanow
- P. Suchomski
- P. Odya
- A. Czyżewski
- Year 2012
The system supporting speech perception during the classes is presented in the paper. The system is a combination of portable device, which enables real-time speech stretching, with the workstation designed in order to perform hearing tests. System was designed to help children suffering from Central Auditory Processing Disorders.

Full text to download in external service
Michał Mazur dr inż.

People

Institute of Mechanics and Machine Design

Aktualne zainteresowania inżynieria mechaniczna, robotyka, drgania mechaniczne, analiza modalna, sterowanie, systemy czasu rzeczywistego Wybrane publikacje Kaliński K., Galewski M., Mazur M., Chodnicki M, 2017, Modelling and Simulation Of A New Variable Stiffness Holder for Milling Of Flexible Details, Polish Maritime Research, vol 24, ss. 115-124 Kaliński K. J., Mazur M.: Optimal control at energy performance index of the mobile...
A handwritten signature verification method employing a tablet
Publication
- M. Lech
- A. Czyżewski
- Year 2016
A signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method's efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
Krzysztof Goczyła prof. dr hab. inż.

People

Department of Software Engineering

Krzysztof Goczyła, full professor of Gdańsk University of Technology, computer scientist, a specialist in software engineering, knowledge engineering and databases. He graduated from the Faculty of Electronics Technical University of Gdansk in 1976 with a degree in electronic engineering, specializing in automation. Since then he has been working at Gdańsk University of Technology. In 1982 he obtained a doctorate in computer science...
IEEE International Conference on Acoustics, Speech and Signal Processing

Conferences
Investigating Feature Spaces for Isolated Word Recognition
Publication
- G. Korvel
- G. Tamulevicus
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Year 2018
Much attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
Publication
- SENSORS - Year 2021
The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

Full text available to download
Handwritten signature verification system employing wireless biometric pen
Publication
- M. Lech
- A. Czyżewski
- Year 2017
The handwritten signature verification system being a part of the developed multimodal biometric banking stand is presented. The hardware component of the solution is described with a focus on the signature acquisition and on verification procedures. The signature is acquired employing an accelerometer and a gyroscope built-in the biometric pen plus pressure sensors for the assessment of the proper pen grip and then the signature...
Introduction to the special issue on machine learning in acoustics
Publication
- Z. Michalopoulou
- P. Gerstoft
- B. Kostek
- M. A. Roch
- Journal of the Acoustical Society of America - Year 2021
When we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...

Full text available to download
Multimodal English corpus for automatic speech recognition
Publication
- Year 2013
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
Zdzisław Kowalczuk prof. dr hab. inż.

People

Department of Decision Systems and Robotics

Zdzislaw Kowalczuk received his M.Sc. degree in 1978 and Ph.D. degree in 1986, both in Automatic Control from Technical University of Gdańsk (TUG), Gdańsk, Poland. In 1993 he received his D.Sc. degree (Dr Habilitus) in Automatic Control from Silesian Technical University, Gliwice, Poland, and the title of Professor from the President of Poland in 2003. Since 1978 he has been with Faculty of Electronics, Telecommunications and Informatics...
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
Publication
- K. Kąkol
- Year 2023
The Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...

Full text available to download
Low-Level Music Feature Vectors Embedded as Watermarks
Publication
- Year 2013
In this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...

Full text to download in external service
An audio-visual corpus for multimodal automatic speech recognition
Publication
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2017
review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Full text available to download
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
Publication
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Year 2022
The aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...

Full text available to download
Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole
Publication
- A. Kupryjanow
- A. Czyżewski
- Year 2009
This paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
Publication
- A. Kupryjanow
- A. Czyżewski
- Diagnostic Pathology - Year 2012
Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based onthe non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of theproposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearingimpaired children and elderly listeners. It was shown that for the speech with average rate equal to or...

Full text available to download
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
Publication
- S. Zaporowski
- B. Kostek
- Year 2018
W niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
Detecting Lombard Speech Using Deep Learning Approach
Publication
- K. Kąkol
- G. Korvel
- G. Tamulevicius
- B. Kostek
- SENSORS - Year 2023
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Full text available to download
Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek
Publication
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2022
W wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...

Full text available to download
Thin-walled frames and grids - statics and dynamics
Publication
- Year 2012
Frames and grids assembled with thin-walled beams of open cross-section are widely applied in various civil engineering and vehicle or machine structures. Static and dynamic analysis of theses structures may be carried out by means of different models, startingfrom the classical models made of beam elements undergoing the Kirchhoff assumptions to the FE discretization of whole frame into plane elements. The former model is very...
Uniwersalny system RPG do zastosowań w przestrzeniach inteligentnych
Publication
- K. Draszawka
- Year 2009
Artykuł dotyczy systemów rozpoznawania poleceń głosowych(RPG). Przedstawiono dwa podstawowe rodzaje systemów RPG i przeprowadzono dyskusję nad wyborem architektury odpowiedniej do zastosowań w przestrzeniach inteligentnych (PI). Zaprezentowano algorytm czasowego dopasowania sygnałów (ang. Dinamic Time Warping - DTW) oraz budowę elementu decyzyjnego zaimplementowanego systemu. Przedstawiono wyniki oceny tego systemu.
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
Publication
- M. Blok
- P. Drózda
- Archives of Acoustics - Year 2014
In this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...

Full text available to download
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
Publication
- SENSORS - Year 2022
Objective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...

Full text available to download
A Novel Approach to the Assessment of Cough Incidence
Publication
- Year 2013
In this paper we consider the problem of identication of cough events in patients suffering from chronic respiratory diseases. The information about frequency of cough events is necessary to medical treatment. The proposed approach is based on bidirectional processing of a measured vibration signal - cough events are localized by combining the results of forward-time and backward-time analysis. The signal is at rst transformed...

Full text to download in external service
Grzegorz Szwoch dr hab. inż.

People

Department of Multimedia Systems

Grzegorz Szwoch was born in 1972 in Gdansk. In 1991-1996 he studied at the Technical University of Gdansk. In 1996 he graduated as a student from the Sound Engineering Department. His thesis was related to physical modeling of musical instruments. Since that time he has been a member of the research staff at the Multimedia Systems Department as a PhD student (1996-2001), Assistant (2001-2004), Assistant professor (2004-2020) and...
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
Publication
- B. Kostek
- B. Szyca
- Journal of the Acoustical Society of America - Year 2023
The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...

Full text available to download
Identity verification using complex representations of handwritten signature
Publication
- M. Papaj
- E. Hermanowicz
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010
This paper is devoted to handwritten signature verification using the cross-correlation approach (adopted by the authors from telecommunications) and dynamic time warping. The following invariants of the handwritten signature: the net signature, the instantaneous complex frequency and the complex cepstrum are analyzed. The problem of setting the threshold for deciding whether the current signature is authentic or forged is discussed....
Speech Intelligibility Measurements in Auditorium
Publication
- K. Leo
- ACTA PHYSICA POLONICA A - Year 2010
Speech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...

Full text available to download
Time variable gain for long range sonar with chirp sounding signal
Publication
- HYDROACOUSTICS - Year 2007
The main purpose of applaying Time Variable Gain (TVG) in active sonars with digital signal processing is to reduce dynamic range of echo signal and adapt it to the dynamic range of the analogue to digital conversion. With high transmission losses level, the dynamic range of the input signal in long range sonars can be very high and even exceed 200dB. When chirp sounding signals with matched filtration are used, sonars can raech...

Full text available to download
Performance Analysis of the OpenCL Environment on Mobile Platforms
Publication
- P. Falkowski-Gilski
- M. Plewka
- Year 2022
Today’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...

Full text to download in external service
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
Publication
- G. Korvel
- O. Kurasova
- B. Kostek
- Year 2019
The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...

Full text available to download
Silence/noise detection for speech and music signals
Publication
- M. Papaj
- Year 2008
This paper introduces a novel off-line algorithm for silence/noise detection in noisy signals. The main concept of the proposed algorithm is to provide noise patterns for further signals processing i.e. noise reduction for speech enhancement. The algorithm is based on frequency domain characteristics of signals. The examples of different types of noisy signals are presented.
Vibration signals collected for concrete beams with GFRP reinforcement subjected to elevated temperatures (120C-240C)
Open Research Data
open access
- B. Zima
The dataset contains the time domain signals obtained during dynamic tests of concrete beams reinforced with GFRP bars. The vibration were induced with the use of modal hammer, while the signals were collected by the accelerometers attached at the beam surface. The signals were captured before and after subjecting the concrete beams to elevated temperatures.
Investigating Feature Spaces for Isolated Word Recognition
Publication
- P. Treigys
- G. Korvel
- G. Tamulevicius
- J. Bernataviciene
- B. Kostek
- Year 2020
The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Full text to download in external service
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
Publication
- P. Falkowski-Gilski
- G. Debita
- Archives of Acoustics - Year 2023
In order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...

Full text available to download
Auditory-visual attention stimulator
Publication
- Year 2013
New approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...

Full text to download in external service
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
Publication
- Journal of the Acoustical Society of America - Year 2018
A method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...

Full text to download in external service
Comparative analysis of various transformation techniques for voiceless consonants modeling
Publication
- G. Korvel
- B. Kostek
- O. Kurasova
- International Journal of Computers Communications & Control - Year 2018
In this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....

Full text available to download

Search

Filters

Catalog

Search results for: ALOPHONEME ANALISYS, SPEECH PROCESSING, DYNAMIC TIME WARPING

Marek Czachor prof. dr hab.

Jan Daciuk dr hab. inż.

Michał Mazur dr inż.

Krzysztof Goczyła prof. dr hab. inż.

Zdzisław Kowalczuk prof. dr hab. inż.

Grzegorz Szwoch dr hab. inż.