Filtry
wszystkich: 812
Wyniki wyszukiwania dla: ALOPHONEME ANALISYS, SPEECH PROCESSING, DYNAMIC TIME WARPING
-
Modified dynamic time warping method applied to handwritten signature authenticity verification
PublikacjaA signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method’s efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
-
Real-time speech-rate modification experiments
PublikacjaAn algorithm designed for real-time speech time scale modification (stretching) is proposed, providing a combination of typical synchronous overlap and add based time scale modification algorithm and signal redundancy detection algorithms that allow to remove parts of the speech signal and replace them with the stretched speech signal fragments. Effectiveness of signal processing algorithms are examined experimentally together...
-
Application of dynamic time warping and cepstrograms to text-dependent speaker verification
PublikacjaThis work provides a description of an automatic speaker verification (ASV) system. In particular, it documents the evolution of all individual stages of the proposed ASV system design from the phase of preprocessing to an operational decision making system. The aim of this research was to achieve the system of the best safety and ease of use in view of users. The objective estimation of this target has been accomplished by assessing...
-
Improved method for real-time speech stretching
Publikacjan algorithm for real-time speech stretching is presented. It was designed to modify input signal dependently on its content and on its relation with the historical input data. The proposed algorithm is a combination of speech signal analysis algorithms, i.e. voice, vowels/consonants, stuttering detection and SOLA (Synchronous-Overlap-and-Add) based speech stretching algorithm. This approach enables stretching input speech signal...
-
Time-domain prosodic modifications for text-to-speech synthesizer
PublikacjaAn application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. Prosodic modifications that improve the naturalness of the synthesized signal are discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.
-
A non-uniform real-time speech time-scale stretching method
PublikacjaAn algorithm for non-uniform real-time speech stretching is presented. It provides a combination of typical SOLA algorithm (Synchronous Overlap and Add ) with the vowels, consonants and silence detectors. Based on the information about the content and the estimated value of the rate of speech (ROS), the algorithm adapts the scaling factor value. The ability of real-time speech stretching and the resultant quality of voice were...
-
A Method of Real-Time Non-uniform Speech Stretching
PublikacjaDeveloped method of real-time non-uniform speech stretching is presented.The proposed solution is based on the well-known SOLA algorithm(Synchronous Overlap and Add). Non-uniform time-scale modification isachieved by the adjustment of time scaling factor values in accordance with thesignal content. Dependently on the speech unit (vowels/consonants), instantaneousrate of speech (ROS), and speech signal presence, values of the scalingfactor...
-
Comparison of various speech time-scale modificartion methods
PublikacjaThe objective of this work is to investigate the influence of the different time-scale modification (TSM) methods on the quality of the speech stretched up using the designed non-uniform real-time speech time-scale modification algorithm (NU-RTSM). The algorithm provides a combination of the typical TSM algorithm with the vowels, consonants, stutter, transients and silence detectors. Based on the information about the content and...
-
Speech codec enhancements utilizing time compression and perceptual coding
PublikacjaA method for encoding wideband speech signal employing standardized narrowband speech codecs is presented as well as experimental results concerning detection of tonal spectral components. The speech signal sampled with a higher sampling rate than it is suitable for narrowband coding algorithm is compressed in order to decrease the amount of samples. Next, the time-compressed representation of a signal is encoded using a narrowband...
-
Parallel implementation of background subtraction algorithms for real-time video processing on a supercomputer platform
PublikacjaResults of evaluation of the background subtraction algorithms implemented on a supercomputer platform in a parallel manner are presented in the paper. The aim of the work is to chose an algorithm, a number of threads and a task scheduling method, that together provide satisfactory accuracy and efficiency of a real-time processing of high resolution camera images, maintaining the cost of resources usage at a reasonable level. Two...
-
Intelligent processing of stuttered speech.
PublikacjaW artykule zaprezentowano kilka metod analizy i automatycznego zliczania potknięć artykulacyjnych, związanych z jąkaniem się, opartych na wykorzystaniu algorytmów uczących się sztucznych sieci neuronowych i zbiorów przybliżonych.
-
Real-time speech streching for supporting hearing impaired schoolchildren
PublikacjaA study of time scale modification algorithms applied to support hearing impaired schoolchildren is presented. Variety of algorithms are considered, namely: overlap-and add, two variations of synchronous overlapand- add, and the phase vocoder. Their effectiveness as well as real-time processing capabilities are examined.
-
Time-scale modification of speech signals for supporting hearing impaired schoolchildren
PublikacjaA study of time scale modification algorithmsapplied to hearing impaired schoolchildren supporting ispresented. Variety of algorithms are considered, namely:overlap and add, two variations of synchronized overlapand add, and the phase vocoder. Their effectiveness as wellas real-time processing capabilities are examined.
-
Overhead wires detection by FPGA real-time image processing
PublikacjaThe paper presents design and hardware implementation of real-time image filtering for overhead wires detection divided on image processing and results presentation blocks. The image processing block was separated from the whole implementation, and its delay and hardware complexity was analysed. Also the maximum frequency of image processing of the proposed implementation was estimated.
-
Linear Time-Varying Dynamic-Algebraic Equations of Index One on Time Scales
PublikacjaIn this paper, we introduce a class of linear time-varying dynamic-algebraic equations (LTVDAE) of tractability index one on ar- bitrary time scales. We propose a procedure for the decoupling of the considered class LTVDAE. Explicit formulae are written down both for transfer operator and the obtained decoupled system. A projector ap- proach is used to prove the main statement of the paper and sufficient conditions of decoupling...
-
Prediction of Processor Utilization for Real-Time Multimedia Stream Processing Tasks
PublikacjaUtilization of MPUs in a computing cluster node for multimedia stream processing is considered. Non-linear increase of processor utilization is described and a related class of algorithms for multimedia real-time processing tasks is defined. For such conditions, experiments measuring the processor utilization and output data loss were proposed and their results presented. A new formula for prediction of utilization was proposed...
-
On time-dependent nonlinear dynamic response of micro-elastic solids
PublikacjaA new approach to the mechanical response of micro-mechanic problems is presented using the modified couple stress theory. This model captured micro-turns due to micro-particles' rotations which could be essential for microstructural materials and/or at small scales. In a micro media based on the small rotations, sub-particles can also turn except the whole domain rotation. However, this framework is competent for a static medium....
-
Artur Gańcza mgr inż.
OsobyI received the M.Sc. degree from the Gdańsk University of Technology (GUT), Gdańsk, Poland, in 2019. I am currently a Ph.D. student at GUT, with the Department of Automatic Control, Faculty of Electronics, Telecommunications and Informatics. My professional interests include speech recognition, system identification, adaptive signal processing and linear algebra.
-
System of speech signal processing and visualisation for linguistic purposes
Publikacja -
Estimation of time-frequency complex phase-based speech attributes using narrow band filter banks
PublikacjaIn this paper, we present nonlinear estimators of nonstationary and multicomponent signal attributes (parameters, properties) which are instantaneous frequency, spectral (or group) delay, and chirp-rate (also known as instantaneous frequency slope). We estimate all of these distributions in the time-frequency domain using both finite and infinite impulse response (FIR and IIR) narrow band filers for speech analysis. Then, we present...
-
Multi-core processing system for real-time image processing in embedded computer vision applications
PublikacjaW artykule opisano architekturę wielordzeniowego programowalnego systemu do przetwarzania obrazów w czasie rzeczywistym. Dane obrazu są przetwarzane równocześnie przez wszystkie procesory. System umożliwia niskopoziomowe przetwarzanie obrazów,np. odejmowanie tła, wykrywanie obiektów ruchomych, transformacje geometryczne, indeksowanie wykrytych obiektów, ocena ich kształtu oraz podstawowa analiza trajektorii ruchu. Ang:This paper...
-
Neural modelling of dynamic systems with time delays based on an adjusted NEAT algorithm
PublikacjaA problem related to the development of an algorithm designed to find an architecture of artificial neural network used for black-box modelling of dynamic systems with time delays has been addressed in this paper. The proposed algorithm is based on a well-known NeuroEvolution of Augmenting Topologies (NEAT) algorithm. The NEAT algorithm has been adjusted by allowing additional connections within an artificial neural network and...
-
Marking the Allophones Boundaries Based on the DTW Algorithm
PublikacjaThe paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighborhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border...
-
Data processing methods for dynamic medical thermography.
PublikacjaArtykuł przedstawia zastosowanie nowej metody syntezy obrazów w termografii dla potrzeb opisu ilościowego właściwości termicznych tkanek. Opis taki umożliwia różnicowanie przypadków medycznych. Metodę zastosowania dla licznych pomiarów fantomowych i in vitro w eksperymentach na zwierzętach (świnia domowa). Przedstawiono i omówiono rezultaty prac.
-
Impact of Shifting Time-Window Post-Processing on the Quality of Face Detection Algorithms
PublikacjaWe consider binary classification algorithms, which operate on single frames from video sequences. Such a class of algorithms is named OFA (One Frame Analyzed). Two such algorithms for facial detection are compared in terms of their susceptibility to the FSA (Frame Sequence Analysis) method. It introduces a shifting time-window improvement, which includes the temporal context of frames in a post-processing step that improves the...
-
Dynamic fracture of brittle shells in a space-time adaptive isogeometric phase field framework
PublikacjaPhase field models for fracture prediction gained popularity as the formulation does not require the specification of ad-hoc criteria and no discontinuities are inserted in the body. This work focuses on dynamic crack evolution of brittle shell structures considering large deformations. The energy contributions from in-plane and out-of-plane deformations are separately split into tensile and compressive components and the resulting...
-
Influence of YARN Schedulers on Power Consumption and Processing Time for Various Big Data Benchmarks
PublikacjaClimate change caused by human activities can influence the lives of everybody onthe planet. The environmental concerns must be taken into consideration by all fields of studyincludingICT. Green Computing aims to reduce negative effects of IT on the environment while,at the same time, maintaining all of the possible benefits it provides. Several Big Data platformslike Apache Spark orYARNhave become widely used in analytics and...
-
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
Czasopisma -
Mariusz Kaczmarek dr hab. inż.
OsobyReceived M.Sc., Eng. in Electronics in 1995 from Gdansk University of Technology, Ph.D. in Medical Electronics in 2003 and habilitation in Biocybernetics and Biomedical Engineering in 2017. He was an investigator in about 13 projects receiving a number of awards, including four best papers, practical innovations (7 medals and awards) and also the Andronicos G. Kantsios Award and Siemens Award. Main research activities: the issues...
-
IEEE Transactions on Audio Speech and Language Processing
Czasopisma -
Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming
PublikacjaIn order to address the output feedback issue for linear discrete-time systems, this work suggests a brand-new adaptive dynamic programming (ADP) technique based on the internal model principle (IMP). The proposed method, termed as IMP-ADP, does not require complete state feedback, merely the measurement of input and output data. More specifically, based on the IMP, the output control problem can first be converted into a stabilization...
-
Assessment of the Impact of GNSS Processing Strategies on the Long-Term Parameters of 20 Years IWV Time Series
PublikacjaAdvanced processing of collected global navigation satellite systems (GNSS) observations allows for the estimation of zenith tropospheric delay (ZTD), which in turn can be converted to the integrated water vapour (IWV). The proper estimation of GNSS IWV can be affected by the adopted GNSS processing strategy. To verify which of its elements cause deterioration and which improve the estimated GNSS IWV, we conducted eight reprocessings...
-
Dynamic inequalities and equations of Volterra type on time scales
PublikacjaPraca dotyczy całkowo-różniczkowych równań dynamicznych typu Volterry z warunkami początkowymi. Stosując twierdzenie Banacha o punkcie stałym pokazano istnienie jedynego rozwiązania liniowego równania dynamicznego. Stosując metodę iteracji monotonicznych pokazano istnienie rozwiązań ekstremalnych dla problemów nieliniowych. Badano też nierówności dynamiczne. Praca zawiera również uwagi dotyczące zagadnień różniczkowych i różnicowych.
-
Stability of softly switched multiregional dynamic output controllers with a static antiwindup filter: A discrete-time case
PublikacjaThis paper addresses the problem of model-based global stability analysis of discrete-time Takagi–Sugeno multiregional dynamic output controllers with static antiwindup filters. The presented analyses are reduced to the problem of a feasibility study of the Linear Matrix Inequalities (LMIs), derived based on Lyapunov stability theory. Two sets of LMIs are considered candidate derived from the classical common quadratic Lyapunov...
-
Real-Time Multimedia Stream data Processing in a Supercomputer Environment
PublikacjaRozdział opisuje doświadczenia uzyskane przez autorów podczas pracy w projekcie MAYDAY EURO 2012. Przedstawiono główny cel projektu - stworzenie systemu umożliwiającego rozwijanie i równolegle wykonywanie usług multimedialnych w środowisku klastra obliczeniowego dużej mocy. opisano tematykę przetwarzania dużej liczby strumieni multimedialnych na komputerach dużej mocy. Następnie zaprezentowano możliwości platformy KASKADA: tworzenie...
-
The influence of different time duration of thermal processing on berries quality
PublikacjaOznaczano zawartość związków bioaktywnych (polifenole, flawonoidy, taniny, antocyjany i kwas askorbinowy) oraz poziom aktywności przeciwutleniającej próbek ekstraktów (wodnych, heksanowych i acetonowych) uzyskanych z różnych gatunków owoców jagodowych. Do pomiaru poziomu aktywności przeciwutleniającej wykorzystano takie testy jak ABTS, DPPH, FRAP i CUPRAC. Zbadano wpływ czasu trwania procesu obróbki termicznej na zawartość bioaktywnych...
-
The influence of different time durations of thermal processing on berries quality
PublikacjaBioactive compounds (polyphenols, flavonoids, flavanols, tannins, anthocyanins and ascorbic acid) and the level of antioxidant activity by ABTS, DPPH, FRAP and CUPRAC of water, acetone and hexane extracts of Chilean 'Murtilla' (Ugni molinae Turcz) and 'Myrteola' berries (Myrtaceae, Myrteola nummularia (Poiret) Berg.), Chilean and Polish blueberries (Vaccinium corymbosum), Chilean raspberries (Rubus idaeus), and Polish black chokeberry...
-
A nine-input 1.25 mW, 34 ns CMOS analog median filter for image processing in real time
PublikacjaIn this paper an analog voltage-mode median filter, which operates on a 3 × 3 kernel is presented. The filter is implemented in a 0.35 μm CMOS technology. The proposed solution is based on voltage comparators and a bubble sort configuration. As a result, a fast (34 ns) time response with low power consumption (1.25 mW for 3.3 V) is achieved. The key advantage of the configuration is relatively high accuracy of signal processing,...
-
On–line Parameter and Delay Estimation of Continuous–Time Dynamic Systems
PublikacjaThe problem of on-line identification of non-stationary delay systems is considered. The dynamics of supervised industrial processes are usually modeled by ordinary differential equations. Discrete-time mechanizations of continuous-time process models are implemented with the use of dedicated finite-horizon integrating filters. Least-squares and instrumental variable procedures mechanized in recursive forms are applied for simultaneous...
-
Application of time-frequency methods for analysis of dynamic silo flow
PublikacjaW artykule przedstawiono możliwość stosowania metod czasowo-częstotliwościowych w analizie dynamicznego przepływu materiału sypkiego w silosie. W pracy omówiono wyniki FT (Fourier Transform), STFT (Short Time Fourier Transform) oraz WT (Wavelet Transform)
-
CMOS implementation of an analogue median filter for image processing in real time
PublikacjaAn analogue median filter, realised in a 0.35 μm CMOS technology, is presented in this paper. The key advantages of the filter are: high speed of image processing (50 frames per second), low-power operation (below 1.25 mW under 3.3 V supply) and relatively high accuracy of signal processing. The presented filter is a part of an integrated circuit for image processing (a vision chip), containing: a photo-sensor matrix, a set of...
-
Robust-adaptive dynamic programming-based time-delay control of autonomous ships under stochastic disturbances using an actor-critic learning algorithm
PublikacjaThis paper proposes a hybrid robust-adaptive learning-based control scheme based on Approximate Dynamic Programming (ADP) for the tracking control of autonomous ship maneuvering. We adopt a Time-Delay Control (TDC) approach, which is known as a simple, practical, model free and roughly robust strategy, combined with an Actor-Critic Approximate Dynamic Programming (ACADP) algorithm as an adaptive part in the proposed hybrid control...
-
Boundary value problems for dynamic equations with advanced arguments on time scales
PublikacjaPraca dotyczy równań i nierówności dynamicznych z wyprzedzonym argumentami. Przedmiotem badań były problemy istnienia rozwiązań równań dynamicznych. Sformułowano warunki dostatczne na istnienie jedynego rozwiązania w odpowiednim obszarze ograniczonym przez górne i dolne rozwiązanie.
-
Boundary value problems for dynamic equations of Volterra type on time scales
PublikacjaPraca dotyczy równań i nierówności dla problemów dynamicznych typu Volterry. Podano warunki dostateczne na istnienie ekstremalnych rozwiązań w obszarze ograniczonym przez dolne i górne rozwiązania. Praca zawiera również pewne uwagi dla konkretnych zagadnień różniczkowych i dyskretnych.
-
Task Allocation and Scalability Evaluation for Real-Time Multimedia Processing in a Cluster Envirinment
PublikacjaAn allocation algorithm for stream processing tasks is proposed (Modified best Fit Descendent, MBFD). A comparison with another solution (BFD) is provided. Tests of the algorithms in an HPC environment are descrobed and the results are presented. A proper scalability metric is proposed and used for the evaluation of the allocation algorithm.
-
Digital processing of pulse signal from light-to-frequency converter under dynamic condition
Publikacja -
IEEE-ACM Transactions on Audio Speech and Language Processing
Czasopisma -
System przetwarzania i wizualizacji sygnału mowy dla potrzeb lingwistycznych = System of speech signal processing and visualisation of the results
PublikacjaW artykule przedstawiono sposób przetwarzania i wizualizacji sygnału mowy w formie prostego w obsłudze i relatywnie niedrogiego urządzenia do nagrywania sygnału akustycznego oraz przetwarzania cyfrowego wyselekcjonowanych fragmentów i wizualizacji uzyskanych rezultatów przekształceń. Zastosowano do tego celu komputer z kartą dźwiękową. Przetwarzanie cyfrowe oraz wizualizacja dokonywana była w oparciu o program MATLAB bezpośrednio...
-
System przetwarzania i wizualizacji sygnału mowy dla potrzeb lingwistycznych [A system of speech signal processing and visualisation for linguistic purposes]
Publikacja -
Dynamic mass measurement in checkweighers using a discrete time-variant low-pass filter
PublikacjaConveyor belt type checkweighers are complex mechanical systems consisting of a weighing sensor (strain gauge load cell, electrodynamically compensated load cell), packages (of different shapes, made of different materials) and a transport system (motors, gears, rollers). Disturbances generated by the vibrating parts of such a system are reflected in the signal power spectra in a form of strong spectral peaks, located usually in...
-
Fixed final time and free final state optimal control problem for fractional dynamic systems – linear quadratic discrete-time case
Publikacja -
Investigation of the 16-year and 18-year ZTD Time Series Derived from GPS Data Processing
PublikacjaThe GPS system can play an important role in activities related to the monitoring of climate. Long time series, coherent strategy, and very high quality of tropospheric parameter Zenith Tropospheric Delay (ZTD) estimated on the basis of GPS data analysis allows to investigate its usefulness for climate research as a direct GPS product. This paper presents results of analysis of 16-year time series derived from EUREF Permanent Network...
-
EURASIP Journal on Audio Speech and Music Processing
Czasopisma -
Statistical Data Pre-Processing and Time Series Incorporation for High-Efficacy Calibration of Low-Cost NO2 Sensor Using Machine Learning
PublikacjaAir pollution stands as a significant modern-day challenge impacting life quality, the environment, and the economy. It comprises various pollutants like gases, particulate matter, biological molecules, and more, stemming from sources such as vehicle emissions, industrial operations, agriculture, and natural events. Nitrogen dioxide (NO2), among these harmful gases, is notably prevalent in densely populated urban regions. Given...
-
Metoda i algorytmy modyfikacji sygnału do celu wspomagania rozumienia mowy przez osoby z pogorszoną rozdzielczością czasową słuchu
PublikacjaPrzedmiotem badań przeprowadzonych w ramach rozprawy są metody modyfikacji czasu trwania sygnału (ang. Time Scale Modification –TSM) mowy operujące w czasie rzeczywistym oraz ocena ich wpływu na rozumienie wypowiedzi przez osoby z pogorszoną rozdzielczością czasową słuchu. Pogorszona rozdzielczość słuchu jest jednym z symptomów związanych z ośrodkowymi zaburzeniami słuchu (ang. Cetnral Auditory Processing Disorder – CAPD). W odróżnieniu...
-
Journal of Real-Time Image Processing
Czasopisma -
Marek Czachor prof. dr hab.
Osoby -
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
PublikacjaIn this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...
-
Jan Daciuk dr hab. inż.
OsobyJan Daciuk uzyskał tytuł zawodowy magistra na Wydziale Elektroniki Politechniki Gdańskiej w 1986 roku, a doktorat na wydziale Elektroniki, Telekomunikacji i Informatyki PG w 1999. Pracuje na Wydziale od 1988 roku. Jego zainteresowania naukowe obejmują zastosowania automatów skończonych w przetwarzaniu języka naturalnego i przetwarzaniu mowy. Spędził ponad cztery lata w europejskich uniwersytetach i instytutach naukowych, takich...
-
System Supporting Speech Perception in Special Educational Needs Schoolchildren
PublikacjaThe system supporting speech perception during the classes is presented in the paper. The system is a combination of portable device, which enables real-time speech stretching, with the workstation designed in order to perform hearing tests. System was designed to help children suffering from Central Auditory Processing Disorders.
-
Michał Mazur dr inż.
OsobyAktualne zainteresowania inżynieria mechaniczna, robotyka, drgania mechaniczne, analiza modalna, sterowanie, systemy czasu rzeczywistego Wybrane publikacje Kaliński K., Galewski M., Mazur M., Chodnicki M, 2017, Modelling and Simulation Of A New Variable Stiffness Holder for Milling Of Flexible Details, Polish Maritime Research, vol 24, ss. 115-124 Kaliński K. J., Mazur M.: Optimal control at energy performance index of the mobile...
-
Krzysztof Goczyła prof. dr hab. inż.
OsobyKrzysztof Goczyła, profesor zwyczajny Politechniki Gdańskiej, informatyk, specjalista z inżynierii oprogramowania, inżynierii wiedzy i baz danych. Ukończył studia wyższe na Wydziale Elektroniki Politechniki Gdańskiej w 1976 r. jako magister inżynier elektronik w specjalności automatyka. Na Politechnice Gdańskiej pracuje od 1976. Na Wydziale Elektroniki PG w 1982 r. uzyskał doktorat z informatyki, a w 1999 r. habilitację. W 2012...
-
A handwritten signature verification method employing a tablet
PublikacjaA signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method's efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
-
IEEE International Conference on Acoustics, Speech and Signal Processing
Konferencje -
Investigating Feature Spaces for Isolated Word Recognition
PublikacjaMuch attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
-
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
PublikacjaThe paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...
-
Handwritten signature verification system employing wireless biometric pen
PublikacjaThe handwritten signature verification system being a part of the developed multimodal biometric banking stand is presented. The hardware component of the solution is described with a focus on the signature acquisition and on verification procedures. The signature is acquired employing an accelerometer and a gyroscope built-in the biometric pen plus pressure sensors for the assessment of the proper pen grip and then the signature...
-
Introduction to the special issue on machine learning in acoustics
PublikacjaWhen we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...
-
Multimodal English corpus for automatic speech recognition
PublikacjaA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Zdzisław Kowalczuk prof. dr hab. inż.
OsobyW 1978 ukończył studia w zakresie automatyki i informatyki na Wydziale Elektroniki Politechniki Gdańskiej, następnie rozpoczął pracę na macierzystej uczelni. W 1986 obronił pracę doktorską, w 1993 habilitował się na Politechnice Śląskiej na podstawie pracy Dyskretne modele w projektowaniu układów sterowania. W 1996 mianowany profesorem nadzwyczajnym, w 2003 otrzymał tytuł profesora nauk technicznych. W 2006 założył i od tego czasu...
-
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
PublikacjaThe Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...
-
Low-Level Music Feature Vectors Embedded as Watermarks
PublikacjaIn this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...
-
An audio-visual corpus for multimodal automatic speech recognition
Publikacjareview of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...
-
Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole
PublikacjaThis paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed
-
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
PublikacjaThe aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...
-
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
PublikacjaMethods developed for real-time time scale modification (TSM) of speech signal are presented. They are based onthe non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of theproposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearingimpaired children and elderly listeners. It was shown that for the speech with average rate equal to or...
-
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
PublikacjaW niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
-
Detecting Lombard Speech Using Deep Learning Approach
PublikacjaRobust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...
-
Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek
PublikacjaW wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...
-
Thin-walled frames and grids - statics and dynamics
PublikacjaFrames and grids assembled with thin-walled beams of open cross-section are widely applied in various civil engineering and vehicle or machine structures. Static and dynamic analysis of theses structures may be carried out by means of different models, startingfrom the classical models made of beam elements undergoing the Kirchhoff assumptions to the FE discretization of whole frame into plane elements. The former model is very...
-
Uniwersalny system RPG do zastosowań w przestrzeniach inteligentnych
PublikacjaArtykuł dotyczy systemów rozpoznawania poleceń głosowych(RPG). Przedstawiono dwa podstawowe rodzaje systemów RPG i przeprowadzono dyskusję nad wyborem architektury odpowiedniej do zastosowań w przestrzeniach inteligentnych (PI). Zaprezentowano algorytm czasowego dopasowania sygnałów (ang. Dinamic Time Warping - DTW) oraz budowę elementu decyzyjnego zaimplementowanego systemu. Przedstawiono wyniki oceny tego systemu.
-
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
PublikacjaIn this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...
-
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
PublikacjaObjective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...
-
A Novel Approach to the Assessment of Cough Incidence
PublikacjaIn this paper we consider the problem of identication of cough events in patients suffering from chronic respiratory diseases. The information about frequency of cough events is necessary to medical treatment. The proposed approach is based on bidirectional processing of a measured vibration signal - cough events are localized by combining the results of forward-time and backward-time analysis. The signal is at rst transformed...
-
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
PublikacjaThe main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...
-
Identity verification using complex representations of handwritten signature
PublikacjaThis paper is devoted to handwritten signature verification using the cross-correlation approach (adopted by the authors from telecommunications) and dynamic time warping. The following invariants of the handwritten signature: the net signature, the instantaneous complex frequency and the complex cepstrum are analyzed. The problem of setting the threshold for deciding whether the current signature is authentic or forged is discussed....
-
Grzegorz Szwoch dr hab. inż.
OsobyGrzegorz Szwoch urodził się w 1972 roku w Gdańsku. W latach 1991-1996 studiował na wydziale Elektroniki Politechniki Gdańskiej. W roku 1996 ukończył studia w Zakładzie Inżynierii Dźwięku (obecnie Katedra Systemów Multimedialnych), broniąc pracę dyplomową pt. Modelowanie fizyczne wybranych instrumentów muzycznych. W tym samym roku dołączył do zespołu badawczego Katedry jako uczestnik Studium Doktoranckiego. Od stycznia 2001 roku...
-
Speech Intelligibility Measurements in Auditorium
PublikacjaSpeech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...
-
Time variable gain for long range sonar with chirp sounding signal
PublikacjaThe main purpose of applaying Time Variable Gain (TVG) in active sonars with digital signal processing is to reduce dynamic range of echo signal and adapt it to the dynamic range of the analogue to digital conversion. With high transmission losses level, the dynamic range of the input signal in long range sonars can be very high and even exceed 200dB. When chirp sounding signals with matched filtration are used, sonars can raech...
-
Performance Analysis of the OpenCL Environment on Mobile Platforms
PublikacjaToday’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...
-
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
PublikacjaThe speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...
-
Silence/noise detection for speech and music signals
PublikacjaThis paper introduces a novel off-line algorithm for silence/noise detection in noisy signals. The main concept of the proposed algorithm is to provide noise patterns for further signals processing i.e. noise reduction for speech enhancement. The algorithm is based on frequency domain characteristics of signals. The examples of different types of noisy signals are presented.
-
Vibration signals collected for concrete beams with GFRP reinforcement subjected to elevated temperatures (120C-240C)
Dane BadawczeThe dataset contains the time domain signals obtained during dynamic tests of concrete beams reinforced with GFRP bars. The vibration were induced with the use of modal hammer, while the signals were collected by the accelerometers attached at the beam surface. The signals were captured before and after subjecting the concrete beams to elevated temperatures.
-
Investigating Feature Spaces for Isolated Word Recognition
PublikacjaThe study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...
-
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
PublikacjaIn order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...
-
Auditory-visual attention stimulator
PublikacjaNew approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...
-
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
PublikacjaA method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...
-
Comparative analysis of various transformation techniques for voiceless consonants modeling
PublikacjaIn this paper, a comparison of various transformation techniques, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Discrete Walsh Hadamard Transform (DWHT) are performed in the context of their application to voiceless consonant modeling. Speech features based on these transformation techniques are extracted. These features are mean and derivative values of cepstrum coefficients, derived from each transformation....
-
A study on signal processing methods applied to hearing aids
PublikacjaThis paper presents a short survey on current technology available in hearing aids with a focus on digital signal processing techniques used. First, factors influencing the hearing aid effectiveness are introduced. Then, examples of the present DSP methods and strategies are provided. Also, a description of current limitations of hearing aids and future trends of development are shown. Finally, the notion of computational auditory...
-
Simulation of signal acquisition from a rotary flowmeter
Dane BadawczeThe dataset contains results of simulation measuring the flow of homogeneous substances by rotational flow meter: a moment of impulse at the output of flow meter, time between successive pulses, number of pulses counted from standard generator and relative error of measurement.