Filtry
wszystkich: 837
Wyniki wyszukiwania dla: ALOPHONEME ANALISYS, SPEECH PROCESSING, DYNAMIC TIME WARPING
-
System przetwarzania i wizualizacji sygnału mowy dla potrzeb lingwistycznych [A system of speech signal processing and visualisation for linguistic purposes]
Publikacja -
Dynamic mass measurement in checkweighers using a discrete time-variant low-pass filter
PublikacjaConveyor belt type checkweighers are complex mechanical systems consisting of a weighing sensor (strain gauge load cell, electrodynamically compensated load cell), packages (of different shapes, made of different materials) and a transport system (motors, gears, rollers). Disturbances generated by the vibrating parts of such a system are reflected in the signal power spectra in a form of strong spectral peaks, located usually in...
-
Fixed final time and free final state optimal control problem for fractional dynamic systems – linear quadratic discrete-time case
Publikacja -
Investigation of the 16-year and 18-year ZTD Time Series Derived from GPS Data Processing
PublikacjaThe GPS system can play an important role in activities related to the monitoring of climate. Long time series, coherent strategy, and very high quality of tropospheric parameter Zenith Tropospheric Delay (ZTD) estimated on the basis of GPS data analysis allows to investigate its usefulness for climate research as a direct GPS product. This paper presents results of analysis of 16-year time series derived from EUREF Permanent Network...
-
EURASIP Journal on Audio Speech and Music Processing
Czasopisma -
Machine-learning-based precise cost-efficient NO2 sensor calibration by means of time series matching and global data pre-processing
PublikacjaAir pollution remains a considerable contemporary challenge affecting life quality, the environment, and economic well-being. It encompasses an array of pollutants—gases, particulate matter, biological molecules—emanating from sources such as vehicle emissions, industrial activities, agriculture, and natural occurrences. Nitrogen dioxide (NO2), a harmful gas, is particularly abundant in densely populated urban areas. Given its...
-
Statistical Data Pre-Processing and Time Series Incorporation for High-Efficacy Calibration of Low-Cost NO2 Sensor Using Machine Learning
PublikacjaAir pollution stands as a significant modern-day challenge impacting life quality, the environment, and the economy. It comprises various pollutants like gases, particulate matter, biological molecules, and more, stemming from sources such as vehicle emissions, industrial operations, agriculture, and natural events. Nitrogen dioxide (NO2), among these harmful gases, is notably prevalent in densely populated urban regions. Given...
-
Metoda i algorytmy modyfikacji sygnału do celu wspomagania rozumienia mowy przez osoby z pogorszoną rozdzielczością czasową słuchu
PublikacjaPrzedmiotem badań przeprowadzonych w ramach rozprawy są metody modyfikacji czasu trwania sygnału (ang. Time Scale Modification –TSM) mowy operujące w czasie rzeczywistym oraz ocena ich wpływu na rozumienie wypowiedzi przez osoby z pogorszoną rozdzielczością czasową słuchu. Pogorszona rozdzielczość słuchu jest jednym z symptomów związanych z ośrodkowymi zaburzeniami słuchu (ang. Cetnral Auditory Processing Disorder – CAPD). W odróżnieniu...
-
Journal of Real-Time Image Processing
Czasopisma -
Marek Czachor prof. dr hab.
Osoby -
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
PublikacjaIn this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...
-
Jan Daciuk dr hab. inż.
OsobyJan Daciuk uzyskał tytuł zawodowy magistra na Wydziale Elektroniki Politechniki Gdańskiej w 1986 roku, a doktorat na wydziale Elektroniki, Telekomunikacji i Informatyki PG w 1999. Pracuje na Wydziale od 1988 roku. Jego zainteresowania naukowe obejmują zastosowania automatów skończonych w przetwarzaniu języka naturalnego i przetwarzaniu mowy. Spędził ponad cztery lata w europejskich uniwersytetach i instytutach naukowych, takich...
-
System Supporting Speech Perception in Special Educational Needs Schoolchildren
PublikacjaThe system supporting speech perception during the classes is presented in the paper. The system is a combination of portable device, which enables real-time speech stretching, with the workstation designed in order to perform hearing tests. System was designed to help children suffering from Central Auditory Processing Disorders.
-
Michał Mazur dr inż.
OsobyAktualne zainteresowania inżynieria mechaniczna, robotyka, drgania mechaniczne, analiza modalna, sterowanie, systemy czasu rzeczywistego Wybrane publikacje Kaliński K., Galewski M., Mazur M., Chodnicki M, 2017, Modelling and Simulation Of A New Variable Stiffness Holder for Milling Of Flexible Details, Polish Maritime Research, vol 24, ss. 115-124 Kaliński K. J., Mazur M.: Optimal control at energy performance index of the mobile...
-
Krzysztof Goczyła prof. dr hab. inż.
OsobyKrzysztof Goczyła, profesor zwyczajny Politechniki Gdańskiej, informatyk, specjalista z inżynierii oprogramowania, inżynierii wiedzy i baz danych. Ukończył studia wyższe na Wydziale Elektroniki Politechniki Gdańskiej w 1976 r. jako magister inżynier elektronik w specjalności automatyka. Na Politechnice Gdańskiej pracuje od 1976. Na Wydziale Elektroniki PG w 1982 r. uzyskał doktorat z informatyki, a w 1999 r. habilitację. W 2012...
-
A handwritten signature verification method employing a tablet
PublikacjaA signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method's efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
-
IEEE International Conference on Acoustics, Speech and Signal Processing
Konferencje -
Investigating Feature Spaces for Isolated Word Recognition
PublikacjaMuch attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
-
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
PublikacjaThe paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...
-
Handwritten signature verification system employing wireless biometric pen
PublikacjaThe handwritten signature verification system being a part of the developed multimodal biometric banking stand is presented. The hardware component of the solution is described with a focus on the signature acquisition and on verification procedures. The signature is acquired employing an accelerometer and a gyroscope built-in the biometric pen plus pressure sensors for the assessment of the proper pen grip and then the signature...
-
Introduction to the special issue on machine learning in acoustics
PublikacjaWhen we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...
-
Zdzisław Kowalczuk prof. dr hab. inż.
OsobyW 1978 ukończył studia w zakresie automatyki i informatyki na Wydziale Elektroniki Politechniki Gdańskiej, następnie rozpoczął pracę na macierzystej uczelni. W 1986 obronił pracę doktorską, w 1993 habilitował się na Politechnice Śląskiej na podstawie pracy Dyskretne modele w projektowaniu układów sterowania. W 1996 mianowany profesorem nadzwyczajnym, w 2003 otrzymał tytuł profesora nauk technicznych. W 2006 założył i od tego czasu...
-
Multimodal English corpus for automatic speech recognition
PublikacjaA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Improvement of speech intelligibility in the presence of noise interference using the Lombard effect and an automatic noise interference profiling based on deep learning
PublikacjaThe Lombard effect is a phenomenon that results in speech intelligibility improvement when applied to noise. There are many distinctive features of Lombard speech that were recalled in this dissertation. This work proposes the creation of a system capable of improving speech quality and intelligibility in real-time measured by objective metrics and subjective tests. This system consists of three main components: speech type detection,...
-
Low-Level Music Feature Vectors Embedded as Watermarks
PublikacjaIn this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...
-
An audio-visual corpus for multimodal automatic speech recognition
Publikacjareview of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...
-
Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically
PublikacjaThe aim of this study is two-fold. First, we perform a series of experiments to examine the interference of different noises on speech processing. For that purpose, we concentrate on the Lombard effect, an involuntary tendency to raise speech level in the presence of background noise. Then, we apply this knowledge to detecting speech with the Lombard effect. This is for preparing a dataset for training a machine learning-based...
-
Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole
PublikacjaThis paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed
-
English Language Learning Employing Developments in Multimedia IS
PublikacjaIn the realm of the development of information systems related to education, integrating multimedia technologies offers novel ways to enhance foreign language learning. This study investigates audio-video processing methods that leverage real-time speech rate adjustment and dynamic captioning to support English language acquisition. Through a mixed-methods analysis involving participants from a language school, we explore the impact...
-
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
PublikacjaMethods developed for real-time time scale modification (TSM) of speech signal are presented. They are based onthe non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of theproposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearingimpaired children and elderly listeners. It was shown that for the speech with average rate equal to or...
-
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
PublikacjaW niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
-
Detecting Lombard Speech Using Deep Learning Approach
PublikacjaRobust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...
-
Strategie treningu neuronowego estymatora częstotliwości tonu krtaniowego z użyciem generatora syntetycznych samogłosek
PublikacjaW wielu zastosowaniach telekomunikacyjnych pojawia się problem przetwarzania lub analizy sygnału mowy, w ramach którego, często w obszarze podstawowych algorytmów, stosuje się estymator częstotliwości tonu krtaniowego. Estymator rozpatrywany w tej pracy bazuje na neuronowym klasyfikatorze podejmującym decyzje na podstawie częstotliwości oraz mocy chwilowej wyznaczanych w podpasmach analizowanego sygnału mowy. W pracy rozważamy...
-
Variable Ratio Sample Rate Conversion Based on Fractional Delay Filter
PublikacjaIn this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely...
-
Thin-walled frames and grids - statics and dynamics
PublikacjaFrames and grids assembled with thin-walled beams of open cross-section are widely applied in various civil engineering and vehicle or machine structures. Static and dynamic analysis of theses structures may be carried out by means of different models, startingfrom the classical models made of beam elements undergoing the Kirchhoff assumptions to the FE discretization of whole frame into plane elements. The former model is very...
-
Uniwersalny system RPG do zastosowań w przestrzeniach inteligentnych
PublikacjaArtykuł dotyczy systemów rozpoznawania poleceń głosowych(RPG). Przedstawiono dwa podstawowe rodzaje systemów RPG i przeprowadzono dyskusję nad wyborem architektury odpowiedniej do zastosowań w przestrzeniach inteligentnych (PI). Zaprezentowano algorytm czasowego dopasowania sygnałów (ang. Dinamic Time Warping - DTW) oraz budowę elementu decyzyjnego zaimplementowanego systemu. Przedstawiono wyniki oceny tego systemu.
-
Grzegorz Szwoch dr hab. inż.
OsobyGrzegorz Szwoch urodził się w 1972 roku w Gdańsku. W latach 1991-1996 studiował na wydziale Elektroniki Politechniki Gdańskiej. W roku 1996 ukończył studia w Zakładzie Inżynierii Dźwięku (obecnie Katedra Systemów Multimedialnych), broniąc pracę dyplomową pt. Modelowanie fizyczne wybranych instrumentów muzycznych. W tym samym roku dołączył do zespołu badawczego Katedry jako uczestnik Studium Doktoranckiego. Od stycznia 2001 roku...
-
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
PublikacjaObjective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...
-
A Novel Approach to the Assessment of Cough Incidence
PublikacjaIn this paper we consider the problem of identication of cough events in patients suffering from chronic respiratory diseases. The information about frequency of cough events is necessary to medical treatment. The proposed approach is based on bidirectional processing of a measured vibration signal - cough events are localized by combining the results of forward-time and backward-time analysis. The signal is at rst transformed...
-
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
PublikacjaThe main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...
-
Identity verification using complex representations of handwritten signature
PublikacjaThis paper is devoted to handwritten signature verification using the cross-correlation approach (adopted by the authors from telecommunications) and dynamic time warping. The following invariants of the handwritten signature: the net signature, the instantaneous complex frequency and the complex cepstrum are analyzed. The problem of setting the threshold for deciding whether the current signature is authentic or forged is discussed....
-
Speech Intelligibility Measurements in Auditorium
PublikacjaSpeech intelligibility was measured in Auditorium Novum on Technical University of Gdansk (seating capacity 408, volume 3300 m3). Articulation tests were conducted; STI and Early Decay Time EDT coefficients were measured. Negative noise contribution to speech intelligibility was taken into account. Subjective measurements and objective tests reveal high speech intelligibility at most seats in auditorium. Correlation was found between...
-
Performance Analysis of the OpenCL Environment on Mobile Platforms
PublikacjaToday’s smartphones have more and more features that so far were only assigned to personal computers. Every year these devices are composed of better and more efficient components. Everything indicates that modern smartphones are replacing ordinary computers in various activities. High computing power is required for tasks such as image processing, speech recognition and object detection. This paper analyses the performance of...
-
Time variable gain for long range sonar with chirp sounding signal
PublikacjaThe main purpose of applaying Time Variable Gain (TVG) in active sonars with digital signal processing is to reduce dynamic range of echo signal and adapt it to the dynamic range of the analogue to digital conversion. With high transmission losses level, the dynamic range of the input signal in long range sonars can be very high and even exceed 200dB. When chirp sounding signals with matched filtration are used, sonars can raech...
-
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
PublikacjaThe speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...
-
Silence/noise detection for speech and music signals
PublikacjaThis paper introduces a novel off-line algorithm for silence/noise detection in noisy signals. The main concept of the proposed algorithm is to provide noise patterns for further signals processing i.e. noise reduction for speech enhancement. The algorithm is based on frequency domain characteristics of signals. The examples of different types of noisy signals are presented.
-
Vibration signals collected for concrete beams with GFRP reinforcement subjected to elevated temperatures (120C-240C)
Dane BadawczeThe dataset contains the time domain signals obtained during dynamic tests of concrete beams reinforced with GFRP bars. The vibration were induced with the use of modal hammer, while the signals were collected by the accelerometers attached at the beam surface. The signals were captured before and after subjecting the concrete beams to elevated temperatures.
-
Investigating Feature Spaces for Isolated Word Recognition
PublikacjaThe study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...
-
Quality Evaluation of Speech Transmission via Two-way BPL-PLC Voice Communication System in an Underground Mine
PublikacjaIn order to design a stable and reliable voice communication system, it is essential to know how many resources are necessary for conveying quality content. These parameters may include objective quality of service (QoS) metrics, such as: available bandwidth, bit error rate (BER), delay, latency as well as subjective quality of experience (QoE) related to user expectations. QoE is expressed as clarity of speech and the ability...
-
Auditory-visual attention stimulator
PublikacjaNew approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...