Wyniki wyszukiwania dla: SPECH PROCESSING

Playback detection using machine learning with spectrogram features approach

Publikacja

- Rok 2017

This paper presents 2D image processing approach to playback detection in automatic speaker verification (ASV) systems using spectrograms as speech signal representation. Three feature extraction and classification methods: histograms of oriented gradients (HOG) with support vector machines (SVM), HAAR wavelets with AdaBoost classifier and deep convolutional neural networks (CNN) were compared on different data partitions in respect...

Pełny tekst do pobrania w portalu

Badanie rozkładów parametrów sygnału mowy w zastosowaniach do prognozowania prawdopodobieństwa popełnienia błędów w systemach identyfikacji mówców = Examining distribution of speech signal parameters for the prognosis of error probability in speaker verification systems

Publikacja

A. Kaczmarek

- Rok 2010

Przedmiotem pracy jest system identyfikacji mówców w sposób zależny od tekstu ("text dependent''). Dokonano analizy wielu różnych wypowiedzi kilkudziesięciu mówców. Zastosowana metoda parametryzacji to metoda oparta na wynikach analizy cepstralnej sygnału mowy. Zdefiniowane zostały nowe parametry skojarzone z elementarnymi zdarzeniami w procesie weryfikacji mówców. Na tej podstawie dokonano estymacji funkcji gęstości prawdopodobieństwa...

Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation

Publikacja

S. Raczyński
E. Vincent

- IEEE Transactions on Audio Speech and Language Processing - Rok 2014

In this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...

Pełny tekst do pobrania w serwisie zewnętrznym

Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering

Publikacja

- IEEE Transactions on Audio Speech and Language Processing - Rok 2015

This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...

Pełny tekst do pobrania w portalu

Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling

Publikacja

S. Raczyński
E. Vincent
S. Sagayama

- IEEE Transactions on Audio Speech and Language Processing - Rok 2013

Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...

Pełny tekst do pobrania w serwisie zewnętrznym

New approach for determining the QoS of MP3-coded voice signals in IP networks

Publikacja

T. Uhl
S. Paulsen
K. Nowicki

- EURASIP Journal on Audio Speech and Music Processing - Rok 2017

Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...

Pełny tekst do pobrania w portalu

Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders

Publikacja

D. Koszewski
T. Görne
G. Korvel
B. Kostek

- EURASIP Journal on Audio Speech and Music Processing - Rok 2023

The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Pełny tekst do pobrania w portalu

The effect of current signal filtering method on the value of cutting power while sawing wood

Publikacja

K. Orłowski
J. Sandak
T. Ochrymiuk
M. Lackowski
A. Sandak

- Annals of WULS, Forestry and Wood Technology - Rok 2015

The goal of this work was to investigate an effect of various signal pre-processings on the outline of the electrical power curve and its influence on the measured cutting force estimation. Two signal processing methods were selected for the needs of the experiment, including digital filter and wavelet transform. The filter used was Butterworth, 3rd order band-stop with the cut-out band from 45 Hz to 55 Hz. The second approach...

Pełny tekst do pobrania w portalu

AUTOMATYCZNA KLASYFIKACJA MOWY PATOLOGICZNEJ

Publikacja

- Rok 2023

Aplikacja przedstawiona w niniejszym rozdziale służy do automatycznego wykrywania mowy patologicznej na podstawie bazy nagrań. W pierwszej kolejności przedstawiono założenia leżące u podstaw przeprowadzonych badan wraz z wyborem bazy mowy patologicznej. Zaprezentowano również zastosowane algorytmy oraz cechy sygnału mowy, które pozwalają odróżnić mowę niezaburzoną od mowy patologicznej. Wytrenowane sieci neuronowe zostały następnie...

Pełny tekst do pobrania w serwisie zewnętrznym

A Novel Approach to the Assessment of Cough Incidence

Publikacja

- Rok 2013

In this paper we consider the problem of identication of cough events in patients suffering from chronic respiratory diseases. The information about frequency of cough events is necessary to medical treatment. The proposed approach is based on bidirectional processing of a measured vibration signal - cough events are localized by combining the results of forward-time and backward-time analysis. The signal is at rst transformed...

Pełny tekst do pobrania w serwisie zewnętrznym

Exception handling model influence factors for discributed systems. W: Proceedings. PPAM 2003. Parallel Processing and Applied Mathematics. 5th In- ternational Conference. Częstochowa, 7-10 September 2003.Model obsługi wyjątków uwzględniający wpływ czynników systemu rozproszonego.

Publikacja

- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2003

Specyfikacja programu jest jasno określona w systemach sekwencyjnych, gdzie posiada standardowe i wyjątkowe przejścia. Praca przedstawia rozszerzony model specyfikacji systemu w środowiskach rozproszonych uwzględniający szereg specyficznych czynników. Model zawiera analizę specyfikacji pod kątem obsługi wyjątków dla rozproszonych danych oraz komunikacji międzyprocesorowej. Ogólny model został zaimplementowany w środowisku...

POPRAWA OBIEKTYWNYCH WSKAŹNIKÓW JAKOŚCI MOWY W WARUNKACH HAŁASU

Publikacja

K. Kąkol
B. Kostek

- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2018

Celem pracy jest modyfikacja sygnału mowy, aby uzyskać zwiększenie poprawy obiektywnych wskaźników jakości mowy po zmiksowaniu sygnału użytecznego z szumem bądź z sygnałem zakłócającym. Wykonane modyfikacje sygnału bazują na cechach mowy lombardzkiej, a w szczególności na efekcie podniesienia częstotliwości podstawowej F0. Sesja nagraniowa obejmowała zestawy słów i zdań w języku polskim, nagrane w warunkach ciszy, jak również w...

Pełny tekst do pobrania w portalu

Enhanced voice user interface employing spatial filtration of signals from acoustic vector sensor

Publikacja

- Rok 2015

Spatial filtration of sound is introduced to enhance speech recognition accuracy in noisy conditions. An acoustic vector sensor (AVS) is employed. The signals from the AVS probe are processed in order to attenuate the surrounding noise. As a result the signal to noise ratio is increased. An experiment is featured in which speech signals are disturbed by babble noise. The signals before and after spatial filtration are processed...

Pełny tekst do pobrania w serwisie zewnętrznym

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Publikacja

D. Korzekwa
R. Barra-Chicote
S. Zaporowski
G. Beringer
J. Lorenzo-trueba
A. Serafinowicz
J. Droppo
T. Drugman
B. Kostek

- Rok 2021

This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

Pełny tekst do pobrania w portalu

Zastosowanie spowalniania wypowiedzi w celu poprawy rozumienia mowy przez dzieci w szkole

Publikacja

- Rok 2009

This paper presents a time-scale modification algorithms that could be used for hearing impairment therapy supported by real-time speech stretching. In this paper the OLA based algorithms and Phase Vocoder were described. In the experimental part usability of those algorithms for real-time speech stretching was discussed

KORPUS MOWY ANGIELSKIEJ DO CELÓW MULTIMODALNEGO AUTOMATYCZNEGO ROZPOZNAWANIA MOWY

Publikacja

- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Rok 2016

W referacie zaprezentowano audiowizualny korpus mowy zawierający 31 godzin nagrań mowy w języku angielskim. Korpus dedykowany jest do celów automatycznego audiowizualnego rozpoznawania mowy. Korpus zawiera nagrania wideo pochodzące z szybkoklatkowej kamery stereowizyjnej oraz dźwięk zarejestrowany przez matrycę mikrofonową i mikrofon komputera przenośnego. Dzięki uwzględnieniu nagrań zarejestrowanych w warunkach szumowych korpus...

Engineering Challenges in the Design of Cochlear Implants

Publikacja

K. Ullah
M. Ishaq

- Rok 2021

Hearing aids such as cochlear implants have been used by both adults and children for a long time. In addition, cochlear implants are used by patients who have severe hearing loss either by birth or after an accident. This paper aims to investigate the engineering challenges bounding the design of cochlear implants and present its possible solution...

HYDROGRAPHIC SURVEY PLANNING FOR THE DETERMINATION OF TERRITORIAL SEA BASELINE ON THE EXAMPLE OF SELECTED POLISH SEA AREAS

Publikacja

M. Specht
C. Specht

- Rok 2018

Pełny tekst do pobrania w serwisie zewnętrznym

THE USE OF GNSS GEODETIC NETWORKS ON THE APPROACH TO THE PORTS � GULF OF GDANSK STUDY

Publikacja

M. Specht
C. Specht

- Rok 2018

Pełny tekst do pobrania w serwisie zewnętrznym

Concept of an Innovative System for Dimensioning and Predicting Changes in the Coastal Zone Topography Using UAVs and USVs (4DBatMap System)

Publikacja

O. Specht
M. Specht
A. Stateczny
C. Specht

- Electronics - Rok 2023

This publication is aimed at developing a concept of an innovative system for dimensioning and predicting changes in the coastal zone topography using Unmanned Aerial Vehicles (UAVs) and Unmanned Surface Vehicles (USVs). The 4DBatMap system will consist of four components: 1. Measurement data acquisition module. Bathymetric and photogrammetric measurements will be carried out with a specific frequency in the coastal zone using...

Pełny tekst do pobrania w portalu

Instantaneous complex frequency for pipeline pitch estimation

Publikacja

M. [. Kaniewska

- Rok 2010

In the paper a pipeline algorithm for estimating the pitch of speech signal is proposed. The algorithm uses instantaneous complex frequencies estimated for four waveforms obtained by filtering the original speech signal through four bandpass complex Hilbert filters. The imaginary parts of ICFs from each channel give four candidates for pitch estimates. The decision regarding the final estimate is made based on the real parts of...

XVIII Międzynarodowe Sympozjum Inżynierii i Reżyserii Dźwięku

Publikacja

P. Falkowski-Gilski
S. Brachmański
A. Dobrucki
M. Kin

- Rok 2021

The subjective assessment of speech signals takes into account previous experiences and habits of an individual. Since the perception process deteriorates with age, differences should be noticeable among people from dissimilar age groups. In this work, we investigated the difference of speech quality assessment between high school students and university students. The study involved 60 participants, with 30 people in both the adolescents...

Pełny tekst do pobrania w serwisie zewnętrznym

Creating new voices using normalizing flows

Publikacja

P. Biliński
T. Merritt
A. Ezzerg
K. Pokora
S. Cygert
K. Yanagisawa
R. Barra-Chicote
D. Korzekwa

- Rok 2022

Creating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...

Pełny tekst do pobrania w portalu

PHONEME DISTORTION IN PUBLIC ADDRESS SYSTEMS

Publikacja

- Rok 2015

The quality of voice messages in speech reinforcement and public address systems is often poor. The sound engineering projects of such systems take care of sound intensity and possible reverberation phenomena in public space without, however, considering the influence of acoustic interference related to the number and distribution of loudspeakers. This paper presents the results of measurements and numerical simulations of the...

Study Analysis of Transmission Efficiency in DAB+ Broadcasting System

Publikacja

P. Falkowski-Gilski

- Rok 2018

DAB+ is a very innovative and universal multimedia broadcasting system. Thanks to its updated multimedia technologies and metadata options, digital radio keeps pace with changing consumer expectations and the impact of media convergence. Broadcasting analog and digital radio services does vary, concerning devices on both transmitting and receiving side, as well as content processing mechanisms. However, the biggest difference is...

Pełny tekst do pobrania w portalu

Human voice modification using instantaneous complex frequency

Publikacja

M. Kaniewska

- Rok 2010

The paper presents the possibilities of changing human voice by modifying instantaneous complex frequency (ICF) of the speech signal. The proposed method provides a flexible way of altering voice without the necessity of finding fundamental frequency and formants' positions or detecting voiced and unvoiced fragments of speech. The algorithm is simple and fast. Apart from ICF it uses signal factorization into two factors: one fully...

Innovative strategies: Combining treatments for advanced wastewater purification

Publikacja

R. A. de Jesus
N. Łukasik
A. Kumar
L. F. R. Ferreira

- Rok 2024

Water scarcity is a pressing global challenge, driving the urgent need for effective wastewater treatment solutions. With untreated wastewater extensively employed, particularly in agriculture, the significance of proper treatment becomes evident, as it presents a more practical and ecologically responsible alternative. This chapter explores the diverse treatment approaches encompassing chemical, physical, and biological methods,...

Pełny tekst do pobrania w serwisie zewnętrznym

Architecture and implementation of distributed data storage using Web Services, CORBA i PVM. W: Proceedings. PPAM 2003. Parallel Processing and Applied Mathematics. Fifth International Conference. Częstochowa, 7-10 September 2003. Architektura i implementacja rozproszonego zarządzania danymi używając systemów Web Services, CORBA i PVN.

Publikacja

P. Czarnul

- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2003

Proponujemy architekturę i jej implementację PVMWeb Cluster I/O przeznaczoną do rozproszonego zarządzania danymi. Dane zapisywane są w systemie Web Services z geograficznie odległych klientów lub przez wywołania CORBA z wewnątrz danego klastra co oferuje lepsze osiągi.

A History of Maritime Radio-Navigation Positioning Systems used in Poland

Publikacja

C. Specht
A. Weintrit
M. Specht

- JOURNAL OF NAVIGATION - Rok 2016

Pełny tekst do pobrania w serwisie zewnętrznym

Application of an Autonomous/Unmanned Survey Vessel (ASV/USV) in Bathymetric Measurements

Publikacja

C. Specht
E. Świtalski
M. Specht

- Polish Maritime Research - Rok 2017

Pełny tekst do pobrania w serwisie zewnętrznym

Assessment of the Steering Precision of a Hydrographic USV along Sounding Profiles Using a High-Precision GNSS RTK Receiver Supported Autopilot

Publikacja

Ł. Marchel
C. Specht
M. Specht

- ENERGIES - Rok 2020

Pełny tekst do pobrania w serwisie zewnętrznym

Testing the Accuracy of the Modified ICP Algorithm with Multimodal Weighting Factors

Publikacja

Ł. Marchel
C. Specht
M. Specht

- ENERGIES - Rok 2020

Pełny tekst do pobrania w serwisie zewnętrznym

Investigating Feature Spaces for Isolated Word Recognition

Publikacja

P. Treigys
G. Korvel
G. Tamulevicius
J. Bernataviciene
B. Kostek

- Rok 2020

The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Pełny tekst do pobrania w serwisie zewnętrznym

Sprzętowa implementacja transformacji Hougha w czasie rzeczywistym

Publikacja

- Poznan University of Technology Academic Journals. Electrical Engineering - Rok 2021

W artykule przedstawiono implementację sprzętową w FPGA algorytmu do wykrywania kształtów aproksymowanych zbiorem linii prostych podczas przetwarzania obrazu cyfrowego w czasie rzeczywistym. W opracowanej strukturze sprzętowej podniesiono efektywność przetwarzania poprzez zastosowanie przetwarzania przepływowego, lookup table, wykorzystanie wyłącznie arytmetyki liczb całkowitych oraz rozproszenie pamięci głosowania. Eksperymentalnie...

Pełny tekst do pobrania w portalu

Auditory-visual attention stimulator

Publikacja

- Rok 2013

New approach to lateralization irregularities formation was proposed. The emphasis is put on the relationship between visual and auditory attention stimulation. In this approach hearing is stimulated using time scale modified speech and sight is stimulated by rendering the text of the currently heard speech. Moreover, displayed text is modified using several techniques i.e. zooming, highlighting etc. In the experimental part of...

Pełny tekst do pobrania w serwisie zewnętrznym

INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH

Publikacja

G. Korvel
P. Treigys
K. Kąkol
B. Kostek

- International Journal of Applied Mathematics and Computer Science - Rok 2023

The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...

Pełny tekst do pobrania w portalu

Adaptive Personal Tuning of Sound in Mobile Computers

Publikacja

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2016

An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of their acoustic track to changing acoustic conditions of the environment and to users’ individual preferences. Signal processing algorithms are introduced that concern: linearization of frequency response, dialogue intelligibility enhancement, and dynamics processing tuned up to the users’...

Pełny tekst do pobrania w portalu

Depth Images Filtering In Distributed Streaming

Publikacja

- Polish Maritime Research - Rok 2016

In this paper, we propose a distributed system for point cloud processing and transferring them via computer network regarding to effectiveness-related requirements. We discuss the comparison of point cloud filters focusing on their usage for streaming optimization. For the filtering step of the stream pipeline processing we evaluate four filters: Voxel Grid, Radial Outliner Remover, Statistical Outlier Removal and Pass Through....

Pełny tekst do pobrania w portalu

High performance filtering for big datasets from Airborne Laser Scanning with CUDA technology

Publikacja

W. Błaszczak-bąk
A. Janowski
P. Srokosz

- SURVEY REVIEW - Rok 2018

There are many studies on the problems of processing big datasets provided by Airborne Laser Scanning (ALS). The processing of point clouds is often executed in stages or on the fragments of the measurement set. Therefore, solutions that enable the processing of the entire cloud at the same time in a simple, fast, efficient way are the subject of many researches. In this paper, authors propose to use General-Purpose computation...

Pełny tekst do pobrania w serwisie zewnętrznym

Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.

Publikacja

- Rok 2018

In this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...

Pełny tekst do pobrania w serwisie zewnętrznym

Accuracy and coverage of the modernized Polish Maritime differential GPS system

Publikacja

C. Specht

- ADVANCES IN SPACE RESEARCH - Rok 2011

Pełny tekst do pobrania w serwisie zewnętrznym

THE CONCEPT OF INTEGRATED SYSTEM FOR COLLECTING GEOGRAPHIC AND HYDROGRAPHIC DATA FOR NAVIGATION PURPOSES IN RIS

Publikacja

C. Specht

- Rok 2018

Pełny tekst do pobrania w serwisie zewnętrznym

NAVIGATION USERS OF MULTI-GNSS CODE RECEIVERS

Publikacja

C. Specht

- Rok 2019

Pełny tekst do pobrania w serwisie zewnętrznym

DEPTH IMAGES FILTERING IN DISTRIBUTED STREAMING

Publikacja

- Polish Maritime Research - Rok 2016

In this paper we discuss the comparison of point cloud filters focusing on their applicability for streaming optimization. For the filtering stage within a stream pipeline processing we evaluate three filters: Voxel Grid, Pass Through and Statistical Outlier Removal. For the filters we perform series of the tests aiming at evaluation of changes of point cloud size and transmitting frequency (various fps ratio). We propose a distributed...

Pełny tekst do pobrania w portalu

Personal adaptive tuning of mobile computer audio

Publikacja

- Rok 2015

An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of the acoustic track to the changing conditions and to the user's individual preferences. Original signal processing algorithms are introduced, which concern: linearization of frequency response, dialogue intelligibility enhancement and dynamics processing tuned up to the user's preferences....

Parallelization of video stream algorithms in kaskada platform

Publikacja

A. Brzeski

- Rok 2011

The purpose of this work is to present different techniques of video stream algorithms parallelization provided by the Kaskada platform - a novel system working in a supercomputer environment designated for multimedia streams processing. Considered parallelization methods include frame-level concurrency, multithreading and pipeline processing. Execution performance was measured on four time-consuming image recognition algorithms,...

OpenGL accelerated method of the material matrix generation for FDTD simulations

Publikacja

- Rok 2014

This paper presents the accelerated technique of the material matrix generation from CAD models utilized by the finite-difference time-domain (FDTD) simulators. To achieve high performance of these computations, the parallel-processing power of a graphics processing unit was employed with the use of the OpenGL library. The method was integrated with the developed FDTD solver, providing approximately five-fold speedup of the material...

Pełny tekst do pobrania w serwisie zewnętrznym