Search results for: audio

Audiosfera środowiska pracy w przestrzeni biurowej na planie otwartym. Wyniki zwiadu badawczego

Publication

P. Mizera-Pęczek

- e-mentor - Year 2021

Full text to download in external service

Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders

Publication

D. Koszewski
T. Görne
G. Korvel
B. Kostek

- EURASIP Journal on Audio Speech and Music Processing - Year 2023

The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Full text available to download

Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling

Publication

S. Raczyński
E. Vincent
S. Sagayama

- IEEE Transactions on Audio Speech and Language Processing - Year 2013

Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...

Full text to download in external service

Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation

Publication

S. Raczyński
E. Vincent

- IEEE Transactions on Audio Speech and Language Processing - Year 2014

In this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...

Full text to download in external service

New approach for determining the QoS of MP3-coded voice signals in IP networks

Publication

T. Uhl
S. Paulsen
K. Nowicki

- EURASIP Journal on Audio Speech and Music Processing - Year 2017

Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...

Full text available to download

Estimation of the short-term predictor parameters of speech under noisy conditions

Publication

M. Kuropatwinski
W. Kleijn
M. Kuropatwiński

- IEEE Transactions on Audio Speech and Language Processing - Year 2006

Full text to download in external service

Automatic sound recognition for security purposes

Publication

P. Żwan

- Year 2008

In the paper an automatic sound recognition system is presented. It forms a part of a bigger security system developed in order to monitor outdoor places for non-typical audio-visual events. The analyzed audio signal is being recorded from a microphone mounted in an outdoor place thus a non stationary noise of a significant energy is present in it. In the paper an especially designed algorithm for outdoor noise reduction is presented,...

QoS/QoE in the Heterogeneous Internet of Things (IoT)

Publication

K. Nowicki
T. Uhl

- Year 2017

Applications provided in the Internet of Things can generally be divided into three categories: audio, video and data. This has given rise to the popular term Triple Play Services. The most important audio applications are VoIP and audio streaming. The most notable video applications are VToIP, IPTV, and video streaming, and the service WWW is the most prominent example of data-type services. This chapter elaborates on the most...

Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions

Publication

- Year 2016

Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Full text to download in external service

Analiza stanu nawierzchni i klas pojazdów na podstawie parametrów ekstrahowanych z sygnału fonicznego

Publication

- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Year 2016

Celem badań jest poszukiwanie parametrów wektora cech ekstrahowanego z sygnału fonicznego w kontekście automatycznego rozpoznawania stanu nawierzchni jezdni oraz typu pojazdów. W pierwszej kolejności przedstawiono wpływ warunków pogodowych na charakterystykę widmową sygnału fonicznego rejestrowanego przy przejeżdżających pojazdach. Następnie, dokonano parametryzacji sygnału fonicznego oraz przeprowadzano analizę korelacyjną w celu...

Full text available to download

Digital Transformation of Terrestrial Radio: An Analysis of Simulcasted Broadcasts in FM and DAB+ for a Smart and Successful Switchover

Publication

P. Falkowski-Gilski

- Applied Sciences-Basel - Year 2021

The process of digitizing radio is far from over. It is an important interdisciplinary aspect, involving Big Data and AI (Artificial Intelligence) when it comes to classifying and handling content, and an organizational challenge in the Industry 4.0 concept. There exist several methods for delivering audio signals, including terrestrial broadcasting and internet streaming. Among them, the DAB+ (Digital Audio Broadcasting plus)...

Full text available to download

Examining Acoustic Emission of Engineered Ultrasound Loudspeakers

Publication

- Year 2014

Measurement results of the sound emitted from an ultrasound custom-made system with high spatial directivity are presented. The proposed system is using modulated ultrasound waves which demodulate in nonlinear medium resulting in audible sound. The system is aimed at enhancing the users’ personal audio space, therefore the measurements are performed using the Head and Torso Simulator which provides the realistic reproduction of...

A concept of Signal Equalization Method Based on Music Genre and the Listener's Room Characteristics

Publication

- Year 2016

A research study that investigates the influence of the room acoustics environment on the frequency characteristic of the audio signal playback is presented. First, a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the frequency response of the room, a system for room acoustics compensation based on eight-band equalizer is proposed. The system settings depend on music genre. In...

Measurements and Simulations of Engineered Ultrasound Loudspeakers

Publication

- Computational Methods in Science and Technology - Year 2015

Simulation and measurement results of the sound emitted from an ultrasound custom-made system with high spatial directivity are presented. The proposed system is using modulated ultrasound waves which demodulate in nonlinear medium resulting in audible sound. The system is aimed at enhancing the users’ personal audio space, therefore the measurements are performed using the Head and Torso Simulator which provides realistic reproduction...

Full text to download in external service

Intelligent multimedia solutions supporting special education needs.

Publication

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2011

The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

Quality Aspects in Digital Broadcasting and Webcasting Systems: Bitrate versus Loudness

Publication

- Journal of Telecommunications and Information Technology - Year 2017

In this paper the quality aspects of bitrate and loudness in digital broadcasting and webcasting systems are examined. The authors discuss a survey concerning user preferences related with processing and managing audio content. The coding efficiency of a popular audio format is analyzed in the context of storing media. An objective study on a representative group of signal samples, as well as a subjective study of the perceived...

Full text available to download

Bimodal deep learning model for subjectively enhanced emotion classification in films

Publication

D. Weber
B. Kostek

- INFORMATION SCIENCES - Year 2024

This research delves into the concept of color grading in film, focusing on how color influences the emotional response of the audience. The study commenced by recalling state-of-the-art works that process audio-video signals and associated emotions by machine learning. Then, assumptions of subjective tests for refining and validating an emotion model for assigning specific emotional labels to selected film excerpts were presented....

Full text to download in external service

Online sound restoration system for digital library applications

Publication

- Year 2013

Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

Full text to download in external service

Wow defect reduction based on interpolation techniques

Publication

P. Maziewski

- Year 2005

W referacie przedstawiono wyniki badania różnych technik interpolacji wykorzystanych w redukcji kołysania dźwięku. W badaniach użyto: interpolację liniową, dwie techniki interpolacji wielomianowej (Hermite i spline), i technikę sumowania okienkowanych funkcji sink. Jakość rekonstrukcji wykonano wykorzystując sztucznie spreparowany sygnał audio, rekonstruowany wymienionymi metodami interpolacji. Jakość rekonstrukcji oceniono wykorzystując...

Creating a Realible Music Discovery and Recomendation System

Publication

- Year 2014

The aim of this paper is to show problems related to creating a reliable music dis-covery system. The SYNAT database that contains audio files is used for the purpose of experiments. The files are divided into 22 classes corresponding to music genres with different cardinality. Of utmost importance for a reliable music recommendation system are the assignment of audio files to their appropriate gen-res and optimum parameterization...

Full text to download in external service

Transmitting Alarm Information in DAB+ Broadcasting System

Publication

P. Falkowski-Gilski

- Year 2018

The main goal of digital broadcasting is to deliver high-quality content with the lowest possible bitrate. This paper is focused on transmitting alarm information, such as emergency warning and alerting, in the DAB+ (Digital Audio Broadcasting plus) broadcasting system. These additional services should be available at the lowest possible bitrate, in order to provide a clear and understandable voice message to people. Furthermore, additional...

In uence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classi cation

Publication

A. Rosner
F. Weninger
B. Schuller
M. Michalak
B. Kostek

- Year 2013

We present a comprehensive evaluation of the infuence of 'harmonic' and rhythmic sections contained in an audio file on automatic music genre classi cation. The study is performed using the ISMIS database composed of music files, which are represented by vectors of acoustic parameters describing low-level music features. Non-negative Matrix Factorization serves for blind separation of instrument components. Rhythmic components...

EVENTS VISUALIZATION POST IN A DISTRIBUTED TELEINFORMATION SYSTEM FOR THE BORDER GUARD

Publication

- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2017

Events Visualization Post is a part of the STRADAR project, which is dedicated to streaming real-time data in distributed dispatcher and teleinformation systems of the Border Guard. Events Visualization Post is a software designed for simultaneous visualization of data of different types. In the paper, the structure of the software is presented, the process of generation of tasks is described, and the visualization of audio, files,...

Online sound restoration system for digital library applications.

Publication

- Journal of the Acoustical Society of America - Year 2013

Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

Building Knowledge for the Purpose of Lip Speech Identification

Publication

- Advances in Intelligent Systems and Computing - Year 2017

Consecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...

Full text to download in external service

Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning

Publication

B. Kostek

- Year 2023

In this paper, the data, information, knowledge, and wisdom (DIKW) pyramid is revisited in the context of deep learning applied to machine learningbased audio signal processing. A discussion on the DIKW schema is carried out, resulting in a proposal that may supplement the original concept. Parallels between DIWK pertaining to audio processing are presented based on examples of the case studies performed by the author and her collaborators....

Full text to download in external service

Fitting the mobile device characteristics to the user's hearing preferences

Publication

- Year 2014

A method for fitting the mobile computer audio characteristics to the user's hearing preferences is proposed. The process consists of two stages: calibration and dynamics processing. During the calibration phase the user performs a loudness scaling test giving their response regarding the perceived loudness. The dynamics processing made on above basis sets the loudness to the most comfortable level. The processing accounts both...

Full text to download in external service

Reduction of parasitic pitch variations in archival musical recordings

Publication

- SIGNAL PROCESSING - Year 2010

A new method for reducing parasitic pitch variations in archival audio recordings is presented. The method is intended for analyzing movie soundtracks recorded in optical films. It utilizes image processing for calculating and reducing effects of tape shrinkage being one of the main reasons for parasitic pitch variations in audio accompanying moving images. As long as the film tape characteristics are known the new method can be...

Full text available to download

Postprodukcja nagrania wideo z dzwiekiem dookolnym

Publication

- Year 2009

One of the aims of this paper is to present issues related to audio-video correlation. This is presented on the basis of a short film realization employing surround microphone techniques. First, some related works in the domain of sound and vision correlation are presented. Then assumptions concerning scene creation related to both audio and video are shortly described. Another objective is to discuss results of subjective tests...

1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type

Publication

- Year 2020

A network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence....

Evaluation of a Novel Approach to Virtual Bass Synthesis Strategy

Publication

- Year 2015

The aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) strategy applied to portable computers. The developed algorithms involve intelligent, rule-based settings of bass synthesis parameters with regard to music genre of an audio excerpt and the type of a portable device in use. The Smart VBS algorithm performs the synthesis based on a nonlinear device (NLD) with artificial controlling synthesis...

Full text to download in external service

Classification of Music Genres by Means of Listening Tests and Decision Algorithms

Publication

- Year 2018

The paper compares the results of audio excerpt assignment to a music genre obtained in listening tests and classification by means of decision algorithms. A short review on music description employing music styles and genres is given. Then, assumptions of listening tests to be carried out along with an online survey for assigning audio samples to selected music genres are presented. A framework for music parametrization is created...

Full text to download in external service

Music genre classification applied to bass enhancement for mobile technology

Publication

- Elektronika : konstrukcje, technologie, zastosowania - Year 2015

The aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) algorithms applied to portable computers. The proposed algorithm is related to intelligent, rule-based setting of synthesis parameters according to music genre of an audio excerpt. The classification of music genres is automatically executed employing MPEG 7 parameters and the Principal Component Analysis method applied to reduce information...

Full text to download in external service

Machine learning applied to acoustic-based road traffic monitoring

Publication

- Procedia Computer Science - Year 2022

The motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...

Full text available to download

Machine learning applied to acoustic-based road traffic monitoring

Publication

- Year 2022

The motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...

Full text available to download

Music Data Processing and Mining in Large Databases for Active Media

Publication

- Year 2014

The aim of this paper was to investigate the problem of music data processing and mining in large databases. Tests were performed on a large data-base that included approximately 30000 audio files divided into 11 classes cor-responding to music genres with different cardinalities. Every audio file was de-scribed by a 173-element feature vector. To reduce the dimensionality of data the Principal Component Analysis (PCA) with variable...

Full text to download in external service

Further Developments of the Online Sound Restoration System for Digital Library Applications

Publication

- Year 2014

New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

Full text to download in external service

Sparse autoregressive modeling

Publication

M. Ciołek

- Year 2012

In the paper the comparison of the popular pitch determination (PD) algorithms for thepurpose of elimination of clicks from archive audio signals using sparse autoregressive (SAR)modeling is presented. The SAR signal representation has been widely used in code-excitedlinear prediction (CELP) systems. The appropriate construction of the SAR model is requiredto guarantee model stability. For this reason the signal representation...

An Approach to Bass Enhancement in Portable Computers Employing Smart Virtual Bass Synthesis Algorithms

Publication

- Year 2014

The aim of this paper is to present a novel approach to the Virtual Bass Synthesis (VBS) algorithms applied to portable computers. The developed algorithms are related to intelligent, rule-based setting of synthesis parameters according to music genre of an audio excerpt and to the type of a portable device in use. To find optimum synthesis parameters of the VBS algorithms, subjective listening tests based on a parametric procedure...

Full text to download in external service

Innovative method of localization airplanes in VCS (VCS-MLAT) distributed system

Publication

S. Wiszniewski

- Year 2019

The article presents the concept and the structure of the localization module. The prototype module is the part of the VCS (VCS-MLAT) localization distributed system. The device receives the audio signal transmitted in airplanes band (118 MHz – 136 MHz). Received data with the timestamps are send to the main server. The data from multiple devices estimates the localization of the airplane. The main aim of the project is the analysis...

Cross-domain applications of multimodal human-computer interfaces

Publication

A. Czyżewski

- Year 2015

Developed multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...

Subjective and Objective Comparative Study of DAB+ Broadcast System

Publication

- Archives of Acoustics - Year 2017

Broadcasting services seek to optimize their use of bandwidth in order to maximize user’s quality of experience. They aim to transmit high-quality digital speech and music signals at the lowest bitrate. They intend to offer the best quality under available conditions. Due to bandwidth limitations, audio quality is in conflict with the number of transmitted radio programs. This paper analyzes whether the quality of real-time digital...

Full text available to download

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Publication

D. Korzekwa
R. Barra-Chicote
S. Zaporowski
G. Beringer
J. Lorenzo-trueba
A. Serafinowicz
J. Droppo
T. Drugman
B. Kostek

- Year 2021

This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

Full text available to download

Methodology and technology for the polymodal allophonic speech transcription

Publication

- Journal of the Acoustical Society of America - Year 2016

A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

Full text to download in external service

Methodology and technology for the polymodal allophonic speech transcription

Publication

- Journal of the Acoustical Society of America - Year 2016

A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

Full text to download in external service

Sound engineering as our commitment to its creators in Poland

Publication

- Archives of Acoustics - Year 2019

Sound engineering is an interdisciplinary and rapidly expanding domain. It covers many aspects, such as sound perception, studio and sound mastering technology, music information retrieval including content-based search systems and automatic music transcription frameworks, sound synthesis, sound restoration, electroacoustics, and other ones constituting multimedia technology. Moreover, machine learning methods applied to the topics...

Full text to download in external service

Measurements of OF QoS/QoE parameters for media streaming in a PMIPv6 TESTBED WITH 802.11 b/g/n WLANs

Publication

- Metrology and Measurement Systems - Year 2012

A growing number of mobile devices and the increasing popularity of multimedia services result in a new challenge of providing mobility in access networks. The paper describes experimental research on media (audio and video) streaming in a mobile IEEE 802.11 b/g/n environment realizing network-based mobility. It is an approach to mobility that requires little or no modification of the mobile terminal. Assessment of relevant parameters...

Full text available to download

Data Analysis in Bridge of Data

Publication

- Year 2022

The chapter presents the data analysis aspects of the Bridge of Data project. The software framework used, Jupyter, and its configuration are presented. The solution’s architecture, including the TRYTON supercomputer as the underlying infrastructure, is described. The use case templates provided by the Stat-reducer application are presented, including data analysis related to spatial points’ cloud-, audio- and wind-related research.

Full text available to download

Subiektywny pomiar jakości sygnałów mowy i muzyki w lokalnych multipleksach radiofonii DAB+ w Gdańsku i Wrocławiu

Publication

P. Falkowski-Gilski
S. Brachmański

- Year 2021

Radiofonia cyfrowa DAB+ (Digital Audio Broadcasting plus) dostępna jest dla słuchaczy w Polsce od 2013 r. Standard ten oferuje szerokie możliwości konfiguracji multipleksów lokalnych nie tylko pod względem liczby, lecz także jakości nadawanych programów radiowych. Dzięki temu możliwe jest dostosowanie parametrów emitowanych sygnałów w celu sprostania oczekiwaniom odbiorców końcowych. W przeciwieństwie do radiofonii analogowej FM...

Full text to download in external service

On the Consumption of Multimedia Content Using Mobile Devices: a Year to Year User Case Study

Publication

P. Falkowski-Gilski

- Archives of Acoustics - Year 2020

In the early days, consumption of multimedia content related with audio signals was only possible in a stationary manner. The music player was located at home, with a necessary physical drive. An alternative way for an individual was to attend a live performance at a concert hall or host a private concert at home. To sum up, audio-visual effects were only reserved for a narrow group of recipients. Today, thanks to portable players,...

Full text available to download

Search

Filters

Catalog

Category

Year

Options