Search results for: audio processing objects

Wow detection and compensation employing spectral processing of audio.

Publication

- Year 2004

Praca zawiera opis opracowanych algorytmów detekcji i kompensacji pasożytniczych modulacji częstotliwości wynikających z nierównomiernego przesuwu nośnika dźwięku. Proponowane metody opracowano ze szczególnym uwzględnieniem przypadkowych zniekształceń drżenia obecnych w archiwalnych filmowych ścieżkach dźwiękowych. Dodatkowo algorytmy badają wpływ zniekształceń na strukturę formantową sygnałów. Analiza zmian położenia formantów...

Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing

Publication

- IEEE Transactions on Audio Speech and Language Processing - Year 2013

In this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...

Full text available to download

RENOVATION OF ARCHIVE AUDIO RECORDINGS USING SPARSE AUTOREGRESSIVE MODELING AND BIDIRECTIONAL PROCESSING

Publication

- Year 2013

The paper presents a new approach to elimination of broadband noise and impulsive disturbances from archive audio recordings. The proposed adaptive Kalman-like algorithm, based on a sparse autoregressive model of the audio signal, simultaneously detects noise pulses, interpolates the irrevocably distorted samples and performs signal smoothing. It is shown that bidirectional (forward-backward) processing of the archive signal improves...

Full text to download in external service

Intelligent Audio Signal Processing − Do We Still Need Annotated Datasets?

Publication

B. Kostek

- Year 2022

In this paper, intelligent audio signal processing examples are shortly described. The focus is, however, on the machine learning approach and datasets needed, especially for deep learning models. Years of intense research produced many important results in this area; however, the goal of fully intelligent signal processing, characterized by its autonomous acting, is not yet achieved. Therefore, a review of state-of-the-art concerning...

Full text available to download

Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams

Publication

K. Łopatka

- Year 2015

A system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...

Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering

Publication

- IEEE Transactions on Audio Speech and Language Processing - Year 2015

This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...

Full text available to download

Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders

Publication

D. Koszewski
T. Görne
G. Korvel
B. Kostek

- EURASIP Journal on Audio Speech and Music Processing - Year 2023

The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Full text available to download

Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling

Publication

S. Raczyński
E. Vincent
S. Sagayama

- IEEE Transactions on Audio Speech and Language Processing - Year 2013

Symbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...

Full text to download in external service

Estimation of the short-term predictor parameters of speech under noisy conditions

Publication

M. Kuropatwinski
W. Kleijn
M. Kuropatwiński

- IEEE Transactions on Audio Speech and Language Processing - Year 2006

Full text to download in external service

Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation

Publication

S. Raczyński
E. Vincent

- IEEE Transactions on Audio Speech and Language Processing - Year 2014

In this work we present a new Bayesian topic model: latent hierarchical Pitman-Yor process allocation (LHPYA), which uses hierarchical Pitman-Yor pr ocess priors for both word and topic distributions, and generalizes a few of the existing topic models, including the latent Dirichlet allocation (LDA), the bi- gram topic model and the hierarchical Pitman-Yor topic model. Using such priors allows for integration of -grams with a topic model,...

Full text to download in external service

New approach for determining the QoS of MP3-coded voice signals in IP networks

Publication

T. Uhl
S. Paulsen
K. Nowicki

- EURASIP Journal on Audio Speech and Music Processing - Year 2017

Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...

Full text available to download

Personal adaptive tuning of mobile computer audio

Publication

- Year 2015

An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of the acoustic track to the changing conditions and to the user's individual preferences. Original signal processing algorithms are introduced, which concern: linearization of frequency response, dialogue intelligibility enhancement and dynamics processing tuned up to the user's preferences....

Adaptive Personal Tuning of Sound in Mobile Computers

Publication

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2016

An integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of their acoustic track to changing acoustic conditions of the environment and to users’ individual preferences. Signal processing algorithms are introduced that concern: linearization of frequency response, dialogue intelligibility enhancement, and dynamics processing tuned up to the users’...

Full text available to download

Audio content analysis in the urban area telemonitoring system

Publication

- Year 2010

Artykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...

Full text to download in external service

Discovering Rule-Based Learning Systems for the Purpose of Music Analysis

Publication

G. Korvel
B. Kostek

- Journal of the Acoustical Society of America - Year 2019

Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

Full text available to download

Moving object detection and tracking for the purpose of multimodal surveillance system in urban areas

Publication

- Year 2008

Background subtraction method based on mixture of Gaussians was employed to detect all regions in a video frame denoting moving objects. Kalman filters were used for establishing relations between the regions and real moving objects in a scene and for tracking them continuously. The objects were represented by rectangles. The objects coupling with adequate regions including the relation of many-to-many was studied experimentally...

Automatic system for audio-video material reconstruction and archiving

Publication

- Year 2008

Referat przedstawia propozycję modelu systemu automatycznej archiwizacji i rekonstrukcji nagrań audio-wideo. Założeniem tego rozwiązania jest uczynienie procesu rekonstrukcji nagrań bardziej niezależnym od człowieka. Ma to na celu redukcję kosztów rekonstrukcji przetwarzanych nagrań. Z powodu dużej liczby archiwalnych nagrań audio-wideo istnieje potrzeba stworzenia systemu który umożliwi automatyczną indeksację ich treści. Pomoże...

Multimodal Surveillance Based Personal Protection System

Publication

- Year 2013

A novel, multimodal approach for automatic detection of abduction of a protected individual, employing dedicated personal protection device and a city monitoring system is proposed and overviewed. The solution is based on combining four modalities (signals coming from: Bluetooth, fixed and PTZ cameras, thermal camera, acoustic sensors). The Bluetooth signal is used continuously to monitor the protected person presence, and in case...

Algorytmy wykrywania krawędzi w obrazie

Publication

- Poznan University of Technology Academic Journals. Electrical Engineering - Year 2018

Wykrywanie krawędzi jest pierwszym etapem w cyfrowym przetwarzaniu obrazów. Operacja ta polega na usunięciu informacji takich jak kolor czy też jasność, a pozostawieniu jedynie krawędzi. Efektem tej operacji jest znaczna redukcja ilości danych do dalszej analizy. Pozwala to na zastosowanie w następnych etapach bardziej złożonych algorytmów rozpoznawania obiektów na podstawie kształtu. W artykule zaprezentowano zastosowanie algorytmów...

Full text available to download

Measurement of Latency in the Android Audio Path

Publication

- Year 2018

This paper provides a description of experimental investigations concerning comparison between the audio path characteristics of various Android versions. First, information about the changes in each system version in the context of latency caused by them is presented. Then, a measurement procedure employing available applications to measure latency is described comparing to results contained in the Internet. Finally, a comparison...

Full text to download in external service

Localization of impulsive disturbances in audio signals using template matching

Publication

- DIGITAL SIGNAL PROCESSING - Year 2015

In this paper, a new solution to the problem of elimination of impulsive disturbances from audio signals, based on the matched filtering technique, is proposed. The new approach stems from the observation that a large proportion of noise pulses corrupting audio recordings have highly repetitive shapes that match several typical “patterns”. In many cases a representative set of exemplary pulse waveforms can be extracted from the...

Full text available to download

A Study on Audio Signal Processed by "Instant Mastering"

Publication

M. Piotrowska
S. Piotrowski
B. Kostek

- Year 2018

An increasing amount of music produced in home- and project-studios results in development and growth of "automatic mastering services". The presented investigation explores changes introduced to audio signal by various online mastering platforms. A music set consisting of 10 songs produced in small facilities was processed by eight on-line automatic mastering services. Additionally, some laboratory-constructed signals were tested....

EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

Publication

- Year 2014

The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...

EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

Publication

- Year 2014

The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...

Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing

Publication

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2020

Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Full text available to download

Analiza kosztów i czasu budowy domu jednorodzinnego w technologii drewna krzyżowo klejonego CLT

Publication

- Materiały Budowlane - Year 2021

-

Full text to download in external service

An audio-visual corpus for multimodal automatic speech recognition

Publication

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2017

review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Full text available to download

Interactions with recognized objects

Publication

J. Rumiński
A. Bujnowski
J. Wtorek
A. Andrushevich
M. Biallas
R. Kistler

- Year 2014

Implicit interaction combined with object recognition techniques opens a new possibility for gathering data and analyzing user behavior for activity and context recognition. The electronic eyewear platform, eGlasses, is being developed, as an integrated and autonomous system to provide interactions with smart environment. In this paper we present a method for the interactions with the recognized objects that can be used for electronic...

Full text to download in external service

Wykorzystanie sztucznych sieci neuronowych do wykrywania i rozpoznawania tablic rejestracyjnych na zdjęciach pojazdów

Publication

- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Year 2015

W artykule przedstawiono koncepcję algorytmu wykrywania i rozpoznawania tablic rejestracyjnych (AWiRTR) na obrazach cyfrowych pojazdów. Detekcja i lokalizacja tablic rejestracyjnych oraz wyodrębnienie z obrazu tablicy rejestracyjnej poszczególnych znaków odbywa się z wykorzystaniem podstawowych technik przetwarzania obrazu (przekształcenia morfologiczne, wykrywanie krawędzi) jak i podstawowych danych statystycznych obiektów wykrytych...

Full text available to download

Retrospecting Polish Audio Engineering Society Membership on 20th Anniversary of the Polish Section of the Audio Engineering Society

Publication

B. Kostek
M. Sankiewicz

- Archives of Acoustics - Year 2011

In this article some key events concerning founding Polish Section of the Audio Engineering Society were presented. In addition, the history covering International Symposia on Sound Engineering and Mastering was outlined. Also, papers contained in this issue were shortly reviewed.

Full text available to download

An new method of audio-visual correlation analysis

Publication

- Year 2009

This paper presents a new methodology of conducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between...

Full text to download in external service

Objectivization of audio-video correlation assessment experiments

Publication

- Year 2010

The purpose of this paper is to present a new method of conducting an audio-visual correlation analysis employing a head-motion-free gaze tracking system. First, a review of related works in the domain of sound and vision correlation is presented. Then assumptions concerning audio-visual scene creation are shortly described. The objectivization process of carrying out correlation tests employing gaze-tracking system is outlined....

Full text to download in external service

Intelligent video and audio applications for learning enhancement

Publication

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2011

The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

Full text available to download

Detection of impulsive disturbances in archive audio signals

Publication

- Year 2017

In this paper the problem of detection of impulsive disturbances in archive audio signals is considered. It is shown that semi-causal/noncausal solutions based on joint evaluation of signal prediction errors and leave-one-out signal interpolation errors, allow one to noticeably improve detection results compared to the prediction-only based solutions. The proposed approaches are evaluated on a set of clean audio signals contaminated...

Full text available to download

Localization and identyfication of ferromagnetic objects

Publication

M. Wołoszyn

- Year 2008

A compact ferromagnetic object placed in the earthly magnetic field causes disturbance of this field. This disturbance is associated with magnetization of the object. Ferromagnetic objects have induced and can also have permanent magnetization. In methods of locating and identifying ferromagnetic objects usually is using the model of the dipol moment. Determination of the position and values of the extremes of the magnetic field...

Exploiting audio-visual correlation by means of gaze tracking

Publication

- International Journal of Computer Science and Applications - Year 2010

This paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...

Full text available to download

Elimination of impulsive disturbances from stereo audio recordings

Publication

- Year 2014

This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. On-line tracking of signal model parameters is performed using the stability-preserving Whittle-Wiggins-Robinson algorithm with exponential data weighting. Detection of noise pulses and model-based interpolation of the irrevocably distorted samples...

Full text to download in external service

Surveillance camera tracking of GEO positioned objects

Publication

- Year 2009

Rozdział opisuje system sterowania kamerami ruchomymi PTZ realizujący śledzenie poruszającego się obiektu o znanej pozycji GPS. Przedstawione są idea systemu oraz możliwości jego wykorzystania. Opisane są: procedura kalibracji pola widzenia kamery i sposób powiązania z danymi o lokalizacji, procedura predykcji ruchu w celu kompensacji opóźnień czasowych. Omówiony jest zaimplementowany system modułowy, w którego skład wchodzą: terminale...

Full text to download in external service

Digital Audio Broadcasting or Webcasting: A Network Quality Perspective

Publication

- Journal of Telecommunications and Information Technology - Year 2016

In recent years, many alternative technologies of delivering audio content have emerged, with different advantages and disadvantages. In this paper pros and cons of digital audio broadcasting and webcasting transmission techniques in a network quality perspective are described. A case study of user expectations with respect to currently available services is analyzed, and the perceived quality of real digital broadcasted and webcasted...

Full text available to download

Layered background modeling for automatic detection of unattended objects in camera images

Publication

- Year 2011

An algorithm for automatic detection of unattended objects in video camera images is presented. First, background subtraction is performed, using an approach based on the codebook method. Results of the detection are then processed by assigning the background pixels to time slots, based on the codeword age. Using this data, moving objects detected during a chosen period may be extracted from the background model. The proposed approach...

Full text to download in external service

System do prototypowania bezprzewodowych inteligentnych urządzeń monitoringu audio-video

Publication

M. Kłosowski

- Year 2013

W komunikacie przedstawiono system prototypowania bezprzewodowych urządzeń do monitoringu audio-video. System bazuje na układach FPGA Virtex6 i wielu dodatkowych wspierających urządzeniach jak: szybka pamięć DDR3, mała kamera HD, mikrofon z konwerterem A/C, moduł radiowy WiFi, itp. Funkcjonalność systemu została szczegółowo opisana w komunikacie. System został zoptymalizowany do pracy pod kontrolą systemu operacyjnego Linux, zostały...

Testing Watermark Robustness against Application of Audio Restoration Algorithms

Publication

- Year 2013

The purpose of this study was to test to what extent watermarks embedded in distorted audio signals are immune to audio restoration algorithm performing. Several restoration routines such as noise reduction, spectrum expansion, clipping or clicks reduction were applied in the online website system. The online service was extended with some copyright protection mechanisms proposed by the authors. They contain low-level music features...

Full text to download in external service

A double-talk detector using audio watermarking

Publication

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2009

a novel approach to double-talk detection in the acoustic echo canceler is proposed. a hidden signature is embedded into the arriving signal, using the echo-hiding method. next detection of the presence of this signature in the microphone signal is performed. the results of the signature detection may be used by the acoustic echo canceler to stop or restart the adaptation process.

Full text to download in external service

Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders

Publication

D. Koszewski

- Year 2023

The purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...

Full text available to download

Reconstruction Methods for 3D Underwater Objects Using Point Cloud Data

Publication

- HYDROACOUSTICS - Year 2015

Existing methods for visualizing underwater objects in three dimensions are usually based on displaying the imaged objects either as unorganised point sets or in the form of edges connecting the points in a trivial way. To allow the researcher to recognise more details and characteristic features of an investigated object, the visualization quality may be improved by transforming the unordered point clouds into higher order structures....

Full text available to download

Exploring contexts of use of cultural objects in virtual museums

Publication

A. Kaczmarek

- Year 2008

This paper presents a system which facilitates discovering knowledge about cultural objects. The system is based on semantic modeling of a virtual museum which consists of cultural objects placed in a virtual 3D space. The article describes an extension to the concept of cultural objects which includes information on the use of these objects. This extension enables to place objects in an appropriate context in a virtual museum....

Automatic audio-visual threat detection

Publication

- Year 2010

The concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...

Objectivization of Audio-Visual Correlation analysis

Publication

- Archives of Acoustics - Year 2012

Simultaneous perception of audio and visual stimuli often causes the concealment or misrepresentation of information actually contained in these stimuli. Such effects are called the ''image proximity effect'' or the ''ventriloquism effect'' in literature. Until recently, most research carried out to understand their nature was based on subjective assessments. The Authors of this paper propose a methodology based on both subjective...

Full text available to download

Pose-Configurable Generic Tracking of Elongated Objects

Publication

D. Węsierski
P. Horain

- Year 2013

Elongated objects have various shapes and can shift, rotate, change scale, and be rigid or deform by flexing, articulating, and vibrating, with examples as varied as a glass bottle, a robotic arm, a surgical suture, a finger pair, a tram, and a guitar string. This generally makes tracking of poses of elongated objects very challenging. We describe a unified, configurable framework for tracking the pose of elongated objects, which...

Integrated acoustical-optical system for inventory of hydrotechnical objects

Publication

- HYDROACOUSTICS - Year 2017

The knowledge of the location, shape and other characteristics of spatial objects in the coastal areas has a significant impact on the functioning of ports, shipyards, and other waterinfrastructure facilities, both offshore and inland. Therefore, measurements of the underwater part of the waterside zone are taken, which means the bottom of the water and other underwater objects (e.g. breakwaters, docks, etc.), and objects above...

Full text available to download

Search

Filters

Catalog

Category

Year

Options

Search results for: audio processing objects