Filters
total: 462
filtered: 256
Search results for: audio
-
Intelligent Audio Signal Processing − Do We Still Need Annotated Datasets?
PublicationIn this paper, intelligent audio signal processing examples are shortly described. The focus is, however, on the machine learning approach and datasets needed, especially for deep learning models. Years of intense research produced many important results in this area; however, the goal of fully intelligent signal processing, characterized by its autonomous acting, is not yet achieved. Therefore, a review of state-of-the-art concerning...
-
RENOVATION OF ARCHIVE AUDIO RECORDINGS USING SPARSE AUTOREGRESSIVE MODELING AND BIDIRECTIONAL PROCESSING
PublicationThe paper presents a new approach to elimination of broadband noise and impulsive disturbances from archive audio recordings. The proposed adaptive Kalman-like algorithm, based on a sparse autoregressive model of the audio signal, simultaneously detects noise pulses, interpolates the irrevocably distorted samples and performs signal smoothing. It is shown that bidirectional (forward-backward) processing of the archive signal improves...
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublicationIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
A study on of music features derived from audio recordings examples – a quantitative analysis
PublicationThe paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that...
-
Localization of impulsive disturbances in archive audio signals using predictive matched filtering
PublicationThe problem of elimination of impulsive disturbances from archive audio signals is considered and its new solution, called predictive matched filtering, is proposed. The new approach is based on the observation that a large percentage of noise pulses corrupting archive audio recordings have highly repetitive shapes that match several typical “patterns”, called click templates. To localize noise pulses, click templates can be correlated...
-
Audio Content and Crowdsourcing: A Subjective Quality Evaluation of Radio Programs Streamed Online
PublicationRadio broadcasting has been present in our lives for over 100 years. The transmission of speech and music signals accompanies us from an early age. Broadcasts provide the latest information from home and abroad. They also shape musical tastes and allow many artists to share their creativity. Modern distribution involves transmission over a number of terrestrial systems. The most popular are analog FM (Frequency Modulation) and...
-
Pomiary wartości opóźnień w torze audio urządzeń z systemem Android
PublicationPoniższy artykuł opisuje metody pomiarów wartości opóźnienia w torze fonicznym urządzeń pracujących na różnych wersjach systemu Android. W pierwszej części artykułu podano krótką charakterystykę środowiska Android w kontekście opóźnień w torze fonicznym. Następnie przedstawiono sposób pomiaru opóźnienia w torze fonicznym za pomocą aplikacji SuperPowered Latency oraz Dr. Rick O’Rang Loopback. W końcowej...
-
Intelligent acquisition of audio signals, employing neutral networks and rough set algorithms
PublicationAlgorytmy oparte na sztucznych sieciach neuronowych i metodzie zbiorówprzybliżonych zostały zastosowane do lokalizacji sygnałów fonicznych obar-czonych pasożytniczym szumem i rewerberacjami. Informacja o kierunku napły-wania dźwięku była uzyskiwana na wyjściach tych algorytmów na podstawie re-prezentacji parametrycznej. Przedstawiono wyniki eksperymentalne i przepro-wadzono ich dyskusję.
-
Production of six-degrees-of-freedom (6DoF) navigable audio using 30 Ambisonic microphones
PublicationThis paper describes a method for planning, recording, and post-production of six-degrees-of-freedom audio recorded with multiple 3rd order Ambisonic microphone arrays. The description is based on the example of recordings conducted in August 2020 with the Poznan Philharmonic Orchestra using 30 units of Zylia ZM-1S. A convenient way to prepare and organize such a big project is proposed – this involves details of stage planning,...
-
Parametric impulsive noise detector for corrupted audio signals based on hidden Markow model
PublicationThe paper addresses the problem of impulsive noise detection for audio signals. A structure of threshold parameter detectors using modelingof signals was introduced. the algorithm of the noise detection, based on discrete-time hidden Markow model (HMM)of whitened audio signal is elaborated
-
Sparse vector autoregressive modeling of audio signals and its application to the elimination of impulsive disturbances
PublicationArchive audio files are often corrupted by impulsive disturbances, such as clicks, pops and record scratches. This paper presents a new method for elimination of impulsive disturbances from stereo audio signals. The proposed approach is based on a sparse vector autoregressive signal model, made up of two components: one taking care of short-term signal correlations, and the other one taking care of long-term correlations. The method...
-
Gaze-tracking based audio-visual correlation analysis employing quality of experience methodology
PublicationThis paper investigates a new approach to audio-visual correlation assessment based on the gaze-tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT). The gaze-tracking methodology, having roots in Human-Computer Interaction borrows the relevance feedback through gaze-tracking and applies it to the new area of interests, which is Quality of Experience. Results of subjective...
-
AUDIO SIGNAL EQUALIZATION BASED ON IMPULSE RESPONSE OF A LISTENING ROOM AND MUSIC CONTENT REPRODUCED
PublicationA research study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, a concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....
-
Energy Efficiency Study of Audio-video Content Consumption on Selected Android Mobile Terminals
PublicationMobile devices are widely used by billions of users worldwide. Thanks to their main advantage, which is portability, they should be fully operational as long as possible, without the need to recharge or connect them to external power sources. This paper describes a study, carried out on four different mobile devices, with different hardware and software parameters, running the Android operating system. The research campaign involved...
-
Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions - Headphones vs Loudspeakers: A Case Study
PublicationThis study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology...
-
Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders
PublicationThe purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...
-
In Memoriam Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering
PublicationBiography and scientific achievements of Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering.
-
Towards Audio Signal Equalization Based on Spectral Characteristics of a Listening Room and Music Content Reproduced
PublicationThis study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, the concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....
-
New semi-causal and noncausal techniques for detection of impulsive disturbances in multivariate signals with audio applications
PublicationThis paper deals with the problem of localization of impulsive disturbances in nonstationary multivariate signals. Both unidirectional and bidirectional (noncausal) detection schemes are proposed. It is shown that the strengthened pulse detection rule, which combines analysis of one-step-ahead signal prediction errors with critical evaluation of leave-one-out signal interpolation errors, allows one to noticeably improve detection results...
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublicationThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
-
Elimination of impulsive disturbances from archive audio files – comparison of three noise pulse detection schemes
PublicationThe problem of elimination of impulsive disturbances (such as clicks, pops, ticks, crackles, and record scratches) from archive audio recordings is considered and solved using autoregressive modeling. Three classical noise pulse detection schemes are examined and compared: the approach based on open-loop multi-step-ahead signal prediction, the approach based on decision-feedback signal prediction, and the double threshold approach,...
-
Evaluation of Six Degrees of Freedom 3D Audio Orchestra Recording and Playback using multi-point Ambisonic interpolation
PublicationThis paper describes a strategy for recording sound and enabling six-degrees-of-freedom playback, making use of multiple simultaneous and synchronized Higher Order Ambisonics (HOA) recordings. Such a strategy enables users to navigate in a simulated 3D space and listen to the six-degrees-of-freedom recordings from different perspectives. For the evaluation of the proposed approach, an Unreal Engine-based navigable 3D audiovisual...
-
Analysis of impact of lossy audio compression on the robustness of watermark embedded in the DWT domain for non-blind copyright protection
PublicationA methodology of non-blind watermarking of the audio content is proposed. The outline of audio copyright problem and motivation for practical applications are discussed. The algorithmic theory pertaining watermarking techniques is briefly introduced. The system architecture together with employed workflows for embedding and extracting the watermarks are described. The implemented approach is described and obtained results are reported....
-
Analiza jakości transmisji treści audio-wideo w symulowanym łączu telekomunikacyjnym z wykorzystaniem techniki OFDM
PublicationWdrożenie niezawodnego systemu komunikacji audio-wideo przynosi wiele korzyści. Z uwagi na fakt, że ilość dostępnego pasma stale się kurczy, badacze koncentrują się na nowatorskich metodach transmisji. Obecnie technika OFDM (Orthogonal Frequency Division Multiplexing) jest szeroko stosowana zarówno w mediach przewodowych, jak i bezprzewodowych. W pracy przedstawiono badania jakości QoS (Quality of Service) symulowanego łącza transmisji...
-
A commonly-accessible toolchain for live streaming music events with higher-order ambisonic audio and 4k 360 vision
PublicationAn immersive live stream is especially interesting in the ongoing development of telepresence tools, especially in the virtual reality (VR) or mixed reality (MR) domain. This paper explores the remote and immersive way of enabling telepresence for the audience to high-fidelity music performance using freely-available and easily-accessible tools. A functional VR live-streaming toolchain, comprising 360 vision and higher-order ambisonic...
-
Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering
PublicationThis paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...
-
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
PublicationA system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
-
Measuring and Analyzing Audio Levels in Film, Commercials, and Movie Trailers Using Leq(A) Values and the LUFS Loudness Model . Analiza pomiarów dźwięku w filmie oraz w reklamach filmowych z wykorzystaniem modelu głośności
PublicationThe purpose of this paper is to describe the measurement of loudness levels in movies, movie trailers, and commercials displayed before feature films at movie theaters. In the initial section, the paper discusses the issues related to measurement of loudness levels, provides recommendations regarding permissible loudness levels during movie screenings, and mentions the applied units of measurement. The following section of the...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublicationThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
Technika komputerowa w audiologii, foniatrii i logopedii
PublicationKsiążka prezentuje opracowania, które są wynikiem kilkuletniej współpracy naukowców z dziedziny informatyki, telekomunikacji, otolaryngologii, audiologii, psychologii, pedagogiki, logopedii i foniatrii. Książka prezentuje zastosowania techniki komputerowej w dziedzinach określonych w jej tytule.
-
Computer modeling of perceptual masking and its audiology applications
PublicationW referacie zaprezentowano podstawy perceptualne słyszenia pozwalające na stworzenie nowych modeli kodowania dźwięku, w szczególności do zastosowania w protezach słuchu.
-
Koncepcja kształtowania audiosfery miejsca pracy. Między sztuką a zarządzaniem
Publication -
Development of an AI-based audiogram classification method for patient referral
PublicationHearing loss is one of the most significant sensory disabilities. It can have various negative effects on a person's quality of life, ranging from impeded school and academic performance to total social isolation in severe cases. It is therefore vital that early symptoms of hearing loss are diagnosed quickly and accurately. Audiology tests are commonly performed with the use of tonal audiometry, which measures a patient's hearing...
-
Audiovisual speech recognition for training hearing impaired patients
PublicationPraca przedstawia system rozpoznawania izolowanych głosek mowy wykorzystujący dane wizualne i akustyczne. Modele Active Shape Models zostały wykorzystane do wyznaczania parametrów wizualnych na podstawie analizy kształtu i ruchu ust w nagraniach wideo. Parametry akustyczne bazują na współczynnikach melcepstralnych. Sieć neuronowa została użyta do rozpoznawania wymawianych głosek na podstawie wektora cech zawierającego oba typy...
-
Automated hearing loss type classification based on pure tone audiometry data
PublicationHearing problems are commonly diagnosed with the use of tonal audiometry, which measures a patient’s hearing threshold in both air and bone conduction at various frequencies. Results of audiometry tests, usually represented graphically in the form of an audiogram, need to be interpreted by a professional audiologist in order to determine the exact type of hearing loss and administer proper treatment. However, the small number of...
-
Comparing noise levels and audiometric testing results employing it based diagnostic systems.
PublicationW referacie przedstawiono Internetowy system przeznaczony do przeprowadzania przesiewowych testów słuchu. Zaprezentowano również system informacyjny przeznaczony do monitorowania hałasu środowiskowego. Obie Internetowe aplikacje mogą być pomocne w zmniejszaniu częstości występowania chorób słuchu powodowanych przez hałas środowiskowy i przemysłowy. Porównano wyniki testów audiometrycznych z pomiarami hałasu na podstawie zawartości...
-
Testing A Novel Gesture-Based Mixing Interface
PublicationWith a digital audio workstation, in contrast to the traditional mouse-keyboard computer interface, hand gestures can be used to mix audio with eyes closed. Mixing with a visual representation of audio parameters during experiments led to broadening the panorama and a more intensive use of shelving equalizers. Listening tests proved that the use of hand gestures produces mixes that are aesthetically as good as those obtained using...
-
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
PublicationDeveloping signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....
-
Adaptive Personal Tuning of Sound in Mobile Computers
PublicationAn integrated methodology for enhancing audio quality in mobile computers is presented. The key features are adaptation of the characteristics of their acoustic track to changing acoustic conditions of the environment and to users’ individual preferences. Signal processing algorithms are introduced that concern: linearization of frequency response, dialogue intelligibility enhancement, and dynamics processing tuned up to the users’...
-
Editor's note and 2018 reviewers
PublicationPrzedmiotem pracy jest odniesienie do prac opublikowanych w 2018 roku, jak również do serii artykułów w ramach specjalnego wydania: Special Issue on Augmented and Participatory Sound and Music Interaction Using Semantic Audio.
-
Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition
Publicationconvolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...
-
Bass Enhancement Settings in Portable Devices Based on Music Genre Recognition
PublicationThe paper presents a novel approach to the Virtual Bass Synthesis (VBS) applied to mobile devices, called Smart VBS (SVBS). The proposed algorithm uses an intelligent, rule-based setting of bass synthesis parameters adjusted to the particular music genre. Harmonic generation is based on a nonlinear device (NLD) method with the intelligent controlling system adapting to the recognized music genre. To automatically classify music...
-
DSP techniques for determining ''Wow'' distortions
PublicationArtykuł przedstawia opis algorytmów do wyznaczania charakterystyki zniekształceń kołysania dźwięku. Są to algorytmy: śledzenia przydźwięku sieciowego, śledzenia pozostałości magnetycznej prądu podkładu wielkich częstotliwości, adaptacyjnej analizy środka ciężkości widma dla wybranej części zniekształconego sygnału. Przedstawione algorytmy pozwalają na implementację programową i sprzętową.
-
Expert system for automatic classification and quality assessment of singing voices
Publication.
-
Tonality Estimation and Frequency Tracking of Modulated Tonal Components
PublicationA novel method for tonality estimation and frequency tracking of tonal components modulated in frequency and amplitude is presented. The algorithm detects the local maxima of magnitude spectra corresponding to three contiguous frames of a signal and matches them into the tonal track candidates. The magnitude-based and phase-based methods are used to estimate the frequency jumps between spectrum maxima belonging to the tonal track...
-
System for automatic singing voice recognition
PublicationW artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...
-
New Aspects of Virtual Sound Source Localization Research—Impact of Visual Angle and 3-D Video Content on Sound Perception
PublicationThe influence of image on virtual sound source localization, called the “image proximity effect” or the “ventriloquism effect”, is a well known phenomenon. This paper focuses on other aspects related to this effect, namely the impact of the visual angle of the presented object and 3D video content on sound perception. The research conducted confirmed that the visual angle of the presented object determines the image proximity effect...
-
Measurements and Visualization of Sound Intensity Around the Human Head in Free Field Using Acoustic Vector Sensor
PublicationThis paper presents measurements and visualization of sound intensity around the human head simulator in a free field. A Cartesian robot, applied for precise positioning of the acoustic vector sensor, was used to measure sound intensity. Measurements were performed in a free field using a head and torso simulator and the setup consisting of four different loudspeaker configurations. The acoustic vector sensor was positioned around...
-
Phraseological Units in Audiovisual Translation. A Case Study of Polish Dubbing of Disney’s 'The Little Mermaid'
PublicationThe paper aims to discuss phraseological units as the object of audiovisual translation in the Polish dubbing of Disney’s 'The Little Mermaid', to discuss the role of phraseological translation techniques, and to present possible translation inconsistencies. A theoretical introduction presents definitions for crucial terms. It is followed by the analysis of the corpus of phraseological units in Disney’s The Little Mermaid and...
-
Audiosfera środowiska pracy w przestrzeni biurowej na planie otwartym. Wyniki zwiadu badawczego
Publication