Wyniki wyszukiwania dla: AUDIO PROCESSING - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: AUDIO PROCESSING

Wyniki wyszukiwania dla: AUDIO PROCESSING

  • Automatic music genre classification based on musical instrument track separation / Automatyczna klasyfikacja gatunku muzycznego wykorzystująca algorytm separacji dźwięku instrumentó muzycznych

    Publikacja

    The aim of this article is to investigate whether separating music tracks at the pre-processing phase and extending feature vector by parameters related to the specific musical instruments that are characteristic for the given musical genre allow for efficient automatic musical genre classification in case of database containing thousands of music excerpts and a dozen of genres. Results of extensive experiments show that the approach...

    Pełny tekst do pobrania w portalu

  • Examining Classifiers Applied to Static Hand Gesture Recognition in Novel Sound Mixing System

    The main objective of the chapter is to present the methodology and results of examining various classifiers (Nearest Neighbor-like algorithm with non-nested generalization (NNge), Naive Bayes, C4.5 (J48), Random Tree, Random Forests, Artificial Neural Networks (Multilayer Perceptron), Support Vector Machine (SVM) used for static gesture recognition. A problem of effective gesture recognition is outlined in the context of the system...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Discovering Rule-Based Learning Systems for the Purpose of Music Analysis

    Publikacja

    Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

    Pełny tekst do pobrania w portalu

  • Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition

    Publikacja

    - JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2018

    convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...

  • Reception of Terrestrial DAB+ and FM Radio with a Mobile Device: A Subjective Quality Evaluation

    Publikacja

    - Rok 2023

    Nowadays, terrestrial broadcasting enables to receive content anytime and everywhere. People can obtain information both with a portable or desktop receiver, which include pocket-sized devices as well as high-end Hi-Fi equipment, not to mention car audio systems. Numerous manufacturers include FM-compatible chipsets in a variety of user equipment (UE), including mobile phones. However, digital radio signal processing modules, such...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Low-Level Music Feature Vectors Embedded as Watermarks

    In this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Stradar - Multimedia Dispatcher and Teleinformation System for the Border Guard

    Security of national borders requires utilization of multimedia surveillance systems automatically gathering, processing and sharing various data. The paper presents such a system developed for the Maritime Division of the Polish Border Guard within the STRADAR project. The system, apart from providing communication means, gathers data, such as map data from AIS, GPS and radar receivers, videos and photos from camera or audio from...

    Pełny tekst do pobrania w portalu

  • Automatic audio-visual threat detection

    Publikacja

    - Rok 2010

    The concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...

  • IFE: NN-aided Instantaneous Pitch Estimation

    Publikacja

    Pitch estimation is still an open issue in contemporary signal processing research. Nowadays, growing momentum of machine learning techniques application in the data-driven society allows for tackling this problem from a new perspective. This work leverages such an opportunity to propose a refined Instantaneous Frequency and power based pitch Estimator method called IFE. It incorporates deep neural network based pitch estimation...

    Pełny tekst do pobrania w portalu

  • INFLUENCE OF DATA NORMALIZATION ON THE EFFECTIVENESS OF NEURAL NETWORKS APPLIED TO CLASSIFICATION OF PAVEMENT CONDITIONS – CASE STUDY

    In recent years automatic classification employing machine learning seems to be in high demand for tele-informatic-based solutions. An example of such solutions are intelligent transportation systems (ITS), in which various factors are taken into account. The subject of the study presented is the impact of data pre-processing and normalization on the accuracy and training effectiveness of artificial neural networks in the case...

  • Bimodal classification of English allophones employing acoustic speech signal and facial motion capture

    A method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Verification of the Parameterization Methods in the Context of Automatic Recognition of Sounds Related to Danger

    W artykule opisano aplikację, która automatycznie wykrywa zdarzenia dźwiękowe takie jak: rozbita szyba, wystrzał, wybuch i krzyk. Opisany system składa się z bloku parametryzacji i klasyfikatora. W artykule dokonano porównania parametrów dedykowanych dla tego zastosowania oraz standardowych deskryptorów MPEG-7. Porównano też dwa klasyfikatory: Jeden oparty o Percetron (sieci neuronowe) i drugi oparty o Maszynę wektorów wspierających....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Comparative study on the effectiveness of various types of road traffic intensity detectors

    Publikacja

    - Rok 2019

    Vehicle detection and speed measurements are crucial tasks in traffic monitoring systems. In this work, we focus on several types of electronic sensors, operating on different physical principles in order to compare their effectiveness in real traffic conditions. Commercial solutions are based on road tubes, microwave sensors, LiDARs, and video cameras. Distributed traffic monitoring systems require a high number of monitoring...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Digital Audio Effects Conference

    Konferencje

  • English Language Learning Employing Developments in Multimedia IS

    Publikacja

    In the realm of the development of information systems related to education, integrating multimedia technologies offers novel ways to enhance foreign language learning. This study investigates audio-video processing methods that leverage real-time speech rate adjustment and dynamic captioning to support English language acquisition. Through a mixed-methods analysis involving participants from a language school, we explore the impact...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Rough Sets Applied to Mood of Music Recognition

    Publikacja

    With the growth of accessible digital music libraries over the past decade, there is a need for research into automated systems for searching, organizing and recommending music. Mood of music is considered as one of the most intuitive criteria for listeners, thus this work is focused on the emotional content of music and its automatic recognition. The research study presented in this work contains an attempt to music emotion recognition...

  • Study Analysis of Transmission Efficiency in DAB+ Broadcasting System

    Publikacja

    - Rok 2018

    DAB+ is a very innovative and universal multimedia broadcasting system. Thanks to its updated multimedia technologies and metadata options, digital radio keeps pace with changing consumer expectations and the impact of media convergence. Broadcasting analog and digital radio services does vary, concerning devices on both transmitting and receiving side, as well as content processing mechanisms. However, the biggest difference is...

    Pełny tekst do pobrania w portalu

  • Classifying type of vehicles on the basis of data extracted from audio signal characteristics

    The aim of this study is to find and optimize a feature vector for an automatic recognition of the type of vehicles, extracted form an audio signal. First, the influence of weather-based conditions of road surface on spectral characteristic of the audio signal recorded from a passing vehicle in close proximity to the road is discussed. Next, parameterization of the recorded audio signal is performed. For that purpose, the MIRtoolbox,...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.

    In this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Improving automatic surveillance by sound analysis

    Publikacja

    An automatic surveillance system, based on event detection in the video image can be improved by implementing algorithms for audio analysis. Dangerous or illegal actions are often connected with distinctive sound events like screams or sudden bursts of energy. A method for detection and classification of alarming sound events is presented. Detection is based on the observation of sudden changes in sound level in distinctive sub-bands...

  • SYNAT_MUSIC_GENRE_FV_173

    Dane Badawcze

    This is the original dataset containing 51582 music tracks (22 music genres) and 173 element-feature vector [1-6,9]. A collection of more than 50000 music excerpts described with a set of descriptors obtained through the analysis of 30-second mp3 recordings was gathered in a database called SYNAT. The SYNAT database was realized by the Gdansk University...

  • Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions - Headphones vs Loudspeakers: A Case Study

    Publikacja

    This study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology...

    Pełny tekst do pobrania w portalu

  • Automatic sound recognition for security purposes

    Publikacja

    - Rok 2008

    In the paper an automatic sound recognition system is presented. It forms a part of a bigger security system developed in order to monitor outdoor places for non-typical audio-visual events. The analyzed audio signal is being recorded from a microphone mounted in an outdoor place thus a non stationary noise of a significant energy is present in it. In the paper an especially designed algorithm for outdoor noise reduction is presented,...

  • Exploiting audio-visual correlation by means of gaze tracking

    This paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...

    Pełny tekst do pobrania w portalu

  • Testing Watermark Robustness against Application of Audio Restoration Algorithms

    Publikacja

    The purpose of this study was to test to what extent watermarks embedded in distorted audio signals are immune to audio restoration algorithm performing. Several restoration routines such as noise reduction, spectrum expansion, clipping or clicks reduction were applied in the online website system. The online service was extended with some copyright protection mechanisms proposed by the authors. They contain low-level music features...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • QoS/QoE in the Heterogeneous Internet of Things (IoT)

    Publikacja

    - Rok 2017

    Applications provided in the Internet of Things can generally be divided into three categories: audio, video and data. This has given rise to the popular term Triple Play Services. The most important audio applications are VoIP and audio streaming. The most notable video applications are VToIP, IPTV, and video streaming, and the service WWW is the most prominent example of data-type services. This chapter elaborates on the most...

  • Parametric impulsive noise detector for corrupted audio signals based on hidden Markow model

    Publikacja

    - Rok 2008

    The paper addresses the problem of impulsive noise detection for audio signals. A structure of threshold parameter detectors using modelingof signals was introduced. the algorithm of the noise detection, based on discrete-time hidden Markow model (HMM)of whitened audio signal is elaborated

  • Sparse vector autoregressive modeling of audio signals and its application to the elimination of impulsive disturbances

    Publikacja

    Archive audio files are often corrupted by impulsive disturbances, such as clicks, pops and record scratches. This paper presents a new method for elimination of impulsive disturbances from stereo audio signals. The proposed approach is based on a sparse vector autoregressive signal model, made up of two components: one taking care of short-term signal correlations, and the other one taking care of long-term correlations. The method...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions

    Publikacja

    Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

    Publikacja

    The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...

  • EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY

    Publikacja

    The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...

  • SYNAT_PCA_48

    Dane Badawcze

    There is a series of datasets containing feature vectors derived from music tracks. The dataset contains 51582 music tracks (22 music genres) and feature vector after  Principal Component Analysis (PCA) performing, so there are 48-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier...

  • SYNAT_PCA_11

    Dane Badawcze

    The dataset contains 51582 music tracks (22 music genres) and feature vector after  Principal Component Analysis (PCA) performing, so there are 11-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of more than...

  • SYNAT Music Genre Parameters PCA 19

    Dane Badawcze

    The dataset contains feature vector after  Principal Component Analysis (PCA) performing, so there are 11 music genres and 19-element vector derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of 52532 music excerpts described...

  • Elimination of impulsive disturbances from stereo audio recordings

    Publikacja

    This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. On-line tracking of signal model parameters is performed using the stability-preserving Whittle-Wiggins-Robinson algorithm with exponential data weighting. Detection of noise pulses and model-based interpolation of the irrevocably distorted samples...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Testing A Novel Gesture-Based Mixing Interface

    With a digital audio workstation, in contrast to the traditional mouse-keyboard computer interface, hand gestures can be used to mix audio with eyes closed. Mixing with a visual representation of audio parameters during experiments led to broadening the panorama and a more intensive use of shelving equalizers. Listening tests proved that the use of hand gestures produces mixes that are aesthetically as good as those obtained using...

    Pełny tekst do pobrania w portalu

  • An new method of audio-visual correlation analysis

    Publikacja

    - Rok 2009

    This paper presents a new methodology of conducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Objectivization of audio-video correlation assessment experiments

    Publikacja

    - Rok 2010

    The purpose of this paper is to present a new method of conducting an audio-visual correlation analysis employing a head-motion-free gaze tracking system. First, a review of related works in the domain of sound and vision correlation is presented. Then assumptions concerning audio-visual scene creation are shortly described. The objectivization process of carrying out correlation tests employing gaze-tracking system is outlined....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Examining Acoustic Emission of Engineered Ultrasound Loudspeakers

    Measurement results of the sound emitted from an ultrasound custom-made system with high spatial directivity are presented. The proposed system is using modulated ultrasound waves which demodulate in nonlinear medium resulting in audible sound. The system is aimed at enhancing the users’ personal audio space, therefore the measurements are performed using the Head and Torso Simulator which provides the realistic reproduction of...

  • Digital Audio Broadcasting or Webcasting: A Network Quality Perspective

    In recent years, many alternative technologies of delivering audio content have emerged, with different advantages and disadvantages. In this paper pros and cons of digital audio broadcasting and webcasting transmission techniques in a network quality perspective are described. A case study of user expectations with respect to currently available services is analyzed, and the perceived quality of real digital broadcasted and webcasted...

    Pełny tekst do pobrania w portalu

  • A concept of Signal Equalization Method Based on Music Genre and the Listener's Room Characteristics

    Publikacja

    - Rok 2016

    A research study that investigates the influence of the room acoustics environment on the frequency characteristic of the audio signal playback is presented. First, a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the frequency response of the room, a system for room acoustics compensation based on eight-band equalizer is proposed. The system settings depend on music genre. In...

  • Gaze-tracking based audio-visual correlation analysis employing quality of experience methodology

    This paper investigates a new approach to audio-visual correlation assessment based on the gaze-tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT). The gaze-tracking methodology, having roots in Human-Computer Interaction borrows the relevance feedback through gaze-tracking and applies it to the new area of interests, which is Quality of Experience. Results of subjective...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Measurements and Simulations of Engineered Ultrasound Loudspeakers

    Simulation and measurement results of the sound emitted from an ultrasound custom-made system with high spatial directivity are presented. The proposed system is using modulated ultrasound waves which demodulate in nonlinear medium resulting in audible sound. The system is aimed at enhancing the users’ personal audio space, therefore the measurements are performed using the Head and Torso Simulator which provides realistic reproduction...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Intelligent video and audio applications for learning enhancement

    The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

    Pełny tekst do pobrania w portalu

  • Intelligent multimedia solutions supporting special education needs.

    The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

  • Bimodal deep learning model for subjectively enhanced emotion classification in films

    Publikacja

    - INFORMATION SCIENCES - Rok 2024

    This research delves into the concept of color grading in film, focusing on how color influences the emotional response of the audience. The study commenced by recalling state-of-the-art works that process audio-video signals and associated emotions by machine learning. Then, assumptions of subjective tests for refining and validating an emotion model for assigning specific emotional labels to selected film excerpts were presented....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Wow defect reduction based on interpolation techniques

    Publikacja

    - Rok 2005

    W referacie przedstawiono wyniki badania różnych technik interpolacji wykorzystanych w redukcji kołysania dźwięku. W badaniach użyto: interpolację liniową, dwie techniki interpolacji wielomianowej (Hermite i spline), i technikę sumowania okienkowanych funkcji sink. Jakość rekonstrukcji wykonano wykorzystując sztucznie spreparowany sygnał audio, rekonstruowany wymienionymi metodami interpolacji. Jakość rekonstrukcji oceniono wykorzystując...

  • Akustyczna analiza natężenia ruchu drogowego dla systemów zarządzania ruchem

    Publikacja

    - Rok 2019

    W pracy przybliżono wybrane zagadnienia z dziedziny zarządzania transportem drogowym w Polsce i na świecie. W tym kontekście pzredstawiono potrzeby rynkowe, wymagania jak i możliwości w zakresie pozyskiwania informacji o aktualnym stanie sieci drogowych. Zaproponowano akustyczną metodę nadzorowania ruchu drogowego i jej możliwości w kontekście systemów zarządzania ruchem. Przedstawiono schemat akwizycji sygnału wraz z danymi odniesienia....

  • Detection of impulsive disturbances in archive audio signals

    Publikacja

    In this paper the problem of detection of impulsive disturbances in archive audio signals is considered. It is shown that semi-causal/noncausal solutions based on joint evaluation of signal prediction errors and leave-one-out signal interpolation errors, allow one to noticeably improve detection results compared to the prediction-only based solutions. The proposed approaches are evaluated on a set of clean audio signals contaminated...

    Pełny tekst do pobrania w portalu

  • Transmitting Alarm Information in DAB+ Broadcasting System

    Publikacja

    - Rok 2018

    The main goal of digital broadcasting is to deliver high-quality content with the lowest possible bitrate. This paper is focused on transmitting alarm information, such as emergency warning and alerting, in the DAB+ (Digital Audio Broadcasting plus) broadcasting system. These additional services should be available at the lowest possible bitrate, in order to provide a clear and understandable voice message to people. Furthermore, additional...