Search results for: AUDIO SIGNALS
-
On the Consumption of Multimedia Content Using Mobile Devices: a Year to Year User Case Study
PublicationIn the early days, consumption of multimedia content related with audio signals was only possible in a stationary manner. The music player was located at home, with a necessary physical drive. An alternative way for an individual was to attend a live performance at a concert hall or host a private concert at home. To sum up, audio-visual effects were only reserved for a narrow group of recipients. Today, thanks to portable players,...
-
Traffic Noise Analysis Applied to Automatic Vehicle Counting and Classification
PublicationProblems related to determining traffic noise characteristics are discussed in the context of automatic dynamic noise analysis based on noise level measurements and traffic prediction models. The obtained analytical results provide the second goal of the study, namely automatic vehicle counting and classification. Several traffic prediction models are presented and compared to the results of in-situ noise level measurements. Synchronized...
-
Evaluation of sound event detection, classification and localization in the presence of background noise for acoustic surveillance of hazardous situations
PublicationAn evaluation of the sound event detection, classification and localization of hazardous acoustic events in the presence of background noise of different types and changing intensities is presented. The methods for separating foreground events from the acoustic background are introduced. The classifier, based on a Support Vector Machine algorithm, is described. The set of features and samples used for the training of the classifier...
-
Study Analysis of Transmission Efficiency in DAB+ Broadcasting System
PublicationDAB+ is a very innovative and universal multimedia broadcasting system. Thanks to its updated multimedia technologies and metadata options, digital radio keeps pace with changing consumer expectations and the impact of media convergence. Broadcasting analog and digital radio services does vary, concerning devices on both transmitting and receiving side, as well as content processing mechanisms. However, the biggest difference is...
-
Music Information Retrieval – Soft Computing versus Statistics . Wyszukiwanie informacji muzycznej - algorytmy uczące versus metody statystyczne
PublicationMusic Information Retrieval (MIR) is an interdisciplinary research area that covers automated extraction of information from audio signals, music databases and services enabling the indexed information searching. In the early stages the primary focus of MIR was on music information through Query-by-Humming (QBH) applications, i.e. on identifying a piece of music by singing (singing/whistling), while more advanced implementations...
-
Dynamic Bayesian Networks for Symbolic Polyphonic Pitch Modeling
PublicationSymbolic pitch modeling is a way of incorporating knowledge about relations between pitches into the process of an- alyzing musical information or signals. In this paper, we propose a family of probabilistic symbolic polyphonic pitch models, which account for both the “horizontal” and the “vertical” pitch struc- ture. These models are formulated as linear or log-linear interpo- lations of up to fi ve sub-models, each of which is...
-
Detection, classification and localization of acoustic events in the presence of background noise for acoustic surveillance of hazardous situations
PublicationEvaluation of sound event detection, classification and localization of hazardous acoustic events in the presence of background noise of different types and changing intensities is presented. The methods for discerning between the events being in focus and the acoustic background are introduced. The classifier, based on a Support Vector Machine algorithm, is described. The set of features and samples used for the training of the...
-
Automatic audio-visual threat detection
PublicationThe concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...
-
Multimodal Surveillance Based Personal Protection System
PublicationA novel, multimodal approach for automatic detection of abduction of a protected individual, employing dedicated personal protection device and a city monitoring system is proposed and overviewed. The solution is based on combining four modalities (signals coming from: Bluetooth, fixed and PTZ cameras, thermal camera, acoustic sensors). The Bluetooth signal is used continuously to monitor the protected person presence, and in case...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublicationThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
Subjective and Objective Quality Evaluation Study of BPL -PLC Wired Medium
PublicationThis paper presents results of research on the effectiveness of bi-directional voice transmission in a 6 kV mine cable network using BPL-PLC (Broadband over Power Line - Power Line Communication) technology. It concerns both emergency cable state (supply outage with cable shorted at both ends) and loaded with distorted current waveforms. The narrowband (0.5 MHz–15 MHz) and broadband (two different modes, frequency range of 3 MHz–7.5...
-
Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera
PublicationThis paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...
-
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
PublicationThe aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
-
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
PublicationAutomatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...
-
New approach for determining the QoS of MP3-coded voice signals in IP networks
PublicationPresent-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion that the quality of service (QoS) must always be quantifiable no matter what. This paper focuses on one method to determine QoS. It defines an innovative, simple model that can evaluate the QoS of MP3-coded voice data transported...
-
Adaptive system for recognition of sounds indicating threats to security of people and property employing parallel processing of audio data streams
PublicationA system for recognition of threatening acoustic events employing parallel processing on a supercomputing cluster is featured. The methods for detection, parameterization and classication of acoustic events are introduced. The recognition engine is based onthreshold-based detection with adaptive threshold and Support Vector Machine classifcation. Spectral, temporal and mel-frequency descriptors are used as signal features. The...
-
Auditory Brainstem Responses recorded employing Audio ABR device
Open Research DataThe dataset consists of ABR measurements employing click, burst and speech stimuli. Parameters of the particular stimuli were as follows:
-
SYNAT_MUSIC_GENRE_FV_173
Open Research DataThis is the original dataset containing 51582 music tracks (22 music genres) and 173 element-feature vector [1-6,9]. A collection of more than 50000 music excerpts described with a set of descriptors obtained through the analysis of 30-second mp3 recordings was gathered in a database called SYNAT. The SYNAT database was realized by the Gdansk University...
-
Emotions in polish speech recordings
Open Research DataThe data set presents emotions recorded in sound files that are expressions of Polish speech. Statements were made by people aged 21-23, young voices of 5 men. Each person said the following words / nie – no, oddaj - give back, podaj – pass, stop - stop, tak - yes, trzymaj -hold / five times representing a specific emotion - one of three - anger (a),...
-
Video recordings of bees at entrance to hives
Open Research DataVideo recordings of bees at entrance to hives from 2017-04-22, 2017-04-23 and 2018-05-22. All recordings were made using hand-held full HD camera (Samsung Galaxy S3) and encoded using H.264 video codec (Standard Baseline Profile for mov files from 2017, High Profile for mp4 files from 2018) , 30 FPS and bit rate 14478 kb/s (mov files from 2017) or 16869 kb/s...
-
SYNAT Music Genre Parameters PCA 19
Open Research DataThe dataset contains feature vector after Principal Component Analysis (PCA) performing, so there are 11 music genres and 19-element vector derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of 52532 music excerpts described...
-
SYNAT_PCA_48
Open Research DataThere is a series of datasets containing feature vectors derived from music tracks. The dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 48-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier...
-
SYNAT_PCA_11
Open Research DataThe dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 11-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of more than...
-
ALOFON corpus
Open Research DataThe ALOFON corpus is one of the multimodal database of word recordings in English, available at http://www.modality-corpus.org/. The ALOFON corpus is oriented towards the recording of the speech equivalence variants. For this purpose, a total of 7 people who are or speak English with native speaker fluency and a variety of Standard Southern British...
-
Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning
PublicationIn this paper, the data, information, knowledge, and wisdom (DIKW) pyramid is revisited in the context of deep learning applied to machine learningbased audio signal processing. A discussion on the DIKW schema is carried out, resulting in a proposal that may supplement the original concept. Parallels between DIWK pertaining to audio processing are presented based on examples of the case studies performed by the author and her collaborators....