Laboratorium Akustyki Fonicznej

An audio-visual corpus for multimodal automatic speech recognition

Publication

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2017

review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Full text available to download

Automatic music genre classification based on musical instrument track separation / Automatyczna klasyfikacja gatunku muzycznego wykorzystująca algorytm separacji dźwięku instrumentó muzycznych

Publication

A. Rosner
B. Kostek

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2018

The aim of this article is to investigate whether separating music tracks at the pre-processing phase and extending feature vector by parameters related to the specific musical instruments that are characteristic for the given musical genre allow for efficient automatic musical genre classification in case of database containing thousands of music excerpts and a dozen of genres. Results of extensive experiments show that the approach...

Full text available to download

Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition

Publication

G. Korvel
P. Treigys
G. Tamulevicus
J. Bernataviciene
B. Kostek

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2018

convolutional neural network (CNN) which is a class of deep, feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and Mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word...

Problems of Railway Noise—A Case Study

Publication

- International Journal of Occupational Safety and Ergonomics - Year 2011

Under Directive 2002/49/EC relating to the assessment and management of environmental noise, all European countries are obliged to model their environmental noise levels in heavily populated areas. Some countries have their own national method, to predict noise but most have not created one yet. The recommendation for countries that do not have their own model is to use an interim method....

Full text to download in external service

A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces

Publication

G. Tamulevicius
G. Korvel
A. B. Yayak
P. Treigys
J. Bernataviciene
B. Kostek

- Electronics - Year 2020

In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Full text available to download

Classification of Music Genres Based on Music Separation into Harmonic and Drum Components . Klasyfikacja gatunków muzycznych wykorzystująca separację instrumentów muzycznych

Publication

A. Rosner
B. Schuller
B. Kostek

- Archives of Acoustics - Year 2014

This article presents a study on music genre classification based on music separation into harmonic and drum components. For this purpose, audio signal separation is executed to extend the overall vector of parameters by new descriptors extracted from harmonic and/or drum music content. The study is performed using the ISMIS database of music files represented by vectors of parameters containing music features. The Support Vector...

Full text available to download

Bass Enhancement Settings in Portable Devices Based on Music Genre Recognition

Publication

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2015

The paper presents a novel approach to the Virtual Bass Synthesis (VBS) applied to mobile devices, called Smart VBS (SVBS). The proposed algorithm uses an intelligent, rule-based setting of bass synthesis parameters adjusted to the particular music genre. Harmonic generation is based on a nonlinear device (NLD) method with the intelligent controlling system adapting to the recognized music genre. To automatically classify music...

Full text available to download

Music Mood Visualization Using Self-Organizing Maps

Publication

- Archives of Acoustics - Year 2015

Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

Full text available to download

MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

Publication

M. Piotrowska
G. Korvel
B. Kostek
T. Ciszewski
A. Czyżewski

- International Journal of Applied Mathematics and Computer Science - Year 2019

Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

Full text available to download

UPDRS tests for diagnosis of Parkinson's disease employing virtual-touchpad

Publication

- Year 2010

This paper presents a new approach to diagnosing Parkinson's disease. The progression of the disease can be measured by the UPDRS (Unified Parkinson Disease Rating Scale) scale which is used to evaluate motor and behavioral symptoms of Parkinson's disease. Hitherto the evaluation of the advancement of the disease in the UPDRS scale was made by a specialist through medical observation. The authors suggest a partial automation of...

Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

Publication

D. Korzekwa
R. Barra-Chicote
B. Kostek
T. Drugman
M. Łajszczak

- Year 2019

We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Full text available to download

Musical Instrument Identification Using Deep Learning Approach

Publication

- SENSORS - Year 2022

The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata...

Full text available to download

Creating Dynamic Maps of Noise Threat Using PL-Grid Infrastructure

Publication

- Archives of Acoustics - Year 2013

The paper presents functionality and operation results of a system for creating dynamic maps of acoustic noise employing the PL-Grid infrastructure extended with a distributed sensor network. The work presented provides a demonstration of the services being prepared within the PLGrid Plus project for measuring, modeling and rendering data related to noise level distribution in city agglomerations. Specific computational environments,...

Full text available to download

Automatic assessment of the motor state of the Parkinson's disease patient --a case study

Publication

B. Kostek
K. Kaszuba-Miotke
P. Żwan
P. Robowski
J. Sławek

- Diagnostic Pathology - Year 2012

This paper presents a novel methodology in which the Unified Parkinson's Disease Rating Scale (UPDRS) data processed with a rule-based decision algorithm is used to predict the state of the Parkinson's Disease patients. The research was carried out to investigate whether the advancement of the Parkinson's Disease can be automatically assessed. For this purpose, past and current UPDRS data from 47 subjects were examined. The results...

Full text available to download

3D Acoustic Field Intensity Probe Design and Measurements

Publication

- Archives of Acoustics - Year 2016

The aim of this paper is two-fold. First, some basic notions on acoustic field intensity and its measurement are shortly recalled. Then, the equipment and the measurement procedure used in the sound intensity in the performed research study are described. The second goal is to present details of the design of the engineered 3D intensity probe, as well as the algorithms developed and applied for that purpose. Results of the intensity...

Full text available to download

Music Information Retrieval in Music Repositories

Publication

B. Kostek

- Year 2013

This chapter reviews the key concepts associated with automated Music Information Retrieval (MIR). First, current research trends and system solutions in terms of music retrieval and music recommendation are discussed. Next, experiments performed on a constructed music database are presented. A proposal for music retrieval and annotation aided by gaze tracking is also discussed.

Full text to download in external service

Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres

Publication

- Electronics - Year 2020

The purpose of this research is two-fold: (a) to explore the relationship between the listeners’ personality trait, i.e., extraverts and introverts and their preferred music genres, and (b) to predict the personality trait of potential listeners on the basis of a musical excerpt by employing several classification algorithms. We assume that this may help match songs according to the listener’s personality in social music networks....

Full text available to download

Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement

Publication

G. Korvel
K. Kąkol
O. Kurasova
B. Kostek

- IEEE Access - Year 2020

The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Full text available to download

A new method for measuring the psychoacoustical properties of tinnitus

Publication

B. Kostek
T. Poremski

- Diagnostic Pathology - Year 2013

information, select the tinnitus treatment and quantitatively substantiate its effects, the measurement of the Tinnitus psychoacoustic parameters should be made an inherent part of the Tinnitus therapy. Methods For this purpose the multimedia-based sound synthesizer has been proposed for testing tinnitus and the results obtained this way are compared with the outcome of the audiometer-based Wilcoxon test. The method has been verified...

Full text available to download

Measurements and Visualization of Sound Intensity Around the Human Head in Free Field Using Acoustic Vector Sensor

Publication

- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2015

This paper presents measurements and visualization of sound intensity around the human head simulator in a free field. A Cartesian robot, applied for precise positioning of the acoustic vector sensor, was used to measure sound intensity. Measurements were performed in a free field using a head and torso simulator and the setup consisting of four different loudspeaker configurations. The acoustic vector sensor was positioned around...

Full text available to download

Publications

Filters

Category

Year

Options

An audio-visual corpus for multimodal automatic speech recognition

Automatic music genre classification based on musical instrument track separation / Automatyczna klasyfikacja gatunku muzycznego wykorzystująca algorytm separacji dźwięku instrumentó muzycznych

Analysis of 2D Feature Spaces for Deep Learning-based Speech Recognition

Problems of Railway Noise—A Case Study

A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces

Classification of Music Genres Based on Music Separation into Harmonic and Drum Components . Klasyfikacja gatunków muzycznych wykorzystująca separację instrumentów muzycznych

Bass Enhancement Settings in Portable Devices Based on Music Genre Recognition

Music Mood Visualization Using Self-Organizing Maps

MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES

UPDRS tests for diagnosis of Parkinson's disease employing virtual-touchpad

Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

Musical Instrument Identification Using Deep Learning Approach

Creating Dynamic Maps of Noise Threat Using PL-Grid Infrastructure

Automatic assessment of the motor state of the Parkinson's disease patient --a case study

3D Acoustic Field Intensity Probe Design and Measurements

Music Information Retrieval in Music Repositories

Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres

Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement

A new method for measuring the psychoacoustical properties of tinnitus

Measurements and Visualization of Sound Intensity Around the Human Head in Free Field Using Acoustic Vector Sensor

Search