Laboratorium Akustyki Fonicznej

Assessing the attractiveness of human face based on machine learning

Publication

- Year 2023

The attractiveness of the face plays an important role in everyday life, especially in the modern world where social media and the Internet surround us. In this study, an attempt to assess the attractiveness of a face by machine learning is shown. Attractiveness is determined by three deep models whose sum of predictions is the final score. Two annotated datasets available in the literature are employed for training and testing...

Full text available to download

Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders

Publication

D. Koszewski
T. Görne
G. Korvel
B. Kostek

- EURASIP Journal on Audio Speech and Music Processing - Year 2023

The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Full text available to download

AUTOMATYCZNA KLASYFIKACJA MOWY PATOLOGICZNEJ

Publication

- Year 2023

Aplikacja przedstawiona w niniejszym rozdziale służy do automatycznego wykrywania mowy patologicznej na podstawie bazy nagrań. W pierwszej kolejności przedstawiono założenia leżące u podstaw przeprowadzonych badan wraz z wyborem bazy mowy patologicznej. Zaprezentowano również zastosowane algorytmy oraz cechy sygnału mowy, które pozwalają odróżnić mowę niezaburzoną od mowy patologicznej. Wytrenowane sieci neuronowe zostały następnie...

Full text to download in external service

Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning

Publication

B. Kostek

- Year 2023

In this paper, the data, information, knowledge, and wisdom (DIKW) pyramid is revisited in the context of deep learning applied to machine learningbased audio signal processing. A discussion on the DIKW schema is carried out, resulting in a proposal that may supplement the original concept. Parallels between DIWK pertaining to audio processing are presented based on examples of the case studies performed by the author and her collaborators....

Full text to download in external service

Detecting Lombard Speech Using Deep Learning Approach

Publication

K. Kąkol
G. Korvel
G. Tamulevicius
B. Kostek

- SENSORS - Year 2023

Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

Full text available to download

INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH

Publication

G. Korvel
P. Treigys
K. Kąkol
B. Kostek

- International Journal of Applied Mathematics and Computer Science - Year 2023

The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...

Full text available to download

Predicting emotion from color present in images and video excerpts by machine learning

Publication

- IEEE Access - Year 2023

This work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...

Full text available to download

Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set

Publication

P. Filipowicz
B. Kostek

- Applied Sciences-Basel - Year 2023

This work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...

Full text available to download

SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM

Publication

- Journal of the Acoustical Society of America - Year 2023

The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...

Full text available to download

WYKORZYSTANIE TESTU MUSHRA W BADANIU KORZYŚCI UŻYTKOWANIA PROTEZ SŁUCHOWYCH

Publication

P. Szymański
T. Poremski
B. Kostek

- Year 2023

Ocena jakości dopasowania aparatów słuchowych w kontekście korzyści, jakie może przy-nieść proteza jest złożonym zagadnieniem. Obiektywne parametry aparatów, które można wy-znaczyć (np. wzmocnienie czy pasmo przenoszenia) nie zawsze mają bezpośredni i decydujący wpływ w subiektywnej ocenie jakości dopasowania protezy słuchowej przez pacjenta. Pomiary efektywności aparatu słuchowego mogą dotyczyć wielu aspektów, między innymi kompensacji...

Full text available to download

Publications

Filters

Category

Year

Options

Year 2024

Sounding Mechanism of a Flue Organ Pipe—A Multi-Sensor Measurement Approach

Year 2023

Assessing the attractiveness of human face based on machine learning

Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders

AUTOMATYCZNA KLASYFIKACJA MOWY PATOLOGICZNEJ

Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning

Detecting Lombard Speech Using Deep Learning Approach

INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH

Predicting emotion from color present in images and video excerpts by machine learning

Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set

SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM

WYKORZYSTANIE TESTU MUSHRA W BADANIU KORZYŚCI UŻYTKOWANIA PROTEZ SŁUCHOWYCH

Year 2022

Algoritmically improved microwave radar monitors breathing more acurrate than sensorized belt

Analysis-by-synthesis paradigm evolved into a new concept

Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera

Computer-assisted pronunciation training—Speech synthesis is almost all you need

Intelligent Audio Signal Processing − Do We Still Need Annotated Datasets?

Investigating Noise Interference on Speech Towards Applying the Lombard Effect Automatically

Klasyfikacja emocji w muzyce filmowej z wykorzystaniem uczenia głębokiego

Machine learning applied to acoustic-based road traffic monitoring

Machine learning applied to acoustic-based road traffic monitoring

Search