prof. dr hab. inż. Bożena Kostek
Employment
- Head of Research Laboratory at Laboratorium Akustyki Fonicznej
- Professor at Laboratorium Akustyki Fonicznej
Publications
Filters
total: 384
Catalog Publications
Year 2024
-
Sounding Mechanism of a Flue Organ Pipe—A Multi-Sensor Measurement Approach
PublicationThis work presents an approach that integrates the results of measuring, analyzing, and modeling air flow phenomena driven by pressurized air in a flue organ pipe. The investigation concerns a Bourdon organ pipe. Measurements are performed in an anechoic chamber using the Cartesian robot equipped with a 3D acoustic vector sensor (AVS) that acquires both acoustic pressure and air particle velocity. Also, a high-speed camera is employed...
Year 2023
-
Applying the Lombard Effect to Speech-in-Noise Communication
PublicationThis study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...
-
Assessing the attractiveness of human face based on machine learning
PublicationThe attractiveness of the face plays an important role in everyday life, especially in the modern world where social media and the Internet surround us. In this study, an attempt to assess the attractiveness of a face by machine learning is shown. Attractiveness is determined by three deep models whose sum of predictions is the final score. Two annotated datasets available in the literature are employed for training and testing...
-
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
PublicationThe purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...
-
AUTOMATYCZNA KLASYFIKACJA MOWY PATOLOGICZNEJ
PublicationAplikacja przedstawiona w niniejszym rozdziale służy do automatycznego wykrywania mowy patologicznej na podstawie bazy nagrań. W pierwszej kolejności przedstawiono założenia leżące u podstaw przeprowadzonych badan wraz z wyborem bazy mowy patologicznej. Zaprezentowano również zastosowane algorytmy oraz cechy sygnału mowy, które pozwalają odróżnić mowę niezaburzoną od mowy patologicznej. Wytrenowane sieci neuronowe zostały następnie...
-
Combining MUSHRA Test and Fuzzy Logic in the Evaluation of Benefits of Using Hearing Prostheses
PublicationAssessing the effectiveness of hearing aid fittings based on the benefits they provide is crucial but intricate. While objective metrics of hearing aids like gain, frequency response, and distortion are measurable, they do not directly indicate user benefits. Hearing aid performance assessment encompasses various aspects, such as compensating for hearing loss and user satisfaction. The authors suggest enhancing the widely used...
-
Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning
PublicationIn this paper, the data, information, knowledge, and wisdom (DIKW) pyramid is revisited in the context of deep learning applied to machine learningbased audio signal processing. A discussion on the DIKW schema is carried out, resulting in a proposal that may supplement the original concept. Parallels between DIWK pertaining to audio processing are presented based on examples of the case studies performed by the author and her collaborators....
-
Detecting Lombard Speech Using Deep Learning Approach
PublicationRobust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...
-
INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH
PublicationThe Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...
-
Predicting emotion from color present in images and video excerpts by machine learning
PublicationThis work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...
-
Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
PublicationThis work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...
-
SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM
PublicationThe main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...
-
WYKORZYSTANIE TESTU MUSHRA W BADANIU KORZYŚCI UŻYTKOWANIA PROTEZ SŁUCHOWYCH
PublicationOcena jakości dopasowania aparatów słuchowych w kontekście korzyści, jakie może przy-nieść proteza jest złożonym zagadnieniem. Obiektywne parametry aparatów, które można wy-znaczyć (np. wzmocnienie czy pasmo przenoszenia) nie zawsze mają bezpośredni i decydujący wpływ w subiektywnej ocenie jakości dopasowania protezy słuchowej przez pacjenta. Pomiary efektywności aparatu słuchowego mogą dotyczyć wielu aspektów, między innymi kompensacji...
Year 2022
-
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
PublicationObjective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...
-
Algoritmically improved microwave radar monitors breathing more acurrate than sensorized belt
PublicationThis paper describes a novel way to measure, process, analyze, and compare respiratory signals acquired by two types of devices: a wearable sensorized belt and a microwave radar-based sensor. Both devices provide breathing rate readouts. First, the background research is presented. Then, the underlying principles and working parameters of the microwave radar-based sensor, a contactless device for monitoring breathing, are described....
-
Analysis-by-synthesis paradigm evolved into a new concept
PublicationThis work aims at showing how the well-known analysis-by-synthesis paradigm has recently been evolved into a new concept. However, in contrast to the original idea stating that the created sound should not fail to pass the foolproof synthesis test, the recent development is a consequence of the need to create new data. Deep learning models are greedy algorithms requiring a vast amount of data that, in addition, should be correctly...
-
Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera
PublicationThis paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...
-
Computer-assisted pronunciation training—Speech synthesis is almost all you need
PublicationThe research community has long studied computer-assisted pronunciation training (CAPT) methods in non-native speech. Researchers focused on studying various model architectures, such as Bayesian networks and deep learning methods, as well as on the analysis of different representations of the speech signal. Despite significant progress in recent years, existing CAPT methods are not able to detect pronunciation errors with high...
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublicationThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
-
Intelligent Audio Signal Processing − Do We Still Need Annotated Datasets?
PublicationIn this paper, intelligent audio signal processing examples are shortly described. The focus is, however, on the machine learning approach and datasets needed, especially for deep learning models. Years of intense research produced many important results in this area; however, the goal of fully intelligent signal processing, characterized by its autonomous acting, is not yet achieved. Therefore, a review of state-of-the-art concerning...
seen 5077 times