Laboratorium Akustyki Fonicznej - Administrative Units - Bridge of Knowledge

Search

Laboratorium Akustyki Fonicznej

Filters

total: 37

  • Category
  • Year
  • Options

clear Chosen catalog filters disabled

Catalog Publications

Year 2024
  • Sounding Mechanism of a Flue Organ Pipe—A Multi-Sensor Measurement Approach
    Publication

    - SENSORS - Year 2024

    This work presents an approach that integrates the results of measuring, analyzing, and modeling air flow phenomena driven by pressurized air in a flue organ pipe. The investigation concerns a Bourdon organ pipe. Measurements are performed in an anechoic chamber using the Cartesian robot equipped with a 3D acoustic vector sensor (AVS) that acquires both acoustic pressure and air particle velocity. Also, a high-speed camera is employed...

    Full text available to download

Year 2023
  • Assessing the attractiveness of human face based on machine learning
    Publication

    The attractiveness of the face plays an important role in everyday life, especially in the modern world where social media and the Internet surround us. In this study, an attempt to assess the attractiveness of a face by machine learning is shown. Attractiveness is determined by three deep models whose sum of predictions is the final score. Two annotated datasets available in the literature are employed for training and testing...

    Full text available to download

  • Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
    Publication

    The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

    Full text available to download

  • AUTOMATYCZNA KLASYFIKACJA MOWY PATOLOGICZNEJ
    Publication

    Aplikacja przedstawiona w niniejszym rozdziale służy do automatycznego wykrywania mowy patologicznej na podstawie bazy nagrań. W pierwszej kolejności przedstawiono założenia leżące u podstaw przeprowadzonych badan wraz z wyborem bazy mowy patologicznej. Zaprezentowano również zastosowane algorytmy oraz cechy sygnału mowy, które pozwalają odróżnić mowę niezaburzoną od mowy patologicznej. Wytrenowane sieci neuronowe zostały następnie...

    Full text to download in external service

  • Data, Information, Knowledge, Wisdom Pyramid Concept Revisited in the Context of Deep Learning
    Publication

    - Year 2023

    In this paper, the data, information, knowledge, and wisdom (DIKW) pyramid is revisited in the context of deep learning applied to machine learningbased audio signal processing. A discussion on the DIKW schema is carried out, resulting in a proposal that may supplement the original concept. Parallels between DIWK pertaining to audio processing are presented based on examples of the case studies performed by the author and her collaborators....

    Full text to download in external service

  • Detecting Lombard Speech Using Deep Learning Approach
    Publication
    • K. Kąkol
    • G. Korvel
    • G. Tamulevicius
    • B. Kostek

    - SENSORS - Year 2023

    Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...

    Full text available to download

  • INVESTIGATION OF THE LOMBARD EFFECT BASED ON A MACHINE LEARNING APPROACH
    Publication

    The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters...

    Full text available to download

  • Klasyfikacja emocji w muzyce filmowej z wykorzystaniem uczenia głębokiego
    Publication

    - Year 2023

    Praca przedstawia zagadnienia związane z klasyfikacją emocji w muzyce filmowej. W artykule zaproponowano model emocji zawierający dziewięć stanów emocjonalnych, do których przypisany jest kolor zgodnie z teorią koloru w filmie. Kolejne kroki eksperymentu obejmowały wybór muzyki filmowej do testów (baza Epidemic Sound), przygotowanie założeń ankiety oraz modelu emocji wykorzystywanych w testach odsłuchowych, a także konstrukcję...

    Full text to download in external service

  • Predicting emotion from color present in images and video excerpts by machine learning
    Publication

    This work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...

    Full text available to download

  • Rediscovering Automatic Detection of Stuttering and Its Subclasses through Machine Learning—The Impact of Changing Deep Model Architecture and Amount of Data in the Training Set
    Publication

    - Applied Sciences-Basel - Year 2023

    This work deals with automatically detecting stuttering and its subclasses. An effective classification of stuttering along with its subclasses could find wide application in determining the severity of stuttering by speech therapists, preliminary patient diagnosis, and enabling communication with the previously mentioned voice assistants. The first part of this work provides an overview of examples of classical and deep learning...

    Full text available to download

  • SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM

    The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...

    Full text available to download

  • WYKORZYSTANIE TESTU MUSHRA W BADANIU KORZYŚCI UŻYTKOWANIA PROTEZ SŁUCHOWYCH
    Publication

    - Year 2023

    Ocena jakości dopasowania aparatów słuchowych w kontekście korzyści, jakie może przy-nieść proteza jest złożonym zagadnieniem. Obiektywne parametry aparatów, które można wy-znaczyć (np. wzmocnienie czy pasmo przenoszenia) nie zawsze mają bezpośredni i decydujący wpływ w subiektywnej ocenie jakości dopasowania protezy słuchowej przez pacjenta. Pomiary efektywności aparatu słuchowego mogą dotyczyć wielu aspektów, między innymi kompensacji...

    Full text available to download

Year 2022
Year 2021
  • Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
    Publication

    The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

    Full text available to download

  • AUTOMATYCZNE GENEROWANIE KOLEJNOŚCI LIST UTWORÓW MUZYCZNYCH
    Publication

    - Year 2021

    W niniejszym rozdziale przedstawiono przygotowanie algorytmu do automa-tycznego układania kolejności utworów muzycznych i zgrywającego je do postaci jednego, długiego miksu. Dzięki algorytmowi dobierane są utwory na podstawie analizy podobieństwa fragmentów końcowych i początkowych utworów. Podo-bieństwo to jest obliczane za pomocą odległości euklidesowej między wektorami parametrów wyznaczonymi przez autoenkoder oraz na podstawie...

    Full text to download in external service

  • Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
    Publication

    - Year 2021

    This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

    Full text available to download

  • Evaluation of aspiration problems in L2 English pronunciation employing machine learning

    The approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...

    Full text available to download

  • Evaluation of Six Degrees of Freedom 3D Audio Orchestra Recording and Playback using multi-point Ambisonic interpolation
    Publication
    • T. Ciotucha
    • A. Rumiński
    • T. Żernicki
    • B. Mróz

    - Scopus - Year 2021

    This paper describes a strategy for recording sound and enabling six-degrees-of-freedom playback, making use of multiple simultaneous and synchronized Higher Order Ambisonics (HOA) recordings. Such a strategy enables users to navigate in a simulated 3D space and listen to the six-degrees-of-freedom recordings from different perspectives. For the evaluation of the proposed approach, an Unreal Engine-based navigable 3D audiovisual...

    Full text to download in external service

  • How Machine Learning Contributes to Solve Acoustical Problems
    Publication
    • M. A. Roch
    • P. Gerstoft
    • B. Kostek
    • Z. Michalopoulou

    - Journal of the Acoustical Society of America - Year 2021

    Machine learning is the process of learning functional relationships between measured signals (called percepts in the artificial intelligence literature) and some output of interest. In some cases, we wish to learn very specific relationships from signals such as identifying the language of a speaker (e.g. Zissman, 1996) which has direct applications such as in call center routing or performing a music information retrieval task...

    Full text available to download

  • Introduction to the special issue on machine learning in acoustics
    Publication
    • Z. Michalopoulou
    • P. Gerstoft
    • B. Kostek
    • M. A. Roch

    - Journal of the Acoustical Society of America - Year 2021

    When we started our Call for Papers for a Special Issue on “Machine Learning in Acoustics” in the Journal of the Acoustical Society of America, our ambition was to invite papers in which machine learning was applied to all acoustics areas. They were listed, but not limited to, as follows: • Music and synthesis analysis • Music sentiment analysis • Music perception • Intelligent music recognition • Musical source separation • Singing...

    Full text available to download

  • Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling
    Publication

    - Year 2021

    A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...

    Full text to download in external service

  • Reinforcement Learning Algorithm and FDTD-based Simulation Applied to Schroeder Diffuser Design Optimization
    Publication

    The aim of this paper is to propose a novel approach to the algorithmic design of Schroeder acoustic diffusers employing a deep learning optimization algorithm and a fitness function based on a computer simulation of the propagation of acoustic waves. The deep learning method employed for the research is a deep policy gradient algorithm. It is used as a tool for carrying out a sequential optimization process the goal of which is...

    Full text available to download

  • Skuteczność klasyfikacji gatunków muzycznych za pomocą sieci neuronowej w zależności od typu danych wejściowych
    Publication

    Rozpoznawanie gatunku muzycznego jest jednym z podstawowych elementów inteligentnych systemów tworzenia automatycznych list muzyki. Platformy strumieniowe oferujące taką usługę wymagają rozwiązań, które umożliwią jak najdokładniej określić przynależność utworu do gatunku muzycznego. Zgodnie z aktualnym stanem wiedzy – najskuteczniejszym klasyfikatorem są sztuczne sieci neuronowe (w tym w wersji uczenia głębokiego), dla których...

    Full text to download in external service

Year 2017
  • Sound intensity distribution around organ pipe

    The aim of the paper was to compare acoustic field around the open and stopped organ pipes. The wooden organ pipe was located in the anechoic chamber and activated with a constant air flow, produced by an external air-compressor. Thus, long-term steady state response was possible to obtain. Multichannel acoustic vector sensor was used to measure the sound intensity distribution of radiated acoustic energy. Measurements have been...

    Full text available to download