Laboratorium Akustyki Fonicznej

Publikacje

Rok 2023

Combining MUSHRA Test and Fuzzy Logic in the Evaluation of Benefits of Using Hearing Prostheses
Publikacja
- P. Szymański
- T. Poremski
- B. Kostek
- Electronics - Rok 2023
Assessing the effectiveness of hearing aid fittings based on the benefits they provide is crucial but intricate. While objective metrics of hearing aids like gain, frequency response, and distortion are measurable, they do not directly indicate user benefits. Hearing aid performance assessment encompasses various aspects, such as compensating for hearing loss and user satisfaction. The authors suggest enhancing the widely used...

Pełny tekst do pobrania w portalu
Applying the Lombard Effect to Speech-in-Noise Communication
Publikacja
- G. Korvel
- K. Kąkol
- P. Treigys
- B. Kostek
- Electronics - Rok 2023
This study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...

Pełny tekst do pobrania w portalu

Rok 2022

Musical Instrument Identification Using Deep Learning Approach
Publikacja
- M. Blaszke
- B. Kostek
- SENSORS - Rok 2022
The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata...

Pełny tekst do pobrania w portalu
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
Publikacja
- Applied Sciences-Basel - Rok 2022
The aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...

Pełny tekst do pobrania w portalu
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times
Publikacja
- SENSORS - Rok 2022
Objective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way...

Pełny tekst do pobrania w portalu

Rok 2021

Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech
Publikacja
- D. Korzekwa
- J. Lorenzo-trueba
- T. Drugman
- S. Calamaro
- B. Kostek
- Rok 2021
We propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions for L2 speech means that the model has to learn only from a weak signal of word-level mispronunciations. Because of that and due to the limited amount of mispronounced...

Pełny tekst do pobrania w portalu
Techniki wielokanałowe wykorzystywane w koncertach i nagraniach muzycznych na odległość
Publikacja
- Rok 2021
W czasie pandemii koronawirusa COVID-19 nowego znaczenia nabrały możliwości transmisji dźwięku z obrazem – zwłaszcza do pracy zdalnej, która w przypadku muzyków jest szczególnym wyzwaniem zarówno w kontekście wspólnych ćwiczeń i prób, jak i koncertów. Wynikła konieczność wieloźródłowego połączenia ujawniła potrzebę uprzestrzennienia dźwięku w celu łatwiejszej lokalizacji źródeł dźwięku. Tworzenie zdalnych nagrań muzycznych stało...

Pełny tekst do pobrania w serwisie zewnętrznym
KLASYFIKACJA EMOCJI W MUZYCE FILMOWEJ Z WYKORZYSTANIEM TESTÓW SUBIEKTYWNYCH
Publikacja
- Rok 2021
Celem referatu było przedstawienie testów odsłuchowych, w których zadaniem osób ankietowanych było przypisanie danego fragmentu muzycznego do odpowiedniej klasy emocji. Kolejne kroki eksperymentu obejmowały wybór muzyki filmowej do testów (baza Epidemic Sound), przygotowanie założeń ankiety oraz modelu emocji wykorzystywanych w testach odsłuchowych, jak również konstrukcj ˛e ankiety. Ankieta została zrealizowana za pomoc ˛a formularzy...

Pełny tekst do pobrania w portalu
Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network
Publikacja
- G. Korvel
- P. Treigys
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2021
The goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...

Pełny tekst do pobrania w portalu
Classifying Emotions in Film Music - A Deep Learning Approach
Publikacja
- Electronics - Rok 2021
The paper presents an application for automatically classifying emotions in film music. A model of emotions is proposed, which is also associated with colors. The model created has nine emotional states, to which colors are assigned according to the color theory in film. Subjective tests are carried out to check the correctness of the assumptions behind the adopted emotion model. For that purpose, a statistical analysis of the...

Pełny tekst do pobrania w portalu

Rok 2020

Ranking Speech Features for Their Usage in Singing Emotion Classification
Publikacja
- S. Zaporowski
- B. Kostek
- Rok 2020
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Pełny tekst do pobrania w portalu
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
Publikacja
- D. Koszewski
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2020
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Pełny tekst do pobrania w portalu
Investigating Feature Spaces for Isolated Word Recognition
Publikacja
- P. Treigys
- G. Korvel
- G. Tamulevicius
- J. Bernataviciene
- B. Kostek
- Rok 2020
The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Pełny tekst do pobrania w serwisie zewnętrznym
Improving Objective Speech Quality Indicators in Noise Conditions
Publikacja
- K. Kąkol
- G. Korvel
- B. Kostek
- Rok 2020
This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

Pełny tekst do pobrania w serwisie zewnętrznym
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
Publikacja
- G. Korvel
- K. Kąkol
- O. Kurasova
- B. Kostek
- IEEE Access - Rok 2020
The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Pełny tekst do pobrania w portalu
Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres
Publikacja
- Electronics - Rok 2020
The purpose of this research is two-fold: (a) to explore the relationship between the listeners’ personality trait, i.e., extraverts and introverts and their preferred music genres, and (b) to predict the personality trait of potential listeners on the basis of a musical excerpt by employing several classification algorithms. We assume that this may help match songs according to the listener’s personality in social music networks....

Pełny tekst do pobrania w portalu
Analyzing the Effectiveness of the Brain–Computer Interface for Task Discerning Based on Machine Learning
Publikacja
- SENSORS - Rok 2020
The aim of the study is to compare electroencephalographic (EEG) signal feature extraction methods in the context of the effectiveness of the classification of brain activities. For classification, electroencephalographic signals were obtained using an EEG device from 17 subjects in three mental states (relaxation, excitation, and solving logical task). Blind source separation employing independent component analysis (ICA) was...

Pełny tekst do pobrania w portalu
Analiza ruchu drogowego z wykorzystaniem analizy akustycznej
Publikacja
- K. Marciniuk
- B. Kostek
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Rok 2020
Tematyka pracy porusza zagadnienia dotyczące pozyskiwania informacji o ruchu drogowym z wykorzystaniem monitoringu akustycznego. Przybliżono podstawowe techniki nadzoru nad ruchem drogowym. Przedstawiono założenia akustycznego detektora ruchu i zbadano jego skuteczność na trzech płaszczyznach działania – zliczania pojazdów, klasyfikacji rodzajowej i klasyfikacji warunków pogodowych panujących na nawierzchni

Pełny tekst do pobrania w serwisie zewnętrznym
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
Publikacja
- G. Tamulevicius
- G. Korvel
- A. B. Yayak
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Electronics - Rok 2020
In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Pełny tekst do pobrania w portalu

Rok 2019

Subjective tests for gathering konwledge for applaying color grading to video clips automatically
Publikacja
- D. Weber
- B. Kostek
- Rok 2019
The analysis of film music concerning caused emotions may allow for a more accurate adaptation of the color of the film in the context of color grading. Therefore, this paper aims to gather knowledge on the correlation between the applied color palette to a video clip, music associated with a particular shot,and emotions evoked. For that purpose, subjective tests are prepared in which several video clips are presented with...

Pełny tekst do pobrania w serwisie zewnętrznym
Subjective tests for gathering knowledge for applying color grading to video clips automatically
Publikacja
- D. Weber
- B. Kostek
- Rok 2019
The analysis of film music concerning caused emotions may allow for a more accurate adaptation of the color of the film in the context of color grading. Therefore, this paper aims to gather knowledge on the correlation between the applied color palette to a video clip, music associated with a particular shot, and emotions evoked. For that purpose, subjective tests are prepared in which several video clips are presented with or...

Pełny tekst do pobrania w portalu
Speech Analytics Based on Machine Learning
Publikacja
- Rok 2019
In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

Pełny tekst do pobrania w serwisie zewnętrznym
Sound engineering as our commitment to its creators in Poland
Publikacja
- B. Kostek
- A. Czyżewski
- Archives of Acoustics - Rok 2019
Sound engineering is an interdisciplinary and rapidly expanding domain. It covers many aspects, such as sound perception, studio and sound mastering technology, music information retrieval including content-based search systems and automatic music transcription frameworks, sound synthesis, sound restoration, electroacoustics, and other ones constituting multimedia technology. Moreover, machine learning methods applied to the topics...

Pełny tekst do pobrania w serwisie zewnętrznym
Relationship between album cover design and music genres.
Publikacja
- A. Dorochowicz
- B. Kostek
- Rok 2019
The aim of the study is to find out whether there exists a relationship between typographic, compositional and coloristic elements of the music album cover design and music contained in the album. The research study involves basic statistical analysis of the manually extracted data coming from the worldwide album covers. The samples represent 34 different music genres, coming from nine countries from around the world. There are...
Recovering Sound Produced by Wind Turbine Structures Employing Video Motion Magnification
Publikacja
- Rok 2019
The recordings were made with a fast video camera and with a microphone. Using fast cameras allowed for observation of the micro vibrations of the object structure. Motion-magnified video recordings of wind turbines on a wind farm were made for the purpose of building a damage prediction system. An idea was to use video to recover sound & vibrations in order to obtain a contactless diagnostic method for wind turbines. The recovered signals...

Pełny tekst do pobrania w serwisie zewnętrznym
Music information retrieval—The impact of technology, crowdsourcing, big data, and the cloud in art.
Publikacja
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2019
The exponential growth of computer processing power, cloud data storage, and crowdsourcing model of gathering data bring new possibilities to music information retrieval (mir) field. Mir is no longer music content retrieval only; the area also comprises the discovery of expressing feelings and emotions contained in music, incorporating other than hearing modalities for helping this issue, users’ profiling, merging music with social...

Pełny tekst do pobrania w portalu
Method for Clustering of Brain Activity Data Derived from EEG Signals
Publikacja
- FUNDAMENTA INFORMATICAE - Rok 2019
A method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...

Pełny tekst do pobrania w portalu
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
Publikacja
- M. Piotrowska
- G. Korvel
- B. Kostek
- T. Ciszewski
- A. Czyżewski
- International Journal of Applied Mathematics and Computer Science - Rok 2019
Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

Pełny tekst do pobrania w portalu
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publikacja
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Rok 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Pełny tekst do pobrania w portalu
Discovering Rule-Based Learning Systems for the Purpose of Music Analysis
Publikacja
- G. Korvel
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2019
Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

Pełny tekst do pobrania w portalu
Comparison of the effectiveness of automatic EEG signal class separation algorithms
Publikacja
- JOURNAL OF INTELLIGENT & FUZZY SYSTEMS - Rok 2019
In this paper, an algorithm for automatic brain activity class identification of EEG (electroencephalographic) signals is presented. EEG signals are gathered from seventeen subjects performing one of the three tasks: resting, watching a music video and playing a simple logic game. The methodology applied consists of several steps, namely: signal acquisition, signal processing utilizing z-score normalization, parametrization and...

Pełny tekst do pobrania w portalu
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
Publikacja
- G. Korvel
- O. Kurasova
- B. Kostek
- Archives of Acoustics - Rok 2019
The goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...

Pełny tekst do pobrania w portalu
Assessment of the Effectiveness of a Short-term Hearing Aid Use in Patients with Different Degrees of Hearing Loss
Publikacja
- T. Poremski
- P. Szymański
- B. Kostek
- Archives of Acoustics - Rok 2019
The study presents evaluating the effectiveness of the hearing aid fitting process in the short-term use (7 days). The evaluation method consists of a survey based on the APHAB (Abbreviated Profile of Hearing Aid Benefit) questionnaire. Additional criteria such as a degree of hearing loss, number of hours and days of hearing aid use as well as the user’s experience were also taken into consideration. The outcomes of the benefit...

Pełny tekst do pobrania w portalu
ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU
Publikacja
- S. Zaporowski
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2019
Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory...

Pełny tekst do pobrania w portalu
ANALIZA KOLORÓW SCEN FILMOWYCH W KONTEKŚCIE COLOR GRADINGU
Publikacja
- D. Weber
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2019
W artykule przedstawiono zagadnienia związane z kolorowaniem sceny filmowej. W pracy przedyskutowano główne aspekty obróbki koloru obrazu filmowego oraz omówiono definicje pojęć związanych z kolorowaniem sceny, tj.: color correction oraz color gradingu. Opisano teorie psychologii koloru oraz ich praktyczne wykorzystanie w filmie i odniesiono je do podstawowych gatunków filmowych i modeli emocji. Następnie przedyskutowano założenia...

Pełny tekst do pobrania w portalu
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
Publikacja
- G. Korvel
- O. Kurasova
- B. Kostek
- Rok 2019
The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...

Pełny tekst do pobrania w portalu
A Concept of Automatic Film Color Grading Based on Music Recognition and Evoked Emotions
Publikacja
- D. Weber
- B. Kostek
- Rok 2019
The article presents the aspects of the final selection of the color of shots in film production based on the psychology of color. First of all, the elements of color processing, contrast, saturation or white balance in the film shots were presented and the definition of color grading was given. In the second part of the article the analysis of film music was conducted in the context of stimulating appropriate emotions while watching...

Rok 2018

ZASTOSOWANIE APLIKACJI INTERNETOWEJ W OCENIE JAKOŚCI DOPASOWANIA APARATÓW SŁUCHOWYCH
Publikacja
- P. Szymański
- T. Poremski
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2018
W pracy opisano zastosowanie aplikacji internetowej do oceny jakości dopasowania aparatów słuchowych. Metoda oceny polega na badaniu ankietowym, uzupełnionym testem rozumienia słów jednosylabowych w polu swobodnym. Opisywana aplikacja internetowa pozwala na przeprowadzenie badania z dowolnego komputera z dostępem do sieci. Dzięki implementacji metody w postaci aplikacji internetowej, można w systematyczny i uporządkowany sposób...

Pełny tekst do pobrania w portalu
WYKORZYSTANIE SIECI NEURONOWYCH DO SYNTEZY MOWY WYRAŻAJĄCEJ EMOCJE
Publikacja
- S. Zaporowski
- B. Kostek
- Rok 2018
W niniejszym artykule przedstawiono analizę rozwiązań do rozpoznawania emocji opartych na mowie i możliwości ich wykorzystania w syntezie mowy z emocjami, wykorzystując do tego celu sieci neuronowe. Przedstawiono aktualne rozwiązania dotyczące rozpoznawania emocji w mowie i metod syntezy mowy za pomocą sieci neuronowych. Obecnie obserwuje się znaczny wzrost zainteresowania i wykorzystania uczenia głębokiego w aplikacjach związanych...
Towards Audio Signal Equalization Based on Spectral Characteristics of a Listening Room and Music Content Reproduced
Publikacja
- P. Hoffmann
- B. Kostek
- Rok 2018
This study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, the concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....

Pełny tekst do pobrania w serwisie zewnętrznym
The influence of time of hearing aid use on auditory perception in various acoustic situations
Publikacja
- P. Szymański
- T. Poremski
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2018
The assessment of sound perception in hearing aids, especially in the context of benefits that a prosthesis can bring, is a complex issue. The objective parameters of the hearing aids can easily be determined. These parameters, however, do not always have a direct and decisive influence on the subjective assessment of quality of the patient’s hearing while using a hearing aid. The paper presents the development of a method for...

Pełny tekst do pobrania w serwisie zewnętrznym
The influence of sound track on the viewer’s emotions and correction of the color in the film
Publikacja
- D. Weber
- B. Kostek
- Rok 2018
The article presents the aspects of the final selection of colors in film production based on the emotions caused by the soundtrack of the film. First, the processing of colors, contrast, saturation and white balance of shots in the film was presented. The definition of color grading is also described, i.e. the color changes in the film's views. In the second part of the article, the soundtracks of the film were analyzed, in particular...
Sound quality metrics applied to road noise evaluation
Publikacja
- K. Marciniuk
- B. Kostek
- Journal of the Acoustical Society of America - Rok 2018
Road noise monitoring systems typically measure sound levels in specific time periods. The more insightful approach suggests to measure also the nature of noise. Sound quality of sounds such as car noise can be objectively evaluated by several parameters. One of them is psychoacoustic annoyance, described by loudness, tone color, and the temporal structure of sound. In this paper the assessment of several sound quality parameters, such...

Pełny tekst do pobrania w serwisie zewnętrznym
POPRAWA OBIEKTYWNYCH WSKAŹNIKÓW JAKOŚCI MOWY W WARUNKACH HAŁASU
Publikacja
- K. Kąkol
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2018
Celem pracy jest modyfikacja sygnału mowy, aby uzyskać zwiększenie poprawy obiektywnych wskaźników jakości mowy po zmiksowaniu sygnału użytecznego z szumem bądź z sygnałem zakłócającym. Wykonane modyfikacje sygnału bazują na cechach mowy lombardzkiej, a w szczególności na efekcie podniesienia częstotliwości podstawowej F0. Sesja nagraniowa obejmowała zestawy słów i zdań w języku polskim, nagrane w warunkach ciszy, jak również w...

Pełny tekst do pobrania w portalu
Objectivization of phonological evaluation of speech elements by means of audio parametrization
Publikacja
- Rok 2018
This study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
Publikacja
- Rok 2018
The purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...

Pełny tekst do pobrania w serwisie zewnętrznym
Listening to Live Music: Life beyond Music Recommendation Systems
Publikacja
- B. Kostek
- Rok 2018
This paper presents first a short review on music recommendation systems based on social collaborative filtering. A dictionary of terms related to music recommendation systems, such as music information retrieval (MIR), Query-by-Example (QBE), Query-by-Category (QBC), music content, music annotating, music tagging, bridging the semantic gap in music domain, etc. is introduced. Bases of music recommender systems are shortly presented,...

Pełny tekst do pobrania w serwisie zewnętrznym
Investigating Feature Spaces for Isolated Word Recognition
Publikacja
- G. Korvel
- G. Tamulevicus
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Rok 2018
Much attention is given by researchers to the speech processing task in automatic speech recognition (ASR) over the past decades. The study addresses the issue related to the investigation of the appropriateness of a two-dimensional representation of speech feature spaces for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and timefrequency signal representation...
INFLUENCE OF DATA NORMALIZATION ON THE EFFECTIVENESS OF NEURAL NETWORKS APPLIED TO CLASSIFICATION OF PAVEMENT CONDITIONS – CASE STUDY
Publikacja
- K. Marciniuk
- B. Kostek
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Rok 2018
In recent years automatic classification employing machine learning seems to be in high demand for tele-informatic-based solutions. An example of such solutions are intelligent transportation systems (ITS), in which various factors are taken into account. The subject of the study presented is the impact of data pre-processing and normalization on the accuracy and training effectiveness of artificial neural networks in the case...
In Memoriam Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering
Publikacja
- A. Czyżewski
- B. Kostek
- Archives of Acoustics - Rok 2018
Biography and scientific achievements of Professors Marianna Sankiewicz-Budzyński and Gustaw K.E. Budzyński - Founders of the Polish Audio Engineering.

Pełny tekst do pobrania w portalu

Wyszukiwarka

Laboratorium Akustyki Fonicznej

Publikacje

Filtry

Kategoria

Rok

Opcje

Katalog Publikacji

Rok 2023

Rok 2022

Rok 2021

Rok 2020

Rok 2019

Rok 2018