Department Of Multimedia Systems

Publications

Year 2022

Robust Object Detection with Multi-input Multi-output Faster R-CNN
Publication
- S. Cygert
- A. Czyżewski
- Year 2022
Recent years have seen impressive progress in visual recognition on many benchmarks, however, generalization to the out-of-distribution setting remains a significant challenge. A state-of-the-art method for robust visual recognition is model ensembling. However, recently it was shown that similarly competitive results could be achieved with a much smaller cost, by using multi-input multi-output architecture (MIMO). In this work,...

Full text to download in external service
Sensing Direction of Human Motion Using Single-Input-Single-Output (SISO) Channel Model and Neural Networks
Publication
- S. A. Bhat
- M. A. Dar
- P. Szczuko
- D. Alyahya
- F. Mustafa
- IEEE Access - Year 2022
Object detection Through-the-Walls enables localization and identification of hidden objects behind the walls. While numerous studies have exploited Channel State Information of Multiple Input Multiple Output (MIMO) WiFi and radar devices in association with Artificial Intelligence based algorithms (AI) to detect and localize objects behind walls, this study proposes a novel non-invasive Through-the-Walls human motion direction...

Full text available to download
Systematic Literature Review for Emotion Recognition from EEG Signals
Publication
- P. A. Leszczełowska (formerly: P. Leszczełowska)
- N. Dawidowska
- Year 2022
Researchers have recently become increasingly interested in recognizing emotions from electroencephalogram (EEG) signals and many studies utilizing different approaches have been conducted in this field. For the purposes of this work, we performed a systematic literature review including over 40 articles in order to identify the best set of methods for the emotion recognition problem. Our work collects information about the most...

Full text to download in external service
Systematic Literature Review for Emotion Recognition from EEG Signals
Publication
- P. A. Leszczełowska (formerly: P. Leszczełowska)
- N. Dawidowska
- Year 2022
Researchers have recently become increasingly interested in recognizing emotions from electroencephalogram (EEG) signals and many studies utilizing different approaches have been conducted in this field. For the purposes of this work, we performed a systematic literature review including over 40 articles in order to identify the best set of methods for the emotion recognition problem. Our work collects information about the most...

Full text available to download
Technologia CyberOko do diagnozy, rehabilitacji i komunikowania się z pacjentami niewykazującymi oznak przytomności
Publication
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2022
CyberOko jest rozwiązaniem opracowanym w Politechnice Gdańskiej, które umożliwia nawiązanie kontaktu i pracę z osobami głęboko upośledzonymi komunikacyjnie. W sposób inteligentny śledzi ruch gałek ocznych, dzięki czemu umożliwia rehabilitację i ocenę stanu świadomości pacjenta nawet w stanie całkowitego porażenia. Rozwiązanie obejmuje także analizę fal EEG, obiektywne badanie słuchu i badanie sygnałów z macierzy elektrod wszczepianych...

Full text available to download
Vehicle Detection and Speed Estimation Using Millimetre Wave Radar
Publication
- P. Odya
- Year 2022
The dataset titled Data from 76- to 81-GHz mmWave Sensor located at S7 road contains data recorded employing an IWR1642 mmWave sensor from Texas Instruments. The data comes from two sessions lasting 24h each. The dataset provides the possibility to perform analyses related to car traffic intensity on one of the carriageways of the motorway heading to the Gdańsk metropolitan area. Based on the gathered data, it is possible to calculate...

Full text available to download

Year 2021

Acoustic Detector of Road Vehicles Based on Sound Intensity
Publication
- G. Szwoch
- J. Kotus
- SENSORS - Year 2021
A method of detecting and counting road vehicles using an acoustic sensor placed by the road is presented. The sensor measures sound intensity in two directions: parallel and perpendicular to the road. The sound intensity analysis performs acoustic event detection. A normalized position of the sound source is tracked and used to determine if the detected event is related to a moving vehicle and to establish the direction of movement....

Full text available to download
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions
Publication
- SENSORS - Year 2021
The paper aims to discuss a case study of sensing analytics and technology in acoustics when applied to reverberation conditions. Reverberation is one of the issues that makes speech in indoor spaces challenging to understand. This problem is particularly critical in large spaces with few absorbing or diffusing surfaces. One of the natural remedies to improve speech intelligibility in such conditions may be achieved through speaking...

Full text available to download
Adaptive Method for Modeling of Temporal Dependencies between Fields of Vision in Multi-Camera Surveillance Systems
Publication
- K. Lisowski
- A. Czyżewski
- Electronics - Year 2021
A method of modeling the time of object transition between given pairs of cameras based on the Gaussian Mixture Model (GMM) is proposed in this article. Temporal dependencies modeling is a part of object re-identification based on the multi-camera experimental framework. The previously utilized Expectation-Maximization (EM) approach, requiring setting the number of mixtures arbitrarily as an input parameter, was extended with the...

Full text available to download
Ambisoniczna mapa wybranych miejsc w Trójmieście z obrazem 360°
Publication
- C. Pietrzak
- P. Odya
- Year 2021
W projekcie, który zostanie opisany w niniejszym rozdziale, założonym celem było stworzenie ambisonicznej mapy Trójmiasta w formie aplikacji internetowej. Materiały wideo w technologii 360° z dźwiękiem w postaci sygnału ambisonicznego zostały zarejestrowane w wybranych lokalizacjach uznanych za charakterystyczne dla tej aglomeracji. Celem badawczym projektu było porównanie dostępnych algorytmów miksowania sygnałów ambisonicznych...

Full text to download in external service
An Automated Method for Biometric Handwritten Signature Authentication Employing Neural Networks
Publication
- M. Kurowski
- A. Sroczyński
- G. Bogdanis
- A. Czyżewski
- Electronics - Year 2021
Handwriting biometrics applications in e-Security and e-Health are addressed in the course of the conducted research. An automated graphomotor analysis method for the dynamic electronic representation of the handwritten signature authentication was researched. The developed algorithms are based on dynamic analysis of electronically handwritten signatures employing neural networks. The signatures were acquired with the use of the...

Full text available to download
Analiza zależności muzyczno-graficznej okładek albumów z użyciem algorytmów uczących się
Publication
- A. Dorochowicz
- Year 2021
Celem rozprawy jest analiza zależności muzyczno-graficznej okładek albumów z użyciem algorytmów uczących się. Brane są pod uwagę parametry badanych gatunków muzycznych, zależności pomiędzy gatunkami muzycznymi a typami osobowości, jak również cechy okładek albumów muzycznych i ich korelacje z gatunkami muzycznymi. Opracowana metodologia jest wykorzystana w celu sprawdzenia możliwości automatycznej klasyfikacji gatunku muzycznego...

Full text available to download
AUTOMATYCZNE GENEROWANIE KOLEJNOŚCI LIST UTWORÓW MUZYCZNYCH
Publication
- K. Pietrusińska
- A. Kurowski
- B. Kostek
- Year 2021
W niniejszym rozdziale przedstawiono przygotowanie algorytmu do automa-tycznego układania kolejności utworów muzycznych i zgrywającego je do postaci jednego, długiego miksu. Dzięki algorytmowi dobierane są utwory na podstawie analizy podobieństwa fragmentów końcowych i początkowych utworów. Podo-bieństwo to jest obliczane za pomocą odległości euklidesowej między wektorami parametrów wyznaczonymi przez autoenkoder oraz na podstawie...

Full text to download in external service
Closer Look at the Uncertainty Estimation in Semantic Segmentation under Distributional Shift
Publication
- Year 2021
While recent computer vision algorithms achieve impressive performance on many benchmarks, they lack robustness - presented with an image from a different distribution, (e.g. weather or lighting conditions not considered during training), they may produce an erroneous prediction. Therefore, it is desired that such a model will be able to reliably predict its confidence measure. In this work, uncertainty estimation for the task...

Full text available to download
Concurrent Video Denoising and Deblurring for Dynamic Scenes
Publication
- IEEE Access - Year 2021
Dynamic scene video deblurring is a challenging task due to the spatially variant blur inflicted by independently moving objects and camera shakes. Recent deep learning works bypass the ill-posedness of explicitly deriving the blur kernel by learning pixel-to-pixel mappings, which is commonly enhanced by larger region awareness. This is a difficult yet simplified scenario because noise is neglected when it is omnipresent in a wide...

Full text available to download
CyberEye: New Eye-Tracking Interfaces for Assessment and Modulation of Cognitive Functions beyond the Brain
Publication
- SENSORS - Year 2021
The emergence of innovative neurotechnologies in global brain projects has accelerated research and clinical applications of BCIs beyond sensory and motor functions. Both invasive and noninvasive sensors are developed to interface with cognitive functions engaged in thinking, communication, or remembering. The detection of eye movements by a camera offers a particularly attractive external sensor for computer interfaces to monitor,...

Full text available to download
Designing acoustic scattering elements using machine learning methods
Publication
- A. Kurowski
- Year 2021
In the process of the design and correction of room acoustic properties, it is often necessary to select the appropriate type of acoustic treatment devices and make decisions regarding their size, geometry, and location of the devices inside the room under the treatment process. The goal of this doctoral dissertation is to develop and validate a mathematical model that allows predicting the effects of the application of the scattering...

Full text available to download
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
Publication
- D. Korzekwa
- R. Barra-Chicote
- S. Zaporowski
- G. Beringer
- J. Lorenzo-trueba
- A. Serafinowicz
- J. Droppo
- T. Drugman
- B. Kostek
- Year 2021
This paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...

Full text available to download
Direct electrical stimulation of the human brain has inverse effects on the theta and gamma neural activities
Publication
- M. Lech
- B. M. Berry
- C. Topcu
- V. Kremen
- P. Nejedly
- B. Lega
- R. E. Gross
- M. R. Sperling
- B. C. Jobst
- S. A. Sheth... and 4 others
- IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING - Year 2021
Objective: Our goal was to analyze the electrophysiological response to direct electrical stimulation (DES) systematically applied at a wide range of parameters and anatomical sites, with particular focus on neural activities associated with memory and cognition. Methods: We used a large set of intracranial EEG (iEEG) recordings with DES from 45 subjects with electrodes...

Full text available to download
Estimation of Average Speed of Road Vehicles by Sound Intensity Analysis
Publication
- J. Kotus
- G. Szwoch
- SENSORS - Year 2021
Constant monitoring of road traffic is important part of modern smart city systems. The proposed method estimates average speed of road vehicles in the observation period, using a passive acoustic vector sensor. Speed estimation based on sound intensity analysis is a novel approach to the described problem. Sound intensity in two orthogonal axes is measured with a sensor placed alongside the road. Position of the apparent sound...

Full text available to download
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
Publication
- M. Piotrowska
- A. Czyżewski
- T. Ciszewski
- G. Korvel
- A. Kurowski
- B. Kostek
- Journal of the Acoustical Society of America - Year 2021
The approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...

Full text available to download
Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network
Publication
- G. Korvel
- P. Treigys
- B. Kostek
- Journal of the Acoustical Society of America - Year 2021
The goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...

Full text available to download
Independent dynamics of low, intermediate, and high frequency spectral intracranial EEG activities during human memory formation
Publication
- V. Marks
- K. Saboo
- Ç. Topçu
- M. Lech
- T. Thayib
- P. Nejedly
- V. Kremen
- G. A. Worrell
- M. T. Kucewicz (formerly: M. Kucewicz)
- NEUROIMAGE - Year 2021
A wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various frequency ranges are coordinated across the space of the human cortex and time of memory processing is inconclusive. They can either be coordinated together across the frequency spectrum at the same cortical site and time or induced independently in particular bands. We used a large dataset of human intracranial...

Full text available to download
Independent dynamics of slow, intermediate, and fast intracranial EEG spectral activities during human memory formation
Publication
- V. S. Marks
- K. V. Saboo
- C. Topcu
- T. P. Thayib
- P. Nejedly
- V. Kremen
- G. A. Worrell
- M. T. Kucewicz
- Year 2021
A wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various low and high frequencies are spatiotemporally coordinated across the human brain during memory processing is inconclusive. They can either be coordinated together across a wide range of the frequency spectrum or induced in specific bands. We used a large dataset of human intracranial electroencephalography...

Full text to download in external service
Leveraging spatio-temporal features for joint deblurring and segmentation of instruments in dental video microscopy
Publication
- E. Katsaros
- A. Jezierska
- D. Węsierski
- Year 2021
In dentistry, microscopes have become indispensable optical devices for high-quality treatment and micro-invasive surgery, especially in the field of endodontics. Recent machine vision advances enable more advanced, real-time applications including but not limited to dental video deblurring and workflow analysis through relevant metadata obtained by instrument motion trajectories. To this end, the proposed work addresses dental...

Full text to download in external service
Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling
Publication
- D. Korzekwa
- J. Lorenzo-trueba
- S. Zaporowski
- S. Calamaro
- T. Drugman
- B. Kostek
- Year 2021
A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...

Full text to download in external service
Production of six-degrees-of-freedom (6DoF) navigable audio using 30 Ambisonic microphones
Publication
- B. Mróz
- M. Kabaciński
- T. Ciotucha
- A. Rumiński
- T. Żernicki
- Year 2021
This paper describes a method for planning, recording, and post-production of six-degrees-of-freedom audio recorded with multiple 3rd order Ambisonic microphone arrays. The description is based on the example of recordings conducted in August 2020 with the Poznan Philharmonic Orchestra using 30 units of Zylia ZM-1S. A convenient way to prepare and organize such a big project is proposed – this involves details of stage planning,...

Full text available to download
Robustness in Compressed Neural Networks for Object Detection
Publication
- S. Cygert
- A. Czyżewski
- Year 2021
Model compression techniques allow to significantly reduce the computational cost associated with data processing by deep neural networks with only a minor decrease in average accuracy. Simultaneously, reducing the model size may have a large effect on noisy cases or objects belonging to less frequent classes. It is a crucial problem from the perspective of the models' safety, especially for object detection in the autonomous driving...

Full text available to download
Selective monitoring of noise emitted by vehicles involved in road traffic
Publication
- A. Czyżewski
- T. Śmiałkowski
- Journal of the Acoustical Society of America - Year 2021
An acoustic intensity probe was developed measures the sound intensity in three orthogonal directions, making possible to calculate the azimuth and elevation angles, describing the sound source position. The acoustic sensor is made in the form of a cube with a side of 10 mm, on the inner surfaces of which the digital MEMS microphones are mounted. The algorithm works in two stages. The first stage is based on the analysis of sound...

Full text available to download
Skuteczność klasyfikacji gatunków muzycznych za pomocą sieci neuronowej w zależności od typu danych wejściowych
Publication
- Year 2021
Rozpoznawanie gatunku muzycznego jest jednym z podstawowych elementów inteligentnych systemów tworzenia automatycznych list muzyki. Platformy strumieniowe oferujące taką usługę wymagają rozwiązań, które umożliwią jak najdokładniej określić przynależność utworu do gatunku muzycznego. Zgodnie z aktualnym stanem wiedzy – najskuteczniejszym klasyfikatorem są sztuczne sieci neuronowe (w tym w wersji uczenia głębokiego), dla których...

Full text to download in external service
Techniki wielokanałowe wykorzystywane w koncertach i nagraniach muzycznych na odległość
Publication
- Year 2021
W czasie pandemii koronawirusa COVID-19 nowego znaczenia nabrały możliwości transmisji dźwięku z obrazem – zwłaszcza do pracy zdalnej, która w przypadku muzyków jest szczególnym wyzwaniem zarówno w kontekście wspólnych ćwiczeń i prób, jak i koncertów. Wynikła konieczność wieloźródłowego połączenia ujawniła potrzebę uprzestrzennienia dźwięku w celu łatwiejszej lokalizacji źródeł dźwięku. Tworzenie zdalnych nagrań muzycznych stało...

Full text to download in external service
Towards Cancer Patients Classification Using Liquid Biopsy
Publication
- S. Cygert
- F. Górski
- P. Juszczyk
- S. Lewalski
- K. Pastuszak
- A. Czyżewski
- A. Supernat
- Year 2021
Liquid biopsy is a useful, minimally invasive diagnostic and monitoring tool for cancer disease. Yet, developing accurate methods, given the potentially large number of input features, and usually small datasets size remains very challenging. Recently, a novel feature parameterization based on the RNA-sequenced platelet data which uses the biological knowledge from the Kyoto Encyclopedia of Genes and Genomes, combined with a classifier...

Full text to download in external service

Year 2020

1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type
Publication
- Year 2020
A network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence....
A Study of Cross-Linguistic Speech Emotion Recognition Based on 2D Feature Spaces
Publication
- G. Tamulevicius
- G. Korvel
- A. B. Yayak
- P. Treigys
- J. Bernataviciene
- B. Kostek
- Electronics - Year 2020
In this research, a study of cross-linguistic speech emotion recognition is performed. For this purpose, emotional data of different languages (English, Lithuanian, German, Spanish, Serbian, and Polish) are collected, resulting in a cross-linguistic speech emotion dataset with the size of more than 10.000 emotional utterances. Despite the bi-modal character of the databases gathered, our focus is on the acoustic representation...

Full text available to download
Adaptive traffic optimization using Variable Speed Limits; Adaptacyjna optymalizacja ruchu drogowego przy pomocy zmiennych ograniczeń prędkości
Publication
- P. Gora
- Year 2020
Variable speed limits (VSL) is an intelligent transportation system (ITS) solution for traffic management. The speed limits can be changed dynamically in order to adapt to traffic, weather, or road surface conditions. This paper presents an approach for such an adaptive traffic control where the primary goal is to ensure traffic safety and efficiency of the traffic control system (fast response to dynamically changing traffic,...

Full text to download in external service
Ambisoniczna mapa wybranych miejsc w Trójmieście
Publication
- C. Pietrzak
- P. Odya
- Year 2020
Projekt miał na celu stworzenie ambisonicznej mapy Trójmiasta w formie aplikacji internetowej. Materiały wideo w technologii 360 z dźwiękiem w postaci sygnału ambisonicznego zostały zarejestrowane w lokalizacjach Trójmiasta, które uznano za charakterystyczne dla tej aglomeracji. Celem badawczym projektu było porównanie dostępnych algorytmów miksowania sygnałów ambisonicznych poprzez przeprowadzenie testów odsłuchowych. Przeprowadzono...

Full text available to download
Analiza ruchu drogowego z wykorzystaniem analizy akustycznej
Publication
- K. Marciniuk
- B. Kostek
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2020
Tematyka pracy porusza zagadnienia dotyczące pozyskiwania informacji o ruchu drogowym z wykorzystaniem monitoringu akustycznego. Przybliżono podstawowe techniki nadzoru nad ruchem drogowym. Przedstawiono założenia akustycznego detektora ruchu i zbadano jego skuteczność na trzech płaszczyznach działania – zliczania pojazdów, klasyfikacji rodzajowej i klasyfikacji warunków pogodowych panujących na nawierzchni

Full text to download in external service
Analyzing the Effectiveness of the Brain–Computer Interface for Task Discerning Based on Machine Learning
Publication
- SENSORS - Year 2020
The aim of the study is to compare electroencephalographic (EEG) signal feature extraction methods in the context of the effectiveness of the classification of brain activities. For classification, electroencephalographic signals were obtained using an EEG device from 17 subjects in three mental states (relaxation, excitation, and solving logical task). Blind source separation employing independent component analysis (ICA) was...

Full text available to download
Audio Feature Analysis for Precise Vocalic Segments Classification in English
Publication
- S. Zaporowski
- A. Czyżewski
- Year 2020
An approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...

Full text to download in external service
Automatic Marking of Allophone Boundaries in Isolated English spoken Words
Publication
- J. Rafałko
- A. Czyżewski
- Year 2020
The work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination...

Full text available to download
Chór wirtualny
Publication
- M. Mróz
- B. Mróz
- Year 2020
Wiosna roku 2020 została zapisana emocjami, które należy zaliczać do tych niepożądanych. Praca on-line stała się jedyną możliwą formą pracy z zespołem. Prekursorem pomysłu wirtualnego chóru był amerykański kompozytor i dyrygent Eric Whitacre. Eric wybrał do wykonania przez chór wirtualny utwory posiadające wspólne cechy. Kolejnym poruszanym zagadnieniem jest stworzenie przestrzennego dźwięku. Technologia na której opiera się dźwięk...

Full text to download in external service
Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor
Publication
- A. Czyżewski
- Journal of the Acoustical Society of America - Year 2020
The purpose of our applied research project is to develop an autonomous road sign with built-in radar devices of our design. In this paper, we show that it is possible to calibrate the acoustic vector sensor so that it can be used to measure traffic volume and count the vehicles involved in the traffic through the analysis of the noise emitted by them. Signals obtained from a Doppler radar are used as a reference source. Although...

Full text to download in external service
Comparison of sound of organ pipes in contemporary and historical instruments
Publication
- Year 2020
The aim of this research is to examine the differences in the timbre of organ pipes’ sound between a historical and a contemporary organ instrument. The historical instrument is the Oliwa organ from Gdansk, Poland, and the contemporary one is from Kartuzy, Poland. Recordings are made of single notes played by an open labial pipe that belongs to the Principal rank. The analyses and comparison of several sound features compatible...

Full text to download in external service
Comparison of two methods of sound extraction from guitar string video recordings
Publication
- M. Zaporowska (formerly: M. Stefaniak)
- A. Czyżewski
- Year 2020
A comparison of two sound extraction methods from guitar string video recordings is presented in the paper. A brief overview of highframe rate camera technology and possible applications are included. The method using the image analysis from two such cameras is presented. The cameras are placed at the angle of 90 degrees for recording the image in three planes. The results achieved...
Constructing a Dataset of Speech Recordingswith Lombard Effect
Publication
- D. Weber
- S. Zaporowski
- D. Korzekwa
- Year 2020
Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres
Publication
- Electronics - Year 2020
The purpose of this research is two-fold: (a) to explore the relationship between the listeners’ personality trait, i.e., extraverts and introverts and their preferred music genres, and (b) to predict the personality trait of potential listeners on the basis of a musical excerpt by employing several classification algorithms. We assume that this may help match songs according to the listener’s personality in social music networks....

Full text available to download
Evaluating calibration and robustness of pedestrian detectors
Publication
- S. Cygert
- A. Czyżewski
- Year 2020
In this work robustness and calibration of modern pedestrian detectors are evaluated. Pedestrian detection is a crucial perception com- ponent in autonomous driving and here we study its performance under different image corruptions. Furthermore, we provide analysis of classifi- cation calibration of pedestrian detectors and we show a positive effect of using style-transfer augmentation technique. Our analysis is aimed as a step...

Full text to download in external service
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
Publication
- G. Korvel
- K. Kąkol
- O. Kurasova
- B. Kostek
- IEEE Access - Year 2020
The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Full text available to download
Improving Objective Speech Quality Indicators in Noise Conditions
Publication
- K. Kąkol
- G. Korvel
- B. Kostek
- Year 2020
This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

Full text to download in external service
Investigating Feature Spaces for Isolated Word Recognition
Publication
- P. Treigys
- G. Korvel
- G. Tamulevicius
- J. Bernataviciene
- B. Kostek
- Year 2020
The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Full text to download in external service

Search

Department Of Multimedia Systems

Publications

Filters

Category

Year

Options

Catalog Publications

Year 2022

Year 2021

Year 2020