Department of Multimedia Systems

Publications

Year 2020

Analiza ruchu drogowego z wykorzystaniem analizy akustycznej
Publication
- K. Marciniuk
- B. Kostek
- Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne - Year 2020
Tematyka pracy porusza zagadnienia dotyczące pozyskiwania informacji o ruchu drogowym z wykorzystaniem monitoringu akustycznego. Przybliżono podstawowe techniki nadzoru nad ruchem drogowym. Przedstawiono założenia akustycznego detektora ruchu i zbadano jego skuteczność na trzech płaszczyznach działania – zliczania pojazdów, klasyfikacji rodzajowej i klasyfikacji warunków pogodowych panujących na nawierzchni

Full text to download in external service
Analyzing the Effectiveness of the Brain–Computer Interface for Task Discerning Based on Machine Learning
Publication
- SENSORS - Year 2020
The aim of the study is to compare electroencephalographic (EEG) signal feature extraction methods in the context of the effectiveness of the classification of brain activities. For classification, electroencephalographic signals were obtained using an EEG device from 17 subjects in three mental states (relaxation, excitation, and solving logical task). Blind source separation employing independent component analysis (ICA) was...

Full text available to download
Audio Feature Analysis for Precise Vocalic Segments Classification in English
Publication
- S. Zaporowski
- A. Czyżewski
- Year 2020
An approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...

Full text to download in external service
Automatic Marking of Allophone Boundaries in Isolated English spoken Words
Publication
- J. Rafałko
- A. Czyżewski
- Year 2020
The work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination...

Full text available to download
Chór wirtualny
Publication
- M. Mróz
- B. Mróz
- Year 2020
Wiosna roku 2020 została zapisana emocjami, które należy zaliczać do tych niepożądanych. Praca on-line stała się jedyną możliwą formą pracy z zespołem. Prekursorem pomysłu wirtualnego chóru był amerykański kompozytor i dyrygent Eric Whitacre. Eric wybrał do wykonania przez chór wirtualny utwory posiadające wspólne cechy. Kolejnym poruszanym zagadnieniem jest stworzenie przestrzennego dźwięku. Technologia na której opiera się dźwięk...

Full text to download in external service
Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor
Publication
- A. Czyżewski
- Journal of the Acoustical Society of America - Year 2020
The purpose of our applied research project is to develop an autonomous road sign with built-in radar devices of our design. In this paper, we show that it is possible to calibrate the acoustic vector sensor so that it can be used to measure traffic volume and count the vehicles involved in the traffic through the analysis of the noise emitted by them. Signals obtained from a Doppler radar are used as a reference source. Although...

Full text to download in external service
Comparison of sound of organ pipes in contemporary and historical instruments
Publication
- Year 2020
The aim of this research is to examine the differences in the timbre of organ pipes’ sound between a historical and a contemporary organ instrument. The historical instrument is the Oliwa organ from Gdansk, Poland, and the contemporary one is from Kartuzy, Poland. Recordings are made of single notes played by an open labial pipe that belongs to the Principal rank. The analyses and comparison of several sound features compatible...

Full text to download in external service
Comparison of two methods of sound extraction from guitar string video recordings
Publication
- M. Zaporowska (formerly: M. Stefaniak)
- A. Czyżewski
- Year 2020
A comparison of two sound extraction methods from guitar string video recordings is presented in the paper. A brief overview of highframe rate camera technology and possible applications are included. The method using the image analysis from two such cameras is presented. The cameras are placed at the angle of 90 degrees for recording the image in three planes. The results achieved...
Constructing a Dataset of Speech Recordingswith Lombard Effect
Publication
- D. Weber
- S. Zaporowski
- D. Korzekwa
- Year 2020
Thepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
Employing Subjective Tests and Deep Learning for Discovering the Relationship between Personality Types and Preferred Music Genres
Publication
- Electronics - Year 2020
The purpose of this research is two-fold: (a) to explore the relationship between the listeners’ personality trait, i.e., extraverts and introverts and their preferred music genres, and (b) to predict the personality trait of potential listeners on the basis of a musical excerpt by employing several classification algorithms. We assume that this may help match songs according to the listener’s personality in social music networks....

Full text available to download
Evaluating calibration and robustness of pedestrian detectors
Publication
- S. Cygert
- A. Czyżewski
- Year 2020
In this work robustness and calibration of modern pedestrian detectors are evaluated. Pedestrian detection is a crucial perception com- ponent in autonomous driving and here we study its performance under different image corruptions. Furthermore, we provide analysis of classifi- cation calibration of pedestrian detectors and we show a positive effect of using style-transfer augmentation technique. Our analysis is aimed as a step...

Full text to download in external service
Evaluation of Lombard Speech Models in the Context of Speech in Noise Enhancement
Publication
- G. Korvel
- K. Kąkol
- O. Kurasova
- B. Kostek
- IEEE Access - Year 2020
The Lombard effect is one of the most well-known effects of noise on speech production. Speech with the Lombard effect is more easily recognizable in noisy environments than normal natural speech. Our previous investigations showed that speech synthesis models might retain Lombard-effect characteristics. In this study, we investigate several speech models, such as harmonic, source-filter, and sinusoidal, applied to Lombard speech...

Full text available to download
Improving Objective Speech Quality Indicators in Noise Conditions
Publication
- K. Kąkol
- G. Korvel
- B. Kostek
- Year 2020
This work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...

Full text to download in external service
Investigating Feature Spaces for Isolated Word Recognition
Publication
- P. Treigys
- G. Korvel
- G. Tamulevicius
- J. Bernataviciene
- B. Kostek
- Year 2020
The study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...

Full text to download in external service
Microscopic traffic simulation models for connected and automated vehicles (CAVs) – state-of-the-art
Publication
- P. Gora
- C. Kartakazas
- A. Drabicki
- F. Islam
- P. Ostaszewski
- Procedia Computer Science - Year 2020
Research on connected and automated vehicles (CAVs) has been gaining substantial momentum in recent years. However, thevast amount of literature sources results in a wide range of applied tools and datasets, assumed methodology to investigate thepotential impacts of future CAVs traffic, and, consequently, differences in the obtained findings. This limits the scope of theircomparability and applicability and calls for a proper standardization...

Full text available to download
Multifactor consciousness level assessment of participants with acquired brain injuries employing human–computer interfaces
Publication
- Biomedical Engineering Online - Year 2020
Background A lack of communication with people suffering from acquired brain injuries may lead to drawing erroneous conclusions regarding the diagnosis or therapy of patients. Information technology and neuroscience make it possible to enhance the diagnostic and rehabilitation process of patients with traumatic brain injury or post-hypoxia. In this paper, we present a new method for evaluation possibility of communication and the...

Full text available to download
Multimedia Communications, Services and Security MCSS. 10th International Conference, MCSS 2020, Preface
Publication
- A. Dziech
- W. Mees
- A. Czyżewski
- Year 2020
Multimedia surrounds us everywhere. It is estimated that only a part of the recorded resources are processed and analyzed. These resources offer enormous opportunities to improve the quality of life of citizens. As a result, of the introduction of a new type of algorithms to improve security by maintaining a high level of privacy protection. Among the many articles, there are examples of solutions for improving the operation of...

Full text to download in external service
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
Publication
- D. Koszewski
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2020
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Full text available to download
O nadjeżdżającej rewolucji w transporcie
Publication
- P. Gora
- Pismo PG - Year 2020
1,3 miliona – tyle osób rocznie na świecie ginie w wypadkach drogowych. Ponad 20 milionów zostaje rannych! 4 miliardy złotych – prawie tyle rocznie tracą kierowcy w 7 największych miastach w Polsce z powodu korków (a są to jedynie szacowane koszty straconego czasu i paliwa, bez uwzględnienia np. negatywnego wpływu na środowisko). Czy możemy coś z tym zrobić?

Full text available to download
Projekt INZNAK - aktywne znaki drogowe
Publication
- A. Czyżewski
- Magazyn Autostrady - Year 2020
W Politechnice Gdańskiej na Wydziale Elektroniki, Telekomunikacji i Informatyki we współpracy z Akademią Górniczo-Hutniczą w Krakowie i dwiema firmami z województwa pomorskiego (Siled Sp. z o.o. i Microsystems Sp. z o.o.) od 2017 r. realizowany jest projekt badawczy pt. „INZNAK – inteligentne znaki drogowe do adaptacyjnego sterowania ruchem pojazdów, komunikujące się w technologii V2X”. Projekt jest dofinansowywany przez NCBR w...

Full text to download in external service
Ranking Speech Features for Their Usage in Singing Emotion Classification
Publication
- S. Zaporowski
- B. Kostek
- Year 2020
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...

Full text available to download
System for monitoring road slippery based on CCTV cameras and convolutional neural networks
Publication
- D. Grabowski
- A. Czyżewski
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2020
The slipperiness of the surface is essential for road safety. The growing number of CCTV cameras opens the possibility of using them to automatically detect the slippery surface and inform road users about it. This paper presents a system of developed intelligent road signs, including a detector based on convolutional neural networks (CNNs) and the transferlearning method employed to the processing of images acquired with video...

Full text available to download
Toward Robust Pedestrian Detection With Data Augmentation
Publication
- S. Cygert
- A. Czyżewski
- IEEE Access - Year 2020
In this article, the problem of creating a safe pedestrian detection model that can operate in the real world is tackled. While recent advances have led to significantly improved detection accuracy on various benchmarks, existing deep learning models are vulnerable to invisible to the human eye changes in the input image which raises concerns about its safety. A popular and simple technique for improving robustness is using data...

Full text available to download
Vehicle Detection with Self-Training for Adaptative Video Processing Embedded Platform
Publication
- S. Cygert
- A. Czyżewski
- Applied Sciences-Basel - Year 2020
Traffic monitoring from closed-circuit television (CCTV) cameras on embedded systems is the subject of the performed experiments. Solving this problem encounters difficulties related to the hardware limitations, and possible camera placement in various positions which affects the system performance. To satisfy the hardware requirements, vehicle detection is performed using a lightweight Convolutional Neural Network (CNN), named...

Full text available to download

Year 2019

A Computationally Efficient Model for Predicting Successful Memory Encoding Using Machine-Learning-based EEG Channel Selection
Publication
- K. Saboo
- Y. Varatharajah
- B. M. Berry
- M. R. Sperling
- R. Gorniak
- K. A. Davis
- B. C. Jobst
- R. E. Gross
- B. C. Lega
- S. A. Sheth... and 4 others
- IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE - Year 2019
Computational cost is an important consideration for memory encoding prediction models that use data from dozens of implanted electrodes. We propose a method to reduce computational expense by selecting a subset of all the electrodes to build the prediction model. The electrodes were selected based on their likelihood of measuring brain activity useful for predicting memory encoding better than chance (in terms of AUC). A logistic...

Full text to download in external service
A Concept of Automatic Film Color Grading Based on Music Recognition and Evoked Emotions
Publication
- D. Weber
- B. Kostek
- Year 2019
The article presents the aspects of the final selection of the color of shots in film production based on the psychology of color. First of all, the elements of color processing, contrast, saturation or white balance in the film shots were presented and the definition of color grading was given. In the second part of the article the analysis of film music was conducted in the context of stimulating appropriate emotions while watching...
Akustyczna analiza natężenia ruchu drogowego dla systemów zarządzania ruchem
Publication
- K. Marciniuk
- Year 2019
W pracy przybliżono wybrane zagadnienia z dziedziny zarządzania transportem drogowym w Polsce i na świecie. W tym kontekście pzredstawiono potrzeby rynkowe, wymagania jak i możliwości w zakresie pozyskiwania informacji o aktualnym stanie sieci drogowych. Zaproponowano akustyczną metodę nadzorowania ruchu drogowego i jej możliwości w kontekście systemów zarządzania ruchem. Przedstawiono schemat akwizycji sygnału wraz z danymi odniesienia....
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
Publication
- G. Korvel
- O. Kurasova
- B. Kostek
- Year 2019
The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...

Full text available to download
ANALIZA KOLORÓW SCEN FILMOWYCH W KONTEKŚCIE COLOR GRADINGU
Publication
- D. Weber
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Year 2019
W artykule przedstawiono zagadnienia związane z kolorowaniem sceny filmowej. W pracy przedyskutowano główne aspekty obróbki koloru obrazu filmowego oraz omówiono definicje pojęć związanych z kolorowaniem sceny, tj.: color correction oraz color gradingu. Opisano teorie psychologii koloru oraz ich praktyczne wykorzystanie w filmie i odniesiono je do podstawowych gatunków filmowych i modeli emocji. Następnie przedyskutowano założenia...

Full text available to download
ANALIZA PARAMETRÓW SYGNAŁU MOWY W KONTEKŚCIE ICH PRZYDATNOŚCI W AUTOMATYCZNEJ OCENIE JAKOŚCI EKSPRESJI ŚPIEWU
Publication
- S. Zaporowski
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Year 2019
Praca dotyczy podejścia do parametryzacji w przypadku klasyfikacji emocji w śpiewie oraz porównania z klasyfikacją emocji w mowie. Do tego celu wykorzystano bazę mowy i śpiewu nacechowanego emocjonalnie RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), zawierającą nagrania profesjonalnych aktorów prezentujących sześć różnych emocji. Następnie obliczono współczynniki mel-cepstralne (MFCC) oraz wybrane deskryptory...

Full text available to download
Application of autoencoder to traffic noise analysis
Publication
- Journal of the Acoustical Society of America - Year 2019
The aim of an autoencoder neural network is to transform the input data into a lower-dimensional code and then to reconstruct the output from this code representation. Applications of autoencoders to classifying sound events in the road traffic have not been found in the literature. The presented research aims to determine whether such an unsupervised learning method may be used for deploying classification algorithms applied to...

Full text available to download
Assessment of the Effectiveness of a Short-term Hearing Aid Use in Patients with Different Degrees of Hearing Loss
Publication
- T. Poremski
- P. Szymański
- B. Kostek
- Archives of Acoustics - Year 2019
The study presents evaluating the effectiveness of the hearing aid fitting process in the short-term use (7 days). The evaluation method consists of a survey based on the APHAB (Abbreviated Profile of Hearing Aid Benefit) questionnaire. Additional criteria such as a degree of hearing loss, number of hours and days of hearing aid use as well as the user’s experience were also taken into consideration. The outcomes of the benefit...

Full text available to download
Automatic labeling of traffic sound recordings using autoencoder-derived features
Publication
- Year 2019
An approach to detection of events occurring in road traffic using autoencoders is presented. Extensions of existing algorithms of acoustic road events detection employing Mel Frequency Cepstral Coefficients combined with classifiers based on k nearest neighbors, Support Vector Machines, and random forests are used. In our research, the acoustic signal gathered from the microphone placed near the road is split into frames and converted...
Combining Road Network Data from OpenStreetMap with an Authoritative Database
Publication
- G. Szwoch
- Journal of Transportation Engineering, Part A: Systems - Year 2019
Computer modeling of road networks requires detailed and up-to-date dataset. This paper proposes a method of combining authoritative databases with OpenStreetMap (OSM) system. The complete route is established by finding paths in the graph constructed from partial data obtained from OSM. In order to correlate data from both sources, a method of coordinate conversion is proposed. The algorithm queries road data from OSM and provides...

Full text available to download
Comparative study on the effectiveness of various types of road traffic intensity detectors
Publication
- A. Czyżewski
- A. Sroczynski
- T. Smialkowski
- P. Hoffmann
- S. Cygert
- G. Szwoch
- J. Kotus
- D. Weber
- M. Szczodrak
- D. Koszewski... and 2 others
- Year 2019
Vehicle detection and speed measurements are crucial tasks in traffic monitoring systems. In this work, we focus on several types of electronic sensors, operating on different physical principles in order to compare their effectiveness in real traffic conditions. Commercial solutions are based on road tubes, microwave sensors, LiDARs, and video cameras. Distributed traffic monitoring systems require a high number of monitoring...

Full text to download in external service
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
Publication
- G. Korvel
- O. Kurasova
- B. Kostek
- Archives of Acoustics - Year 2019
The goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...

Full text available to download
Comparison of the effectiveness of automatic EEG signal class separation algorithms
Publication
- JOURNAL OF INTELLIGENT & FUZZY SYSTEMS - Year 2019
In this paper, an algorithm for automatic brain activity class identification of EEG (electroencephalographic) signals is presented. EEG signals are gathered from seventeen subjects performing one of the three tasks: resting, watching a music video and playing a simple logic game. The methodology applied consists of several steps, namely: signal acquisition, signal processing utilizing z-score normalization, parametrization and...

Full text available to download
Database of speech and facial expressions recorded with optimized face motion capture settings
Publication
- A. Czyżewski
- M. Kawaler
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2019
The broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...

Full text available to download
Deep neural networks for human pose estimation from a very low resolution depth image
Publication
- P. Szczuko
- MULTIMEDIA TOOLS AND APPLICATIONS - Year 2019
The work presented in the paper is dedicated to determining and evaluating the most efficient neural network architecture applied as a multiple regression network localizing human body joints in 3D space based on a single low resolution depth image. The main challenge was to deal with a noisy and coarse representation of the human body, as observed by a depth sensor from a large distance, and to achieve high localization precision....

Full text available to download
Development of Intelligent Road Signs with V2X Interface for Adaptive Traffic Controlling
Publication
- A. Czyżewski
- A. Sroczyński
- T. Śmiałkowski
- P. Hoffmann
- Year 2019
The objective of this paper is to present a practical project of intelligent road signs, under which a series of new products for the regulation of traffic is being created. The engineering part of the project, described in this paper, was preceded by a series of experimental studies, the results of which were described in another paper accepted for publication at the MTS-ITS conference 2019, entitled "Comparative study on the effectiveness...

Full text available to download
Diagnosing wind turbine condition employing a neural network to the analysis of vibroacoustic signals
Publication
- A. Czyżewski
- Journal of the Acoustical Society of America - Year 2019
It is important from the economic point of view to detect damage early in the wind turbines before failures occur. For this purpose, a monitoring device was built that analyzes both acoustic signals acquired from the built-in non-contact acoustic intensity probe, as well as from the accelerometers, mounted on the internal devices in the nacelle. The signals collected in this way are used for long-term training of the autoencoder...

Full text available to download
Discovering Rule-Based Learning Systems for the Purpose of Music Analysis
Publication
- G. Korvel
- B. Kostek
- Journal of the Acoustical Society of America - Year 2019
Music analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...

Full text available to download
Estimating Traffic Intensity Employing Passive Acoustic Radar and Enhanced Microwave Doppler Radar Sensor
Publication
- Remote Sensing - Year 2019
Innovative road signs that can autonomously display the speed limit in cases where the trac situation requires it are under development. The autonomous road sign contains many types of sensors, of which the subject of interest in this article is the Doppler sensor that we have improved and the constructed and calibrated acoustic probe. An algorithm for performing vehicle detection and tracking, as well as vehicle speed measurement,...

Full text available to download
Human Computer Interface for Tracking Eye Movements Improves Assessment and Diagnosis of Patients With Acquired Brain Injuries
Publication
- Frontiers in Neurology - Year 2019
One of the first clinical signs differentiating the minimally conscious state from the vegetative state is the presence of smooth pursuit eye movements occurring in direct response to moving salient stimuli. Glasgow Coma Scale (GCS) is one of the most commonly used diagnostic tools for acute phase assessment of the level of consciousness, together with a neurological examination. These classic measures are limited to qualitative...

Full text available to download
Human verbal memory encoding is hierarchically distributed in a continuous processing stream
Publication
- M. T. Kucewicz
- K. Saboo
- B. M. Berry
- V. Kremen
- L. R. Miller
- F. Khadjevand
- C. S. Inman
- P. A. Wanda
- M. R. Sperling
- R. Gorniak... and 8 others
- eNeuro - Year 2019
Processing of memory is supported by coordinated activity in a network of sensory, association, and motor brain regions. It remains a major challenge to determine where memory is encoded for later retrieval. Here we used direct intracranial brain recordings from epilepsy patients performing free recall tasks to determine the temporal pattern and anatomical distribution of verbal memory encoding across the entire human cortex. High...

Full text available to download
Influence of the Delay in Monitor System on the Motor Coordination of Musicians while Performing
Publication
- Year 2019
This paper provides a description and results of measurements of the maximum acceptable value of delay tolerated by a musician, while playing an instrument, that does not cause de-synchronization and discomfort. First, methodology of measurements comprising audio recording and a fast camera is described. Then, themeasurement procedure for acquiring the maximum value of delay conditioning...

Full text to download in external service
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Publication
- D. Korzekwa
- R. Barra-Chicote
- B. Kostek
- T. Drugman
- M. Łajszczak
- Year 2019
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not...

Full text available to download
Labeler-hot Detection of EEG Epileptic Transients
Publication
- Ł. Czekaj
- W. Ziembla
- P. Jezierski
- P. Świniarski
- A. Kołodziejak
- P. Ogniewski
- P. Niedbalski
- A. Jezierska
- D. Węsierski
- Year 2019
Preventing early progression of epilepsy and sothe severity of seizures requires effective diagnosis. Epileptictransients indicate the ability to develop seizures but humansoverlook such brief events in an electroencephalogram (EEG)what compromises patient treatment. Traditionally, trainingof the EEG event detection algorithms has relied on groundtruth labels, obtained from the consensus...

Full text to download in external service
Localization of sound sources with dual acoustic vector sensor
Publication
- J. Kotus
- G. Szwoch
- Year 2019
The aim of the work is to estimate the position of sound sources. The proposed method uses a setup of two acoustic vector sensors (AVS). The intersection of azimuth rays from each AVS should indicate the position of a source. In practice, the result of position estimation using this method is an area rather than a point. This is a result of inaccuracy of the individual sensors, but more importantly, of the influence of a source...

Full text available to download
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
Publication
- M. Piotrowska
- G. Korvel
- B. Kostek
- T. Ciszewski
- A. Czyżewski
- International Journal of Applied Mathematics and Computer Science - Year 2019
Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

Full text available to download

Search

Department of Multimedia Systems

Publications

Filters

Category

Year

Options

Catalog Publications

Year 2020

Year 2019