Publications
Filters
total: 894
Catalog Publications
Year 2021
-
AUTOMATYCZNE GENEROWANIE KOLEJNOŚCI LIST UTWORÓW MUZYCZNYCH
PublicationW niniejszym rozdziale przedstawiono przygotowanie algorytmu do automa-tycznego układania kolejności utworów muzycznych i zgrywającego je do postaci jednego, długiego miksu. Dzięki algorytmowi dobierane są utwory na podstawie analizy podobieństwa fragmentów końcowych i początkowych utworów. Podo-bieństwo to jest obliczane za pomocą odległości euklidesowej między wektorami parametrów wyznaczonymi przez autoenkoder oraz na podstawie...
-
Closer Look at the Uncertainty Estimation in Semantic Segmentation under Distributional Shift
PublicationWhile recent computer vision algorithms achieve impressive performance on many benchmarks, they lack robustness - presented with an image from a different distribution, (e.g. weather or lighting conditions not considered during training), they may produce an erroneous prediction. Therefore, it is desired that such a model will be able to reliably predict its confidence measure. In this work, uncertainty estimation for the task...
-
Concurrent Video Denoising and Deblurring for Dynamic Scenes
PublicationDynamic scene video deblurring is a challenging task due to the spatially variant blur inflicted by independently moving objects and camera shakes. Recent deep learning works bypass the ill-posedness of explicitly deriving the blur kernel by learning pixel-to-pixel mappings, which is commonly enhanced by larger region awareness. This is a difficult yet simplified scenario because noise is neglected when it is omnipresent in a wide...
-
CyberEye: New Eye-Tracking Interfaces for Assessment and Modulation of Cognitive Functions beyond the Brain
PublicationThe emergence of innovative neurotechnologies in global brain projects has accelerated research and clinical applications of BCIs beyond sensory and motor functions. Both invasive and noninvasive sensors are developed to interface with cognitive functions engaged in thinking, communication, or remembering. The detection of eye movements by a camera offers a particularly attractive external sensor for computer interfaces to monitor,...
-
Designing acoustic scattering elements using machine learning methods
PublicationIn the process of the design and correction of room acoustic properties, it is often necessary to select the appropriate type of acoustic treatment devices and make decisions regarding their size, geometry, and location of the devices inside the room under the treatment process. The goal of this doctoral dissertation is to develop and validate a mathematical model that allows predicting the effects of the application of the scattering...
-
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention
PublicationThis paper describes two novel complementary techniques that improve the detection of lexical stress errors in non-native (L2) English speech: attention-based feature extraction and data augmentation based on Neural Text-To-Speech (TTS). In a classical approach, audio features are usually extracted from fixed regions of speech such as the syllable nucleus. We propose an attention-based deep learning model that automatically de...
-
Direct electrical stimulation of the human brain has inverse effects on the theta and gamma neural activities
PublicationObjective: Our goal was to analyze the electrophysiological response to direct electrical stimulation (DES) systematically applied at a wide range of parameters and anatomical sites, with particular focus on neural activities associated with memory and cognition. Methods: We used a large set of intracranial EEG (iEEG) recordings with DES from 45 subjects with electrodes...
-
Estimation of Average Speed of Road Vehicles by Sound Intensity Analysis
PublicationConstant monitoring of road traffic is important part of modern smart city systems. The proposed method estimates average speed of road vehicles in the observation period, using a passive acoustic vector sensor. Speed estimation based on sound intensity analysis is a novel approach to the described problem. Sound intensity in two orthogonal axes is measured with a sensor placed alongside the road. Position of the apparent sound...
-
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
PublicationThe approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...
-
Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network
PublicationThe goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined with a convolutional neural network (CNN). In the first experiment, we compare the effectiveness of the similarity matrices applied to discerning acoustic differences between consonant...
-
Independent dynamics of low, intermediate, and high frequency spectral intracranial EEG activities during human memory formation
PublicationA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various frequency ranges are coordinated across the space of the human cortex and time of memory processing is inconclusive. They can either be coordinated together across the frequency spectrum at the same cortical site and time or induced independently in particular bands. We used a large dataset of human intracranial...
-
Independent dynamics of slow, intermediate, and fast intracranial EEG spectral activities during human memory formation
PublicationA wide spectrum of brain rhythms are engaged throughout the human cortex in cognitive functions. How the rhythms of various low and high frequencies are spatiotemporally coordinated across the human brain during memory processing is inconclusive. They can either be coordinated together across a wide range of the frequency spectrum or induced in specific bands. We used a large dataset of human intracranial electroencephalography...
-
Leveraging spatio-temporal features for joint deblurring and segmentation of instruments in dental video microscopy
PublicationIn dentistry, microscopes have become indispensable optical devices for high-quality treatment and micro-invasive surgery, especially in the field of endodontics. Recent machine vision advances enable more advanced, real-time applications including but not limited to dental video deblurring and workflow analysis through relevant metadata obtained by instrument motion trajectories. To this end, the proposed work addresses dental...
-
Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling
PublicationA common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result...
-
Production of six-degrees-of-freedom (6DoF) navigable audio using 30 Ambisonic microphones
PublicationThis paper describes a method for planning, recording, and post-production of six-degrees-of-freedom audio recorded with multiple 3rd order Ambisonic microphone arrays. The description is based on the example of recordings conducted in August 2020 with the Poznan Philharmonic Orchestra using 30 units of Zylia ZM-1S. A convenient way to prepare and organize such a big project is proposed – this involves details of stage planning,...
-
Robustness in Compressed Neural Networks for Object Detection
PublicationModel compression techniques allow to significantly reduce the computational cost associated with data processing by deep neural networks with only a minor decrease in average accuracy. Simultaneously, reducing the model size may have a large effect on noisy cases or objects belonging to less frequent classes. It is a crucial problem from the perspective of the models' safety, especially for object detection in the autonomous driving...
-
Selective monitoring of noise emitted by vehicles involved in road traffic
PublicationAn acoustic intensity probe was developed measures the sound intensity in three orthogonal directions, making possible to calculate the azimuth and elevation angles, describing the sound source position. The acoustic sensor is made in the form of a cube with a side of 10 mm, on the inner surfaces of which the digital MEMS microphones are mounted. The algorithm works in two stages. The first stage is based on the analysis of sound...
-
Skuteczność klasyfikacji gatunków muzycznych za pomocą sieci neuronowej w zależności od typu danych wejściowych
PublicationRozpoznawanie gatunku muzycznego jest jednym z podstawowych elementów inteligentnych systemów tworzenia automatycznych list muzyki. Platformy strumieniowe oferujące taką usługę wymagają rozwiązań, które umożliwią jak najdokładniej określić przynależność utworu do gatunku muzycznego. Zgodnie z aktualnym stanem wiedzy – najskuteczniejszym klasyfikatorem są sztuczne sieci neuronowe (w tym w wersji uczenia głębokiego), dla których...
-
Techniki wielokanałowe wykorzystywane w koncertach i nagraniach muzycznych na odległość
PublicationW czasie pandemii koronawirusa COVID-19 nowego znaczenia nabrały możliwości transmisji dźwięku z obrazem – zwłaszcza do pracy zdalnej, która w przypadku muzyków jest szczególnym wyzwaniem zarówno w kontekście wspólnych ćwiczeń i prób, jak i koncertów. Wynikła konieczność wieloźródłowego połączenia ujawniła potrzebę uprzestrzennienia dźwięku w celu łatwiejszej lokalizacji źródeł dźwięku. Tworzenie zdalnych nagrań muzycznych stało...
-
Towards Cancer Patients Classification Using Liquid Biopsy
PublicationLiquid biopsy is a useful, minimally invasive diagnostic and monitoring tool for cancer disease. Yet, developing accurate methods, given the potentially large number of input features, and usually small datasets size remains very challenging. Recently, a novel feature parameterization based on the RNA-sequenced platelet data which uses the biological knowledge from the Kyoto Encyclopedia of Genes and Genomes, combined with a classifier...
Year 2022
-
Algoritmically improved microwave radar monitors breathing more acurrate than sensorized belt
PublicationThis paper describes a novel way to measure, process, analyze, and compare respiratory signals acquired by two types of devices: a wearable sensorized belt and a microwave radar-based sensor. Both devices provide breathing rate readouts. First, the background research is presented. Then, the underlying principles and working parameters of the microwave radar-based sensor, a contactless device for monitoring breathing, are described....
-
Architecture Design of a Networked Music Performance Platform for a Chamber Choir
PublicationThis paper describes an architecture design process for Networked Music Performance (NMP) platform for medium-sized conducted music ensembles, based on remote rehearsals of Academic Choir of Gdańsk University of Technology. The issues of real-time remote communication, in-person music performance, and NMP are described. Three iterative steps defining and extending the architecture of the NMP platform with additional features to...
-
Blockchain based Secure Data Exchange between Cloud Networks and Smart Hand-held Devices for use in Smart Cities
PublicationIn relation to smart city planning and management, processing huge amounts of generated data and execution of non-lightweight cryptographic algorithms on resource constraint devices at disposal, is the primary focus of researchers today. To enable secure exchange of data between cloud networks and mobile devices, in particular smart hand held devices, this paper presents Blockchain based approach that disperses a public/free key...
-
BP-EVD: Forward Block-Output Propagation for Efficient Video Denoising
PublicationDenoising videos in real-time is critical in many applications, including robotics and medicine, where varying light conditions, miniaturized sensors, and optics can substantially compromise image quality. This work proposes the first video denoising method based on a deep neural network that achieves state-of-the-art performance on dynamic scenes while running in real-time on VGA video resolution with no frame latency. The backbone...
-
Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera
PublicationThis paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...
-
Cognitive neuroscience: Theta network oscillations coordinate development of episodic memory
PublicationOur ability to remember life events matures through childhood and adolescence. A new study has revealed how theta oscillations between two anatomical brain regions supporting memory and executive functions are synchronized and develop across age through functional and structural connectivity.
-
Computer-Aided Detection of Hypertensive Retinopathy Using Depth-Wise Separable CNN
PublicationHypertensive retinopathy (HR) is a retinal disorder, linked to high blood pressure. The incidence of HR-eye illness is directly related to the severity and duration of hypertension. It is critical to identify and analyze HR at an early stage to avoid blindness. There are presently only a few computer-aided systems (CADx) designed to recognize HR. Instead, those systems concentrated on collecting features from many retinopathy-related...
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublicationThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
-
Creating new voices using normalizing flows
PublicationCreating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...
-
Detection of Anomalies in the Operation of a Road Lighting System Based on Data from Smart Electricity Meters
PublicationSmart meters in road lighting systems create new opportunities for automatic diagnostics of undesirable phenomena such as lamp failures, schedule deviations, or energy theft from the power grid. Such a solution fits into the smart cities concept, where an adaptive lighting system creates new challenges with respect to the monitoring function. This article presents research results indicating the practical feasibility of real‐time...
-
Edge-Computing based Secure E-learning Platforms
PublicationImplementation of Information and Communication Technologies (ICT) in E-Learning environments have brought up dramatic changes in the current educational sector. Distance learning, online learning, and networked learning are few examples that promote educational interaction between students, lecturers and learning communities. Although being an efficient form of real learning resource, online electronic resources are subject to...
-
Evaluation of Decision Fusion Methods for Multimodal Biometrics in the Banking Application
PublicationAn evaluation of decision fusion methods based on Dempster-Shafer Theory (DST) and its modifications is presented in the article, studied over real biometric data from the engineered multimodal banking client verification system. First, the approaches for multimodal biometric data fusion for verification are explained. Then the proposed implementation of comparison scores fusion is presented, including details on the application...
-
Examining Impact of Speed Recommendation Algorithm Operating in Autonomous Road Signs on Minimum Distance between Vehicles
PublicationAn approach to a new kind of recommendation system design that suggests safe speed on the road is presented. Real data obtained on roads were used for the simulations. As part of a project related to autonomous road sign development, a number of measurements were carried out on both local roads and expressways. A speed recommendation model was created based on gathered traffic data employing the traffic simulator. Depending on...
-
Hotspot of human verbal memory encoding in the left anterior prefrontal cortex
PublicationBackground: Treating memory and cognitive deficits requires knowledge about anatomical sites and neural activities to be targeted with particular therapies. Emerging technologies for local brain stimulation offer attractive therapeutic options but need to be applied to target specific neural activities, at distinct times, and in specific brain regions that are critical for memory formation. Methods: The areas that are critical...
-
Intracranial electrophysiological recordings from the human brain during memory tasks with pupillometry
PublicationData comprise intracranial EEG (iEEG) brain activity represented by stereo EEG (sEEG) signals, recorded from over 100 electrode channels implanted in any one patient across various brain regions. The iEEG signals were recorded in epilepsy patients (N=10) undergoing invasive monitoring and localization of seizures when they were performing a battery of four memory tasks lasting approx. 1 hour in total. Gaze tracking on the task...
-
Klasyfikacja emocji w muzyce filmowej z wykorzystaniem uczenia głębokiego
PublicationPraca przedstawia zagadnienia związane z klasyfikacją emocji w muzyce filmowej. W artykule zaproponowano model emocji zawierający dziewięć stanów emocjonalnych, do których przypisany jest kolor zgodnie z teorią koloru w filmie. Kolejne kroki eksperymentu obejmowały wybór muzyki filmowej do testów (baza Epidemic Sound), przygotowanie założeń ankiety oraz modelu emocji wykorzystywanych w testach odsłuchowych, a także konstrukcję...
-
Machine learning applied to acoustic-based road traffic monitoring
PublicationThe motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...
-
Machine learning applied to acoustic-based road traffic monitoring
PublicationThe motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...
-
Medical Image Segmentation Using Deep Semantic-based Methods: A Review of Techniques, Applications and Emerging Trends
PublicationSemantic-based segmentation (Semseg) methods play an essential part in medical imaging analysis to improve the diagnostic process. In Semseg technique, every pixel of an image is classified into an instance, where each class is corresponded by an instance. In particular, the semantic segmentation can be used by many medical experts in the domain of radiology, ophthalmologists, dermatologist, and image-guided radiotherapy. The authors...
-
Mining Knowledge of Respiratory Rate Quantification and Abnormal Pattern Prediction
PublicationThe described application of granular computing is motivated because cardiovascular disease (CVD) remains a major killer globally. There is increasing evidence that abnormal respiratory patterns might contribute to the development and progression of CVD. Consequently, a method that would support a physician in respiratory pattern evaluation should be developed. Group decision-making, tri-way reasoning, and rough set–based analysis...
-
Multi-task Video Enhancement for Dental Interventions
PublicationA microcamera firmly attached to a dental handpiece allows dentists to continuously monitor the progress of conservative dental procedures. Video enhancement in video-assisted dental interventions alleviates low-light, noise, blur, and camera handshakes that collectively degrade visual comfort. To this end, we introduce a novel deep network for multi-task video enhancement that enables macro-visualization of dental scenes. In particular,...
-
Musical Instrument Identification Using Deep Learning Approach
PublicationThe work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata...
-
Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions - Headphones vs Loudspeakers: A Case Study
PublicationThis study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology...
-
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
PublicationArtificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...
-
Remote Health Monitoring of Wind Turbines Employing Vibroacoustic Transducers and Autoencoders
PublicationImplementation of remote monitoring technology for real wind turbine structures designed to detect potential sources of failure is described. An innovative multi-axis contactless acoustic sensor measuring acoustic intensity as well as previously known accelerometers were used for this purpose. Signal processing methods were proposed, including feature extraction and data analysis. Two strategies were examined: Mel Frequency Cepstral...
-
Robust and Efficient Machine Learning Algorithms for Visual Recognition
PublicationIn visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...
-
Robust Object Detection with Multi-input Multi-output Faster R-CNN
PublicationRecent years have seen impressive progress in visual recognition on many benchmarks, however, generalization to the out-of-distribution setting remains a significant challenge. A state-of-the-art method for robust visual recognition is model ensembling. However, recently it was shown that similarly competitive results could be achieved with a much smaller cost, by using multi-input multi-output architecture (MIMO). In this work,...
-
Robust Object Detection with Multi-input Multi-output Faster R-CNN
PublicationRecent years have seen impressive progress in visual recognition on many benchmarks, however, generalization to the out-of-distribution setting remains a significant challenge. A state-of-the-art method for robust visual recognition is model ensembling. However, recently it was shown that similarly competitive results could be achieved with a much smaller cost, by using multi-input multi-output architecture (MIMO). In this work,...
-
Sensing Direction of Human Motion Using Single-Input-Single-Output (SISO) Channel Model and Neural Networks
PublicationObject detection Through-the-Walls enables localization and identification of hidden objects behind the walls. While numerous studies have exploited Channel State Information of Multiple Input Multiple Output (MIMO) WiFi and radar devices in association with Artificial Intelligence based algorithms (AI) to detect and localize objects behind walls, this study proposes a novel non-invasive Through-the-Walls human motion direction...
-
Systematic Literature Review for Emotion Recognition from EEG Signals
PublicationResearchers have recently become increasingly interested in recognizing emotions from electroencephalogram (EEG) signals and many studies utilizing different approaches have been conducted in this field. For the purposes of this work, we performed a systematic literature review including over 40 articles in order to identify the best set of methods for the emotion recognition problem. Our work collects information about the most...
-
Systematic Literature Review for Emotion Recognition from EEG Signals
PublicationResearchers have recently become increasingly interested in recognizing emotions from electroencephalogram (EEG) signals and many studies utilizing different approaches have been conducted in this field. For the purposes of this work, we performed a systematic literature review including over 40 articles in order to identify the best set of methods for the emotion recognition problem. Our work collects information about the most...
-
Technologia CyberOko do diagnozy, rehabilitacji i komunikowania się z pacjentami niewykazującymi oznak przytomności
PublicationCyberOko jest rozwiązaniem opracowanym w Politechnice Gdańskiej, które umożliwia nawiązanie kontaktu i pracę z osobami głęboko upośledzonymi komunikacyjnie. W sposób inteligentny śledzi ruch gałek ocznych, dzięki czemu umożliwia rehabilitację i ocenę stanu świadomości pacjenta nawet w stanie całkowitego porażenia. Rozwiązanie obejmuje także analizę fal EEG, obiektywne badanie słuchu i badanie sygnałów z macierzy elektrod wszczepianych...
-
Vehicle Detection and Speed Estimation Using Millimetre Wave Radar
PublicationThe dataset titled Data from 76- to 81-GHz mmWave Sensor located at S7 road contains data recorded employing an IWR1642 mmWave sensor from Texas Instruments. The data comes from two sessions lasting 24h each. The dataset provides the possibility to perform analyses related to car traffic intensity on one of the carriageways of the motorway heading to the Gdańsk metropolitan area. Based on the gathered data, it is possible to calculate...
Year 2023
-
A commonly-accessible toolchain for live streaming music events with higher-order ambisonic audio and 4k 360 vision
PublicationAn immersive live stream is especially interesting in the ongoing development of telepresence tools, especially in the virtual reality (VR) or mixed reality (MR) domain. This paper explores the remote and immersive way of enabling telepresence for the audience to high-fidelity music performance using freely-available and easily-accessible tools. A functional VR live-streaming toolchain, comprising 360 vision and higher-order ambisonic...
-
A survey of automatic speech recognition deep models performance for Polish medical terms
PublicationAmong the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....
-
An automated, low-latency environment for studying the neural basis of behavior in freely moving rats
PublicationBackground Behavior consists of the interaction between an organism and its environment, and is controlled by the brain. Brain activity varies at sub-second time scales, but behavioral measures are usually coarse (often consisting of only binary trial outcomes). Results To overcome this mismatch, we developed the Rat Interactive Foraging Facility (RIFF): a programmable interactive arena for freely moving rats with multiple feeding...
-
Applying the Lombard Effect to Speech-in-Noise Communication
PublicationThis study explored how the Lombard effect, a natural or artificial increase in speech loudness in noisy environments, can improve speech-in-noise communication. This study consisted of several experiments that measured the impact of different types of noise on synthesizing the Lombard effect. The main steps were as follows: first, a dataset of speech samples with and without the Lombard effect was collected in a controlled setting;...
-
Autoencoder application for anomaly detection in power consumption of lighting systems
PublicationDetecting energy consumption anomalies is a popular topic of industrial research, but there is a noticeable lack of research reported in the literature on energy consumption anomalies for road lighting systems. However, there is a need for such research because the lighting system, a key element of the Smart City concept, creates new monitoring opportunities and challenges. This paper examines algorithms based on the deep learning...
-
Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders
PublicationThe purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...
-
Bimodal Emotion Recognition Based on Vocal and Facial Features
PublicationEmotion recognition is a crucial aspect of human communication, with applications in fields such as psychology, education, and healthcare. Identifying emotions accurately is challenging, as people use a variety of signals to express and perceive emotions. In this study, we address the problem of multimodal emotion recognition using both audio and video signals, to develop a robust and reliable system that can recognize emotions...
-
Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice
PublicationThe vulnerability of the speaker identity verification system to attacks using voice cloning was examined. The research project assumed creating a model for verifying the speaker’s identity based on voice biometrics and then testing its resistance to potential attacks using voice cloning. The Deep Speaker Neural Speaker Embedding System was trained, and the Real-Time Voice Cloning system was employed based on the SV2TTS, Tacotron,...
-
Computer-Aided Diagnosis of COVID-19 from Chest X-ray Images Using Hybrid-Features and Random Forest Classifier
PublicationIn recent years, a lot of attention has been paid to using radiology imaging to automatically find COVID-19. (1) Background: There are now a number of computer-aided diagnostic schemes that help radiologists and doctors perform diagnostic COVID-19 tests quickly, accurately, and consistently. (2) Methods: Using chest X-ray images, this study proposed a cutting-edge scheme for the automatic recognition of COVID-19 and pneumonia....
-
Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech
PublicationIn this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...
-
Detection of Water on Road Surface with Acoustic Vector Sensor
PublicationThis paper presents a new approach to detecting the presence of water on a road surface, employing an acoustic vector sensor. The proposed method is based on sound intensity analysis in the frequency domain. Acoustic events, representing road vehicles, are detected in the sound intensity signals. The direction of the incoming sound is calculated for the individual spectral components of the intensity signal, and the components...
-
Digital Transformation and Its Influence on Sustainable Manufacturing and Business Practices
PublicationThe paper focuses on the relationship between businesses and digital transformation, and how digital transformation has changed manufacturing in several ways. Aspects like Cloud Computing, vertical and horizontal integration, data communication, and the internet have contributed to sustainable manufacturing by decentralizing supply chains. In addition, digital transformation inventions such as predictive analysis and big data analytics...
-
Direct electrical brain stimulation of human memory: lessons learnt and future perspectives
PublicationModulation of cognitive functions supporting human declarative memory is one of the grand challenges of neuroscience, and of vast importance for a variety of neuropsychiatric, neurodegenerative and neurodevelopmental diseases. Despite a recent surge of successful attempts at improving performance in a range of memory tasks, the optimal approaches and parameters for memory enhancement have yet to be determined. On a more fundamental...
-
Distinct hippocampal-prefrontal neural assemblies coordinate memory encoding, maintenance, and recall
PublicationShort-term memory enables incorporation of recent experience into subsequent decision-making. This processing recruits both the prefrontal cortex and hippocampus, where neurons encode task cues, rules, and outcomes. However, precisely which information is carried when, and by which neurons, remains unclear. Using population decoding of activity in rat medial prefrontal cortex (mPFC) and dorsal hippocampal CA1, we confirm that mPFC...
-
Driver’s Condition Detection System Using Multimodal Imaging and Machine Learning Algorithms
PublicationTo this day, driver fatigue remains one of the most significant causes of road accidents. In this paper, a novel way of detecting and monitoring a driver’s physical state has been proposed. The goal of the system was to make use of multimodal imaging from RGB and thermal cameras working simultaneously to monitor the driver’s current condition. A custom dataset was created consisting of thermal and RGB video samples. Acquired data...
-
Energy consumption optimization in wastewater treatment plants: Machine learning for monitoring incineration of sewage sludge
PublicationBiomass management in terms of energy consumption optimization has become a recent challenge for developed countries. Nevertheless, the multiplicity of materials and operating parameters controlling energy consumption in wastewater treatment plants necessitates the need for sophisticated well-organized disciplines in order to minimize energy consumption and dissipation. Sewage sludge (SS) disposal management is the key stage of...
-
Ensembling noisy segmentation masks of blurred sperm images
PublicationBackground: Sperm tail morphology and motility have been demonstrated to be important factors in determining sperm quality for in vitro fertilization. However, many existing computer-aided sperm analysis systems leave the sperm tail out of the analysis, as detecting a few tail pixels is challenging. Moreover, some publicly available datasets for classifying morphological defects contain images limited only to the sperm head. This...
-
Facilitating free travel in the Schengen area—A position paper by the European Association for Biometrics
PublicationDue to migration, terror-threats and the viral pandemic, various EU member states have re-established internal border control or even closed their borders. European Association for Biometrics (EAB), a non-profit organisation, solicited the views of its members on ways which biometric technologies and services may be used to help with re-establishing open borders within the Schengen area while at the same time mitigating any adverse...
-
How Can We Identify Electrophysiological iEEG Activities Associated with Cognitive Functions?
PublicationElectrophysiological activities of the brain are engaged in its various functions and give rise to a wide spectrum of low and high frequency oscillations in the intracranial EEG (iEEG) signals, commonly known as the brain waves. The iEEG spectral activities are distributed across networks of cortical and subcortical areas arranged into hierarchical processing streams. It remains a major challenge to identify these activities in...
-
Intra-subject class-incremental deep learning approach for EEG-based imagined speech recognition
PublicationBrain–computer interfaces (BCIs) aim to decode brain signals and transform them into commands for device operation. The present study aimed to decode the brain activity during imagined speech. The BCI must identify imagined words within a given vocabulary and thus perform the requested action. A possible scenario when using this approach is the gradual addition of new words to the vocabulary using incremental learning methods....
-
Multimedia industrial and medical applications supported by machine learning
PublicationThis article outlines a keynote paper presented at the Intelligent DecisionTechnologies conference providing a part of the KES Multi-theme Conference “Smart Digital Futures” organized in Rome on June 14–16, 2023. It briefly discusses projects related to traffic control using developed intelligent traffic signs and diagnosing the health of wind turbine mechanisms and multimodal biometric authentication for banking branches to provide...
-
Neural Graph Collaborative Filtering: Analysis of Possibilities on Diverse Datasets
PublicationThis paper continues the work by Wang et al. [17]. Its goal is to verify the robustness of the NGCF (Neural Graph Collaborative Filtering) technique by assessing its ability to generalize across different datasets. To achieve this, we first replicated the experiments conducted by Wang et al. [17] to ensure that their replication package is functional. We received sligthly better results for ndcg@20 and somewhat poorer results for...
-
Optimizing Medical Personnel Speech Recognition Models Using Speech Synthesis and Reinforcement Learning
PublicationText-to-Speech synthesis (TTS) can be used to generate training data for building Automatic Speech Recognition models (ASR). Access to medical speech data is because it is sensitive data that is difficult to obtain for privacy reasons; TTS can help expand the data set. Speech can be synthesized by mimicking different accents, dialects, and speaking styles that may occur in a medical language. Reinforcement Learning (RL), in the...
-
Platelet RNA Sequencing Data Through the Lens of Machine Learning
PublicationLiquid biopsies offer minimally invasive diagnosis and monitoring of cancer disease. This biosource is often analyzed using sequencing, which generates highly complex data that can be used using machine learning tools. Nevertheless, validating the clinical applications of such methods is challenging. It requires: (a) using data from many patients; (b) verifying potential bias concerning sample collection; and (c) adding interpretability...
-
Predicting emotion from color present in images and video excerpts by machine learning
PublicationThis work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...
-
Prediction of maximum tensile stress in plain-weave composite laminates with interacting holes via stacked machine learning algorithms: A comparative study
PublicationPlain weave composite is a long-lasting type of fabric composite that is stable enough when being handled. Open-hole composites have been widely used in industry, though they have weak structural performance and complex design processes. An extensive number of material/geometry parameters have been utilized for designing these composites, thereby an efficient computational tool is essential for that purpose. Different Machine Learning...
-
Rating by detection: an artifact detection protocol for rating EEG quality with average event duration
PublicationQuantitative evaluation protocols are critical for the development of algorithms that remove artifacts from real EEG optimally. However, visually inspecting the real EEG to select the top-performing artifact removal pipeline is infeasible while hand-crafted EEG data allow assessing artifact removal configurations only in a simulated environment. This study proposes a novel, principled approach for quantitatively evaluating algorithmically...
-
Reverberation divergence in VR applications
PublicationThe aim of this project was to investigate the correlation between virtual reality (VR) imagery and ambisonic sound. With the increasing popularity of VR applications, understanding how sound is perceived in virtual environments is crucial for enhancing the immersiveness of the experience. By examining the relationship between visual scenes and sound scenes, this research attempts to explore how the interaction between vision and...
-
Road traffic can be predicted by machine learning equally effectively as by complex microscopic model
PublicationSince high-quality real data acquired from selected road sections are not always available, a traffic control solution can use data from software traffic simulators working offline. The results show that in contrast to microscopic traffic simulation, the algorithms employing neural networks can work in real-time, so they can be used, among others, to determine the speed displayed on variable message road signs. This paper describes...
-
Systematic Literature Review on Click Through Rate Prediction
PublicationThe ability to anticipate whether a user will click on an item is one of the most crucial aspects of operating an e-commerce business, and clickthrough rate prediction is an attempt to provide an answer to this question. Beginning with the simplest multilayer perceptrons and progressing to the most sophisticated attention networks, researchers employ a variety of methods to solve this issue. In this paper, we present the findings...
-
Usability study of various biometric techniques in bank branches
PublicationThe purpose of the presented research was to evaluate the performance of the prepared biometric algorithms and obtain information on the opinions and preferences of their users in bank branches. The study aimed to determine users' attitudes towards particular modalities and preferences on how to use biometrics after the bank customers had practical experience with the operation of the prototype solutions. The research results...
-
User Authentication by Eye Movement Features Employing SVM and XGBoost Classifiers
PublicationDevices capable of tracking the user’s gaze have become significantly more affordable over the past few years, thus broadening their application, including in-home and office computers and various customer service equipment. Although such devices have comparatively low operating frequencies and limited resolution, they are sufficient to supplement or replace classic input interfaces, such as the keyboard and mouse. The biometric...
Year 2024
-
Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning
PublicationIn this work, we investigate exemplar-free class incremental learning (CIL) with knowledge distillation (KD) as a regularization strategy, aiming to prevent forgetting. KDbased methods are successfully used in CIL, but they often struggle to regularize the model without access to exemplars of the training data from previous tasks. Our analysis reveals that this issue originates from substantial representation shifts in the teacher...
-
Deep learning techniques for biometric security: A systematic review of presentation attack detection systems
PublicationBiometric technology, including finger vein, fingerprint, iris, and face recognition, is widely used to enhance security in various devices. In the past decade, significant progress has been made in improving biometric sys- tems, thanks to advancements in deep convolutional neural networks (DCNN) and computer vision (CV), along with large-scale training datasets. However, these systems have become targets of various attacks, with...
-
Detection of circulating tumor cells by means of machine learning using Smart-Seq2 sequencing
PublicationCirculating tumor cells (CTCs) are tumor cells that separate from the solid tumor and enter the bloodstream, which can cause metastasis. Detection and enumeration of CTCs show promising potential as a predictor for prognosis in cancer patients. Furthermore, single-cells sequencing is a technique that provides genetic information from individual cells and allows to classify them precisely and reliably. Sequencing data typically...
-
Divide and not forget: Ensemble of selectively trained experts in Continual Learning
PublicationClass-incremental learning is becoming more popular as it helps models widen their applicability while not forgetting what they already know. A trend in this area is to use a mixture-of-expert technique, where different models work together to solve the task. However, the experts are usually trained all at once using whole task data, which makes them all prone to forgetting and increasing computational burden. To address this limitation,...
-
High frequency oscillations in human memory and cognition: a neurophysiological substrate of engrams?
PublicationDespite advances in understanding the cellular and molecular processes underlying memory and cognition, and recent successful modulation of cognitive performance in brain disorders, the neurophysiological mechanisms remain underexplored. High frequency oscillations beyond the classic electroencephalogram spectrum have emerged as a potential neural correlate of fundamental cognitive processes. High frequency oscillations are detected...
-
Infographics in Educational Settings: A Literature Review
PublicationInfographics are visual representations of data that utilize various graphic elements, including pie charts, bar graphs, line graphs, and histograms. Educators and designers can maximize the potential of infographics as powerful educational tools by carefully addressing challenges and capitalizing on emerging technologies. However, current education systems showcase the need for development guidelines and the best practices targeted...
-
Looking through the past: better knowledge retention for generative replay in continual learning
PublicationIn this work, we improve the generative replay in a continual learning setting to perform well on challenging scenarios. Because of the growing complexity of continual learning tasks, it is becoming more popular, to apply the generative replay technique in the feature space instead of image space. Nevertheless, such an approach does not come without limitations. In particular, we notice the degradation of the continually trained...
-
Missing Puzzle Pieces in Dementia Research: HCN Channels and Theta Oscillations
PublicationIncreasing evidence indicates a role of hyperpolarization activated cation (HCN) channels in controlling the resting membrane potential, pacemaker activity, memory formation, sleep, and arousal. Their disfunction may be associated with the development of epilepsy and age-related memory decline. Neuronal hyperexcitability involved in epileptogenesis and EEG desynchronization occur in the course of dementia in human Alzheimer’s Disease...
-
Sounding Mechanism of a Flue Organ Pipe—A Multi-Sensor Measurement Approach
PublicationThis work presents an approach that integrates the results of measuring, analyzing, and modeling air flow phenomena driven by pressurized air in a flue organ pipe. The investigation concerns a Bourdon organ pipe. Measurements are performed in an anechoic chamber using the Cartesian robot equipped with a 3D acoustic vector sensor (AVS) that acquires both acoustic pressure and air particle velocity. Also, a high-speed camera is employed...