Publikacje
Filtry
wszystkich: 892
Katalog Publikacji
-
Vocalic Segments Classification Assisted by Mouth Motion Capture
PublikacjaVisual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...
-
CNN Architectures for Human Pose Estimation from a Very Low Resolution Depth Image
PublikacjaThe paper is dedicated to proposing and evaluating a number of convolutional neural network architectures for calculating a multiple regression on 3D coordinates of human body joints tracked in a single low resolution depth image. The main challenge was to obtain a high precision in case of a noisy and coarse scan of the body, as observed by a depth sensor from a large distance. The regression network was expected to reason about...
-
Investigating Feature Spaces for Isolated Word Recognition
PublikacjaThe study addresses the issues related to the appropriateness of a two-dimensional representation of speech signal for speech recognition tasks based on deep learning techniques. The approach combines Convolutional Neural Networks (CNNs) and time-frequency signal representation converted to the investigated feature spaces. In particular, waveforms and fractal dimension features of the signal were chosen for the time domain, and...
-
Improving Objective Speech Quality Indicators in Noise Conditions
PublikacjaThis work aims at modifying speech signal samples and test them with objective speech quality indicators after mixing the original signals with noise or with an interfering signal. Modifications that are applied to the signal are related to the Lombard speech characteristics, i.e., pitch shifting, utterance duration changes, vocal tract scaling, manipulation of formants. A set of words and sentences in Polish, recorded in silence,...
-
Edge-Computing based Secure E-learning Platforms
PublikacjaImplementation of Information and Communication Technologies (ICT) in E-Learning environments have brought up dramatic changes in the current educational sector. Distance learning, online learning, and networked learning are few examples that promote educational interaction between students, lecturers and learning communities. Although being an efficient form of real learning resource, online electronic resources are subject to...
-
Objectivization of phonological evaluation of speech elements by means of audio parametrization
PublikacjaThis study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...
-
New applications of sound and vision engineering
PublikacjaMultimedia, Sound & Vision Engineering are relatively new fields within the area of science and technology, but teaching and research in this area has been carried out at Gdansk University of Technology (Gdansk, Poland) for nearly 5 decades. Current project carried-out in the Multimedia Systems Department are in the scope of the paper.
-
Modeling and Designing Acoustical Conditions of the Interior – Case Study
PublikacjaThe primary aim of this research study was to model acoustic conditions of the Courtyard of the Gdańsk University of Technology Main Building, and then to design a sound reinforcement system for this interior. First, results of measurements of the parameters of the acoustic field are presented. Then, the comparison between measured and predicted values using the ODEON program is shown. Collected data indicate a long reverberation...
-
Robust Object Detection with Multi-input Multi-output Faster R-CNN
PublikacjaRecent years have seen impressive progress in visual recognition on many benchmarks, however, generalization to the out-of-distribution setting remains a significant challenge. A state-of-the-art method for robust visual recognition is model ensembling. However, recently it was shown that similarly competitive results could be achieved with a much smaller cost, by using multi-input multi-output architecture (MIMO). In this work,...
-
Remote Health Monitoring of Wind Turbines Employing Vibroacoustic Transducers and Autoencoders
PublikacjaImplementation of remote monitoring technology for real wind turbine structures designed to detect potential sources of failure is described. An innovative multi-axis contactless acoustic sensor measuring acoustic intensity as well as previously known accelerometers were used for this purpose. Signal processing methods were proposed, including feature extraction and data analysis. Two strategies were examined: Mel Frequency Cepstral...
-
Towards Cancer Patients Classification Using Liquid Biopsy
PublikacjaLiquid biopsy is a useful, minimally invasive diagnostic and monitoring tool for cancer disease. Yet, developing accurate methods, given the potentially large number of input features, and usually small datasets size remains very challenging. Recently, a novel feature parameterization based on the RNA-sequenced platelet data which uses the biological knowledge from the Kyoto Encyclopedia of Genes and Genomes, combined with a classifier...
-
Platelet RNA Sequencing Data Through the Lens of Machine Learning
PublikacjaLiquid biopsies offer minimally invasive diagnosis and monitoring of cancer disease. This biosource is often analyzed using sequencing, which generates highly complex data that can be used using machine learning tools. Nevertheless, validating the clinical applications of such methods is challenging. It requires: (a) using data from many patients; (b) verifying potential bias concerning sample collection; and (c) adding interpretability...
-
Concurrent Video Denoising and Deblurring for Dynamic Scenes
PublikacjaDynamic scene video deblurring is a challenging task due to the spatially variant blur inflicted by independently moving objects and camera shakes. Recent deep learning works bypass the ill-posedness of explicitly deriving the blur kernel by learning pixel-to-pixel mappings, which is commonly enhanced by larger region awareness. This is a difficult yet simplified scenario because noise is neglected when it is omnipresent in a wide...
-
A Study in Experimental Methods of Human-Computer Communication for Patients After Severe Brain Injuries
PublikacjaExperimental research in the domain of multimedia technology applied to medical practice is discussed, employing a prototype of integrated multimodal system to assist diagnosis and polysensory stimulation of patients after severe brain injury. The system being developed includes among others: eye gaze tracker, and EEG monitoring of non-communicating patients after severe brain injuries. The proposed solutions are used for collecting...
-
Analysis of allophones based on audio signal recordings and parameterization
PublikacjaThe aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...
-
Performance Evaluation of Selected Parallel Object Detection and Tracking Algorithms on an Embedded GPU Platform
PublikacjaPerformance evaluation of selected complex video processing algorithms, implemented on a parallel, embedded GPU platform Tegra X1, is presented. Three algorithms were chosen for evaluation: a GMM-based object detection algorithm, a particle filter tracking algorithm and an optical flow based algorithm devoted to people counting in a crowd flow. The choice of these algorithms was based on their computational complexity and parallel...
-
Predicting emotion from color present in images and video excerpts by machine learning
PublikacjaThis work aims at predicting emotion based on the colors present in images and video excerpts using a machine-learning approach. The purpose of this paper is threefold: (a) to develop a machine-learning algorithm that classifies emotions based on the color present in an image, (b) to select the best-performing algorithm from the first phase and apply it to film excerpt emotion analysis based on colors, (c) to design an online survey...
-
User Authentication by Eye Movement Features Employing SVM and XGBoost Classifiers
PublikacjaDevices capable of tracking the user’s gaze have become significantly more affordable over the past few years, thus broadening their application, including in-home and office computers and various customer service equipment. Although such devices have comparatively low operating frequencies and limited resolution, they are sufficient to supplement or replace classic input interfaces, such as the keyboard and mouse. The biometric...
-
Localization of sound sources with dual acoustic vector sensor
PublikacjaThe aim of the work is to estimate the position of sound sources. The proposed method uses a setup of two acoustic vector sensors (AVS). The intersection of azimuth rays from each AVS should indicate the position of a source. In practice, the result of position estimation using this method is an area rather than a point. This is a result of inaccuracy of the individual sensors, but more importantly, of the influence of a source...
-
Assessment of the Effectiveness of a Short-term Hearing Aid Use in Patients with Different Degrees of Hearing Loss
PublikacjaThe study presents evaluating the effectiveness of the hearing aid fitting process in the short-term use (7 days). The evaluation method consists of a survey based on the APHAB (Abbreviated Profile of Hearing Aid Benefit) questionnaire. Additional criteria such as a degree of hearing loss, number of hours and days of hearing aid use as well as the user’s experience were also taken into consideration. The outcomes of the benefit...
-
Comparison of Lithuanian and Polish Consonant Phonemes Based on Acoustic Analysis – Preliminary Results
PublikacjaThe goal of this research is to find a set of acoustic parameters that are related to differences between Polish and Lithuanian language consonants. In order to identify these differences, an acoustic analysis is performed, and the phoneme sounds are described as the vectors of acoustic parameters. Parameters known from the speech domain as well as those from the music information retrieval area are employed. These parameters are...
-
An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics
PublikacjaThe speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing...
-
Personalized avatar animation for virtual reality
PublikacjaThe paper presents a method for creating a personalized animation of avatar for virtual reality application such as multiplayer on-line games. Animation is stored in a simplified version, containing only keyframes for important avatar poses. This version defines key movements, i.e. roughly describes the avatar's action. Animation is enriched by the user with new motion phases utilizing fuzzy descriptors.Various degrees of motion...
-
Square Root Raised Cosine Fractionally Delaying Nyquist Filter - Design and Performance Evaluation
PublikacjaIn this paper we propose a discrete-time FIR (Finite Impulse Response) filter which is applied as a square root Nyquist filter and fractional delay filter simultaneously. The filter enables to substitute for a cascade of square root Nyquist filter and fractional delay filter in one device/algorithm. The aim is to compensate for transmission delay in digital communication system. Performance of the filter as a matched filter is...
-
Application of virtual gate for counting people participating in large public events
PublikacjaThe concept and practical application of the developed algorithm forpeople counting in crowded scene is presented. The aim of the work is to estimatethe number of people passing towards entrances of a large sport hall. Thedetails of implemented the Virtual Gate algorithm are presented. The video signalfrom the camera installed in the building constituted the input for the algorithm.The most challenging problem was the unpredicted...
-
Resolving Conflicts in Object Tracking in Video Stream Employing Key Point Matching
PublikacjaA novel approach to resolving ambiguous situations in object tracking in video streams is presented. The proposed method combines standard tracking technique employing Kalman filters with global feature matching method. Object detection is performed using a background subtraction algorithm, then Kalman filters are used for object tracking. At the same time, SURF key points are detected only in image sections identified as moving...
-
Sacral sound-engineering
PublikacjaOrganologic and campanologic acoustical problems due to applications to sacral objects are characterized on ground of numerous reviewed publications and engineering reports. Participations of several involved research centres, mostly Polish, at solving these problems are evaluated. Some desirable future developments are indicated. Appendices bring examples of documentation on selected investigated objects.
-
Musical Instrument Separation Applied to Music Genre Classification . Separacja instrumentów muzycznych w zastosowaniu do rozpoznawania gatunków muzycznych
PublikacjaThis paper outlines first issues related to music genre classification and a short description of algorithms used for musical instrument separation. Also, the paper presents proposed optimization of the feature vectors used for music genre recognition. Then, the ability of decision algorithms to properly recognize music genres is discussed based on two databases. In addition, results are cited for another database with regard to...
-
On the use of instantaneous complex frequency for analysis and modification of simple sounds
PublikacjaW pracy przedstawiono możliwości wykorzystania zespolonej pulsacji chwilowej do analizy i modyfikacji prostych dźwięków. Opisywany algorytm składa się z dwóch kroków: bifaktoryzacji sygnału na obwiednię minimalnofazową i fazor dodatnipskrętny, a następnie estymacja i modyfikacja zespolonej pulsacji chwilowej obu czynników faktoryzacji.
-
Surveillance camera tracking of GEO positioned objects
PublikacjaRozdział opisuje system sterowania kamerami ruchomymi PTZ realizujący śledzenie poruszającego się obiektu o znanej pozycji GPS. Przedstawione są idea systemu oraz możliwości jego wykorzystania. Opisane są: procedura kalibracji pola widzenia kamery i sposób powiązania z danymi o lokalizacji, procedura predykcji ruchu w celu kompensacji opóźnień czasowych. Omówiony jest zaimplementowany system modułowy, w którego skład wchodzą: terminale...
-
Complexity analysis of the Pawlak’s flowgraph extension for re-identification in multi-camera surveillance system
PublikacjaThe idea of Pawlak’s flowgraph turned out to be a useful and convenient container for a knowledge of objects’ behaviour and movements within the area observed with a multi-camera surveillance system. Utilization of the flowgraph for modelling behaviour admittedly requires certain extensions and enhancements, but it allows for combining many rules into a one data structure and for obtaining parameters describing how objects tend...
-
Simple gait parameterization and 3D animation for anonymous visual monitoring based on augmented reality
PublikacjaThe article presents a method for video anonymization and replacing real human silhouettes with virtual 3D figures rendered on a screen. Video stream is processed to detect and to track objects, whereas anonymization stage employs animating avatars accordingly to behavior of detected persons. Location, movement speed, direction, and person height are taken into account during animation and rendering phases. This approach requires...
-
Low-Level Music Feature Vectors Embedded as Watermarks
PublikacjaIn this paper a method consisting in embedding low-level music feature vectors as watermarks into a musical signal is proposed. First, a review of some recent watermarking techniques and the main goals of development of digital watermarking research are provided. Then, a short overview of parameterization employed in the area of Music Information Retrieval is given. A methodology of non-blind watermarking applied to music-content...
-
Detection of Anomalies in the Operation of a Road Lighting System Based on Data from Smart Electricity Meters
PublikacjaSmart meters in road lighting systems create new opportunities for automatic diagnostics of undesirable phenomena such as lamp failures, schedule deviations, or energy theft from the power grid. Such a solution fits into the smart cities concept, where an adaptive lighting system creates new challenges with respect to the monitoring function. This article presents research results indicating the practical feasibility of real‐time...
-
Comparison of Methods for Real and Imaginary Motion Classification from EEG Signals
PublikacjaA method for feature extraction and results of classification of EEG signals obtained from performed and imagined motion are presented. A set of 615 features was obtained to serve for the recognition of type and laterality of motion using 8 different classifications approaches. A comparison of achieved classifiers accuracy is presented in the paper, and then conclusions and discussion are provided. Among applied algorithms the...
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublikacjaThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
-
Evaluation of the separation algorithm performance employing ANNs
PublikacjaCelem niniejszego rozdziału jest przedstawienie metodyki separacji dźwięków muzycznych bez informacji a priori o dźwiękach zawartych w muzycznym miksie. W pracy pokazano, że prawidłowo wytrenowana sztuczna sieć neuronowa (SNN)jest w stanie w sposób automatyczny poprawnie sklasyfikować dźwięki zawarte w zmiksowanym sygnale. Skuteczność klasyfikacji SNN jest porównywalna z oceną subiektywną ekspertów.
-
A study on signal processing methods applied to hearing aids
PublikacjaThis paper presents a short survey on current technology available in hearing aids with a focus on digital signal processing techniques used. First, factors influencing the hearing aid effectiveness are introduced. Then, examples of the present DSP methods and strategies are provided. Also, a description of current limitations of hearing aids and future trends of development are shown. Finally, the notion of computational auditory...
-
Automatic Rhythm Retrieval from Musical Files
PublikacjaThis paper presents a comparison of the effectiveness of two computational intelligence approaches applied to the task of retrieving rhythmic structure from musical files. The method proposed by the authors of this paper generates rhythmic levels first, and then uses these levels to compose rhythmic hypotheses. Three phases: creating periods, creating simplified hypotheses and creating full hypotheses are examined within this study....
-
Multimodal Approach For Polysensory Stimulation And Diagnosis Of Subjects With Severe Communication Disorders
Publikacjais evaluated on 9 patients, data analysis methods are described, and experiments of correlating Glasgow Coma Scale with extracted features describing subjects performance in therapeutic exercises exploiting EEG and eyetracker are presented. Performance metrics are proposed, and k-means clusters used to define concepts for mental states related to EEG and eyetracking activity. Finally, it is shown that the strongest correlations...
-
Multimedia industrial and medical applications supported by machine learning
PublikacjaThis article outlines a keynote paper presented at the Intelligent DecisionTechnologies conference providing a part of the KES Multi-theme Conference “Smart Digital Futures” organized in Rome on June 14–16, 2023. It briefly discusses projects related to traffic control using developed intelligent traffic signs and diagnosing the health of wind turbine mechanisms and multimodal biometric authentication for banking branches to provide...
-
Road traffic can be predicted by machine learning equally effectively as by complex microscopic model
PublikacjaSince high-quality real data acquired from selected road sections are not always available, a traffic control solution can use data from software traffic simulators working offline. The results show that in contrast to microscopic traffic simulation, the algorithms employing neural networks can work in real-time, so they can be used, among others, to determine the speed displayed on variable message road signs. This paper describes...
-
Diagnosing wind turbine condition employing a neural network to the analysis of vibroacoustic signals
PublikacjaIt is important from the economic point of view to detect damage early in the wind turbines before failures occur. For this purpose, a monitoring device was built that analyzes both acoustic signals acquired from the built-in non-contact acoustic intensity probe, as well as from the accelerometers, mounted on the internal devices in the nacelle. The signals collected in this way are used for long-term training of the autoencoder...
-
Wind Turbines Modeling as the Tool for Developing Algorithms of Processing their Video Recordings
PublikacjaIn the real world, many factors exist disturbing observation of the examined phenomena and causing various noises and distortions in recorded signals. It very often makes it difficult or even impossible to optimize various signal processing algorithms, through finding appropriate parameters. In this paper, we show an application, that retrieves wind turbine rotor speed from recorded video. Next, we describe the process of reduction...
-
efficient fractional delay hilbert transform filter in the farrow structure
PublikacjaIn this paper the design and application of a Fractional Delay Hilbert Transform Filter (FDHTF) into an adaptive sub-sample delay estimation between two separated sinusoidal signals is considered. The FDHTF incorporates the functions of Hilbertian and variable fractional delay filtering of the incoming signal simultaneously, in one stage. In traditional approach each of these operations was performed separately. Obtained value...
-
Towards Cognitive and Perceptive Video Systems
PublikacjaIn this chapter we cover research and development issues related to smart cameras. We discuss challenges, new technologies and algorithms, applications and the evaluation of today’s technologies. We will cover problems related to software, hardware, communication, embedded and distributed systems, multi-modal sensors, privacy and security. We also discuss future trends and market expectations from the customer’s point of view.
-
Knowledge representation of motor activity of patients with Parkinson’s disease
PublikacjaAn approach to the knowledge representation extraction from biomedical signals analysis concerning motor activity of Parkinson disease patients is proposed in this paper. This is done utilizing accelerometers attached to their body as well as exploiting video image of their hand movements. Experiments are carried out employing artificial neural networks and support vector machine to the recognition of characteristic motor activity...
-
Waveguide model of the hearing aid earmold system
PublikacjaBackground The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...
-
Waveguide model of the hearing aid earmold system
PublikacjaBackground The earmold system of the Behind-The-Ear hearing aid is an acoustic system that modifies the spectrum of the propagated sound waves. Improper selection of the earmold system may result in deterioration of sound quality and speech intelligibility. Computer modeling methods may be useful in the process of hearing aid fitting, allowing physician to examine various earmold system configurations and choose the optimum one...
-
Development of Domain-Specific Solutions within the Polish Infrastructure for Advanced Scientific Research
PublikacjaThe Polish Grid computing infrastructure was established during the PL-Grid project (2009-2012). The main purpose of this Project was to provide the Polish scientists with an IT basic platform, allowing them to conduct interdisciplinary research on a national scale, and giving them transparent access to international grid resources via international grid infrastructures. Currently, the infrastructure is maintained and extended...
-
Examining Quality of Hand Segmentation Based on Gaussian Mixture Models
PublikacjaResults of examination of various implementations of Gaussian mix-ture models are presented in the paper. Two of the implementations belonged to the Intel’s OpenCV 2.4.3 library and utilized Background Subtractor MOG and Background Subtractor MOG2 classes. The third implementation presented in the paper was created by the authors and extended Background Subtractor MOG2 with the possibility of operating on the scaled version of...
-
Computer-Supported Polysensory Integration Technology for Educationally Handicapped Pupils
PublikacjaIn this paper, a multimedia system providing technology for hearing and visual attention stimulation is shortly presented. The system aims to support the development of educationally handicapped pupils. The system has been presented in the context of its configuration, architecture, and therapeutic exercise implementation issues. Results of pupils’ improvements after 8 weeks of training with the system are also provided. Training...
-
A concept of Signal Equalization Method Based on Music Genre and the Listener's Room Characteristics
PublikacjaA research study that investigates the influence of the room acoustics environment on the frequency characteristic of the audio signal playback is presented. First, a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the frequency response of the room, a system for room acoustics compensation based on eight-band equalizer is proposed. The system settings depend on music genre. In...
-
Retrospecting Polish Audio Engineering Society Membership on 20th Anniversary of the Polish Section of the Audio Engineering Society
PublikacjaIn this article some key events concerning founding Polish Section of the Audio Engineering Society were presented. In addition, the history covering International Symposia on Sound Engineering and Mastering was outlined. Also, papers contained in this issue were shortly reviewed.
-
Facial features extraction for color, frontal images
PublikacjaThe problem of facial characteristic features extraction is discussed. Several methods of features extraction for color en--face photographs are discussed. The methods are based mainly on the colors features related to the specific regions of the human face. The usefulness of presented methods was tested on a database of en--face photographs consisting of 100 photographs.
-
Measurements and visualization of sound field distribution around organ pipe
PublikacjaMeasurements and visualization of acoustic field around an organ pipe are presented. Sound intensity technique was applied for this purpose. Measurements were performed in free field. The organ pipe was activated with a constant air flow, produced by an external compressor, aimed at obtaining long-term steady state responses of generated acoustic signal. Sound energy distribution was measured in a defined fixed grid of points...
-
Multimedia polysensory integration training system dedicated to children with educational difficulties
PublikacjaThis paper aims at presenting a multimedia system providing polysensory train- ing for pupils with educational difficulties. The particularly interesting aspect of the system lies in the sonic interaction with image projection in which sounds generated lead to stim- ulation of a particular part of the human brain. The system architecture, video processing methods, therapeutic exercises and guidelines for children’s interaction...
-
Enhancement of computer character animation utilizing fuzzy rules
PublikacjaRozdział przedstawia nową metodę przetwarzania komputerowych animacji postaci. Wykorzystuje ona wnioskowanie rozmyte, oparte na regułach i funkcjach przynależności uzyskanych w procesie analizy wyników testów subiektywnej oceny jakości animacji. W trakcie przetwarzania do animacji automatycznie dodawane są nowe fazy ruchu, co skutkuje poprawą jakości wizualnej oraz zmianą płynności i stylizacji ruchu w sposób zamierzony. W referacie...
-
An approach to determining tinnitus acoustical characteristic
PublikacjaFor many treatment methods, accurate estimation of Tinnitus(ringing in ears) concerning sound type, level, and bandwidth or frequency is inevitable. The proposed way of obtaining Tinnitus parameters is described in this paper. The method employs sound synthesis, aimed at obtaining sound which is closest to perceived Tinnitus. The proposed method assumes running a designed application on a multimedia PC provided with a special graphical...
-
Visual Detection of People Movement Rules Violation in Crowded Indoor Scenes
PublikacjaThe paper presents a camera-independent framework for detecting violations of two typical people movement rules that are in force in many public transit terminals: moving in the wrong direction or across designated lanes. Low-level image processing is based on object detection with Gaussian Mixture Models and employs Kalman filters with conflict resolving extensions for the object tracking. In order to allow an effective event...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublikacjaA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
-
Automatic Clustering of EEG-Based Data Associated with Brain Activity
PublikacjaThe aim of this paper is to present a system for automatic assigning electroencephalographic (EEG) signals to appropriate classes associated with brain activity. The EEG signals are acquired from a headset consisting of 14 electrodes placed on skull. Data gathered are first processed by the Independent Component Analysis algorithm to obtain estimates of signals generated by primary sources reflecting the activity of the brain....
-
Performance Analysis of Developed Multimodal Biometric Identity Verification System
PublikacjaThe bank client identity verification system developed in the course of the IDENT project is presented. The total number of five biometric modalities including: dynamic handwritten signature proofing, voice recognition, face image verification, face contour extraction and hand blood vessels distribution comparison have been developed and studied. The experimental data were acquired employing multiple biometric sensors installed...
-
Towards Audio Signal Equalization Based on Spectral Characteristics of a Listening Room and Music Content Reproduced
PublikacjaThis study presents investigations of the influence of the room acoustics on the frequency characteristic of the audio signal playback. First, the concept of a novel spectral equalization method of the room acoustic conditions is introduced. On the basis of the room spectral response, a system for room acoustics compensation based on an equalizer designed is proposed. The system settings depend on music genre recognized automatically....
-
Audio Feature Analysis for Precise Vocalic Segments Classification in English
PublikacjaAn approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...
-
Evaluating calibration and robustness of pedestrian detectors
PublikacjaIn this work robustness and calibration of modern pedestrian detectors are evaluated. Pedestrian detection is a crucial perception com- ponent in autonomous driving and here we study its performance under different image corruptions. Furthermore, we provide analysis of classifi- cation calibration of pedestrian detectors and we show a positive effect of using style-transfer augmentation technique. Our analysis is aimed as a step...
-
Multimedia Communications, Services and Security MCSS. 10th International Conference, MCSS 2020, Preface
PublikacjaMultimedia surrounds us everywhere. It is estimated that only a part of the recorded resources are processed and analyzed. These resources offer enormous opportunities to improve the quality of life of citizens. As a result, of the introduction of a new type of algorithms to improve security by maintaining a high level of privacy protection. Among the many articles, there are examples of solutions for improving the operation of...
-
Rating by detection: an artifact detection protocol for rating EEG quality with average event duration
PublikacjaQuantitative evaluation protocols are critical for the development of algorithms that remove artifacts from real EEG optimally. However, visually inspecting the real EEG to select the top-performing artifact removal pipeline is infeasible while hand-crafted EEG data allow assessing artifact removal configurations only in a simulated environment. This study proposes a novel, principled approach for quantitatively evaluating algorithmically...
-
Marking the Allophones Boundaries Based on the DTW Algorithm
PublikacjaThe paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighborhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border...
-
Eulerian motion magnification applied to structural health monitoring of wind turbines
PublikacjaSeveral types of defects may occur in wind turbines, as physical damage of blades or gearbox malfunction. A wind farm monitoring and damage prediction system is built to observe abnormal vibrations of elements of wind turbine: blades, nacelle, and tower. Contactless methods are developed which do not require turbine stopping. In this work, structural health monitoring of a wind turbine is evaluated using a conversion from the captured...
-
Multimodal system for diagnosis and polysensory stimulation of subjects with communication disorders
PublikacjaAn experimental multimodal system, designed for polysensory diagnosis and stimulation of persons with impaired communication skills or even non-communicative subjects is presented. The user interface includes an eye tracking device and the EEG monitoring of the subject. Furthermore, the system consists of a device for objective hearing testing and an autostereoscopic projection system designed to stimulate subjects through their...
-
Leveraging spatio-temporal features for joint deblurring and segmentation of instruments in dental video microscopy
PublikacjaIn dentistry, microscopes have become indispensable optical devices for high-quality treatment and micro-invasive surgery, especially in the field of endodontics. Recent machine vision advances enable more advanced, real-time applications including but not limited to dental video deblurring and workflow analysis through relevant metadata obtained by instrument motion trajectories. To this end, the proposed work addresses dental...
-
Adaptive Method for Modeling of Temporal Dependencies between Fields of Vision in Multi-Camera Surveillance Systems
PublikacjaA method of modeling the time of object transition between given pairs of cameras based on the Gaussian Mixture Model (GMM) is proposed in this article. Temporal dependencies modeling is a part of object re-identification based on the multi-camera experimental framework. The previously utilized Expectation-Maximization (EM) approach, requiring setting the number of mixtures arbitrarily as an input parameter, was extended with the...
-
Systematic Literature Review on Click Through Rate Prediction
PublikacjaThe ability to anticipate whether a user will click on an item is one of the most crucial aspects of operating an e-commerce business, and clickthrough rate prediction is an attempt to provide an answer to this question. Beginning with the simplest multilayer perceptrons and progressing to the most sophisticated attention networks, researchers employ a variety of methods to solve this issue. In this paper, we present the findings...
-
Autoencoder application for anomaly detection in power consumption of lighting systems
PublikacjaDetecting energy consumption anomalies is a popular topic of industrial research, but there is a noticeable lack of research reported in the literature on energy consumption anomalies for road lighting systems. However, there is a need for such research because the lighting system, a key element of the Smart City concept, creates new monitoring opportunities and challenges. This paper examines algorithms based on the deep learning...
-
Missing Puzzle Pieces in Dementia Research: HCN Channels and Theta Oscillations
PublikacjaIncreasing evidence indicates a role of hyperpolarization activated cation (HCN) channels in controlling the resting membrane potential, pacemaker activity, memory formation, sleep, and arousal. Their disfunction may be associated with the development of epilepsy and age-related memory decline. Neuronal hyperexcitability involved in epileptogenesis and EEG desynchronization occur in the course of dementia in human Alzheimer’s Disease...
-
Surgical tool tracking by on-line selection of structural correlation filters
PublikacjaIn visual tracking of surgical instruments, correlation filtering finds the best candidate with maximal correlation peak. However, most trackers only consider capturing target appearance but not target structure. In this paper we propose surgical instrument tracking approach that integrates prior knowledge related to rotation of both shaft and tool tips. To this end, we employ rigid parts mixtures model of an instrument. The rigidly...
-
Multi-task Video Enhancement for Dental Interventions
PublikacjaA microcamera firmly attached to a dental handpiece allows dentists to continuously monitor the progress of conservative dental procedures. Video enhancement in video-assisted dental interventions alleviates low-light, noise, blur, and camera handshakes that collectively degrade visual comfort. To this end, we introduce a novel deep network for multi-task video enhancement that enables macro-visualization of dental scenes. In particular,...
-
Applications of neural networks and perceptual masking to audio restoration
PublikacjaOmówiono zastosowania algorytmów uczących się w dziedzinie rekonstruowania nagrań fonicznych. Szczególną uwagę zwrócono na zastosowanie sztucznych sieci neuronowych do usuwania zakłócających impulsów. Ponadto opisano zastosowanie inteligentnego algorytmu decyzyjnego do sterowania maskowaniem perceptualnym w celu redukowania szumu.
-
Methodology and technology for the polymodal allophonic speech transcription
PublikacjaA method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...
-
Modified dynamic time warping method applied to handwritten signature authenticity verification
PublikacjaA signature verification system based on static features and time-domain functions of signals obtained using a tablet has been presented in the paper. The signature verification method, based mainly on dynamic time warping coupled with some signature image features, has been described. The FRR measures reflecting the method’s efficiency have been evaluated for verification attempts performed directly after obtaining model signatures...
-
Assessment of hearing in coma patients employing auditory brainstem response, electroencephalography, and eye-gaze-tracking
PublikacjaThe results of the study conducted by Tagliaferri et al. in 12 European countries indicate that the ratio of registered brain injury cases in Europe amounts to 150-300 per 100 000 people, with the European mean value of 235 cases per 100 000 people. The project presented in the paper assumes development of a combined metric of patients’ state remaining in coma by intelligent fusion of GCS (subjective Glasgow Coma Scale or its derivatives)...
-
Comparison of the Ability of Neural Network Model and Humans to Detect a Cloned Voice
PublikacjaThe vulnerability of the speaker identity verification system to attacks using voice cloning was examined. The research project assumed creating a model for verifying the speaker’s identity based on voice biometrics and then testing its resistance to potential attacks using voice cloning. The Deep Speaker Neural Speaker Embedding System was trained, and the Real-Time Voice Cloning system was employed based on the SV2TTS, Tacotron,...
-
Discovering Rule-Based Learning Systems for the Purpose of Music Analysis
PublikacjaMusic analysis and processing aims at understanding information retrieved from music (Music Information Retrieval). For the purpose of music data mining, machine learning (ML) methods or statistical approach are employed. Their primary task is recognition of musical instrument sounds, music genre or emotion contained in music, identification of audio, assessment of audio content, etc. In terms of computational approach, music databases...
-
Application of autoencoder to traffic noise analysis
PublikacjaThe aim of an autoencoder neural network is to transform the input data into a lower-dimensional code and then to reconstruct the output from this code representation. Applications of autoencoders to classifying sound events in the road traffic have not been found in the literature. The presented research aims to determine whether such an unsupervised learning method may be used for deploying classification algorithms applied to...
-
Employing a biofeedback method based on hemispheric synchronization in effective learning
PublikacjaIn this paper an approach to build a brain computer-based hemispheric synchronization system is presented. The concept utilizes the wireless EEG signal registration and acquisition as well as advanced pre-processing methods. The influence of various filtration techniques of EOG artifacts on brain state recognition is examined. The emphasis is put on brain state recognition using band pass filtration for separation of individual...
-
Multimedia interface using head movements tracking
PublikacjaThe presented solution supports innovative ways of manipulating computer multimedia content, such as: static images, videos and music clips and others that can be browsed subsequently. The system requires a standard web camera that captures images of the user face. The core of the system is formed by a head movement analyzing algorithm that finds a user face and tracks head movements in real time. Head movements are tracked with...
-
Performance evaluation of parallel background subtraction on GPU platforms
PublikacjaImplementation of the background subtraction algorithm on parallel GPUs is presented. The algorithm processes video streams and extracts foreground pixels. The work focuses on optimizing parallel algorithm implementation by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. The algorithm is implemented in both OpenCL and CUDA. Various optimizations of...
-
Application of auto calibration and linearization algorithms to improve sound quality of computer devices
PublikacjaAn application of auto calibration and linearization algorithms designed for correcting acoustic characteristics of selected computer devices was presented in the paper. The functionality of the algorithms were presented for two kind of computer devices: ultrabook class computer and portable device of All-In-One type. The algorithms were adjusted for the given type of the device on the basis of series of measurements conducted...
-
The Application Of A Noise Mapping Tool Deployed In Grid Infrastructure For Creating Noise Maps Of Urban Areas
PublikacjaThe concept and implementation of the system for creating dynamic noise maps in PL-Grid infrastructure are presented. The methodology of dynamic acoustical maps creating is introduced. The concept of noise mapping, based on noise source and propagation models, was developed and employed in the system. The details of incorporation of the system to the PL-Grid infrastructure are presented. The results of simulations performed by...
-
Comparison of the effectiveness of automatic EEG signal class separation algorithms
PublikacjaIn this paper, an algorithm for automatic brain activity class identification of EEG (electroencephalographic) signals is presented. EEG signals are gathered from seventeen subjects performing one of the three tasks: resting, watching a music video and playing a simple logic game. The methodology applied consists of several steps, namely: signal acquisition, signal processing utilizing z-score normalization, parametrization and...
-
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
PublikacjaA method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublikacjaThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
Improving the quality of speech in the conditions of noise and interference
PublikacjaThe aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...
-
Counting and tracking vehicles using acoustic vector sensors
PublikacjaA method is presented for counting vehicles and for determining their movement direction by means of acoustic vector sensor application. The assumptions of the method employing spatial distribution of sound intensity determined with the help of an integrated 3D intensity probe are discussed. The intensity probe developed by the authors was used for the experiments. The mode of operation of the algorithm is presented in conjunction...
-
A Comparison of STI Measured by Direct and Indirect Methods for Interiors Coupled with Sound Reinforcement Systems
PublikacjaThis paper presents a comparison of STI (Speech Transmission Index) coefficient measurement results carried out by direct and indirect methods. First, acoustic parameters important in the context of public address and sound reinforcement systems are recalled. A measurement methodology is presented that employs various test signals to determine impulse responses. The process of evaluating sound system performance, signals enabling...
-
Employing economical methods for pavement defects estimation
PublikacjaIt is a common practise that measurements of road surface conditions are made using professional and expensive apparatus. Typically a van or a truck equipped with a set of professional sensors i.e. laser scanners of surface is used, therefore the measurement update period is often quite long. Two alternative low-cost methods for estimating road pavement defects and failures were proposed and investigated by the authors. The first...
-
Classifying type of vehicles on the basis of data extracted from audio signal characteristics
PublikacjaThe aim of this study is to find and optimize a feature vector for an automatic recognition of the type of vehicles, extracted form an audio signal. First, the influence of weather-based conditions of road surface on spectral characteristic of the audio signal recorded from a passing vehicle in close proximity to the road is discussed. Next, parameterization of the recorded audio signal is performed. For that purpose, the MIRtoolbox,...
-
Comparison of selected electroencephalographic signal classification methods
PublikacjaA variety of methods exists for electroencephalographic (EEG) signals classification. In this paper, we briefly review selected methods developed for such a purpose. First, a short description of the EEG signal characteristics is shown. Then, a comparison between the selected EEG signal classification methods, based on the overview of research studies on this topic, is presented. Examples of methods included in the study are: Artificial...
-
Comparing traffic intensity estimates employing passive acoustic radar and microwave Doppler radar sensor
PublikacjaThe purpose of our applied research project is to develop an autonomous road sign with built-in radar devices of our design. In this paper, we show that it is possible to calibrate the acoustic vector sensor so that it can be used to measure traffic volume and count the vehicles involved in the traffic through the analysis of the noise emitted by them. Signals obtained from a Doppler radar are used as a reference source. Although...