Wyniki wyszukiwania dla: recordings
-
Multimodal English corpus for automatic speech recognition
PublikacjaA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Evaluation of Face Detection Algorithms for the Bank Client Identity Verification
PublikacjaResults of investigation of face detection algorithms efficiency in the banking client visual verification system are presented. The video recordings were made in real conditions met in three bank operating outlets employing a miniature industrial USB camera. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide...
-
Differentiating patients with obstructive sleep apnea from healthy controls based on heart rate-blood pressure coupling quantified by entropy-based indices
PublikacjaWe introduce an entropy-based classification method for pairs of sequences (ECPS) for quantifying mutual dependencies in heart rate and beat-to-beat blood pressure recordings. The purpose of the method is to build a classifier for data in which each item consists of two intertwined data series taken for each subject. The method is based on ordinal patterns and uses entropy-like indices. Machine learning is used to select a subset...
-
Evidence for consolidation of neuronal assemblies after seizures in humans
PublikacjaThe establishment of memories involves reactivation of waking neuronal activity patterns and strengthening of associated neural circuits during slow-wave sleep (SWS), a process known as "cellular consolidation" (Dudai and Morris, 2013). Reactivation of neural activity patterns during waking behaviors that occurs on a timescale of seconds to minutes is thought to constitute memory recall (O'Keefe and Nadel, 1978), whereas consolidation...
-
Automated detection of sleep apnea and hypopnea events based on robust airflow envelope tracking
PublikacjaThe paper presents a new approach to detection of apnea/hypopnea events, in the presence of artifacts and breathing irregularities, from a single-channel airflow record. The proposed algorithm identifies segments of signal affected by a high amplitude modulation corresponding to apnea/hypopnea events. It is shown that a robust airflow envelope—free of breathing artifacts—improves effectiveness of the diagnostic process and allows...
-
Comparison of sound of organ pipes in contemporary and historical instruments
PublikacjaThe aim of this research is to examine the differences in the timbre of organ pipes’ sound between a historical and a contemporary organ instrument. The historical instrument is the Oliwa organ from Gdansk, Poland, and the contemporary one is from Kartuzy, Poland. Recordings are made of single notes played by an open labial pipe that belongs to the Principal rank. The analyses and comparison of several sound features compatible...
-
A commonly-accessible toolchain for live streaming music events with higher-order ambisonic audio and 4k 360 vision
PublikacjaAn immersive live stream is especially interesting in the ongoing development of telepresence tools, especially in the virtual reality (VR) or mixed reality (MR) domain. This paper explores the remote and immersive way of enabling telepresence for the audience to high-fidelity music performance using freely-available and easily-accessible tools. A functional VR live-streaming toolchain, comprising 360 vision and higher-order ambisonic...
-
An Approach to the Detection of Bank Robbery Acts Employing Thermal Image Analysis
PublikacjaA novel approach to the detection of selected security-related events in bank monitoring systems is presented. Thermal camera images are used for the detection of people in difficult lighting conditions. Next, the algorithm analyses movement of objects detected in thermal or standard monitoring cameras using a method evolved from the motion history images algorithm. At the same time, thermal images are analyzed in order to detect...
-
A detector of sleep disorders for using at home
PublikacjaObstructive sleep apnea usually requires all-ni ght examination in a specialized clinic, under the supervision of a medical staff. Because of those requirements it is an expensive and a non-widely utilized test. Moving the examination procedure to patients’ home with automatic analysis algorithms involved will decrease the costs and make it available for larger group of patients. The developed device allows all-night recordings...
-
Polish motorways 2016 - video data
Dane BadawczePolish motorways 2016 - video data
-
Measuring Pulse Rate with a Webcam
PublikacjaIn this paper a simple method of measuring the pulse rate is presented. Elaborated algorithm allows for efficient pulse rate registration directly from face images captured from a webcam. The desired signal is obtained by proper channel selection and principal component analysis. To determine the accuracy of the method an ECG signal is collected together with a video recordings. The effectiveness of the algorithm is considered...
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublikacjaThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
-
Texture Features for the Detection of Playback Attacks: Towards a Robust Solution
PublikacjaThis paper describes the new version of a method that is capable of protecting automatic speaker verification (ASV) systems from playback attacks. The presented approach uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. Our goal is to make the algorithm independent from the contents of the training set as much as possible; we look for the...
-
Localization of impulsive disturbances in archive audio signals using predictive matched filtering
PublikacjaThe problem of elimination of impulsive disturbances from archive audio signals is considered and its new solution, called predictive matched filtering, is proposed. The new approach is based on the observation that a large percentage of noise pulses corrupting archive audio recordings have highly repetitive shapes that match several typical “patterns”, called click templates. To localize noise pulses, click templates can be correlated...
-
Sound signals generated during lapping of technical ceramics using electroplated tools with diamond grains
Dane BadawczeData contains the recordings of sound generated during single-sided lapping with the use of electroplated diamond tools. This relationship was examined with the use of spectral analysis of the sound signal in the frequency domain with a focus on the Ra parameter of the surface roughness. The estimated sound coefficient increased as the surface roughness...
-
Driving Performance Indicators of Electric Bus Driving Technique: Naturalistic Driving Data Multicriterial Analysis
PublikacjaThe issue of electric energy saving in public transport is becoming the key area of interest. By improving of driving techniques and the implementation of eco-driving, it is possible to save electric energy. Systems that help to decrease energy consumption and to reduce fuel emissions are becoming popular in vehicles powered by diesel engines. However, these methods have not yet gained popularity in electric vehicles. Therefore,...
-
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
PublikacjaAn allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
-
Visual Detection of People Movement Rules Violation in Crowded Indoor Scenes
PublikacjaThe paper presents a camera-independent framework for detecting violations of two typical people movement rules that are in force in many public transit terminals: moving in the wrong direction or across designated lanes. Low-level image processing is based on object detection with Gaussian Mixture Models and employs Kalman filters with conflict resolving extensions for the object tracking. In order to allow an effective event...
-
Polish expressways 2016 - video data
Dane BadawczePolish expressways 2016 - video data
-
Applications for investigating therapy progress of autistic children
PublikacjaThe paper regards supporting behavioral therapy of autistic children with mobile applications, specifically applied for measuring the child’s progress. A family of five applications is presented, that was developed as an investigation tool within the project aimed at automation of therapy progress monitoring. The applications were already tested with children with autism spectrum disorder. Hereby we analyse children’ experience...
-
Closed-loop stimulation of temporal cortex rescues functional networks and improves memory
PublikacjaMemory failures are frustrating and often the result of ineffective encoding. One approach to improving memory outcomes is through direct modulation of brain activity with electrical stimulation. Previous efforts, however, have reported inconsistent effects when using open-loop stimulation and often target the hippocampus and medial temporal lobes. Here we use a closed-loop system to monitor and decode neural activity from direct...
-
Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers
PublikacjaThe purpose of this study was to apply Self-Organizing Maps to differentiate between the correct and the incorrect allophone pronunciations and to compare the results with subjective evaluation. Recordings of a list of target words, containing selected allophones of English plosive consonants, the velar nasal and the lateral consonant, were made twice. First, the target words were read from the list by 9 non-native speakers and...
-
Sparse autoregressive modeling
PublikacjaIn the paper the comparison of the popular pitch determination (PD) algorithms for thepurpose of elimination of clicks from archive audio signals using sparse autoregressive (SAR)modeling is presented. The SAR signal representation has been widely used in code-excitedlinear prediction (CELP) systems. The appropriate construction of the SAR model is requiredto guarantee model stability. For this reason the signal representation...
-
Facial emotion recognition using depth data
PublikacjaIn this paper an original approach is presented for facial expression and emotion recognition based only on depth channel from Microsoft Kinect sensor. The emotional user model contains nine emotions including the neutral one. The proposed recognition algorithm uses local movements detection within the face area in order to recognize actual facial expression. This approach has been validated on Facial Expressions and Emotions Database...
-
Biometric identity verification
PublikacjaThis chapter discusses methods which are capable of protecting automatic speaker verification systems (ASV) from playback attacks. Additionally, it presents a new approach, which uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. We show that in this case training the system with large amounts of spectrogram patches may be difficult, and...
-
Ripple oscillations in the left temporal neocortex are associated with impaired verbal episodic memory encoding
PublikacjaBACKGROUND: We sought to determine if ripple oscillations (80-120 Hz), detected in intracranial electroencephalogram (iEEG) recordings of patients with epilepsy, correlate with an enhancement or disruption of verbal episodic memory encoding. METHODS: We defined ripple and spike events in depth iEEG recordings during list learning in 107 patients with focal epilepsy. We used logistic regression models (LRMs) to investigate the...
-
Polish voivodeship roads 2016 - video data
Dane BadawczePolish voivodeship roads 2016 - video data
-
Robot Eye Perspective in Perceiving Facial Expressions in Interaction with Children with Autism
PublikacjaThe paper concerns automatic facial expression analysis applied in a study of natural “in the wild” interaction between children with autism and a social robot. The paper reports a study that analyzed the recordings captured via a camera located in the eye of a robot. Children with autism exhibit a diverse level of deficits, including ones in social interaction and emotional expression. The aim of the study was to explore the possibility...
-
Systematic approach to binary classification of images in video streams using shifting time windows
Publikacjain the paper, after pointing out of realistic recordings and classifications of their frames, we propose a new shifting time window approach for improving binary classifications. We consider image classification in tewo steps. in the first one the well known binary classification algorithms are used for each image separately. In the second step the results of the previous step mare analysed in relatively short sequences of consecutive...
-
Detection of Face Position and Orientation Using Depth Data
PublikacjaIn this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...
-
Cross-domain applications of multimodal human-computer interfaces
PublikacjaDeveloped multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...
-
Is This Distance Teaching Planning That Bad?
PublikacjaIn spring 2020, university courses were moved into the virtual space due to the Covid-19 lockdown. In this paper, we use experience from courses at Gdańsk University of Technology and ETH Zurich to identify core problems in distance teaching planning and to discuss what to do and what not to do in teaching planning after the pandemic. We conclude that we will not return to the state of (teaching) affairs that we had previously....
-
Polish national roads 2016 - video data
Dane BadawczeThe data includes video traffic data registered with video camera installed inside the car. The purpose of the research was to gather vehicle traffic recordings in real conditions on polish national roads.
-
Akustyczna analiza parametrów ruchu drogowego z wykorzystaniem informacji o hałasie oraz uczenia maszynowego
PublikacjaCelem rozprawy było opracowanie akustycznej metody analizy parametrów ruchu drogowego. Zasada działania akustycznej analizy ruchu drogowego zapewnia pasywną metodę monitorowania natężenia ruchu. W pracy przedstawiono wybrane metody uczenia maszynowego w kontekście analizy dźwięku (ang.Machine Hearing). Przedstawiono metodologię klasyfikacji zdarzeń w ruchu drogowym z wykorzystaniem uczenia maszynowego. Przybliżono podstawowe...
-
High frequency oscillations are associated with cognitive processing in human recognition memory
PublikacjaHigh frequency oscillations are associated with normal brain function, but also increasingly recognized as potential biomarkers of the epileptogenic brain. Their role in human cognition has been predominantly studied in classical gamma frequencies (30-100 Hz), which reflect neuronal network coordination involved in attention, learning and memory. Invasive brain recordings in animals and humans demonstrate that physiological oscillations...
-
Direct brain stimulation modulates encoding states and memory performance in humans
PublikacjaPeople often forget information because they fail to effectively encode it. Here, we test the hypothesis that targeted electrical stimulation can modulate neural encoding states and subsequent memory outcomes. Using recordings from neurosurgical epilepsy patients with intracranially implanted electrodes, we trained multivariate classifiers to discriminate spectral activity during learning that predicted remembering from forgetting,...
-
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
PublikacjaA concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...
-
Automated Detection of Sleep Apnea and Hypopnea Events Based on Robust Airflow Envelope Tracking in the Presence of Breathing Artifacts. - [IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS]
PublikacjaThe paper presents a new approach to detection of apnea/hypopnea events, in the presence of artifacts and breathing irregularities, from a single channel airflow record. The proposed algorithm, based on a robust envelope detector , identifies segments of signal affected by a high amplitude mo d- ulation corresponding to apnea/hypopnea events. It is show n that a robust airflow envelope - free of breathing artifacts - improves effectiveness...
-
Automatic Marking of Allophone Boundaries in Isolated English spoken Words
PublikacjaThe work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublikacjaThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech
PublikacjaIn this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...
-
Sound quality metrics applied to road noise evaluation
PublikacjaRoad noise monitoring systems typically measure sound levels in specific time periods. The more insightful approach suggests to measure also the nature of noise. Sound quality of sounds such as car noise can be objectively evaluated by several parameters. One of them is psychoacoustic annoyance, described by loudness, tone color, and the temporal structure of sound. In this paper the assessment of several sound quality parameters, such...
-
Database of speech and facial expressions recorded with optimized face motion capture settings
PublikacjaThe broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...
-
Feasibility Study for Food Intake Tasks Recognition Based on Smart Glasses
PublikacjaIn this exploratory study 13 adult test subjects have performed different food intake tasks while wearing a three axis accelerometer mounted at a temple of glasses. Two different algorithms for task recognition have been applied and compared. The retrospective data processing leads to better task recognition results when the frequency range of 50 Hz to 100 Hz is analysed within accelerometer signal recordings. A straightforward...
-
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
PublikacjaThe purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...
-
Detection of Water on Road Surface with Acoustic Vector Sensor
PublikacjaThis paper presents a new approach to detecting the presence of water on a road surface, employing an acoustic vector sensor. The proposed method is based on sound intensity analysis in the frequency domain. Acoustic events, representing road vehicles, are detected in the sound intensity signals. The direction of the incoming sound is calculated for the individual spectral components of the intensity signal, and the components...
-
Emotion Recognition
Dane BadawczeThe films presented here were recorded using so-called high-speed camera Phantom Miro. To play the movie You need the special software which can be downloaded from the web site https://www.phantomhighspeed.com/resourcesandsupport/phantomresources/pccsoftware the details of the movie are available after starting the movie in the viewer in the description...
-
Emotion Recognition
Dane BadawczeThe films presented here were recorded using so-called high-speed camera Phantom Miro. To play the movie You need the special software which can be downloaded from the web site https://www.phantomhighspeed.com/resourcesandsupport/phantomresources/pccsoftware the details of the movie are available after starting the movie in the viewer in the description...
-
Low-Power WSN System for Honey Bee Monitoring
PublikacjaThe paper presents a universal low-power system for biosensory data acquisition in scope of bees monitoring. We describe the architecture of the system, energy-saving components as well as we discuss the selection of used sensors. The work focuses on energy optimization in a scope of wireless communication. A custom protocol was implemented, which is the basis for presented energy-efficient devices. Data exchange process during...
-
Human subarachnoid space width oscillations in the resting state
PublikacjaAbnormal cerebrospinal fluid (CSF) pulsatility has been implicated in patients suffering from various diseases, including multiple sclerosis and hypertension. CSF pulsatility results in subarachnoid space (SAS) width changes, which can be measured with near-infrared transillumination backscattering sounding (NIR-T/BSS). The aim of this study was to combine NIR-T/BSS and wavelet analysis methods to characterise the dynamics of the...