Filters
total: 375
Search results for: FIELD RECORDINGS
-
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
PublicationAn allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
-
Sparse autoregressive modeling
PublicationIn the paper the comparison of the popular pitch determination (PD) algorithms for thepurpose of elimination of clicks from archive audio signals using sparse autoregressive (SAR)modeling is presented. The SAR signal representation has been widely used in code-excitedlinear prediction (CELP) systems. The appropriate construction of the SAR model is requiredto guarantee model stability. For this reason the signal representation...
-
Facial emotion recognition using depth data
PublicationIn this paper an original approach is presented for facial expression and emotion recognition based only on depth channel from Microsoft Kinect sensor. The emotional user model contains nine emotions including the neutral one. The proposed recognition algorithm uses local movements detection within the face area in order to recognize actual facial expression. This approach has been validated on Facial Expressions and Emotions Database...
-
Biometric identity verification
PublicationThis chapter discusses methods which are capable of protecting automatic speaker verification systems (ASV) from playback attacks. Additionally, it presents a new approach, which uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. We show that in this case training the system with large amounts of spectrogram patches may be difficult, and...
-
Closed-loop stimulation of temporal cortex rescues functional networks and improves memory
PublicationMemory failures are frustrating and often the result of ineffective encoding. One approach to improving memory outcomes is through direct modulation of brain activity with electrical stimulation. Previous efforts, however, have reported inconsistent effects when using open-loop stimulation and often target the hippocampus and medial temporal lobes. Here we use a closed-loop system to monitor and decode neural activity from direct...
-
Applications for investigating therapy progress of autistic children
PublicationThe paper regards supporting behavioral therapy of autistic children with mobile applications, specifically applied for measuring the child’s progress. A family of five applications is presented, that was developed as an investigation tool within the project aimed at automation of therapy progress monitoring. The applications were already tested with children with autism spectrum disorder. Hereby we analyse children’ experience...
-
Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers
PublicationThe purpose of this study was to apply Self-Organizing Maps to differentiate between the correct and the incorrect allophone pronunciations and to compare the results with subjective evaluation. Recordings of a list of target words, containing selected allophones of English plosive consonants, the velar nasal and the lateral consonant, were made twice. First, the target words were read from the list by 9 non-native speakers and...
-
Polish expressways 2016 - video data
Open Research DataPolish expressways 2016 - video data
-
Ripple oscillations in the left temporal neocortex are associated with impaired verbal episodic memory encoding
PublicationBACKGROUND: We sought to determine if ripple oscillations (80-120 Hz), detected in intracranial electroencephalogram (iEEG) recordings of patients with epilepsy, correlate with an enhancement or disruption of verbal episodic memory encoding. METHODS: We defined ripple and spike events in depth iEEG recordings during list learning in 107 patients with focal epilepsy. We used logistic regression models (LRMs) to investigate the...
-
Polish voivodeship roads 2016 - video data
Open Research DataPolish voivodeship roads 2016 - video data
-
AN ALGORITHM FOR PORTAL HYPERTENSIVE GASTROPATHY RECOGNITION ON THE ENDOSCOPIC RECORDINGS
PublicationSymptoms recognition of portal hypertensive gastropathy (PHG) can be done by analysing endoscopic recordings, but manual analysis done by physician may take a long time. This increases probability of missing some symptoms and automated methods may be applied to prevent that. In this paper a novel hybrid algorithm for recognition of early stage of portal hypertensive gastropathy is proposed. First image preprocessing is described....
-
Detection of Face Position and Orientation Using Depth Data
PublicationIn this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...
-
Cross-domain applications of multimodal human-computer interfaces
PublicationDeveloped multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...
-
Systematic approach to binary classification of images in video streams using shifting time windows
Publicationin the paper, after pointing out of realistic recordings and classifications of their frames, we propose a new shifting time window approach for improving binary classifications. We consider image classification in tewo steps. in the first one the well known binary classification algorithms are used for each image separately. In the second step the results of the previous step mare analysed in relatively short sequences of consecutive...
-
Robot Eye Perspective in Perceiving Facial Expressions in Interaction with Children with Autism
PublicationThe paper concerns automatic facial expression analysis applied in a study of natural “in the wild” interaction between children with autism and a social robot. The paper reports a study that analyzed the recordings captured via a camera located in the eye of a robot. Children with autism exhibit a diverse level of deficits, including ones in social interaction and emotional expression. The aim of the study was to explore the possibility...
-
Is This Distance Teaching Planning That Bad?
PublicationIn spring 2020, university courses were moved into the virtual space due to the Covid-19 lockdown. In this paper, we use experience from courses at Gdańsk University of Technology and ETH Zurich to identify core problems in distance teaching planning and to discuss what to do and what not to do in teaching planning after the pandemic. We conclude that we will not return to the state of (teaching) affairs that we had previously....
-
Polish national roads 2016 - video data
Open Research DataThe data includes video traffic data registered with video camera installed inside the car. The purpose of the research was to gather vehicle traffic recordings in real conditions on polish national roads.
-
High frequency oscillations are associated with cognitive processing in human recognition memory
PublicationHigh frequency oscillations are associated with normal brain function, but also increasingly recognized as potential biomarkers of the epileptogenic brain. Their role in human cognition has been predominantly studied in classical gamma frequencies (30-100 Hz), which reflect neuronal network coordination involved in attention, learning and memory. Invasive brain recordings in animals and humans demonstrate that physiological oscillations...
-
Automated Detection of Sleep Apnea and Hypopnea Events Based on Robust Airflow Envelope Tracking in the Presence of Breathing Artifacts. - [IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS]
PublicationThe paper presents a new approach to detection of apnea/hypopnea events, in the presence of artifacts and breathing irregularities, from a single channel airflow record. The proposed algorithm, based on a robust envelope detector , identifies segments of signal affected by a high amplitude mo d- ulation corresponding to apnea/hypopnea events. It is show n that a robust airflow envelope - free of breathing artifacts - improves effectiveness...
-
Behavior Analysis and Dynamic Crowd Management in Video Surveillance System
PublicationA concept and practical implementation of a crowd management system which acquires input data by the set of monitoring cameras is presented. Two leading threads are considered. First concerns the crowd behavior analysis. Second thread focuses on detection of a hold-ups in the doorway. The optical flow combined with soft computing methods (neural network) is employed to evaluate the type of crowd behavior, and fuzzy logic aids detection...
-
Improving methods for detecting people in video recordings using shifting time-windows
PublicationWe propose a novel method for improving algorithms which detect the presence of people in video sequences. Our focus is on algorithms for applications which require reporting and analyzing all scenes with detected people in long recordings. Therefore one of the target qualities of the classification result is its stability, understood as a low number of invalid scene boundaries. Many existing methods process images in the recording...
-
Automatic Marking of Allophone Boundaries in Isolated English spoken Words
PublicationThe work presents a method that allows delimiting the borders of allophones in isolated English words. The described method is based on the DTW algorithm combining two signals, a reference signal and an analyzed one. As the reference signal, recordings from the MODALITY database were used, from which the words were extracted. This database was also used for tests, which were described. Test results show that the automatic determination...
-
Ranking Speech Features for Their Usage in Singing Emotion Classification
PublicationThis paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based...
-
Cross-Lingual Knowledge Distillation via Flow-Based Voice Conversion for Robust Polyglot Text-to-Speech
PublicationIn this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic features and durations...
-
Direct brain stimulation modulates encoding states and memory performance in humans
PublicationPeople often forget information because they fail to effectively encode it. Here, we test the hypothesis that targeted electrical stimulation can modulate neural encoding states and subsequent memory outcomes. Using recordings from neurosurgical epilepsy patients with intracranially implanted electrodes, we trained multivariate classifiers to discriminate spectral activity during learning that predicted remembering from forgetting,...
-
Detection of Water on Road Surface with Acoustic Vector Sensor
PublicationThis paper presents a new approach to detecting the presence of water on a road surface, employing an acoustic vector sensor. The proposed method is based on sound intensity analysis in the frequency domain. Acoustic events, representing road vehicles, are detected in the sound intensity signals. The direction of the incoming sound is calculated for the individual spectral components of the intensity signal, and the components...
-
Feasibility Study for Food Intake Tasks Recognition Based on Smart Glasses
PublicationIn this exploratory study 13 adult test subjects have performed different food intake tasks while wearing a three axis accelerometer mounted at a temple of glasses. Two different algorithms for task recognition have been applied and compared. The retrospective data processing leads to better task recognition results when the frequency range of 50 Hz to 100 Hz is analysed within accelerometer signal recordings. A straightforward...
-
Database of speech and facial expressions recorded with optimized face motion capture settings
PublicationThe broad objective of the present research is the analysis of spoken English employing a multiplicity of modalities. An important stage of this process, discussed in the paper, is creating a database of speech accompanied with facial expressions. Recordings of speakers were made using an advanced system for capturing facial muscle motion. A brief historical outline, current applications, limitations and the ways of capturing face...
-
Sound quality metrics applied to road noise evaluation
PublicationRoad noise monitoring systems typically measure sound levels in specific time periods. The more insightful approach suggests to measure also the nature of noise. Sound quality of sounds such as car noise can be objectively evaluated by several parameters. One of them is psychoacoustic annoyance, described by loudness, tone color, and the temporal structure of sound. In this paper the assessment of several sound quality parameters, such...
-
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
PublicationThe purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...
-
Emotion Recognition
Open Research DataThe films presented here were recorded using so-called high-speed camera Phantom Miro. To play the movie You need the special software which can be downloaded from the web site https://www.phantomhighspeed.com/resourcesandsupport/phantomresources/pccsoftware the details of the movie are available after starting the movie in the viewer in the description...
-
Emotion Recognition
Open Research DataThe films presented here were recorded using so-called high-speed camera Phantom Miro. To play the movie You need the special software which can be downloaded from the web site https://www.phantomhighspeed.com/resourcesandsupport/phantomresources/pccsoftware the details of the movie are available after starting the movie in the viewer in the description...
-
Low-Power WSN System for Honey Bee Monitoring
PublicationThe paper presents a universal low-power system for biosensory data acquisition in scope of bees monitoring. We describe the architecture of the system, energy-saving components as well as we discuss the selection of used sensors. The work focuses on energy optimization in a scope of wireless communication. A custom protocol was implemented, which is the basis for presented energy-efficient devices. Data exchange process during...
-
Human subarachnoid space width oscillations in the resting state
PublicationAbnormal cerebrospinal fluid (CSF) pulsatility has been implicated in patients suffering from various diseases, including multiple sclerosis and hypertension. CSF pulsatility results in subarachnoid space (SAS) width changes, which can be measured with near-infrared transillumination backscattering sounding (NIR-T/BSS). The aim of this study was to combine NIR-T/BSS and wavelet analysis methods to characterise the dynamics of the...
-
DevEmo—Software Developers’ Facial Expression Dataset
PublicationThe COVID-19 pandemic has increased the relevance of remote activities and digital tools for education, work, and other aspects of daily life. This reality has highlighted the need for emotion recognition technology to better understand the emotions of computer users and provide support in remote environments. Emotion recognition can play a critical role in improving the remote experience and ensuring that individuals are able...
-
Production of six-degrees-of-freedom (6DoF) navigable audio using 30 Ambisonic microphones
PublicationThis paper describes a method for planning, recording, and post-production of six-degrees-of-freedom audio recorded with multiple 3rd order Ambisonic microphone arrays. The description is based on the example of recordings conducted in August 2020 with the Poznan Philharmonic Orchestra using 30 units of Zylia ZM-1S. A convenient way to prepare and organize such a big project is proposed – this involves details of stage planning,...
-
Labeler-hot Detection of EEG Epileptic Transients
PublicationPreventing early progression of epilepsy and sothe severity of seizures requires effective diagnosis. Epileptictransients indicate the ability to develop seizures but humansoverlook such brief events in an electroencephalogram (EEG)what compromises patient treatment. Traditionally, trainingof the EEG event detection algorithms has relied on groundtruth labels, obtained from the consensus...
-
Doppler blood flow recordings
Open Research DataVital signals registration plays a grate role in biomedical engineering and education process. Well acquired data allow future engineers to observe certain physical phenomenons as well learn how to correctly process and interpret the data. This data set was designed for students to learn about Doppler phenomena and to demonstrate correctly and incorrectly...
-
Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
PublicationThe aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...
-
Noise profiling for speech enhancement employing machine learning models
PublicationThis paper aims to propose a noise profiling method that can be performed in near real-time based on machine learning (ML). To address challenges related to noise profiling effectively, we start with a critical review of the literature background. Then, we outline the experiment performed consisting of two parts. The first part concerns the noise recognition model built upon several baseline classifiers and noise signal features...
-
Machine learning applied to acoustic-based road traffic monitoring
PublicationThe motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...
-
Estimation of nonstructural stiffness in instrumented steel frames
PublicationLateral stiffness of nonstructural components may significantly influence the initial stiffness of the entire structure and consequently alter its dynamic characteristics. While methods for simulating structural members are well-established, approaches for modeling nonstructural components that also participate in seismic response are notably less developed. In this paper a simplified, physically-intuitive approach for estimating...
-
Machine learning applied to acoustic-based road traffic monitoring
PublicationThe motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...
-
Improving the quality of speech in the conditions of noise and interference
PublicationThe aim of the work is to present a method of intelligent modification of the speech signal with speech features expressed in noise, based on the Lombard effect. The recordings utilized sets of words and sentences as well as disturbing signals, i.e., pink noise and the so-called babble speech. Noise signal, calibrated to various levels at the speaker's ears, was played over two loudspeakers located 2 m away from the speaker. In...
-
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
PublicationThe bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
-
Identification of acoustic event of selected noise sources in a long-term environmental monitoring systems
PublicationABSTRACT Undertaking long-term acoustic measurements on sites located near an airport is related to a problem of large quantities of recorded data, which very often represents information not related to flight operations. In such areas, usually defined as zone of limited use, often other sources of noise exist, such as roads or railway lines treated is such context as acoustic background. Manual verification of such recorded data...
-
Entropic Measures of Complexity of Short-Term Dynamics of Nocturnal Heartbeats in an Aging Population
PublicationTwo entropy-based approaches are investigated to study patterns describing differences in time intervals between consecutive heartbeats. The first method explores matrices arising from networks of transitions constructed following events represented by a time series. The second method considers distributions of ordinal patterns of length three, whereby patterns with repeated values are counted as different patterns. Both methods provide...
-
Detecting coupling directions with transcript mutual information: A comparative study
PublicationCausal relationships are important to understand the dynamics of coupled processes and, moreover, to influence or control the effects by acting on the causes. Among the different approaches to determine cause-effect relationships and, in particular, coupling directions in interacting random or deterministic processes, we focus in this paper on information-theoretic measures. So, we study in the theoretical part the difference between...
-
The shallow sea experiment with usage of linear hydrophone array
PublicationPurpose of this article is to present designed and made linear hydrophone array and the results obtained during in situ trails on Gulf of Gdańsk. The measuring system allowed to localize hydrophones in the selected points and perform measurements in both the horizontal antenna positioning and vertical. Made in this way recordings allow creating accurate 3D imaging of sound intensity/propagation. During research three floating objects...
-
Objectivization of phonological evaluation of speech elements by means of audio parametrization
PublicationThis study addresses two issues related to both machine- and subjective-based speech evaluation by investigating five phonological phenomena related to allophone production. Its aim is to use objective parametrization and phonological classification of the recorded allophones. These allophones were selected as specifically difficult for Polish speakers of English: aspiration, final obstruent devoicing, dark lateral /l/, velar nasal...