Filters
total: 377
filtered: 147
Chosen catalog filters
Search results for: FIELD RECORDINGS
-
Automatic Analysis System of TV Commercial Emission Level
PublicationThe purpose of the study was to determine whether the commercial emission level is higher than the emission level of a regular program and to check if the commercials broadcasters follow the recommended levels of loudness. The paper shortly reviews some chosen methods of volume measurements specified in the ITU and EBU recommendations. Then, it describes a prototype of a system implemented in Embarcadero C++ Builder 2010 which...
-
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
PublicationThe aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
-
Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets
PublicationCelem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...
-
Entropy Measures in the Assessment of Heart Rate Variability in Patients with Cardiodepressive Vasovagal Syncope
PublicationSample entropy (SampEn) was reported to be useful in the assessment of the complexity of heart rate dynamics. Permutation entropy (PermEn) is a new measure based on the concept of order and was previously shown to be accurate for short, non-stationary datasets. The aim of the present study is to assess if SampEn and PermEn obtained from baseline recordings might differentiate patients with various outcomes of the head-up tilt test...
-
Detection of impulsive disturbances in archive audio signals
PublicationIn this paper the problem of detection of impulsive disturbances in archive audio signals is considered. It is shown that semi-causal/noncausal solutions based on joint evaluation of signal prediction errors and leave-one-out signal interpolation errors, allow one to noticeably improve detection results compared to the prediction-only based solutions. The proposed approaches are evaluated on a set of clean audio signals contaminated...
-
RENOVATION OF ARCHIVE AUDIO RECORDINGS USING SPARSE AUTOREGRESSIVE MODELING AND BIDIRECTIONAL PROCESSING
PublicationThe paper presents a new approach to elimination of broadband noise and impulsive disturbances from archive audio recordings. The proposed adaptive Kalman-like algorithm, based on a sparse autoregressive model of the audio signal, simultaneously detects noise pulses, interpolates the irrevocably distorted samples and performs signal smoothing. It is shown that bidirectional (forward-backward) processing of the archive signal improves...
-
Selection of Features for Multimodal Vocalic Segments Classification
PublicationEnglish speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...
-
Online sound restoration system for digital library applications.
PublicationAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Detecting Lombard Speech Using Deep Learning Approach
PublicationRobust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...
-
Further Developments of the Online Sound Restoration System for Digital Library Applications
PublicationNew signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...
-
STEADY STATE VISUALLY EVOKED POTENTIALS FOR BRAIN COMPUTER INTERFACE
PublicationAn experiment conducted to validate a possibility of use a single active electrode EEG device for detecting Steady State Visually Evoked Potentials (SSVEP) is shown. A LED stimulator was applied to stimulate patients with two different frequencies - 13 Hz and 17 Hz. First, EEG signals were recorded and pre-processed using MATLAB software. In the next step recordings were analysed and classified employing the WEKA software. As indicated...
-
Face detection algorithms evaluation for the bank client verification
PublicationResults of investigation of face detection algorithms in the video sequences are presented in the paper. The recordings were made with a miniature industrial USB camera in real conditions met in three bank operating rooms. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide as much as possible user interaction...
-
Expert System and Decision Support System for Electrocardiogram Interpretation and Diagnosis: Review, Challenges and Research Directions
PublicationElectrocardiography (ECG) is one of the most widely used recordings in clinical medicine. ECG deals with the recording of electrical activity that is generated by the heart through the surface of the body. The electrical activity generated by the heart is measured using electrodes that are attached to the body surface. The use of ECG in the diagnosis and management of cardiovascular disease (CVD) has been in existence for over...
-
Educational Dataset of Handheld Doppler Blood Flow Recordings
PublicationVital signals registration plays a significant role in biomedical engineering and education process. Well acquired data allow future engineers to observe certain physical phenomena as well learn how to correctly process and interpret the data. This dataset was designed for students to learn about Doppler phenomena and to demonstrate correctly and incorrectly acquired signals as well as the basic methods of signal processing. This...
-
Elimination of impulsive disturbances from archive audio files – comparison of three noise pulse detection schemes
PublicationThe problem of elimination of impulsive disturbances (such as clicks, pops, ticks, crackles, and record scratches) from archive audio recordings is considered and solved using autoregressive modeling. Three classical noise pulse detection schemes are examined and compared: the approach based on open-loop multi-step-ahead signal prediction, the approach based on decision-feedback signal prediction, and the double threshold approach,...
-
Sparse vector autoregressive modeling of audio signals and its application to the elimination of impulsive disturbances
PublicationArchive audio files are often corrupted by impulsive disturbances, such as clicks, pops and record scratches. This paper presents a new method for elimination of impulsive disturbances from stereo audio signals. The proposed approach is based on a sparse vector autoregressive signal model, made up of two components: one taking care of short-term signal correlations, and the other one taking care of long-term correlations. The method...
-
Multimodal English corpus for automatic speech recognition
PublicationA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Evaluation of Face Detection Algorithms for the Bank Client Identity Verification
PublicationResults of investigation of face detection algorithms efficiency in the banking client visual verification system are presented. The video recordings were made in real conditions met in three bank operating outlets employing a miniature industrial USB camera. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide...
-
Differentiating patients with obstructive sleep apnea from healthy controls based on heart rate-blood pressure coupling quantified by entropy-based indices
PublicationWe introduce an entropy-based classification method for pairs of sequences (ECPS) for quantifying mutual dependencies in heart rate and beat-to-beat blood pressure recordings. The purpose of the method is to build a classifier for data in which each item consists of two intertwined data series taken for each subject. The method is based on ordinal patterns and uses entropy-like indices. Machine learning is used to select a subset...
-
Evidence for consolidation of neuronal assemblies after seizures in humans
PublicationThe establishment of memories involves reactivation of waking neuronal activity patterns and strengthening of associated neural circuits during slow-wave sleep (SWS), a process known as "cellular consolidation" (Dudai and Morris, 2013). Reactivation of neural activity patterns during waking behaviors that occurs on a timescale of seconds to minutes is thought to constitute memory recall (O'Keefe and Nadel, 1978), whereas consolidation...
-
Comparison of sound of organ pipes in contemporary and historical instruments
PublicationThe aim of this research is to examine the differences in the timbre of organ pipes’ sound between a historical and a contemporary organ instrument. The historical instrument is the Oliwa organ from Gdansk, Poland, and the contemporary one is from Kartuzy, Poland. Recordings are made of single notes played by an open labial pipe that belongs to the Principal rank. The analyses and comparison of several sound features compatible...
-
Automated detection of sleep apnea and hypopnea events based on robust airflow envelope tracking
PublicationThe paper presents a new approach to detection of apnea/hypopnea events, in the presence of artifacts and breathing irregularities, from a single-channel airflow record. The proposed algorithm identifies segments of signal affected by a high amplitude modulation corresponding to apnea/hypopnea events. It is shown that a robust airflow envelope—free of breathing artifacts—improves effectiveness of the diagnostic process and allows...
-
A study on of music features derived from audio recordings examples – a quantitative analysis
PublicationThe paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that...
-
A commonly-accessible toolchain for live streaming music events with higher-order ambisonic audio and 4k 360 vision
PublicationAn immersive live stream is especially interesting in the ongoing development of telepresence tools, especially in the virtual reality (VR) or mixed reality (MR) domain. This paper explores the remote and immersive way of enabling telepresence for the audience to high-fidelity music performance using freely-available and easily-accessible tools. A functional VR live-streaming toolchain, comprising 360 vision and higher-order ambisonic...
-
Constructing a Dataset of Speech Recordingswith Lombard Effect
PublicationThepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
-
An Approach to the Detection of Bank Robbery Acts Employing Thermal Image Analysis
PublicationA novel approach to the detection of selected security-related events in bank monitoring systems is presented. Thermal camera images are used for the detection of people in difficult lighting conditions. Next, the algorithm analyses movement of objects detected in thermal or standard monitoring cameras using a method evolved from the motion history images algorithm. At the same time, thermal images are analyzed in order to detect...
-
A detector of sleep disorders for using at home
PublicationObstructive sleep apnea usually requires all-ni ght examination in a specialized clinic, under the supervision of a medical staff. Because of those requirements it is an expensive and a non-widely utilized test. Moving the examination procedure to patients’ home with automatic analysis algorithms involved will decrease the costs and make it available for larger group of patients. The developed device allows all-night recordings...
-
Measuring Pulse Rate with a Webcam
PublicationIn this paper a simple method of measuring the pulse rate is presented. Elaborated algorithm allows for efficient pulse rate registration directly from face images captured from a webcam. The desired signal is obtained by proper channel selection and principal component analysis. To determine the accuracy of the method an ECG signal is collected together with a video recordings. The effectiveness of the algorithm is considered...
-
Texture Features for the Detection of Playback Attacks: Towards a Robust Solution
PublicationThis paper describes the new version of a method that is capable of protecting automatic speaker verification (ASV) systems from playback attacks. The presented approach uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. Our goal is to make the algorithm independent from the contents of the training set as much as possible; we look for the...
-
Localization of impulsive disturbances in archive audio signals using predictive matched filtering
PublicationThe problem of elimination of impulsive disturbances from archive audio signals is considered and its new solution, called predictive matched filtering, is proposed. The new approach is based on the observation that a large percentage of noise pulses corrupting archive audio recordings have highly repetitive shapes that match several typical “patterns”, called click templates. To localize noise pulses, click templates can be correlated...
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublicationThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
-
Driving Performance Indicators of Electric Bus Driving Technique: Naturalistic Driving Data Multicriterial Analysis
PublicationThe issue of electric energy saving in public transport is becoming the key area of interest. By improving of driving techniques and the implementation of eco-driving, it is possible to save electric energy. Systems that help to decrease energy consumption and to reduce fuel emissions are becoming popular in vehicles powered by diesel engines. However, these methods have not yet gained popularity in electric vehicles. Therefore,...
-
Visual Detection of People Movement Rules Violation in Crowded Indoor Scenes
PublicationThe paper presents a camera-independent framework for detecting violations of two typical people movement rules that are in force in many public transit terminals: moving in the wrong direction or across designated lanes. Low-level image processing is based on object detection with Gaussian Mixture Models and employs Kalman filters with conflict resolving extensions for the object tracking. In order to allow an effective event...
-
Exploring music listening patterns: an online survey
PublicationAn online survey was carried out to explore how respondents listen to music recordings. It was anticipated that the listener’s preferences would be influenced by various factors, such as age, music genre, the contexts in which they listen, and their favored methods of music consumption. Consequently, the data were collected to analyze these relationships. The survey, structured as a web application, encompassed 23 questions,...
-
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
PublicationAn allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
-
Closed-loop stimulation of temporal cortex rescues functional networks and improves memory
PublicationMemory failures are frustrating and often the result of ineffective encoding. One approach to improving memory outcomes is through direct modulation of brain activity with electrical stimulation. Previous efforts, however, have reported inconsistent effects when using open-loop stimulation and often target the hippocampus and medial temporal lobes. Here we use a closed-loop system to monitor and decode neural activity from direct...
-
Biometric identity verification
PublicationThis chapter discusses methods which are capable of protecting automatic speaker verification systems (ASV) from playback attacks. Additionally, it presents a new approach, which uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. We show that in this case training the system with large amounts of spectrogram patches may be difficult, and...
-
Facial emotion recognition using depth data
PublicationIn this paper an original approach is presented for facial expression and emotion recognition based only on depth channel from Microsoft Kinect sensor. The emotional user model contains nine emotions including the neutral one. The proposed recognition algorithm uses local movements detection within the face area in order to recognize actual facial expression. This approach has been validated on Facial Expressions and Emotions Database...
-
Sparse autoregressive modeling
PublicationIn the paper the comparison of the popular pitch determination (PD) algorithms for thepurpose of elimination of clicks from archive audio signals using sparse autoregressive (SAR)modeling is presented. The SAR signal representation has been widely used in code-excitedlinear prediction (CELP) systems. The appropriate construction of the SAR model is requiredto guarantee model stability. For this reason the signal representation...
-
Comparative Study of Self-Organizing Maps vs. Subjective Evaluation of Quality of Allophone Pronunciation for Nonnative English Speakers
PublicationThe purpose of this study was to apply Self-Organizing Maps to differentiate between the correct and the incorrect allophone pronunciations and to compare the results with subjective evaluation. Recordings of a list of target words, containing selected allophones of English plosive consonants, the velar nasal and the lateral consonant, were made twice. First, the target words were read from the list by 9 non-native speakers and...
-
Applications for investigating therapy progress of autistic children
PublicationThe paper regards supporting behavioral therapy of autistic children with mobile applications, specifically applied for measuring the child’s progress. A family of five applications is presented, that was developed as an investigation tool within the project aimed at automation of therapy progress monitoring. The applications were already tested with children with autism spectrum disorder. Hereby we analyse children’ experience...
-
Ripple oscillations in the left temporal neocortex are associated with impaired verbal episodic memory encoding
PublicationBACKGROUND: We sought to determine if ripple oscillations (80-120 Hz), detected in intracranial electroencephalogram (iEEG) recordings of patients with epilepsy, correlate with an enhancement or disruption of verbal episodic memory encoding. METHODS: We defined ripple and spike events in depth iEEG recordings during list learning in 107 patients with focal epilepsy. We used logistic regression models (LRMs) to investigate the...
-
Systematic approach to binary classification of images in video streams using shifting time windows
Publicationin the paper, after pointing out of realistic recordings and classifications of their frames, we propose a new shifting time window approach for improving binary classifications. We consider image classification in tewo steps. in the first one the well known binary classification algorithms are used for each image separately. In the second step the results of the previous step mare analysed in relatively short sequences of consecutive...
-
Is This Distance Teaching Planning That Bad?
PublicationIn spring 2020, university courses were moved into the virtual space due to the Covid-19 lockdown. In this paper, we use experience from courses at Gdańsk University of Technology and ETH Zurich to identify core problems in distance teaching planning and to discuss what to do and what not to do in teaching planning after the pandemic. We conclude that we will not return to the state of (teaching) affairs that we had previously....
-
Robot Eye Perspective in Perceiving Facial Expressions in Interaction with Children with Autism
PublicationThe paper concerns automatic facial expression analysis applied in a study of natural “in the wild” interaction between children with autism and a social robot. The paper reports a study that analyzed the recordings captured via a camera located in the eye of a robot. Children with autism exhibit a diverse level of deficits, including ones in social interaction and emotional expression. The aim of the study was to explore the possibility...
-
AN ALGORITHM FOR PORTAL HYPERTENSIVE GASTROPATHY RECOGNITION ON THE ENDOSCOPIC RECORDINGS
PublicationSymptoms recognition of portal hypertensive gastropathy (PHG) can be done by analysing endoscopic recordings, but manual analysis done by physician may take a long time. This increases probability of missing some symptoms and automated methods may be applied to prevent that. In this paper a novel hybrid algorithm for recognition of early stage of portal hypertensive gastropathy is proposed. First image preprocessing is described....
-
Detection of Face Position and Orientation Using Depth Data
PublicationIn this paper an original approach is presented for real-time detection of user's face position and orientation based only on depth channel from a Microsoft Kinect sensor which can be used in facial analysis on scenes with poor lighting conditions where traditional algorithms based on optical channel may have failed. Thus the proposed approach can support, or even replace, algorithms based on optical channel or based on skeleton...
-
Cross-domain applications of multimodal human-computer interfaces
PublicationDeveloped multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...
-
High frequency oscillations are associated with cognitive processing in human recognition memory
PublicationHigh frequency oscillations are associated with normal brain function, but also increasingly recognized as potential biomarkers of the epileptogenic brain. Their role in human cognition has been predominantly studied in classical gamma frequencies (30-100 Hz), which reflect neuronal network coordination involved in attention, learning and memory. Invasive brain recordings in animals and humans demonstrate that physiological oscillations...
-
Direct brain stimulation modulates encoding states and memory performance in humans
PublicationPeople often forget information because they fail to effectively encode it. Here, we test the hypothesis that targeted electrical stimulation can modulate neural encoding states and subsequent memory outcomes. Using recordings from neurosurgical epilepsy patients with intracranially implanted electrodes, we trained multivariate classifiers to discriminate spectral activity during learning that predicted remembering from forgetting,...