Filtry
wszystkich: 377
Wyniki wyszukiwania dla: FIELD RECORDINGS
-
MODALITY corpus - SPEAKER 17 - SEQUENCE S4
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 17 - SEQUENCE S2
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 17 - SEQUENCE S5
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 17 - SEQUENCE S3
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 17 - SEQUENCE S6
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
PublikacjaDeveloping signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....
-
Effects of Column Base Flexibility on Seismic Response of Steel Moment-Frame Buildings
PublikacjaSteel Moment Resisting Frames (SMRFs) are very popular lateral load resisting systems in many seismically active regions. However, their seismic response is strongly dependent on the rotational fixity of column base connections. Despite many studies (both experimental and numerical) in this particular area, available approaches for estimating column base flexibility have been validated only against laboratory test data. In the...
-
APPLICATION OF ENTROPY-BASED METHODS TO DISTINGUISH HEALTHY INDIVIDUALS WITH NORMAL SINUS RHYTHM FROM PATIENTS WITH CONGESTIVE HEART FAILURE
PublikacjaIn this paper, we examined whether entropy-based methods are able to differentiate healthy individuals from patients with congestive heart failure. To this aim, we applied two methods: Permutation Entropy and Block Entropy. Long-term ECG recordings (75 000 RR intervals) were analyzed. The results proved that both methods can distinguish those groups on condition that the parameters are appropriately chosen.
-
Video recordings of bees at entrance to hives
Dane BadawczeVideo recordings of bees at entrance to hives from 2017-04-22, 2017-04-23 and 2018-05-22. All recordings were made using hand-held full HD camera (Samsung Galaxy S3) and encoded using H.264 video codec (Standard Baseline Profile for mov files from 2017, High Profile for mp4 files from 2018) , 30 FPS and bit rate 14478 kb/s (mov files from 2017) or 16869 kb/s...
-
ALOFON corpus
Dane BadawczeThe ALOFON corpus is one of the multimodal database of word recordings in English, available at http://www.modality-corpus.org/. The ALOFON corpus is oriented towards the recording of the speech equivalence variants. For this purpose, a total of 7 people who are or speak English with native speaker fluency and a variety of Standard Southern British...
-
Evaluation Criteria for Affect-Annotated Databases
PublikacjaIn this paper a set of comprehensive evaluation criteria for affect-annotated databases is proposed. These criteria can be used for evaluation of the quality of a database on the stage of its creation as well as for evaluation and comparison of existing databases. The usefulness of these criteria is demonstrated on several databases selected from affect computing domain. The databases contain different kind of data: video or still...
-
Rough Set-Based Classification of EEG Signals Related to Real and Imagery Motion
PublikacjaA rough set-based approach to classification of EEG signals registered while subjects were performing real and imagery motions is presented in the paper. The appropriate subset of EEG channels is selected, the recordings are segmented, and features are extracted, based on time-frequency decomposition of the signal. Rough set classifier is trained in several scenarios, comparing accuracy of classification for real and imagery motion....
-
Online sound restoration system for digital library applications
PublikacjaAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
FEEDB: A multimodal database of facial expressions and emotions
PublikacjaIn this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...
-
Automatic Analysis System of TV Commercial Emission Level
PublikacjaThe purpose of the study was to determine whether the commercial emission level is higher than the emission level of a regular program and to check if the commercials broadcasters follow the recommended levels of loudness. The paper shortly reviews some chosen methods of volume measurements specified in the ITU and EBU recommendations. Then, it describes a prototype of a system implemented in Embarcadero C++ Builder 2010 which...
-
Analysis of Lombard speech using parameterization and the objective quality indicators in noise conditions
PublikacjaThe aim of the work is to analyze Lombard speech effect in recordings and then modify the speech signal in order to obtain an increase in the improvement of objective speech quality indicators after mixing the useful signal with noise or with an interfering signal. The modifications made to the signal are based on the characteristics of the Lombard speech, and in particular on the effect of increasing the fundamental frequency...
-
Automatic Singing Voice Recognition EmployingNeural Networks and Rough Sets
PublikacjaCelem badań jest automatyczne rozpoznawanie głosów śpiewaczych w kategorii rodzaju i jakości technicznej śpiewu. W artykule opisano stworzoną bazę danych głosów, która zawiera próbki głosu śpiewaków profesjonalnych i amatorskich. W dalszej części opisano parametry zdefiniowane w oparciu o zjawiska biomechaniczne w narządzie głosu podczas śpiewania. W oparciu o stworzone macierze parametrów wytrenowano i porównano automatyczne klasyfikatory...
-
Entropy Measures in the Assessment of Heart Rate Variability in Patients with Cardiodepressive Vasovagal Syncope
PublikacjaSample entropy (SampEn) was reported to be useful in the assessment of the complexity of heart rate dynamics. Permutation entropy (PermEn) is a new measure based on the concept of order and was previously shown to be accurate for short, non-stationary datasets. The aim of the present study is to assess if SampEn and PermEn obtained from baseline recordings might differentiate patients with various outcomes of the head-up tilt test...
-
Detection of impulsive disturbances in archive audio signals
PublikacjaIn this paper the problem of detection of impulsive disturbances in archive audio signals is considered. It is shown that semi-causal/noncausal solutions based on joint evaluation of signal prediction errors and leave-one-out signal interpolation errors, allow one to noticeably improve detection results compared to the prediction-only based solutions. The proposed approaches are evaluated on a set of clean audio signals contaminated...
-
Selection of Features for Multimodal Vocalic Segments Classification
PublikacjaEnglish speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...
-
RENOVATION OF ARCHIVE AUDIO RECORDINGS USING SPARSE AUTOREGRESSIVE MODELING AND BIDIRECTIONAL PROCESSING
PublikacjaThe paper presents a new approach to elimination of broadband noise and impulsive disturbances from archive audio recordings. The proposed adaptive Kalman-like algorithm, based on a sparse autoregressive model of the audio signal, simultaneously detects noise pulses, interpolates the irrevocably distorted samples and performs signal smoothing. It is shown that bidirectional (forward-backward) processing of the archive signal improves...
-
Online sound restoration system for digital library applications.
PublikacjaAudio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
-
Further Developments of the Online Sound Restoration System for Digital Library Applications
PublikacjaNew signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...
-
Detecting Lombard Speech Using Deep Learning Approach
PublikacjaRobust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the background concerning the Lombard effect. Then, assumptions of the work performed for Lombard speech detection are outlined. The framework proposed combines convolutional neural networks...
-
STEADY STATE VISUALLY EVOKED POTENTIALS FOR BRAIN COMPUTER INTERFACE
PublikacjaAn experiment conducted to validate a possibility of use a single active electrode EEG device for detecting Steady State Visually Evoked Potentials (SSVEP) is shown. A LED stimulator was applied to stimulate patients with two different frequencies - 13 Hz and 17 Hz. First, EEG signals were recorded and pre-processed using MATLAB software. In the next step recordings were analysed and classified employing the WEKA software. As indicated...
-
Face detection algorithms evaluation for the bank client verification
PublikacjaResults of investigation of face detection algorithms in the video sequences are presented in the paper. The recordings were made with a miniature industrial USB camera in real conditions met in three bank operating rooms. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide as much as possible user interaction...
-
Expert System and Decision Support System for Electrocardiogram Interpretation and Diagnosis: Review, Challenges and Research Directions
PublikacjaElectrocardiography (ECG) is one of the most widely used recordings in clinical medicine. ECG deals with the recording of electrical activity that is generated by the heart through the surface of the body. The electrical activity generated by the heart is measured using electrodes that are attached to the body surface. The use of ECG in the diagnosis and management of cardiovascular disease (CVD) has been in existence for over...
-
Elimination of impulsive disturbances from archive audio files – comparison of three noise pulse detection schemes
PublikacjaThe problem of elimination of impulsive disturbances (such as clicks, pops, ticks, crackles, and record scratches) from archive audio recordings is considered and solved using autoregressive modeling. Three classical noise pulse detection schemes are examined and compared: the approach based on open-loop multi-step-ahead signal prediction, the approach based on decision-feedback signal prediction, and the double threshold approach,...
-
Sparse vector autoregressive modeling of audio signals and its application to the elimination of impulsive disturbances
PublikacjaArchive audio files are often corrupted by impulsive disturbances, such as clicks, pops and record scratches. This paper presents a new method for elimination of impulsive disturbances from stereo audio signals. The proposed approach is based on a sparse vector autoregressive signal model, made up of two components: one taking care of short-term signal correlations, and the other one taking care of long-term correlations. The method...
-
Educational Dataset of Handheld Doppler Blood Flow Recordings
PublikacjaVital signals registration plays a significant role in biomedical engineering and education process. Well acquired data allow future engineers to observe certain physical phenomena as well learn how to correctly process and interpret the data. This dataset was designed for students to learn about Doppler phenomena and to demonstrate correctly and incorrectly acquired signals as well as the basic methods of signal processing. This...
-
Multimodal English corpus for automatic speech recognition
PublikacjaA multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
-
Differentiating patients with obstructive sleep apnea from healthy controls based on heart rate-blood pressure coupling quantified by entropy-based indices
PublikacjaWe introduce an entropy-based classification method for pairs of sequences (ECPS) for quantifying mutual dependencies in heart rate and beat-to-beat blood pressure recordings. The purpose of the method is to build a classifier for data in which each item consists of two intertwined data series taken for each subject. The method is based on ordinal patterns and uses entropy-like indices. Machine learning is used to select a subset...
-
Evaluation of Face Detection Algorithms for the Bank Client Identity Verification
PublikacjaResults of investigation of face detection algorithms efficiency in the banking client visual verification system are presented. The video recordings were made in real conditions met in three bank operating outlets employing a miniature industrial USB camera. The aim of the experiments was to check the practical usability of the face detection method in the biometric bank client verification system. The main assumption was to provide...
-
Evidence for consolidation of neuronal assemblies after seizures in humans
PublikacjaThe establishment of memories involves reactivation of waking neuronal activity patterns and strengthening of associated neural circuits during slow-wave sleep (SWS), a process known as "cellular consolidation" (Dudai and Morris, 2013). Reactivation of neural activity patterns during waking behaviors that occurs on a timescale of seconds to minutes is thought to constitute memory recall (O'Keefe and Nadel, 1978), whereas consolidation...
-
A study on of music features derived from audio recordings examples – a quantitative analysis
PublikacjaThe paper presents a comparative study of music features derived from audio recordings, i.e. the same music pieces but representing different music genres, excerpts performed by different musicians, and songs performed by a musician, whose style evolved over time. Firstly, the origin and the background of the division of music genres were shortly presented. Then, several objective parameters of an audio signal were recalled that...
-
Comparison of sound of organ pipes in contemporary and historical instruments
PublikacjaThe aim of this research is to examine the differences in the timbre of organ pipes’ sound between a historical and a contemporary organ instrument. The historical instrument is the Oliwa organ from Gdansk, Poland, and the contemporary one is from Kartuzy, Poland. Recordings are made of single notes played by an open labial pipe that belongs to the Principal rank. The analyses and comparison of several sound features compatible...
-
A commonly-accessible toolchain for live streaming music events with higher-order ambisonic audio and 4k 360 vision
PublikacjaAn immersive live stream is especially interesting in the ongoing development of telepresence tools, especially in the virtual reality (VR) or mixed reality (MR) domain. This paper explores the remote and immersive way of enabling telepresence for the audience to high-fidelity music performance using freely-available and easily-accessible tools. A functional VR live-streaming toolchain, comprising 360 vision and higher-order ambisonic...
-
Automated detection of sleep apnea and hypopnea events based on robust airflow envelope tracking
PublikacjaThe paper presents a new approach to detection of apnea/hypopnea events, in the presence of artifacts and breathing irregularities, from a single-channel airflow record. The proposed algorithm identifies segments of signal affected by a high amplitude modulation corresponding to apnea/hypopnea events. It is shown that a robust airflow envelope—free of breathing artifacts—improves effectiveness of the diagnostic process and allows...
-
Time frequency representation of Doppler boold flow recordings
Dane BadawczeVital signals registration plays a grate role in biomedical engineering and education process. Well acquired data allow future engineers to observe certain physical phenomenons as well learn how to correctly process and interpret the data. This data set was designed for students to learn about Doppler phenomena and to demonstrate correctly and incorrectly...
-
Constructing a Dataset of Speech Recordingswith Lombard Effect
PublikacjaThepurpose of therecordings was to create a speech corpus based on the ISLEdataset, extended with video and Lombard speech. Selected from a set of 165sentences, 10, evaluatedas having thehighest possibility to occur in the context ofthe Lombard effect,were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether,15speakers were recorded, and speech parameterswere...
-
An Approach to the Detection of Bank Robbery Acts Employing Thermal Image Analysis
PublikacjaA novel approach to the detection of selected security-related events in bank monitoring systems is presented. Thermal camera images are used for the detection of people in difficult lighting conditions. Next, the algorithm analyses movement of objects detected in thermal or standard monitoring cameras using a method evolved from the motion history images algorithm. At the same time, thermal images are analyzed in order to detect...
-
A detector of sleep disorders for using at home
PublikacjaObstructive sleep apnea usually requires all-ni ght examination in a specialized clinic, under the supervision of a medical staff. Because of those requirements it is an expensive and a non-widely utilized test. Moving the examination procedure to patients’ home with automatic analysis algorithms involved will decrease the costs and make it available for larger group of patients. The developed device allows all-night recordings...
-
Polish motorways 2016 - video data
Dane BadawczePolish motorways 2016 - video data
-
Machine Learning Applied to Aspirated and Non-Aspirated Allophone Classification—An Approach Based on Audio "Fingerprinting"
PublikacjaThe purpose of this study is to involve both Convolutional Neural Networks and a typical learning algorithm in the allophone classification process. A list of words including aspirated and non-aspirated allophones pronounced by native and non-native English speakers is recorded and then edited and analyzed. Allophones extracted from English speakers’ recordings are presented in the form of two-dimensional spectrogram images and...
-
Measuring Pulse Rate with a Webcam
PublikacjaIn this paper a simple method of measuring the pulse rate is presented. Elaborated algorithm allows for efficient pulse rate registration directly from face images captured from a webcam. The desired signal is obtained by proper channel selection and principal component analysis. To determine the accuracy of the method an ECG signal is collected together with a video recordings. The effectiveness of the algorithm is considered...
-
Localization of impulsive disturbances in archive audio signals using predictive matched filtering
PublikacjaThe problem of elimination of impulsive disturbances from archive audio signals is considered and its new solution, called predictive matched filtering, is proposed. The new approach is based on the observation that a large percentage of noise pulses corrupting archive audio recordings have highly repetitive shapes that match several typical “patterns”, called click templates. To localize noise pulses, click templates can be correlated...
-
Texture Features for the Detection of Playback Attacks: Towards a Robust Solution
PublikacjaThis paper describes the new version of a method that is capable of protecting automatic speaker verification (ASV) systems from playback attacks. The presented approach uses computer vision techniques, such as the texture feature extraction based on Local Ternary Patterns (LTP), to identify spoofed recordings. Our goal is to make the algorithm independent from the contents of the training set as much as possible; we look for the...
-
Sound signals generated during lapping of technical ceramics using electroplated tools with diamond grains
Dane BadawczeData contains the recordings of sound generated during single-sided lapping with the use of electroplated diamond tools. This relationship was examined with the use of spectral analysis of the sound signal in the frequency domain with a focus on the Ra parameter of the surface roughness. The estimated sound coefficient increased as the surface roughness...
-
Visual Detection of People Movement Rules Violation in Crowded Indoor Scenes
PublikacjaThe paper presents a camera-independent framework for detecting violations of two typical people movement rules that are in force in many public transit terminals: moving in the wrong direction or across designated lanes. Low-level image processing is based on object detection with Gaussian Mixture Models and employs Kalman filters with conflict resolving extensions for the object tracking. In order to allow an effective event...
-
Exploring music listening patterns: an online survey
PublikacjaAn online survey was carried out to explore how respondents listen to music recordings. It was anticipated that the listener’s preferences would be influenced by various factors, such as age, music genre, the contexts in which they listen, and their favored methods of music consumption. Consequently, the data were collected to analyze these relationships. The survey, structured as a web application, encompassed 23 questions,...