Filters
total: 381
Search results for: FIELD RECORDINGS
-
The effect of groyne field on trapping macroplastic. Preliminary results from laboratory experiments
PublicationMacroplastic, a precursor of microplastic pollution, has become a new scope of research interest. However, the physical processes of macroplastic transport and deposition in rivers are poorly understood, which makes the decisions of where to locate macroplastic trapping infrastructure difficult. In this research, we conducted a series of experiments in a laboratory channel, exploring the impact of groynes and flexible artificial...
-
Seafloor Characterisation Using Underwater Acoustic Devices
PublicationThe problem of seafloor characterisation is important in the context of management as well as investigation and protection of the marine environment. In the first part of the paper, a review of underwater acoustic technology and methodology used in seafloor characterisation is presented. It consists of the techniques based on the use of singlebeam echosounders and seismic sources, along with those developed for the use of sidescan...
-
Combined Single Neuron Unit Activity and Local Field Potential Oscillations in a Human Visual Recognition Memory Task
PublicationGOAL: Activities of neuronal networks range from action potential firing of individual neurons, coordinated oscillations of local neuronal assemblies, and distributed neural populations. Here, we describe recordings using hybrid electrodes, containing both micro- and clinical macroelectrodes, to simultaneously sample both large-scale network oscillations and single neuron spiking activity in the medial temporal lobe structures...
-
Audio Feature Analysis for Precise Vocalic Segments Classification in English
PublicationAn approach to identifying the most meaningful Mel-Frequency Cepstral Coefficients representing selected allophones and vocalic segments for their classification is presented in the paper. For this purpose, experiments were carried out using algorithms such as Principal Component Analysis, Feature Importance, and Recursive Parameter Elimination. The data used were recordings made within the ALOFON corpus containing audio signal...
-
Comparison of ambisonic and object-based spatial sound recording techniques
PublicationThis article presents a comparison of spatial sound recording techniques based on scene-based and object-based audio. The study aimed to make different mixes from a recording which consists of a higher-order ambisonic microphone and spot microphones. For spot microphones simple ambisonics encoding was used, which allows panning the individual channels on an ambisonic sphere as objects. Recordings were combined in various variants...
-
Special techniques and future perspectives: Simultaneous macro- and micro-electrode recordings
PublicationThere are many approaches to studying the inner workings of the brain and its highly interconnected circuits. One can look at the global activity in different brain structures using non-invasive technologies like positron emission tomography (PET) or functional magnetic resonance imaging (fMRI), which measure physiological changes, e.g. in the glucose uptake or blood flow. These can be very effectively used to localize active patches...
-
Evaluation of the Possibility of Identifying a Complex Polygonal Tram Track Layout Using Multiple Satellite Measurements
PublicationWe present the main assumptions about the algorithmization of the analysis of measurement data recorded in mobile satellite measurements. The research team from the Gda´nsk University of Technology and the Maritime University in Gdynia, as part of a research project conducted in cooperation with PKP PLK (Polish Railway Infrastructure Manager), developed algorithms supporting the identification and assessment of track axis layout....
-
Evaluation of the Possibility of Identifying a Complex Polygonal Tram Track Layout Using Multiple Satellite Measurements
PublicationWe present the main assumptions about the algorithmization of the analysis of measurement data recorded in mobile satellite measurements. The research team from the Gda´nsk University of Technology and the Maritime University in Gdynia, as part of a research project conducted in cooperation with PKP PLK (Polish Railway Infrastructure Manager), developed algorithms supporting the identification and assessment of track axis layout....
-
A survey of automatic speech recognition deep models performance for Polish medical terms
PublicationAmong the numerous applications of speech-to-text technology is the support of documentation created by medical personnel. There are many available speech recognition systems for doctors. Their effectiveness in languages such as Polish should be verified. In connection with our project in this field, we decided to check how well the popular speech recognition systems work, employing models trained for the general Polish language....
-
An automated, low-latency environment for studying the neural basis of behavior in freely moving rats
PublicationBackground Behavior consists of the interaction between an organism and its environment, and is controlled by the brain. Brain activity varies at sub-second time scales, but behavioral measures are usually coarse (often consisting of only binary trial outcomes). Results To overcome this mismatch, we developed the Rat Interactive Foraging Facility (RIFF): a programmable interactive arena for freely moving rats with multiple feeding...
-
Akustyczna analiza parametrów ruchu drogowego z wykorzystaniem informacji o hałasie oraz uczenia maszynowego
PublicationCelem rozprawy było opracowanie akustycznej metody analizy parametrów ruchu drogowego. Zasada działania akustycznej analizy ruchu drogowego zapewnia pasywną metodę monitorowania natężenia ruchu. W pracy przedstawiono wybrane metody uczenia maszynowego w kontekście analizy dźwięku (ang.Machine Hearing). Przedstawiono metodologię klasyfikacji zdarzeń w ruchu drogowym z wykorzystaniem uczenia maszynowego. Przybliżono podstawowe...
-
Analysis of allophones based on audio signal recordings and parameterization
PublicationThe aim of this study is to develop an allophonic description of English plosive consonants based on recordings of 600 specially selected words. Allophonic variations addressed in the study may have two sources: positional and contextual. The former one depends on the syllabic or prosodic position in which a particular phoneme occurs. Contextual allophony is conditioned by the local phonetic environment. Co-articulation overlapping...
-
Video content analysis in the urban area telemonitoring system
PublicationThe task of constant monitoring of video streams from a large number of cameras and reviewing the recordings in order to find a specified event requires a considerable amount of time and effort from the system operators and it is prone to errors. A solution to this problem is an automatic system for constant analysis of camera images being able to raise an alarm if a predefined event is detected. The chapter presents various aspects...
-
Dissecting gamma frequency activities during human memory processing
PublicationGamma frequency activity (30-150 Hz) is induced in cognitive tasks and is thought to reflect underlying neural processes. Gamma frequency activity can be recorded directly from the human brain using intracranial electrodes implanted in patients undergoing treatment for drug-resistant epilepsy. Previous studies have independently explored narrowband oscillations in the local field potential and broadband power increases. It is not...
-
Analysis of soundscape recordings in close proximity to the road in changeable wather conditions
PublicationThe acoustic vehicle sensing is the least invasive type of traffic detection. Also, acoustic-based vehicle detection technology is insensitive to precipitation and can operate in low light level. Therefore, this kind of method may be used for automatic detection of the vehicle passage events. It can also be employed for measurements of a vehicle speed and the vehicle assignment to the particular category. In this paper the results...
-
On Facial Expressions and Emotions RGB-D Database
PublicationThe goal of this paper is to present the idea of creating reference database of RGB-D video recordings for recognition of facial expressions and emotions. Two different formats of the recordings used for creation of two versions of the database are described and compared using different criteria. Examples of first applications using databases are also presented to evaluate their usefulness.
-
Video of LEGO Bricks on Conveyor Belt Dataset Series
PublicationThe dataset series titled Video of LEGO bricks on conveyor belt is composed of 14 datasets containing video recordings of a moving white conveyor belt. The recordings were created using a smartphone camera in Full HD resolution. The dataset allows for the preparation of data for neural network training, and building of a LEGO sorting machine that can help builders to organise their collections.
-
An extension to the FEEDB Multimodal Database of Facial Expressions and Emotions
PublicationFEEDB is a multimodal database that contains recordings of people expressing different emotions, captured by using a Microsoft Kinect sensor. Data were originally provided in the device’s proprietary format (XED), requiring both the Microsoft Kinect Studio application and a Kinect sensor attached to the system to use the files. In this paper, we present an extension of the database. For a selection of recordings, we also provide...
-
Visually validated semi-automatic high-frequency oscillation detection aides the delineation of epileptogenic regions during intra-operative electrocorticography
PublicationOBJECTIVE: To test the utility of a novel semi-automated method for detecting, validating, and quantifying high-frequency oscillations (HFOs): ripples (80-200 Hz) and fast ripples (200-600 Hz) in intra-operative electrocorticography (ECoG) recordings. METHODS: Sixteen adult patients with temporal lobe epilepsy (TLE) had intra-operative ECoG recordings at the time of resection. The computer-annotated ECoG recordings were visually...
-
IMAGE CORRELATION AS A TOLL FOR TRACKING FACIAL CHANGES CAUSING BY EXTERNAL STIMULI
PublicationExpressions of the human face bring a lot of information, which are a valuable source in the areas of computer vision, remote sensing and affective computing. For years, by analyzing the movement of the skin and facial muscles scientists are trying to create the perfect tool, based on image analysis, allowing the recognition of emotional states of human beings. To create a reliable algorithm, it is necessary to explore and examine...
-
Reactivation of seizure‐related changes to interictal spike shape and synchrony during postseizure sleep in patients
PublicationOBJECTIVE: Local field potentials (LFPs) arise from synchronous activation of millions of neurons, producing seemingly consistent waveform shapes and relative synchrony across electrodes. Interictal spikes (IISs) are LFPs associated with epilepsy that are commonly used to guide surgical resection. Recently, changes in neuronal firing patterns observed in the minutes preceding seizure onset were found to be reactivated during postseizure...
-
Evaluation of Six Degrees of Freedom 3D Audio Orchestra Recording and Playback using multi-point Ambisonic interpolation
PublicationThis paper describes a strategy for recording sound and enabling six-degrees-of-freedom playback, making use of multiple simultaneous and synchronized Higher Order Ambisonics (HOA) recordings. Such a strategy enables users to navigate in a simulated 3D space and listen to the six-degrees-of-freedom recordings from different perspectives. For the evaluation of the proposed approach, an Unreal Engine-based navigable 3D audiovisual...
-
Reduction of parasitic pitch variations in archival musical recordings
PublicationA new method for reducing parasitic pitch variations in archival audio recordings is presented. The method is intended for analyzing movie soundtracks recorded in optical films. It utilizes image processing for calculating and reducing effects of tape shrinkage being one of the main reasons for parasitic pitch variations in audio accompanying moving images. As long as the film tape characteristics are known the new method can be...
-
Recovering Sound Produced by Wind Turbine Structures Employing Video Motion Magnification
PublicationThe recordings were made with a fast video camera and with a microphone. Using fast cameras allowed for observation of the micro vibrations of the object structure. Motion-magnified video recordings of wind turbines on a wind farm were made for the purpose of building a damage prediction system. An idea was to use video to recover sound & vibrations in order to obtain a contactless diagnostic method for wind turbines. The recovered signals...
-
Resonance problems in UHV transmission lines
PublicationThe paper presents resonance phenomena observed in 400 kV transmission lines in the Polish power system. Two events are analysed, when shunt reactors used for reactive power compensation, caused overvoltages and overcurrent protection tripping as a result of resonance. An oscillographic fault recordings from protection devices are compared to time domain simulation results. The obtained simulation results match fault recordings,...
-
Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering
PublicationThis paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. Online tracking of signal model parameters is performed using the exponential ly weighted least squares algo- rithm. Detection of noise pulses an d model-based interpolation of the irrevocably distorted sampl es is realized using an adaptive, variable-order...
-
Elimination of impulsive disturbances from stereo audio recordings
PublicationThis paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. On-line tracking of signal model parameters is performed using the stability-preserving Whittle-Wiggins-Robinson algorithm with exponential data weighting. Detection of noise pulses and model-based interpolation of the irrevocably distorted samples...
-
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
PublicationIn this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...
-
Video analytics-based algorithm for monitoring egress from buildings
PublicationA concept and a practical implementation of the algorithm for detecting of potentially dangerous situations related to crowding in passages is presented. An example of such a situation is a crush which may be caused by an obstructed pedestrian pathway. The surveillance video camera signal analysis performed in the online mode is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of the...
-
Traffic Noise Analysis Applied to Automatic Vehicle Counting and Classification
PublicationProblems related to determining traffic noise characteristics are discussed in the context of automatic dynamic noise analysis based on noise level measurements and traffic prediction models. The obtained analytical results provide the second goal of the study, namely automatic vehicle counting and classification. Several traffic prediction models are presented and compared to the results of in-situ noise level measurements. Synchronized...
-
The American Sign Language alphabet
Open Research DataThe American Sign Language dataset contains all static letters of the American alphabet, meaning those that do not require movement to perform (the entire alphabet except for the letters 'J' and 'Z', which are dynamic and require hand movement).
-
Comparison of two methods of sound extraction from guitar string video recordings
PublicationA comparison of two sound extraction methods from guitar string video recordings is presented in the paper. A brief overview of highframe rate camera technology and possible applications are included. The method using the image analysis from two such cameras is presented. The cameras are placed at the angle of 90 degrees for recording the image in three planes. The results achieved...
-
Visual Lip Contour Detection for the Purpose of Speech Recognition
PublicationA method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...
-
Video Analytics-Based Algorithm for Monitoring Egress from Buildings
PublicationA concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...
-
Analysis of the influence of external conditions on temperature readings in thermograms and adaptive adjustment of the measured temperature value
PublicationMeasuring human temperature is a crucial step in preventing the spread of diseases such as COVID-19. For the proper operation of an automatic body temperature measurement system throughout the year, it is necessary to consider outdoor conditions. In this paper, the effect of atmospheric factors on facial temperature readings using infrared thermography is investigated. A thorough analysis of the variation of facial temperature...
-
Localization of impulsive disturbances in audio signals using template matching
PublicationIn this paper, a new solution to the problem of elimination of impulsive disturbances from audio signals, based on the matched filtering technique, is proposed. The new approach stems from the observation that a large proportion of noise pulses corrupting audio recordings have highly repetitive shapes that match several typical “patterns”. In many cases a representative set of exemplary pulse waveforms can be extracted from the...
-
An audio-visual corpus for multimodal automatic speech recognition
Publicationreview of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...
-
Wind Turbines Modeling as the Tool for Developing Algorithms of Processing their Video Recordings
PublicationIn the real world, many factors exist disturbing observation of the examined phenomena and causing various noises and distortions in recorded signals. It very often makes it difficult or even impossible to optimize various signal processing algorithms, through finding appropriate parameters. In this paper, we show an application, that retrieves wind turbine rotor speed from recorded video. Next, we describe the process of reduction...
-
Wind Turbines Modeling as the Tool for Developing Algorithms of Processing their Video Recordings
PublicationIn the real world, many factors exist disturbing observation of the examined phenomena and causing various noises and distortions in recorded signals. It very often makes it difficult or even impossible to optimize various signal processing algorithms, through finding appropriate parameters. In this paper, we show an application, that retrieves wind turbine rotor speed from recorded video. Next, we describe the process of reduction...
-
Column base fixity in steel moment frames: Observations from instrumented buildings
PublicationThe rotational fixity of column base connections in Steel Moment Resisting Frames (SMRFs) strongly influences their seismic response. However, approaches for estimating base fixity have been validated only against laboratory test data. These approaches are examined based on strong motion recordings from four instrumented SMRF buildings in California to informbest practices for seismic response simulation. These buildings represent...
-
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
PublicationAutomatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...
-
Compensation of Voltage Drops in Trolleybus Supply System Using Battery-Based Buffer Station
PublicationThis paper analyzes the results of a trial operation of a battery-based buffer station supporting a selected section of trolleybus power supply systems in Pilsen, Czech Republic. The buffer station aims to prevent the catenary from excessive voltage drops in a part of the route that is most remote from the traction substation. Compensation of voltage drops is carried out by continuously measuring the catenary voltage and injecting...
-
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
PublicationIn this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...
-
Playback Attack Detection: The Search for the Ultimate Set of Antispoof Features
PublicationAutomatic speaker verification systems are vulnerable to several kinds of spoofing attacks. Some of them can be quite simple – for example, the playback of an eavesdropped recording does not require any specialized equipment nor knowledge, but still may pose a serious threat for a biometric identification module built into an e-banking application. In this paper we follow the recent approach and convert recordings to images, assuming...
-
Vident-real: an intra-oral video dataset for multi-task learning
Open Research DataWe introduce Vident-real, a large dataset of 100 video sequences of intra-oral scenes from real conservative dental treatments performed at the Medical University of Gdańsk, Poland. The dataset can be used for multi-task learning methods including:
-
Analyzing the relationship between sound, color, and emotion based on subjective and machine-learning approaches
PublicationThe aim of the research is to analyze the relationship between sound, color, and emotion. For this purpose, a survey application was prepared, enabling the assignment of a color to a given speaker’s/singer’s voice recordings. Subjective tests were then conducted, enabling the respondents to assign colors to voice/singing samples. In addition, a database of voice/singing recordings of people speaking in a natural way and with expressed...
-
MODALITY corpus - SPEAKER 35 - COMMANDS C1
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S4
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...