Search results for: audio-video recordings database

Search results for: audio-video recordings database

results on page:
embed this view on your website

Filters

total: 322

clear all filters disabled

Creating a Remote Choir Performance Recording Based on an Ambisonic Approach
Publication
- Applied Sciences-Basel - Year 2022
The aim of this paper is three-fold. First, the basics of binaural and ambisonic techniques are briefly presented. Then, details related to audio-visual recordings of a remote performance of the Academic Choir of the Gdańsk University of Technology are shown. Due to the COVID-19 pandemic, artists had a choice, namely, to stay at home and not perform or stay at home and perform. In fact, staying at home brought in the possibility...

Full text available to download
Monitoring of Caged Bluefin Tuna Reactions to Ship and Offshore Wind Farm Operational Noises
Publication
- V. Puig-Pons
- E. Soliveres
- I. Pérez-Arjona
- V. Espinosa
- P. Poveda-Martínez
- J. Ramis-Soriano
- P. Ordoñez-Cebrián
- M. Moszyński
- F. de la Gándara
- M. Bou-Cabo... and 2 others
- SENSORS - Year 2021
Underwater noise has been identified as a relevant pollution affecting marine ecosystems in different ways. Despite the numerous studies performed over the last few decades regarding the adverse effect of underwater noise on marine life, a lack of knowledge and methodological procedures still exists, and results are often tentative or qualitative. A monitoring methodology for the behavioral response of bluefin tuna (Thunnus thynnus)...

Full text available to download
TRANSPORT POSSIBILITY FOR MPEG-4/AVC- AND MPEG-2-ENCODED VIDEO DATA IN IPTV: A COMPARISON STUDY
Publication
- T. Uhl
- S. Paulsen
- K. Nowicki
- Year 2013
IPTV (Television over IP) is a modern service with a great potential to expand. It uses the IP transport platform, that is already in worldwide operation. At the time of writing, two techniques are used to transport the video and audio data of IPTV: MPEG-2 TS and Native RTP. The two techniques quite definitely have an influence on both quality of service (QoS) and quality of experience (QoE). This paper sets out to demonstrate...
A Review of Emotion Recognition Methods Based on Data Acquired via Smartphone Sensors
Publication
- SENSORS - Year 2020
In recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones,...

Full text available to download
Broadening the scope of measurement and analysis of vibrations of an organ pipe employing intensity probe, simulations, and highspeed camera
Publication
- P. Bordoni
- J. Kotus
- P. Odya
- F. Antonacci
- B. Kostek
- Journal of the Acoustical Society of America - Year 2022
This paper shows an integrated approach to measure, analyze, and model phenomena occurring in an organ pipe driven by pressurized air. The aim of this paper is two-fold, i.e., to measure the pressure signal and the intensity field around the mouth by means of an intensity probe and to visualize and observe the motion of the air jet, which represents the excitation mechanism of the system. This is realized through two techniques,...

Full text to download in external service
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
Publication
- A. Czyżewski
- B. Kostek
- T. Ciszewski
- D. Majewicz
- Year 2013
The bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
Comparison of Classification Methods for EEG Signals of Real and Imaginary Motion
Publication
- Year 2018
The classification of EEG signals provides an important element of brain-computer interface (BCI) applications, underlying an efficient interaction between a human and a computer application. The BCI applications can be especially useful for people with disabilities. Numerous experiments aim at recognition of motion intent of left or right hand being useful for locked-in-state or paralyzed subjects in controlling computer applications....

Full text available to download
Evaluation of aspiration problems in L2 English pronunciation employing machine learning
Publication
- M. Piotrowska
- A. Czyżewski
- T. Ciszewski
- G. Korvel
- A. Kurowski
- B. Kostek
- Journal of the Acoustical Society of America - Year 2021
The approach proposed in this study includes methods specifically dedicated to the detection of allophonic variation in English. This study aims to find an efficient method for automatic evaluation of aspiration in the case of Polish second-language (L2) English speakers’ pronunciation when whole words are analyzed instead of particular allophones extracted from words. Sample words including aspirated and unaspirated allophones...

Full text available to download
Vident-real: an intra-oral video dataset for multi-task learning
Open Research Data
open access
- D. Węsierski
- A. Jezierska
- P. Kopa Ostrowski
- E. Lewandowska
- E. Katsaros
- A. Żółtowska
We introduce Vident-real, a large dataset of 100 video sequences of intra-oral scenes from real conservative dental treatments performed at the Medical University of Gdańsk, Poland. The dataset can be used for multi-task learning methods including:
Objectivization of Audio-Visual Correlation analysis
Publication
- B. Kunka
- B. Kostek
- Archives of Acoustics - Year 2012
Simultaneous perception of audio and visual stimuli often causes the concealment or misrepresentation of information actually contained in these stimuli. Such effects are called the ''image proximity effect'' or the ''ventriloquism effect'' in literature. Until recently, most research carried out to understand their nature was based on subjective assessments. The Authors of this paper propose a methodology based on both subjective...

Full text available to download
Application of autoencoder to traffic noise analysis
Publication
- Journal of the Acoustical Society of America - Year 2019
The aim of an autoencoder neural network is to transform the input data into a lower-dimensional code and then to reconstruct the output from this code representation. Applications of autoencoders to classifying sound events in the road traffic have not been found in the literature. The presented research aims to determine whether such an unsupervised learning method may be used for deploying classification algorithms applied to...

Full text available to download
Fully Automated AI-powered Contactless Cough Detection based on Pixel Value Dynamics Occurring within Facial Regions
Publication
- M. Szankin
- A. Kwaśniewska
- N. Kowalczyk
- J. Rumiński
- R. Nicolas
- D. Gamba
- Year 2021
Increased interest in non-contact evaluation of the health state has led to higher expectations for delivering automated and reliable solutions that can be conveniently used during daily activities. Although some solutions for cough detection exist, they suffer from a series of limitations. Some of them rely on gesture or body pose recognition, which might not be possible in cases of occlusions, closer camera distances or impediments...

Full text to download in external service
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
Publication
- Electronics - Year 2022
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...

Full text available to download
IMAGE CORRELATION AS A TOLL FOR TRACKING FACIAL CHANGES CAUSING BY EXTERNAL STIMULI
Publication
- K. Bobkowska
- A. Janowski
- M. Przyborski
- Year 2015
Expressions of the human face bring a lot of information, which are a valuable source in the areas of computer vision, remote sensing and affective computing. For years, by analyzing the movement of the skin and facial muscles scientists are trying to create the perfect tool, based on image analysis, allowing the recognition of emotional states of human beings. To create a reliable algorithm, it is necessary to explore and examine...

Full text to download in external service
Tagged images with LEGO bricks moving on the conveyor belt
Open Research Data
open access
- T. Boiński
- A. Stiegler
- Ł. Kłos
- series: LEGO - partial
The data set conatins tagged images conatining LEGO bricks used for traning LEGO bricks detecting network. The dataset contains frames taken from video recordings of lego bricks moving on a conveyor belt. The images conatin from 0 to 7 bricks. In total there are 5459 tagged lego bricks in 4340 images and 3390 images without any bricks in them. The images...
MACHINE LEARNING–BASED ANALYSIS OF ENGLISH LATERAL ALLOPHONES
Publication
- M. Piotrowska
- G. Korvel
- B. Kostek
- T. Ciszewski
- A. Czyżewski
- International Journal of Applied Mathematics and Computer Science - Year 2019
Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and selforganizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’...

Full text available to download
The effect of groyne field on trapping macroplastic. Preliminary results from laboratory experiments
Publication
- Ł. Przyborowski
- Z. Cuban
- A. Łoboda
- M. Robakiewicz
- S. Biegowski
- T. Kolerski
- SCIENCE OF THE TOTAL ENVIRONMENT - Year 2024
Macroplastic, a precursor of microplastic pollution, has become a new scope of research interest. However, the physical processes of macroplastic transport and deposition in rivers are poorly understood, which makes the decisions of where to locate macroplastic trapping infrastructure difficult. In this research, we conducted a series of experiments in a laboratory channel, exploring the impact of groynes and flexible artificial...

Full text to download in external service
New Applications of Multimodal Human-Computer Interfaces
Publication
- A. Czyżewski
- Year 2012
Multimodal computer interfaces and examples of their applications to education software and for the disabled people are presented. The proposed interfaces include the interactive electronic whiteboard based on video image analysis, application for controlling computers with gestures and the audio interface for speech stretching for hearing impaired and stuttering people. Application of the eye-gaze tracking system to awareness...
System for automatic singing voice recognition
Publication
- P. Żwan
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Year 2008
W artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...
Improving automatic surveillance by sound analysis
Publication
- Year 2010
An automatic surveillance system, based on event detection in the video image can be improved by implementing algorithms for audio analysis. Dangerous or illegal actions are often connected with distinctive sound events like screams or sudden bursts of energy. A method for detection and classification of alarming sound events is presented. Detection is based on the observation of sudden changes in sound level in distinctive sub-bands...
SkinDepth - synthetic 3D skin lesion database
Open Research Data
version 1.0 open access
- A. Jezierska
- M. Woźniak
SkinDepth is the first synthetic 3D skin lesion database. The release of SkinDepth dataset intends to contribute to the development of algorithms for:
ZINTEGROWANY SYSTEM DOMOWEGO MONITORINGU PARAMETRÓW MEDYCZNYCH OSÓB STARSZYCH I CHORYCH
Publication
- Year 2019
Proponowane rozwiązania mają na celu wspomaganie osób starszych i chorych, tak by mogły jak najdłużej mieszkać i żyć samodzielnie ze zwiększonym poczuciem bezpieczeństwa, iż są nadzorowane i w razie nagłego zagrożenia życia nie pozostaną bez pomocy. System jednocześnie nie narusza poczucia zachowania prywatności i intymności, gdyż nie są używane do monitoringu kamery wizyjne czy też stały nasłuch audio. Dodatkowo gromadzone informacje...

Search

Filters

Catalog

Search results for: audio-video recordings database