Wyniki wyszukiwania dla: audio-video recordings database

Wyniki wyszukiwania dla: audio-video recordings database

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 326

wyczyść wszystkie filtry niedostępne

Multimodal human-computer interfaces based on advanced video and audio analysis
Publikacja
- Advances in Intelligent Systems and Computing - Rok 2014
Multimodal interfaces development history is reviewed briefly in the introduction. Some applications of multimodal interfaces to education software for disabled people are presented. One of them, the LipMouse is a novel, vision-based human-computer interface that tracks user’s lip movements and detect lips gestures. A new approach to diagnosing Parkinson’s disease is also shown. The progression of the disease can be measured employing...

Pełny tekst do pobrania w serwisie zewnętrznym
Video recordings of static hand gestures for gesture based interaction
Dane Badawcze
open access
- T. Kocejko
This data set contains video recording of selected simple hand gestures related to sign language. The purpose of the data set is to evaluate different computer algorithms design for hand gesture detection as well as for hand features and hand pose detection and identification. The data set contains 5 video recordings in mp4 format. Each recording is...
Energy Efficiency Study of Audio-video Content Consumption on Selected Android Mobile Terminals
Publikacja
- P. Falkowski-Gilski
- M. Pańkowski
- Rok 2021
Mobile devices are widely used by billions of users worldwide. Thanks to their main advantage, which is portability, they should be fully operational as long as possible, without the need to recharge or connect them to external power sources. This paper describes a study, carried out on four different mobile devices, with different hardware and software parameters, running the Android operating system. The research campaign involved...

Pełny tekst do pobrania w serwisie zewnętrznym
Piotr Odya dr inż.

Osoby

Katedra Systemów Multimedialnych

Piotr Odya urodził się w Gdańsku w 1974. W 1999 roku ukończył z wyróżnieniem studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Praca dyplomowa dotyczyła problemów poprawy jakości dźwięku w studiach emisyjnych współczesnych rozgłośni radiowych.Jego zainteresowania dotyczą montażu wideofonicznego, systemów dźwięku wielokanałowego. W ramach studiów doktoranckich...
Grzegorz Szwoch dr hab. inż.

Osoby

Katedra Systemów Multimedialnych

Grzegorz Szwoch urodził się w 1972 roku w Gdańsku. W latach 1991-1996 studiował na wydziale Elektroniki Politechniki Gdańskiej. W roku 1996 ukończył studia w Zakładzie Inżynierii Dźwięku (obecnie Katedra Systemów Multimedialnych), broniąc pracę dyplomową pt. Modelowanie fizyczne wybranych instrumentów muzycznych. W tym samym roku dołączył do zespołu badawczego Katedry jako uczestnik Studium Doktoranckiego. Od stycznia 2001 roku...
QoS/QoE in the Heterogeneous Internet of Things (IoT)
Publikacja
- K. Nowicki
- T. Uhl
- Rok 2017
Applications provided in the Internet of Things can generally be divided into three categories: audio, video and data. This has given rise to the popular term Triple Play Services. The most important audio applications are VoIP and audio streaming. The most notable video applications are VToIP, IPTV, and video streaming, and the service WWW is the most prominent example of data-type services. This chapter elaborates on the most...
Video of LEGO Bricks on Conveyor Belt Dataset Series
Publikacja
- T. M. Boiński
- Rok 2022
The dataset series titled Video of LEGO bricks on conveyor belt is composed of 14 datasets containing video recordings of a moving white conveyor belt. The recordings were created using a smartphone camera in Full HD resolution. The dataset allows for the preparation of data for neural network training, and building of a LEGO sorting machine that can help builders to organise their collections.

Pełny tekst do pobrania w portalu
Postprodukcja nagrania wideo z dzwiekiem dookolnym
Publikacja
- Rok 2009
One of the aims of this paper is to present issues related to audio-video correlation. This is presented on the basis of a short film realization employing surround microphone techniques. First, some related works in the domain of sound and vision correlation are presented. Then assumptions concerning scene creation related to both audio and video are shortly described. Another objective is to discuss results of subjective tests...
Recovering Sound Produced by Wind Turbine Structures Employing Video Motion Magnification
Publikacja
- Rok 2019
The recordings were made with a fast video camera and with a microphone. Using fast cameras allowed for observation of the micro vibrations of the object structure. Motion-magnified video recordings of wind turbines on a wind farm were made for the purpose of building a damage prediction system. An idea was to use video to recover sound & vibrations in order to obtain a contactless diagnostic method for wind turbines. The recovered signals...

Pełny tekst do pobrania w serwisie zewnętrznym
Reduction of parasitic pitch variations in archival musical recordings
Publikacja
- SIGNAL PROCESSING - Rok 2010
A new method for reducing parasitic pitch variations in archival audio recordings is presented. The method is intended for analyzing movie soundtracks recorded in optical films. It utilizes image processing for calculating and reducing effects of tape shrinkage being one of the main reasons for parasitic pitch variations in audio accompanying moving images. As long as the film tape characteristics are known the new method can be...

Pełny tekst do pobrania w portalu
Video analytics-based algorithm for monitoring egress from buildings
Publikacja
- M. Szczodrak
- A. Czyżewski
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2016
A concept and a practical implementation of the algorithm for detecting of potentially dangerous situations related to crowding in passages is presented. An example of such a situation is a crush which may be caused by an obstructed pedestrian pathway. The surveillance video camera signal analysis performed in the online mode is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of the...

Pełny tekst do pobrania w portalu
The American Sign Language alphabet
Dane Badawcze
open access
- S. Olewniczak
- K. Witczak
- I. Czartowski
- H. Wołek
The American Sign Language dataset contains all static letters of the American alphabet, meaning those that do not require movement to perform (the entire alphabet except for the letters 'J' and 'Z', which are dynamic and require hand movement).
Video Analytics-Based Algorithm for Monitoring Egress from Buildings
Publikacja
- M. Szczodrak
- A. Czyżewski
- Rok 2013
A concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...

Pełny tekst do pobrania w serwisie zewnętrznym
Localization of impulsive disturbances in audio signals using template matching
Publikacja
- M. Niedźwiecki
- M. Ciołek
- DIGITAL SIGNAL PROCESSING - Rok 2015
In this paper, a new solution to the problem of elimination of impulsive disturbances from audio signals, based on the matched filtering technique, is proposed. The new approach stems from the observation that a large proportion of noise pulses corrupting audio recordings have highly repetitive shapes that match several typical “patterns”. In many cases a representative set of exemplary pulse waveforms can be extracted from the...

Pełny tekst do pobrania w portalu
Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing
Publikacja
- M. Niedźwiecki
- M. Ciołek
- IEEE Transactions on Audio Speech and Language Processing - Rok 2013
In this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...

Pełny tekst do pobrania w portalu
An extension to the FEEDB Multimodal Database of Facial Expressions and Emotions
Publikacja
- M. Szwoch
- L. Marco-gimenez
- M. Arevalillo-herráez
- A. Ayesh
- Rok 2015
FEEDB is a multimodal database that contains recordings of people expressing different emotions, captured by using a Microsoft Kinect sensor. Data were originally provided in the device’s proprietary format (XED), requiring both the Microsoft Kinect Studio application and a Kinect sensor attached to the system to use the files. In this paper, we present an extension of the database. For a selection of recordings, we also provide...

Pełny tekst do pobrania w serwisie zewnętrznym
Building Knowledge for the Purpose of Lip Speech Identification
Publikacja
- Advances in Intelligent Systems and Computing - Rok 2017
Consecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...

Pełny tekst do pobrania w serwisie zewnętrznym
FEEDB: A multimodal database of facial expressions and emotions
Publikacja
- M. Szwoch
- Rok 2013
In this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...

Pełny tekst do pobrania w serwisie zewnętrznym
Online sound restoration system for digital library applications
Publikacja
- Rok 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

Pełny tekst do pobrania w serwisie zewnętrznym
Sparse vector autoregressive modeling of audio signals and its application to the elimination of impulsive disturbances
Publikacja
- M. Niedźwiecki
- M. Ciołek
- Rok 2015
Archive audio files are often corrupted by impulsive disturbances, such as clicks, pops and record scratches. This paper presents a new method for elimination of impulsive disturbances from stereo audio signals. The proposed approach is based on a sparse vector autoregressive signal model, made up of two components: one taking care of short-term signal correlations, and the other one taking care of long-term correlations. The method...

Pełny tekst do pobrania w serwisie zewnętrznym
1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type
Publikacja
- Rok 2020
A network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence....
Detection of impulsive disturbances in archive audio signals
Publikacja
- M. Ciołek
- M. Niedźwiecki
- Rok 2017
In this paper the problem of detection of impulsive disturbances in archive audio signals is considered. It is shown that semi-causal/noncausal solutions based on joint evaluation of signal prediction errors and leave-one-out signal interpolation errors, allow one to noticeably improve detection results compared to the prediction-only based solutions. The proposed approaches are evaluated on a set of clean audio signals contaminated...

Pełny tekst do pobrania w portalu
Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing
Publikacja
- D. Koszewski
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2020
Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

Pełny tekst do pobrania w portalu
Online sound restoration system for digital library applications.
Publikacja
- Journal of the Acoustical Society of America - Rok 2013
Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...
Intelligent multimedia solutions supporting special education needs.
Publikacja
- A. Czyżewski
- B. Kostek
- LECTURE NOTES IN COMPUTER SCIENCE - Rok 2011
The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....
Exploiting audio-visual correlation by means of gaze tracking
Publikacja
- B. Kunka
- B. Kostek
- International Journal of Computer Science and Applications - Rok 2010
This paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...

Pełny tekst do pobrania w portalu
Rozproszone przechowywanie zapasowych kopii danych
Publikacja
- J. Kuchta
- Rok 2012
Pokazano metodę wykorzystania systemu przetwarzania rozproszonego do zabezpieczenia instytucji przed skutkami ataku hakerskiego połączonego ze zniszczeniem bazy danych tej instytucji. Metoda ta polega na wplataniu pakietów danych do materiałów audio-video ściąganych przez internautów korzystających z serwisów filmowych Video-on-Demand i przechowywaniu danych w rozproszeniu na setki lub nawet tysiące komputerów.

Pełny tekst do pobrania w serwisie zewnętrznym
Creating a Realible Music Discovery and Recomendation System
Publikacja
- Rok 2014
The aim of this paper is to show problems related to creating a reliable music dis-covery system. The SYNAT database that contains audio files is used for the purpose of experiments. The files are divided into 22 classes corresponding to music genres with different cardinality. Of utmost importance for a reliable music recommendation system are the assignment of audio files to their appropriate gen-res and optimum parameterization...

Pełny tekst do pobrania w serwisie zewnętrznym
Further Developments of the Online Sound Restoration System for Digital Library Applications
Publikacja
- Rok 2014
New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

Pełny tekst do pobrania w serwisie zewnętrznym
Data obtained via parametrization of differently mixed audio signals
Dane Badawcze
open access
- J. Stefański
- K. Marciniuk
Dataset consists of audio samples and the results of their parametrization. The extraction of music parameters was performed using MIRToolbox. Information extracted from the samples was used as a database for master's thesis titled 'The influence of audio signal processing chain in mixing on the emotional state of a music piece'.
Moving object detection and tracking for the purpose of multimodal surveillance system in urban areas
Publikacja
- A. Czyżewski
- P. Dalka
- Rok 2008
Background subtraction method based on mixture of Gaussians was employed to detect all regions in a video frame denoting moving objects. Kalman filters were used for establishing relations between the regions and real moving objects in a scene and for tracking them continuously. The objects were represented by rectangles. The objects coupling with adequate regions including the relation of many-to-many was studied experimentally...
Polish motorways 2016 - video data
Dane Badawcze
open access
- Ł. Jeliński
- W. Kustra
- D. Bytner
- seria: Polish roads 2016- video data
Polish motorways 2016 - video data
Gaze-tracking based audio-visual correlation analysis employing quality of experience methodology
Publikacja
- Intelligent Decision Technologies-Netherlands - Rok 2010
This paper investigates a new approach to audio-visual correlation assessment based on the gaze-tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT). The gaze-tracking methodology, having roots in Human-Computer Interaction borrows the relevance feedback through gaze-tracking and applies it to the new area of interests, which is Quality of Experience. Results of subjective...

Pełny tekst do pobrania w serwisie zewnętrznym
Methodology and technology for the polymodal allophonic speech transcription
Publikacja
- Journal of the Acoustical Society of America - Rok 2016
A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

Pełny tekst do pobrania w serwisie zewnętrznym
Methodology and technology for the polymodal allophonic speech transcription
Publikacja
- Journal of the Acoustical Society of America - Rok 2016
A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

Pełny tekst do pobrania w serwisie zewnętrznym
Localization of impulsive disturbances in archive audio signals using predictive matched filtering
Publikacja
- M. Niedźwiecki
- M. Ciołek
- Rok 2014
The problem of elimination of impulsive disturbances from archive audio signals is considered and its new solution, called predictive matched filtering, is proposed. The new approach is based on the observation that a large percentage of noise pulses corrupting archive audio recordings have highly repetitive shapes that match several typical “patterns”, called click templates. To localize noise pulses, click templates can be correlated...

Pełny tekst do pobrania w serwisie zewnętrznym
Polish expressways 2016 - video data
Dane Badawcze
open access
- Ł. Jeliński
- W. Kustra
- D. Bytner
- seria: Polish roads 2016- video data
Polish expressways 2016 - video data
Eye Blink Based Detection of Liveness in Biometric Authentication Systems Using Conditional Random Fields
Publikacja
- M. Szwoch
- P. Pieniążek
- Rok 2012
The goal of this paper was to verify whether the conditional random fields are suitable and enough efficient for eye blink detection in user authentication systems based on face recognition with a standard web camera. To evaluate this approach several experiments were carried on using a specially developed test application and video database.
Measurements of OF QoS/QoE parameters for media streaming in a PMIPv6 TESTBED WITH 802.11 b/g/n WLANs
Publikacja
- Metrology and Measurement Systems - Rok 2012
A growing number of mobile devices and the increasing popularity of multimedia services result in a new challenge of providing mobility in access networks. The paper describes experimental research on media (audio and video) streaming in a mobile IEEE 802.11 b/g/n environment realizing network-based mobility. It is an approach to mobility that requires little or no modification of the mobile terminal. Assessment of relevant parameters...

Pełny tekst do pobrania w portalu
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
Publikacja
- B. Kostek
- M. Piotrowska
- T. Ciszewski
- A. Czyżewski
- Rok 2017
An allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
An new method of audio-visual correlation analysis
Publikacja
- B. Kunka
- B. Kostek
- Rok 2009
This paper presents a new methodology of conducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between...

Pełny tekst do pobrania w serwisie zewnętrznym
Polish voivodeship roads 2016 - video data
Dane Badawcze
open access
- Ł. Jeliński
- W. Kustra
- D. Bytner
- seria: Polish roads 2016- video data
Polish voivodeship roads 2016 - video data
In uence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classi cation
Publikacja
- A. Rosner
- F. Weninger
- B. Schuller
- M. Michalak
- B. Kostek
- Rok 2013
We present a comprehensive evaluation of the infuence of 'harmonic' and rhythmic sections contained in an audio file on automatic music genre classi cation. The study is performed using the ISMIS database composed of music files, which are represented by vectors of acoustic parameters describing low-level music features. Non-negative Matrix Factorization serves for blind separation of instrument components. Rhythmic components...
Sparse autoregressive modeling
Publikacja
- M. Ciołek
- Rok 2012
In the paper the comparison of the popular pitch determination (PD) algorithms for thepurpose of elimination of clicks from archive audio signals using sparse autoregressive (SAR)modeling is presented. The SAR signal representation has been widely used in code-excitedlinear prediction (CELP) systems. The appropriate construction of the SAR model is requiredto guarantee model stability. For this reason the signal representation...
Audio content analysis in the urban area telemonitoring system
Publikacja
- Rok 2010
Artykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...

Pełny tekst do pobrania w serwisie zewnętrznym
Selection of Features for Multimodal Vocalic Segments Classification
Publikacja
- S. Zaporowski
- A. Czyżewski
- Rok 2018
English speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the inﬂuence of feature vector dimension reduction for the accuracy of vocalic segments classiﬁcation employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...

Pełny tekst do pobrania w serwisie zewnętrznym
Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders
Publikacja
- D. Koszewski
- Rok 2023
The purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...

Pełny tekst do pobrania w portalu
Visualization of events using various kinds of synchronized data for the Border Guard
Publikacja
- B. Czaplewski
- S. Kaczmarek
- J. A. Litka
- M. Miszewski
- Zeszyty Naukowe Akademii Marynarki Wojennej - Rok 2017
STRADAR project is dedicated to streaming real-time data in a distributed dispatcher and teleinfor-mation system of the Border Guard. The Events Visualization Post is a software designed for simultaneous visualization of data of different types in BG headquarters. The software allows the operator to visualize files, images, SMS, SDS, video, audio, and current or archival data on naval situation on digital maps. All the visualized...

Pełny tekst do pobrania w portalu
Polish national roads 2016 - video data
Dane Badawcze
open access
- Ł. Jeliński
- W. Kustra
- D. Bytner
- seria: Polish roads 2016- video data
The data includes video traffic data registered with video camera installed inside the car. The purpose of the research was to gather vehicle traffic recordings in real conditions on polish national roads.
Production of six-degrees-of-freedom (6DoF) navigable audio using 30 Ambisonic microphones
Publikacja
- B. Mróz
- M. Kabaciński
- T. Ciotucha
- A. Rumiński
- T. Żernicki
- Rok 2021
This paper describes a method for planning, recording, and post-production of six-degrees-of-freedom audio recorded with multiple 3rd order Ambisonic microphone arrays. The description is based on the example of recordings conducted in August 2020 with the Poznan Philharmonic Orchestra using 30 units of Zylia ZM-1S. A convenient way to prepare and organize such a big project is proposed – this involves details of stage planning,...

Pełny tekst do pobrania w serwisie zewnętrznym

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: audio-video recordings database

Piotr Odya dr inż.

Grzegorz Szwoch dr hab. inż.