Wyniki wyszukiwania dla: AUDIO-VIDEO RECORDINGS DATABASE - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: AUDIO-VIDEO RECORDINGS DATABASE

Wyniki wyszukiwania dla: AUDIO-VIDEO RECORDINGS DATABASE

  • Video recordings of static hand gestures for gesture based interaction

    Dane Badawcze
    open access

    This data set contains video recording of selected simple hand gestures related to sign language. The purpose of the data set is to evaluate different computer algorithms design for hand gesture detection as well as for hand features and hand pose detection and identification. The data set contains 5 video recordings in mp4 format.  Each recording is...

  • Energy Efficiency Study of Audio-video Content Consumption on Selected Android Mobile Terminals

    Publikacja

    Mobile devices are widely used by billions of users worldwide. Thanks to their main advantage, which is portability, they should be fully operational as long as possible, without the need to recharge or connect them to external power sources. This paper describes a study, carried out on four different mobile devices, with different hardware and software parameters, running the Android operating system. The research campaign involved...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Piotr Odya dr inż.

    Piotr Odya urodził się w Gdańsku w 1974. W 1999 roku ukończył z wyróżnieniem studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Praca dyplomowa dotyczyła problemów poprawy jakości dźwięku w studiach emisyjnych współczesnych rozgłośni radiowych.Jego zainteresowania dotyczą montażu wideofonicznego, systemów dźwięku wielokanałowego. W ramach studiów doktoranckich...

  • Grzegorz Szwoch dr hab. inż.

    Grzegorz Szwoch urodził się w 1972 roku w Gdańsku. W latach 1991-1996 studiował na wydziale Elektroniki Politechniki Gdańskiej. W roku 1996 ukończył studia w Zakładzie Inżynierii Dźwięku (obecnie Katedra Systemów Multimedialnych), broniąc pracę dyplomową pt. Modelowanie fizyczne wybranych instrumentów muzycznych. W tym samym roku dołączył do zespołu badawczego Katedry jako uczestnik Studium Doktoranckiego. Od stycznia 2001 roku...

  • QoS/QoE in the Heterogeneous Internet of Things (IoT)

    Publikacja

    - Rok 2017

    Applications provided in the Internet of Things can generally be divided into three categories: audio, video and data. This has given rise to the popular term Triple Play Services. The most important audio applications are VoIP and audio streaming. The most notable video applications are VToIP, IPTV, and video streaming, and the service WWW is the most prominent example of data-type services. This chapter elaborates on the most...

  • Video of LEGO Bricks on Conveyor Belt Dataset Series

    Publikacja

    - Rok 2022

    The dataset series titled Video of LEGO bricks on conveyor belt is composed of 14 datasets containing video recordings of a moving white conveyor belt. The recordings were created using a smartphone camera in Full HD resolution. The dataset allows for the preparation of data for neural network training, and building of a LEGO sorting machine that can help builders to organise their collections.

    Pełny tekst do pobrania w portalu

  • Postprodukcja nagrania wideo z dzwiekiem dookolnym

    Publikacja

    One of the aims of this paper is to present issues related to audio-video correlation. This is presented on the basis of a short film realization employing surround microphone techniques. First, some related works in the domain of sound and vision correlation are presented. Then assumptions concerning scene creation related to both audio and video are shortly described. Another objective is to discuss results of subjective tests...

  • Recovering Sound Produced by Wind Turbine Structures Employing Video Motion Magnification

    The recordings were made with a fast video camera and with a microphone. Using fast cameras allowed for observation of the micro vibrations of the object structure. Motion-magnified video recordings of wind turbines on a wind farm were made for the purpose of building a damage prediction system. An idea was to use video to recover sound & vibrations in order to obtain a contactless diagnostic method for wind turbines. The recovered signals...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Reduction of parasitic pitch variations in archival musical recordings

    A new method for reducing parasitic pitch variations in archival audio recordings is presented. The method is intended for analyzing movie soundtracks recorded in optical films. It utilizes image processing for calculating and reducing effects of tape shrinkage being one of the main reasons for parasitic pitch variations in audio accompanying moving images. As long as the film tape characteristics are known the new method can be...

    Pełny tekst do pobrania w portalu

  • Video analytics-based algorithm for monitoring egress from buildings

    A concept and a practical implementation of the algorithm for detecting of potentially dangerous situations related to crowding in passages is presented. An example of such a situation is a crush which may be caused by an obstructed pedestrian pathway. The surveillance video camera signal analysis performed in the online mode is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of the...

    Pełny tekst do pobrania w portalu

  • Video Analytics-Based Algorithm for Monitoring Egress from Buildings

    Publikacja

    A concept and practical implementation of the algorithm for detecting of potentially dangerous situations of crowding in passages is presented. An example of such situation is a crush which may be caused by obstructed pedestrian pathway. Surveillance video camera signal analysis performed on line is employed in order to detect hold-ups near bottlenecks like doorways or staircases. The details of implemented algorithm which uses...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Localization of impulsive disturbances in audio signals using template matching

    In this paper, a new solution to the problem of elimination of impulsive disturbances from audio signals, based on the matched filtering technique, is proposed. The new approach stems from the observation that a large proportion of noise pulses corrupting audio recordings have highly repetitive shapes that match several typical “patterns”. In many cases a representative set of exemplary pulse waveforms can be extracted from the...

    Pełny tekst do pobrania w portalu

  • An extension to the FEEDB Multimodal Database of Facial Expressions and Emotions

    Publikacja
    • M. Szwoch
    • L. Marco-gimenez
    • M. Arevalillo-herráez
    • A. Ayesh

    - Rok 2015

    FEEDB is a multimodal database that contains recordings of people expressing different emotions, captured by using a Microsoft Kinect sensor. Data were originally provided in the device’s proprietary format (XED), requiring both the Microsoft Kinect Studio application and a Kinect sensor attached to the system to use the files. In this paper, we present an extension of the database. For a selection of recordings, we also provide...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Elimination of Impulsive Disturbances From Archive Audio Signals Using Bidirectional Processing

    In this application-oriented paper we consider the problem of elimination of impulsive disturbances, such as clicks, pops and record scratches, from archive audio recordings. The proposed approach is based on bidirectional processing—noise pulses are localized by combining the results of forward-time and backward-time signal analysis. Based on the results of specially designed empirical tests (rather than on the results of theoretical analysis),...

    Pełny tekst do pobrania w portalu

  • Building Knowledge for the Purpose of Lip Speech Identification

    Consecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • FEEDB: A multimodal database of facial expressions and emotions

    Publikacja

    - Rok 2013

    In this paper a first version of a multimodal FEEDB database of facial expressions and emotions is presented. The database contains labeled RGB-D recordings of people expressing a specific set of expressions that have been recorded using Microsoft Kinect sensor. Such a database can be used for classifier training and testing in face recognition as well as in recognition of facial expressions and human emotions. Also initial experiences...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Sparse vector autoregressive modeling of audio signals and its application to the elimination of impulsive disturbances

    Publikacja

    Archive audio files are often corrupted by impulsive disturbances, such as clicks, pops and record scratches. This paper presents a new method for elimination of impulsive disturbances from stereo audio signals. The proposed approach is based on a sparse vector autoregressive signal model, made up of two components: one taking care of short-term signal correlations, and the other one taking care of long-term correlations. The method...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Online sound restoration system for digital library applications

    Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • 1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type

    Publikacja

    A network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence....

  • Detection of impulsive disturbances in archive audio signals

    Publikacja

    In this paper the problem of detection of impulsive disturbances in archive audio signals is considered. It is shown that semi-causal/noncausal solutions based on joint evaluation of signal prediction errors and leave-one-out signal interpolation errors, allow one to noticeably improve detection results compared to the prediction-only based solutions. The proposed approaches are evaluated on a set of clean audio signals contaminated...

    Pełny tekst do pobrania w portalu

  • Musical Instrument Tagging Using Data Augmentation and Effective Noisy Data Processing

    Developing signal processing methods to extract information automatically has potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile applications (e.g., tuning apps), or pre-processing for an automatic mixing system. However, the last-mentioned application needs a significant amount of research to reliably recognize real musical instruments in recordings....

    Pełny tekst do pobrania w portalu

  • Online sound restoration system for digital library applications.

    Audio signal processing algorithms were introduced to the new online non-commercial service for audio restoration intended to enhance the content of digitized audio repositories. Missing or distorted audio samples are predicted using neural networks and a specific implementation of the Jannsen interpolation method based on the autoregressive model (AR) combined with the iterative restoring of missing signal samples. Since the distortion...

  • Intelligent multimedia solutions supporting special education needs.

    The role of computers in school education is briefly discussed. Multimodal interfaces development history is shortly reviewed. Examples of applications of multimodal interfaces for learners with special educational needs are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with facial expression and speech stretching audio interface representing audio modality....

  • Exploiting audio-visual correlation by means of gaze tracking

    This paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...

    Pełny tekst do pobrania w portalu

  • Rozproszone przechowywanie zapasowych kopii danych

    Publikacja

    - Rok 2012

    Pokazano metodę wykorzystania systemu przetwarzania rozproszonego do zabezpieczenia instytucji przed skutkami ataku hakerskiego połączonego ze zniszczeniem bazy danych tej instytucji. Metoda ta polega na wplataniu pakietów danych do materiałów audio-video ściąganych przez internautów korzystających z serwisów filmowych Video-on-Demand i przechowywaniu danych w rozproszeniu na setki lub nawet tysiące komputerów.

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Creating a Realible Music Discovery and Recomendation System

    The aim of this paper is to show problems related to creating a reliable music dis-covery system. The SYNAT database that contains audio files is used for the purpose of experiments. The files are divided into 22 classes corresponding to music genres with different cardinality. Of utmost importance for a reliable music recommendation system are the assignment of audio files to their appropriate gen-res and optimum parameterization...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Further Developments of the Online Sound Restoration System for Digital Library Applications

    Publikacja

    New signal processing algorithms were introduced to the online service for audio restoration available at the web address: www.youarchive.net. Missing or distorted audio samples are estimated using a specific implementation of the Jannsen interpolation method. The algorithm is based on the autoregressive model (AR) combined with the iterative complementation of signal samples. Since the interpolation algorithm is computationally...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Polish motorways 2016 - video data

    Dane Badawcze
    open access - seria: Polish roads 2016- video data

    Polish motorways 2016 - video data

  • Moving object detection and tracking for the purpose of multimodal surveillance system in urban areas

    Publikacja

    - Rok 2008

    Background subtraction method based on mixture of Gaussians was employed to detect all regions in a video frame denoting moving objects. Kalman filters were used for establishing relations between the regions and real moving objects in a scene and for tracking them continuously. The objects were represented by rectangles. The objects coupling with adequate regions including the relation of many-to-many was studied experimentally...

  • Gaze-tracking based audio-visual correlation analysis employing quality of experience methodology

    This paper investigates a new approach to audio-visual correlation assessment based on the gaze-tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT). The gaze-tracking methodology, having roots in Human-Computer Interaction borrows the relevance feedback through gaze-tracking and applies it to the new area of interests, which is Quality of Experience. Results of subjective...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Methodology and technology for the polymodal allophonic speech transcription

    A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Methodology and technology for the polymodal allophonic speech transcription

    A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Polish expressways 2016 - video data

    Dane Badawcze
    open access - seria: Polish roads 2016- video data

    Polish expressways 2016 - video data

  • Eye Blink Based Detection of Liveness in Biometric Authentication Systems Using Conditional Random Fields

    Publikacja

    - Rok 2012

    The goal of this paper was to verify whether the conditional random fields are suitable and enough efficient for eye blink detection in user authentication systems based on face recognition with a standard web camera. To evaluate this approach several experiments were carried on using a specially developed test application and video database.

  • Localization of impulsive disturbances in archive audio signals using predictive matched filtering

    Publikacja

    The problem of elimination of impulsive disturbances from archive audio signals is considered and its new solution, called predictive matched filtering, is proposed. The new approach is based on the observation that a large percentage of noise pulses corrupting archive audio recordings have highly repetitive shapes that match several typical “patterns”, called click templates. To localize noise pulses, click templates can be correlated...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Polish voivodeship roads 2016 - video data

    Dane Badawcze
    open access - seria: Polish roads 2016- video data

    Polish voivodeship roads 2016 - video data

  • Measurements of OF QoS/QoE parameters for media streaming in a PMIPv6 TESTBED WITH 802.11 b/g/n WLANs

    A growing number of mobile devices and the increasing popularity of multimedia services result in a new challenge of providing mobility in access networks. The paper describes experimental research on media (audio and video) streaming in a mobile IEEE 802.11 b/g/n environment realizing network-based mobility. It is an approach to mobility that requires little or no modification of the mobile terminal. Assessment of relevant parameters...

    Pełny tekst do pobrania w portalu

  • An new method of audio-visual correlation analysis

    Publikacja

    - Rok 2009

    This paper presents a new methodology of conducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization

    Publikacja

    - Rok 2017

    An allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...

  • In uence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classi cation

    Publikacja

    - Rok 2013

    We present a comprehensive evaluation of the infuence of 'harmonic' and rhythmic sections contained in an audio file on automatic music genre classi cation. The study is performed using the ISMIS database composed of music files, which are represented by vectors of acoustic parameters describing low-level music features. Non-negative Matrix Factorization serves for blind separation of instrument components. Rhythmic components...

  • Sparse autoregressive modeling

    Publikacja

    - Rok 2012

    In the paper the comparison of the popular pitch determination (PD) algorithms for thepurpose of elimination of clicks from archive audio signals using sparse autoregressive (SAR)modeling is presented. The SAR signal representation has been widely used in code-excitedlinear prediction (CELP) systems. The appropriate construction of the SAR model is requiredto guarantee model stability. For this reason the signal representation...

  • Audio content analysis in the urban area telemonitoring system

    Publikacja

    Artykuł przedstawia możliwości rozwinięcie monitoringu miejskiego o automatyczną analizę dźwięku. Przedstawiono metody parametryzacji dźwięku, które możliwe są do zastosowania w takim systemie oraz omówiono aspekty techniczne implementacji. W kolejnej części przedstawiono system decyzyjny oparty na drzewach zastosowany w systemie. System ten rozpoznaje dźwięki niebezpieczne (strzał, rozbita szyba, krzyk) wśród dźwięków zarejestrowanych...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Polish national roads 2016 - video data

    Dane Badawcze
    open access - seria: Polish roads 2016- video data

    The data includes video traffic data registered with video camera installed inside the car.  The purpose of the research was to gather vehicle traffic recordings in real conditions on polish national roads. 

  • Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders

    Publikacja

    - Rok 2023

    The purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...

    Pełny tekst do pobrania w portalu

  • Selection of Features for Multimodal Vocalic Segments Classification

    Publikacja

    English speech recognition experiments are presented employing both: audio signal and Facial Motion Capture (FMC) recordings. The principal aim of the study was to evaluate the influence of feature vector dimension reduction for the accuracy of vocalic segments classification employing neural networks. Several parameter reduction strategies were adopted, namely: Extremely Randomized Trees, Principal Component Analysis and Recursive...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Zaawansowane Przetwarzanie Sygnału

    Kursy Online
    • A. Szewczyk
    • J. Smulko

    Przedmiot prezentuje wybrane metody przetwarzania sygnałów w bardzo szerokim obszarze zastosowań. Ilustruje najnowsze osiągnięcia w tym zakresie, wsparte wybranymi publikacjami. Zajęcia są podzielone na wykład (15 h) i seminarium (15 h). Podstawowe pojęcia dotyczące cyfrowego przetwarzania sygnałów, zalecana literatura Analiza widmowa gęstość widmowa mocy, widmo falkowe, polispektra i gęstość widmowa mocy skrośnej Efekty...

  • Visualization of events using various kinds of synchronized data for the Border Guard

    STRADAR project is dedicated to streaming real-time data in a distributed dispatcher and teleinfor-mation system of the Border Guard. The Events Visualization Post is a software designed for simultaneous visualization of data of different types in BG headquarters. The software allows the operator to visualize files, images, SMS, SDS, video, audio, and current or archival data on naval situation on digital maps. All the visualized...

    Pełny tekst do pobrania w portalu

  • Production of six-degrees-of-freedom (6DoF) navigable audio using 30 Ambisonic microphones

    Publikacja
    • B. Mróz
    • M. Kabaciński
    • T. Ciotucha
    • A. Rumiński
    • T. Żernicki

    - Rok 2021

    This paper describes a method for planning, recording, and post-production of six-degrees-of-freedom audio recorded with multiple 3rd order Ambisonic microphone arrays. The description is based on the example of recordings conducted in August 2020 with the Poznan Philharmonic Orchestra using 30 units of Zylia ZM-1S. A convenient way to prepare and organize such a big project is proposed – this involves details of stage planning,...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Acquisition and indexing of RGB-D recordings for facial expressions and emotion recognition

    Publikacja

    In this paper KinectRecorder comprehensive tool is described which provides for convenient and fast acquisition, indexing and storing of RGB-D video streams from Microsoft Kinect sensor. The application is especially useful as a supporting tool for creation of fully indexed databases of facial expressions and emotions that can be further used for learning and testing of emotion recognition algorithms for affect-aware applications....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Machine learning applied to acoustic-based road traffic monitoring

    The motivation behind this study lies in adapting acoustic noise monitoring systems for road traffic monitoring for driver’s safety. Such a system should recognize a vehicle type and weather-related pavement conditions based on the audio level measurement. The study presents the effectiveness of the selected machine learning algorithms in acoustic-based road traffic monitoring. Bases of the operation of the acoustic road traffic...

    Pełny tekst do pobrania w portalu