Wyniki wyszukiwania dla: audio-visual correlation

Wyniki wyszukiwania dla: audio-visual correlation

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 1614

wyczyść wszystkie filtry niedostępne

wyświetlamy 1000 najlepszych wyników Pomoc

Objectivization of Audio-Visual Correlation analysis
Publikacja
- B. Kunka
- B. Kostek
- Archives of Acoustics - Rok 2012
Simultaneous perception of audio and visual stimuli often causes the concealment or misrepresentation of information actually contained in these stimuli. Such effects are called the ''image proximity effect'' or the ''ventriloquism effect'' in literature. Until recently, most research carried out to understand their nature was based on subjective assessments. The Authors of this paper propose a methodology based on both subjective...

Pełny tekst do pobrania w portalu
An new method of audio-visual correlation analysis
Publikacja
- B. Kunka
- B. Kostek
- Rok 2009
This paper presents a new methodology of conducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between...

Pełny tekst do pobrania w serwisie zewnętrznym
Exploiting audio-visual correlation by means of gaze tracking
Publikacja
- B. Kunka
- B. Kostek
- International Journal of Computer Science and Applications - Rok 2010
This paper presents a novel means for increasing audio-visual correlation analysis reliability. This is done based on gaze tracking technology engineered at the Multimedia Systems Department of the Gdansk University of Technology, Poland. In the paper, the past history and current research in the area of audio-visual perception analysis are shortly reviewed. Then the methodology employing gaze tracking is presented along with the...

Pełny tekst do pobrania w portalu
Gaze-tracking based audio-visual correlation analysis employing quality of experience methodology
Publikacja
- Intelligent Decision Technologies-Netherlands - Rok 2010
This paper investigates a new approach to audio-visual correlation assessment based on the gaze-tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT). The gaze-tracking methodology, having roots in Human-Computer Interaction borrows the relevance feedback through gaze-tracking and applies it to the new area of interests, which is Quality of Experience. Results of subjective...

Pełny tekst do pobrania w serwisie zewnętrznym
Automatic audio-visual threat detection
Publikacja
- J. Kotus
- J. Łopatka
- K. Kopaczewski
- A. Czyżewski
- Rok 2010
The concept, practical realization and application of a system for detection and classification of hazardous situations based on multimodal sound and vision analysis are presented. The device consists of new kind multichannel miniature sound intensity sensors, digital Pan Tilt Zoom and fixed cameras and a bundle of signal processing algorithms. The simultaneous analysis of multimodal signals can significantly improve the accuracy...
Objectivization of audio-video correlation assessment experiments
Publikacja
- B. Kunka
- B. Kostek
- Rok 2010
The purpose of this paper is to present a new method of conducting an audio-visual correlation analysis employing a head-motion-free gaze tracking system. First, a review of related works in the domain of sound and vision correlation is presented. Then assumptions concerning audio-visual scene creation are shortly described. The objectivization process of carrying out correlation tests employing gaze-tracking system is outlined....

Pełny tekst do pobrania w serwisie zewnętrznym
Multimodal Audio-Visual Recognition of Traffic Events
Publikacja
- Rok 2011
Przedstawiono demonstrator systemu wykrywania niebezpiecznych zdarzeń w ruchu drogowym oparty na jednoczesnej analizie danych wizyjnych i akustycznych. System jest częścią systemu automatycznego nadzoru bezpieczeństwa. Wykorzystuje on kamery i mikrofony jako źródła danych. Przedstawiono wykorzystane algorytmy - algorytmy rozpoznawania zdarzeń dźwiękowych oraz analizy obrazu. Zaprezentowano wyniki działania algorytmów na przykładzie...
An audio-visual corpus for multimodal automatic speech recognition
Publikacja
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017
review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Pełny tekst do pobrania w portalu
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publikacja
- Rok 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publikacja
- Rok 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
Audio-visual surveillance system for application in bank operating room
Publikacja
- J. Kotus
- K. Łopatka
- A. Czyżewski
- G. Bogdanis
- Communications in Computer and Information Science - Rok 2013
An audio-visual surveillance system able to detect, classify and to localize acoustic events in a bank operating room is presented. Algorithms for detection and classification of abnormal acoustic events, such as screams or gunshots are introduced. Two types of detectors are employed to detect impulsive sounds and vocal activity. A Support Vector Machine (SVM) classifier is used to discern between the different classes of acoustic...
Audio-visual aspect of the Lombard effect and comparison with recordings depicting emotional states.
Publikacja
- Rok 2018
In this paper an analysis of audio-visual recordings of the Lombard effect is shown. First, audio signal is analyzed indicating the presence of this phenomenon in the recorded sessions. The principal aim, however, was to discuss problems related to extracting differences caused by the Lombard effect, present in the video , i.e. visible as tension and work of facial muscles aligned to an increase in the intensity of the articulated...

Pełny tekst do pobrania w serwisie zewnętrznym
Pursuing Listeners’ Perceptual Response in Audio-Visual Interactions - Headphones vs Loudspeakers: A Case Study
Publikacja
- B. Mróz
- B. Kostek
- Archives of Acoustics - Rok 2022
This study investigates listeners’ perceptual responses in audio-visual interactions concerning binaural spatial audio. Audio stimuli are coupled with or without visual cues to the listeners. The subjective test participants are tasked to indicate the direction of the incoming sound while listening to the audio stimulus via loudspeakers or headphones with the head-related transfer function (HRTF) plugin. First, the methodology...

Pełny tekst do pobrania w portalu
Piotr Szczuko dr hab. inż.

Osoby

Katedra Systemów Multimedialnych

Dr hab. inż. Piotr Szczuko w 2002 roku ukończył studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Tematem pracy dyplomowej było badanie zjawisk jednoczesnej percepcji obrazu cyfrowego i dźwięku dookólnego. W roku 2008 obronił rozprawę doktorską zatytułowaną "Zastosowanie reguł rozmytych w komputerowej animacji postaci", za którą otrzymał nagrodę Prezesa Rady...
Michał Lech dr inż.

Osoby

Michał Lech was born in Gdynia in 1983. In 2007 he graduated from the faculty of Electronics, Telecommunications and Informatics of Gdansk University of Technology. In June 2013, he received his Ph.D. degree. The subject of the dissertation was: “A Method and Algorithms for Controlling the Sound Mixing Processes with Hand Gestures Recognized Using Computer Vision”. The main focus of the thesis was the bias of audio perception caused...
Marek Blok dr hab. inż.

Osoby

Marek Blok w 1994 roku ukończył studia na kierunku Telekomunikacja wydziału Elektroniki Politechniki Gdańskiej i uzyskał tytuł mgra inżyniera. Doktorat w zakresie telekomunikacji uzyskał w 2003 roku na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej. W 2017 roku uzyskał stopień naukowy dra habilitowanego w dyscyplinie telekomunikacja. Jego zainteresowania badawcze ukierunkowane są na telekomunikacyjne...
Application of gaze tracking technology to quality of experience domain
Publikacja
- B. Kostek
- B. Kunka
- Rok 2010
A new methodological approach to study subjective assessment results employing gaze tracking technology is shown. Notions of Human-Computer Interaction (HCI) and Quality of Experience (QoE) are shortly introduced in the context of their common application. Then, the gaze tracking system developed at the Multimedia Systems Department (MSD) of Gdansk University of Technology (GUT) is presented. A series of audio-visual subjective...
Piotr Odya dr inż.

Osoby

Katedra Systemów Multimedialnych

Piotr Odya urodził się w Gdańsku w 1974. W 1999 roku ukończył z wyróżnieniem studia na Wydziale Elektroniki, Telekomunikacji i Informatyki Politechniki Gdańskiej zdobywając tytuł magistra inżyniera. Praca dyplomowa dotyczyła problemów poprawy jakości dźwięku w studiach emisyjnych współczesnych rozgłośni radiowych.Jego zainteresowania dotyczą montażu wideofonicznego, systemów dźwięku wielokanałowego. W ramach studiów doktoranckich...
Determining Pronunciation Differences in English Allophones Utilizing Audio Signal Parameterization
Publikacja
- B. Kostek
- M. Piotrowska
- T. Ciszewski
- A. Czyżewski
- Rok 2017
An allophonic description of English plosive consonants, based on audio-visual recordings of 600 specially selected words, was developed. First, several speakers were recorded while reading words from a teleprompter. Then, every word was played back from the previously recorded sample read by a phonology expert and each examined speaker repeated a particular word trying to imitate correct pronunciation. The next step consisted...
Classifying type of vehicles on the basis of data extracted from audio signal characteristics
Publikacja
- Journal of the Acoustical Society of America - Rok 2017
The aim of this study is to find and optimize a feature vector for an automatic recognition of the type of vehicles, extracted form an audio signal. First, the influence of weather-based conditions of road surface on spectral characteristic of the audio signal recorded from a passing vehicle in close proximity to the road is discussed. Next, parameterization of the recorded audio signal is performed. For that purpose, the MIRtoolbox,...

Pełny tekst do pobrania w serwisie zewnętrznym
Automatic audio signal mixing system based on one-dimensional Wave-U-Net autoencoders
Publikacja
- D. Koszewski
- Rok 2023
The purpose of this dissertation is to develop an automatic song mixing system that is capable of automatically mixing a song with good quality in any music genre. This work recalls first the audio signal processing methods used in audio mixing, and it describes selected methods for automatic audio mixing. Then, a novel architecture built based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. Models...

Pełny tekst do pobrania w portalu
Testing A Novel Gesture-Based Mixing Interface
Publikacja
- M. Lech
- B. Kostek
- JOURNAL OF THE AUDIO ENGINEERING SOCIETY - Rok 2013
With a digital audio workstation, in contrast to the traditional mouse-keyboard computer interface, hand gestures can be used to mix audio with eyes closed. Mixing with a visual representation of audio parameters during experiments led to broadening the panorama and a more intensive use of shelving equalizers. Listening tests proved that the use of hand gestures produces mixes that are aesthetically as good as those obtained using...

Pełny tekst do pobrania w portalu
Quality Analysis of Audio-Video Transmission in an OFDM-Based Communication System
Publikacja
- M. Zamłyńska
- G. Debita
- P. Falkowski-Gilski
- Rok 2022
Application of a reliable audio-video communication system, brings many advantages. With the spoken word we can exchange ideas, provide descriptive information, as well as aid to another person. With the availability of visual information one can monitor the surrounding, working environment, etc. As the amount of available bandwidth continues to shrink, researchers focus on novel types of transmission. Currently, orthogonal frequency...

Pełny tekst do pobrania w serwisie zewnętrznym
Surgical tool tracking by on-line selection of structural correlation filters
Publikacja
- A. Jezierska
- D. Węsierski
- Rok 2017
In visual tracking of surgical instruments, correlation filtering finds the best candidate with maximal correlation peak. However, most trackers only consider capturing target appearance but not target structure. In this paper we propose surgical instrument tracking approach that integrates prior knowledge related to rotation of both shaft and tool tips. To this end, we employ rigid parts mixtures model of an instrument. The rigidly...

Pełny tekst do pobrania w serwisie zewnętrznym
Józef Kotus dr hab. inż.

Osoby

Katedra Systemów Multimedialnych
Methodology and technology for the polymodal allophonic speech transcription
Publikacja
- Journal of the Acoustical Society of America - Rok 2016
A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

Pełny tekst do pobrania w serwisie zewnętrznym
Methodology and technology for the polymodal allophonic speech transcription
Publikacja
- Journal of the Acoustical Society of America - Rok 2016
A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

Pełny tekst do pobrania w serwisie zewnętrznym
Automatic sound recognition for security purposes
Publikacja
- P. Żwan
- Rok 2008
In the paper an automatic sound recognition system is presented. It forms a part of a bigger security system developed in order to monitor outdoor places for non-typical audio-visual events. The analyzed audio signal is being recorded from a microphone mounted in an outdoor place thus a non stationary noise of a significant energy is present in it. In the paper an especially designed algorithm for outdoor noise reduction is presented,...
Postprodukcja nagrania wideo z dzwiekiem dookolnym
Publikacja
- Rok 2009
One of the aims of this paper is to present issues related to audio-video correlation. This is presented on the basis of a short film realization employing surround microphone techniques. First, some related works in the domain of sound and vision correlation are presented. Then assumptions concerning scene creation related to both audio and video are shortly described. Another objective is to discuss results of subjective tests...
Energy Efficiency Study of Audio-video Content Consumption on Selected Android Mobile Terminals
Publikacja
- P. Falkowski-Gilski
- M. Pańkowski
- Rok 2021
Mobile devices are widely used by billions of users worldwide. Thanks to their main advantage, which is portability, they should be fully operational as long as possible, without the need to recharge or connect them to external power sources. This paper describes a study, carried out on four different mobile devices, with different hardware and software parameters, running the Android operating system. The research campaign involved...

Pełny tekst do pobrania w serwisie zewnętrznym
Quality Evaluation of Novel DTD Algorithm Based on Audio Watermarking
Publikacja
- A. Ciarkowski
- A. Czyżewski
- Rok 2011
Echo cancellers typically employ a doubletalk detection (DTD) algorithm in order to keep the adaptive filter from diverging in the presence of near-end speech signal or other disruptive sounds in the microphone signal. A novel doubletalk detection algorithm based on techniques similar to those used for audio signal watermarking was introduced by the authors. The application of the described DTD algorithm within acoustic echo cancellation...

Pełny tekst do pobrania w serwisie zewnętrznym
Building Knowledge for the Purpose of Lip Speech Identification
Publikacja
- Advances in Intelligent Systems and Computing - Rok 2017
Consecutive stages of building knowledge for automatic lip speech identification are shown in this study. The main objective is to prepare audio-visual material for phonetic analysis and transcription. First, approximately 260 sentences of natural English were prepared taking into account the frequencies of occurrence of all English phonemes. Five native speakers from different countries read the selected sentences in front of...

Pełny tekst do pobrania w serwisie zewnętrznym
Analiza stanu nawierzchni i klas pojazdów na podstawie parametrów ekstrahowanych z sygnału fonicznego
Publikacja
- K. Marciniuk
- B. Kostek
- Zeszyty Naukowe Wydziału Elektrotechniki i Automatyki Politechniki Gdańskiej - Rok 2016
Celem badań jest poszukiwanie parametrów wektora cech ekstrahowanego z sygnału fonicznego w kontekście automatycznego rozpoznawania stanu nawierzchni jezdni oraz typu pojazdów. W pierwszej kolejności przedstawiono wpływ warunków pogodowych na charakterystykę widmową sygnału fonicznego rejestrowanego przy przejeżdżających pojazdach. Następnie, dokonano parametryzacji sygnału fonicznego oraz przeprowadzano analizę korelacyjną w celu...

Pełny tekst do pobrania w portalu
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
Publikacja
- Rok 2016
Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Pełny tekst do pobrania w serwisie zewnętrznym
Multiple Cues-Based Robust Visual Object Tracking Method
Publikacja
- B. Khan
- A. Jalil
- A. Ali
- K. Alkhaledi
- K. Mehmood
- K. M. Cheema
- M. Murad
- H. Tariq
- A. M. El-Sherbeeny
- Electronics - Rok 2022
Visual object tracking is still considered a challenging task in computer vision research society. The object of interest undergoes significant appearance changes because of illumination variation, deformation, motion blur, background clutter, and occlusion. Kernelized correlation filter- (KCF) based tracking schemes have shown good performance in recent years. The accuracy and robustness of these trackers can be further enhanced...

Pełny tekst do pobrania w portalu
English Language Learning Employing Developments in Multimedia IS
Publikacja
- Rok 2024
In the realm of the development of information systems related to education, integrating multimedia technologies offers novel ways to enhance foreign language learning. This study investigates audio-video processing methods that leverage real-time speech rate adjustment and dynamic captioning to support English language acquisition. Through a mixed-methods analysis involving participants from a language school, we explore the impact...

Pełny tekst do pobrania w serwisie zewnętrznym
A comparative study of English viseme recognition methods and algorithm
Publikacja
- D. Jachimski
- A. Czyżewski
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

Pełny tekst do pobrania w portalu
A comparative study of English viseme recognition methods and algorithms
Publikacja
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...

Pełny tekst do pobrania w portalu
Visual Attention Distribution Based Assessment of User's Skill in Electronic Medical Record Navigation
Publikacja
- T. Kocejko
- J. Wtorek
- K. Goforth
- K. Moidu
- Journal of Medical Imaging and Health Informatics - Rok 2015
Currently, the most precise way of reflecting the skills level is an expert’s subjective assessment. In this paper we investigate the possibility of the use of eye tracking data for scalar quantitative and objective assessment of medical staff competency in EMR system navigation. According to the experiment conducted by Yarbus the observation process of particular features is associated with thinking. Moreover, eye tracking is...

Pełny tekst do pobrania w serwisie zewnętrznym
Multimodal human-computer interfaces based on advanced video and audio analysis
Publikacja
- Rok 2013
Multimodal interfaces development history is reviewed briefly in the introduction. Examples of applications of multimodal interfaces to education software and for the disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and the audio interface for speech stretching for hearing impaired and stuttering people. The Smart...

Pełny tekst do pobrania w serwisie zewnętrznym
Gesture-controlled Sound Mixing System With a Sonified Interface
Publikacja
- M. Lech
- B. Kostek
- Rok 2013
In this paper the Authors present a novel approach to sound mixing. It is materialized in a system that enables to mix sound with hand gestures recognized in a video stream. The system has been developed in such a way that mixing operations can be performed both with or without visual support. To check the hypothesis that the mixing process needs only an auditory display, the influence of audio information visualization on sound...

Pełny tekst do pobrania w serwisie zewnętrznym
Cross-domain applications of multimodal human-computer interfaces
Publikacja
- A. Czyżewski
- Rok 2015
Developed multimodal interfaces for education applications and for disabled people are presented, including interactive electronic whiteboard based on video image analysis, application for controlling computers with mouth gestures and audio interface for speech stretching for hearing impaired and stuttering people and intelligent pen allowing for diagnosing and ameliorating developmental dyslexia. The eye-gaze tracking system named...
Automatic music signal mixing system based on one-dimensional Wave-U-Net autoencoders
Publikacja
- D. Koszewski
- T. Görne
- G. Korvel
- B. Kostek
- EURASIP Journal on Audio Speech and Music Processing - Rok 2023
The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods for automatic audio mixing first. Then, a novel deep model based on one-dimensional Wave-U-Net autoencoders is proposed for automatic music mixing. The model is trained on a custom-prepared database. Mixes created using the...

Pełny tekst do pobrania w portalu
Examining Government-Citizen Interactions on Twitter using Visual and Sentiment Analysis
Publikacja
- R. Hubert
- E. Estevez
- A. Maguitman
- T. Janowski
- Rok 2018
The goal of this paper is to propose a methodology comprising a range of visualization techniques to analyze the interactions between government and citizens on the issues of public concern taking place on Twitter, mainly through the official government or ministry accounts. The methodology addresses: 1) the level of government activity in different countries and sectors; 2) the topics that are addressed through such activities;...

Pełny tekst do pobrania w portalu
Multimodal English corpus for automatic speech recognition
Publikacja
- Rok 2013
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
On the Consumption of Multimedia Content Using Mobile Devices: a Year to Year User Case Study
Publikacja
- P. Falkowski-Gilski
- Archives of Acoustics - Rok 2020
In the early days, consumption of multimedia content related with audio signals was only possible in a stationary manner. The music player was located at home, with a necessary physical drive. An alternative way for an individual was to attend a live performance at a concert hall or host a private concert at home. To sum up, audio-visual effects were only reserved for a narrow group of recipients. Today, thanks to portable players,...

Pełny tekst do pobrania w portalu
MODALITY corpus - SPEAKER 17 - SEQUENCE S4
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S2
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S5
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S3
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: audio-visual correlation

Piotr Szczuko dr hab. inż.

Michał Lech dr inż.

Marek Blok dr hab. inż.

Piotr Odya dr inż.

Józef Kotus dr hab. inż.