Wyniki wyszukiwania dla: VISUAL RECOGNITION

Wyniki wyszukiwania dla: VISUAL RECOGNITION

wyników na stronę:
osadź ten widok na swojej stronie

Filtry

wszystkich: 227

wyczyść wszystkie filtry niedostępne

Multimodal Audio-Visual Recognition of Traffic Events
Publikacja
- Rok 2011
Przedstawiono demonstrator systemu wykrywania niebezpiecznych zdarzeń w ruchu drogowym oparty na jednoczesnej analizie danych wizyjnych i akustycznych. System jest częścią systemu automatycznego nadzoru bezpieczeństwa. Wykorzystuje on kamery i mikrofony jako źródła danych. Przedstawiono wykorzystane algorytmy - algorytmy rozpoznawania zdarzeń dźwiękowych oraz analizy obrazu. Zaprezentowano wyniki działania algorytmów na przykładzie...
Robust and Efficient Machine Learning Algorithms for Visual Recognition
Publikacja
- S. Cygert
- Rok 2022
In visual recognition, the task is to identify and localize all objects of interest in the input image. With the ubiquitous presence of visual data in modern days, the role of object recognition algorithms is becoming more significant than ever and ranges from autonomous driving to computer-aided diagnosis in medicine. Current models for visual recognition are dominated by models based on Convolutional Neural Networks (CNNs), which...

Pełny tekst do pobrania w portalu
Combined Single Neuron Unit Activity and Local Field Potential Oscillations in a Human Visual Recognition Memory Task
Publikacja
- M. T. Kucewicz
- B. M. Berry
- M. R. Bower
- J. Cymbalnik
- V. Svehlik
- S. M. Stead
- G. A. Worrell
- IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING - Rok 2016
GOAL: Activities of neuronal networks range from action potential firing of individual neurons, coordinated oscillations of local neuronal assemblies, and distributed neural populations. Here, we describe recordings using hybrid electrodes, containing both micro- and clinical macroelectrodes, to simultaneously sample both large-scale network oscillations and single neuron spiking activity in the medial temporal lobe structures...

Pełny tekst do pobrania w serwisie zewnętrznym
Vowel recognition based on acoustic and visual features
Publikacja
- Archives of Acoustics - Rok 2006
W artykule zaprezentowano metodę, która może ułatwić naukę mowy dla osób z wadami słuchu. Opracowany system rozpoznawania samogłosek wykorzystuje łączną analizę parametrów akustycznych i wizualnych sygnału mowy. Parametry akustyczne bazują na współczynnikach mel-cepstralnych. Do wyznaczenia parametrów wizualnych z kształtu i ruchu ust zastosowano Active Shape Models. Jako klasyfikator użyto sztuczną sieć neuronową. Działanie systemu...

Pełny tekst do pobrania w portalu
Visual Lip Contour Detection for the Purpose of Speech Recognition
Publikacja
- Rok 2014
A method for visual detection of lip contours in frontal recordings of speakers is described and evaluated. The purpose of the method is to facilitate speech recognition with visual features extracted from a mouth region. Different Active Appearance Models are employed for finding lips in video frames and for lip shape and texture statistical description. Search initialization procedure is proposed and error measure values are...
An audio-visual corpus for multimodal automatic speech recognition
Publikacja
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017
review of available audio-visual speech corpora and a description of a new multimodal corpus of English speech recordings is provided. The new corpus containing 31 hours of recordings was created specifically to assist audio-visual speech recognition systems (AVSR) development. The database related to the corpus includes high-resolution, high-framerate stereoscopic video streams from RGB cameras, depth imaging stream utilizing Time-of-Flight...

Pełny tekst do pobrania w portalu
Comparison of Acoustic and Visual Voice Activity Detection for Noisy Speech Recognition
Publikacja
- Rok 2016
The problem of accurate differentiating between the speaker utterance and the noise parts in a speech signal is considered. The influence of utilizing a voice activity detection in speech signals on the accuracy of the automatic speech recognition (ASR) system is presented. The examined methods of voice activity detection are based on acoustic and visual modalities. The problem of detecting the voice activity in clean and noisy...
Human-Computer Interface Based on Visual Lip Movement and Gesture Recognition
Publikacja
- P. Dalka
- A. Czyżewski
- International Journal of Computing Science and Mathematics - Rok 2010
The multimodal human-computer interface (HCI) called LipMouse is presented, allowing a user to work on a computer using movements and gestures made with his/her mouth only. Algorithms for lip movement tracking and lip gesture recognition are presented in details. User face images are captured with a standard webcam. Face detection is based on a cascade of boosted classifiers using Haar-like features. A mouth region is located in...

Pełny tekst do pobrania w serwisie zewnętrznym
Combining visual and acoustic modalities to ease speech recognition by hearing impaired people
Publikacja
- B. Kostek
- P. Dalka
- Rok 2005
Artykuł prezentuje system, którego celem działania jest ułatwienie procesu treningu poprawnej wymowy dla osób z poważnymi wadami słuchu. W analizie mowy wykorzystane zostały parametry akutyczne i wizualne. Do wyznaczenia parametrów wizualnych na podstawie kształtu i ruchu ust zostały wykorzystane modele Active Shape Models. Parametry akustyczne bazują na współczynnikach melcepstralnych. Do klasyfikacji wypowiadanych głosek została...
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publikacja
- Rok 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recognition is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
EXAMINING INFLUENCE OF VIDEO FRAMERATE AND AUDIO/VIDEO SYNCHRONIZATION ON AUDIO-VISUAL SPEECH RECOGNITION ACCURACY
Publikacja
- Rok 2014
The problem of video framerate and audio/video synchronization in audio-visual speech recogni-tion is considered. The visual features are added to the acoustic parameters in order to improve the accuracy of speech recognition in noisy conditions. The Mel-Frequency Cepstral Coefficients are used on the acoustic side whereas Active Appearance Model features are extracted from the image. The feature fusion approach is employed. The...
Automatic sound recognition for security purposes
Publikacja
- P. Żwan
- Rok 2008
In the paper an automatic sound recognition system is presented. It forms a part of a bigger security system developed in order to monitor outdoor places for non-typical audio-visual events. The analyzed audio signal is being recorded from a microphone mounted in an outdoor place thus a non stationary noise of a significant energy is present in it. In the paper an especially designed algorithm for outdoor noise reduction is presented,...
A comparative study of English viseme recognition methods and algorithms
Publikacja
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector construction...

Pełny tekst do pobrania w portalu
A comparative study of English viseme recognition methods and algorithm
Publikacja
- D. Jachimski
- A. Czyżewski
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2018
An elementary visual unit – the viseme is concerned in the paper in the context of preparing the feature vector as a main visual input component of Audio-Visual Speech Recognition systems. The aim of the presented research is a review of various approaches to the problem, the implementation of algorithms proposed in the literature and a comparative research on their effectiveness. In the course of the study an optimal feature vector...

Pełny tekst do pobrania w portalu
Multimodal English corpus for automatic speech recognition
Publikacja
- Rok 2013
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech...
Vocalic Segments Classification Assisted by Mouth Motion Capture
Publikacja
- Rok 2018
Visual features convey important information for automatic speech recognition (ASR), especially in noisy environment. The purpose of this study is to evaluate to what extent visual data (i.e. lip reading) can enhance recognition accuracy in the multi-modal approach. For that purpose motion capture markers were placed on speakers' faces to obtain lips tracking data during speaking. Different parameterizations strategies were tested...

Pełny tekst do pobrania w serwisie zewnętrznym
Visual Features for Endoscopic Bleeding Detection
Publikacja
- A. Brzeski
- Current Journal of Applied Science and Technology (British Journal of Applied Science & Technology) - Rok 2014
Aims: To define a set of high-level visual features of endoscopic bleeding and evaluate their capabilities for potential use in automatic bleeding detection. Study Design: Experimental study. Place and Duration of Study: Department of Computer Architecture, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, between March 2014 and May 2014. Methodology: The features have...

Pełny tekst do pobrania w portalu
Language material for English audiovisual speech recognition system developmen . Materiał językowy do wykorzystania w systemie audiowizualnego rozpoznawania mowy angielskiej
Publikacja
- A. Czyżewski
- B. Kostek
- T. Ciszewski
- D. Majewicz
- Rok 2013
The bi-modal speech recognition system requires a 2-sample language input for training and for testing algorithms which precisely depicts natural English speech. For the purposes of the audio-visual recordings, a training data base of 264 sentences (1730 words without repetitions; 5685 sounds) has been created. The language sample reflects vowel and consonant frequencies in natural speech. The recording material reflects both the...
From Linear Classifier to Convolutional Neural Network for Hand Pose Recognition
Publikacja
- P. Rościszewski
- Computer Science - Rok 2017
Recently gathered image datasets and the new capabilities of high-performance computing systems have allowed developing new artificial neural network models and training algorithms. Using the new machine learning models, computer vision tasks can be accomplished based on the raw values of image pixels instead of specific features. The principle of operation of deep neural networks resembles more and more what we believe to be happening...

Pełny tekst do pobrania w portalu
Recognition of Emotions in Speech Using Convolutional Neural Networks on Different Datasets
Publikacja
- Electronics - Rok 2022
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were applied to extract emotions based on spectrograms and mel-spectrograms. This study uses spectrograms and mel-spectrograms to investigate which feature extraction method better represents emotions and how big the differences in efficiency are in this context. The conducted studies demonstrated that mel-spectrograms are a better-suited...

Pełny tekst do pobrania w portalu
Material for Automatic Phonetic Transcription of Speech Recorded in Various Conditions
Publikacja
- Rok 2016
Automatic speech recognition (ASR) is under constant development, especially in cases when speech is casually produced or it is acquired in various environment conditions, or in the presence of background noise. Phonetic transcription is an important step in the process of full speech recognition and is discussed in the presented work as the main focus in this process. ASR is widely implemented in mobile devices technology, but...

Pełny tekst do pobrania w serwisie zewnętrznym
Automatic Watercraft Recognition and Identification on Water Areas Covered by Video Monitoring as Extension for Sea and River Traffic Supervision Systems
Publikacja
- N. Wawrzyniak
- A. Stateczny
- Polish Maritime Research - Rok 2018
The article presents the watercraft recognition and identification system as an extension for the presently used visual water area monitoring systems, such as VTS (Vessel Traffic Service) or RIS (River Information Service). The watercraft identification systems (AIS - Automatic Identification Systems) which are presently used in both sea and inland navigation require purchase and installation of relatively expensive transceivers...

Pełny tekst do pobrania w serwisie zewnętrznym
Remote Estimation of Video-Based Vital Signs in Emotion Invocation Studies
Publikacja
- Rok 2018
Abstract— The goal of this study is to examine the influence of various imitated and video invoked emotions on the vital signs (respiratory and pulse rates). We also perform an analysis of the possibility to extract signals from sequences acquired with cost-effective cameras. The preliminary results show that the respiratory rate allows for better separation of some emotions than the pulse rate, yet this relation highly depends...

Pełny tekst do pobrania w portalu
Methodology and technology for the polymodal allophonic speech transcription
Publikacja
- Journal of the Acoustical Society of America - Rok 2016
A method for automatic audiovisual transcription of speech employing: acoustic and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e. the changes in the articulatory setting of speech organs for...

Pełny tekst do pobrania w serwisie zewnętrznym
Methodology and technology for the polymodal allophonic speech transcription
Publikacja
- Journal of the Acoustical Society of America - Rok 2016
A method for automatic audiovisual transcription of speech employing: acoustic, electromagnetical articulography and visual speech representations is developed. It adopts a combining of audio and visual modalities, which provide a synergy effect in terms of speech recognition accuracy. To establish a robust solution, basic research concerning the relation between the allophonic variation of speech, i.e., the changes in the articulatory...

Pełny tekst do pobrania w serwisie zewnętrznym
The Innovative Faculty for Innovative Technologies
Publikacja
- Rok 2013
A leaflet describing Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology. Multimedia Systems Department described laboratories and prototypes of: Auditory-visual attention stimulator, Automatic video event detection, Object re-identification application for multi-camera surveillance systems, Object Tracking and Automatic Master-Slave PTZ Camera Positioning System, Passive Acoustic Radar,...

Pełny tekst do pobrania w serwisie zewnętrznym
Robust Object Detection with Multi-input Multi-output Faster R-CNN
Publikacja
- S. Cygert
- A. Czyżewski
- Rok 2022
Recent years have seen impressive progress in visual recognition on many benchmarks, however, generalization to the out-of-distribution setting remains a significant challenge. A state-of-the-art method for robust visual recognition is model ensembling. However, recently it was shown that similarly competitive results could be achieved with a much smaller cost, by using multi-input multi-output architecture (MIMO). In this work,...

Pełny tekst do pobrania w portalu
Robust Object Detection with Multi-input Multi-output Faster R-CNN
Publikacja
- S. Cygert
- A. Czyżewski
- Rok 2022
Recent years have seen impressive progress in visual recognition on many benchmarks, however, generalization to the out-of-distribution setting remains a significant challenge. A state-of-the-art method for robust visual recognition is model ensembling. However, recently it was shown that similarly competitive results could be achieved with a much smaller cost, by using multi-input multi-output architecture (MIMO). In this work,...

Pełny tekst do pobrania w serwisie zewnętrznym
Using Eye-tracking to get information on the skills acquisition by the radiology residents
Publikacja
- T. Kocejko
- A. Poliński
- A. Bujnowski
- M. Kaczmarek
- J. Rumiński
- A. Z. Nowakowski
- J. Wtorek
- T. Gorycki
- Rok 2019
This paper describes the possibility of monitoring the progress of knowledge and skills acquisition by the students of radiology. It is achieved by an analysis of a visual attention distribution patterns during image-based tasks solving. The concept is to use the eye-tracking data to recognize the way how the radiographic images are read by recognized experts, radiography residents involved in the training program, and untrained...

Pełny tekst do pobrania w serwisie zewnętrznym
Video Semantic Analysis Framework based on Run-time Production Rules - Towards Cognitive Vision
Publikacja
- E. Szczerbicki
- C. Toro
- C. Sanin
- JOURNAL OF UNIVERSAL COMPUTER SCIENCE - Rok 2015
This paper proposes a service-oriented architecture for video analysis which separates object detection from event recognition. Our aim is to introduce new tools to be considered in the pathway towards Cognitive Vision as a support for classical Computer Vision techniques that have been broadly used by the scientific community. In the article, we particularly focus in solving some of the reported scalability issues found in current...

Pełny tekst do pobrania w portalu
Detection, classification and localization of acoustic events in the presence of background noise for acoustic surveillance of hazardous situations
Publikacja
- MULTIMEDIA TOOLS AND APPLICATIONS - Rok 2016
Evaluation of sound event detection, classification and localization of hazardous acoustic events in the presence of background noise of different types and changing intensities is presented. The methods for discerning between the events being in focus and the acoustic background are introduced. The classifier, based on a Support Vector Machine algorithm, is described. The set of features and samples used for the training of the...

Pełny tekst do pobrania w portalu
Bimodal classification of English allophones employing acoustic speech signal and facial motion capture
Publikacja
- Journal of the Acoustical Society of America - Rok 2018
A method for automatic transcription of English speech into International Phonetic Alphabet (IPA) system is developed and studied. The principal objective of the study is to evaluate to what extent the visual data related to lip reading can enhance recognition accuracy of the transcription of English consonantal and vocalic allophones. To this end, motion capture markers were placed on the faces of seven speakers to obtain lip...

Pełny tekst do pobrania w serwisie zewnętrznym
The data exchange between smart glasses and healthcare information systems using the HL7 FHIR standard
Publikacja
- J. Rumiński
- A. Bujnowski
- T. Kocejko
- A. Andrushevich
- M. Biallas
- R. Kistler
- Rok 2016
In this study we evaluated system architecture for the use of smart glasses as a viewer of information, as a source of medical data (vital sign measurements: temperature, pulse rate, and respiration rate), and as a filter of healthcare information. All activities were based on patient/device identification procedures using graphical markers or features based on visual appearance. The architecture and particular use cases were implemented...

Pełny tekst do pobrania w serwisie zewnętrznym
From Knowledge based Vision Systems to Cognitive Vision Systems: A Review
Publikacja
- T. Souza
- C. De
- C. Sanin
- E. Szczerbicki
- Rok 2018
Computer vision research and applications have their origins in 1960s. Limitations in computational resources inherent of that time, among other reasons, caused research to move away from artificial intelligence and generic recognition goals to accomplish simple tasks for constrained scenarios. In the past decades, the development in machine learning techniques has contributed to noteworthy progress in vision systems. However,...

Pełny tekst do pobrania w portalu
Video content analysis in the urban area telemonitoring system
Publikacja
- Rok 2010
The task of constant monitoring of video streams from a large number of cameras and reviewing the recordings in order to find a specified event requires a considerable amount of time and effort from the system operators and it is prone to errors. A solution to this problem is an automatic system for constant analysis of camera images being able to raise an alarm if a predefined event is detected. The chapter presents various aspects...

Pełny tekst do pobrania w serwisie zewnętrznym
Karolina Zielińska-Dąbkowska dr inż. arch.

Osoby

Katedra Architektury Miejskiej i Przestrzeni Nadwodnych

Karolina M. Zielinska-Dabkowska (dr inż. arch., Dipl.-Ing. Arch.[FH]) jest adiunktem na Wydziale Architektury Politechniki Gdańskiej. W roku 2002 ukończyła studia na Wydziale Architektury i Urbanistyki Politechniki Gdańskiej a w 2004 inżynierii architektonicznej na HAWK Hochschule für angewandte Wissenschaft und Kunst Hildesheim w Niemczech. Po studiach pracowała dla kilku firm o światowej renomie w Berlinie, Londynie, Nowym Jorku...
Network oscillations modulate interictal epileptiform spike rate during human memory
Publikacja
- J. Matsumoto
- M. Stead
- M. T. Kucewicz
- A. Matsumoto
- P. Peters
- B. Brinkmann
- J. C. Danstrom
- S. Goerss
- W. Marsh
- F. Meyer
- G. Worrell
- Brain: A Journal of Neurology - Rok 2013
Eleven patients being evaluated with intracranial electroencephalography for medically resistant temporal lobe epilepsy participated in a visual recognition memory task. Interictal epileptiform spikes were manually marked and their rate of occurrence compared between baseline and three 2 s periods spanning a 6 s viewing period. During successful, but not unsuccessful, encoding of the images there was a significant reduction in...

Pełny tekst do pobrania w serwisie zewnętrznym
Visual Detection of People Movement Rules Violation in Crowded Indoor Scenes
Publikacja
- P. Dalka
- P. Bratoszewski
- Rok 2013
The paper presents a camera-independent framework for detecting violations of two typical people movement rules that are in force in many public transit terminals: moving in the wrong direction or across designated lanes. Low-level image processing is based on object detection with Gaussian Mixture Models and employs Kalman filters with conflict resolving extensions for the object tracking. In order to allow an effective event...

Pełny tekst do pobrania w serwisie zewnętrznym
“Shadow” vs. “Phase 3D” method within endoscopic examinations of marine engines
Publikacja
- Z. Korczewski
- J. Rudnicki
- Combustion Engines - Rok 2013
A visual investigation of surfaces creating internal, working spaces of marine combustion engines by means of specialized view-finders so called endoscopes is at present almost a basic method of technical diag-nostics. The surface structure of constructional material is visible during investigations like through the magnifying glass (usually with a precisely determined magnification), which makes possible a detection, recognition...

Pełny tekst do pobrania w portalu
Convolutional Neural Networks for C. Elegans Muscle Age Classification Using Only Self-Learned Features
Publikacja
- Journal of Telecommunications and Information Technology - Rok 2022
Nematodes Caenorhabditis elegans (C. elegans) have been used as model organisms in a wide variety of biological studies, especially those intended to obtain a better understanding of aging and age-associated diseases. This paper focuses on automating the analysis of C. elegans imagery to classify the muscle age of nematodes based on the known and well established IICBU dataset. Unlike many modern classification methods, the proposed...

Pełny tekst do pobrania w portalu
High frequency oscillations are associated with cognitive processing in human recognition memory
Publikacja
- M. T. Kucewicz
- J. Cymbalnik
- J. Matsumoto
- B. H. Brinkmann
- M. R. Bower
- V. Vasoli
- V. Sulc
- F. Meyer
- W. Marsh
- S. M. Stead
- G. A. Worrell
- Brain: A Journal of Neurology - Rok 2014
High frequency oscillations are associated with normal brain function, but also increasingly recognized as potential biomarkers of the epileptogenic brain. Their role in human cognition has been predominantly studied in classical gamma frequencies (30-100 Hz), which reflect neuronal network coordination involved in attention, learning and memory. Invasive brain recordings in animals and humans demonstrate that physiological oscillations...

Pełny tekst do pobrania w portalu
MODALITY corpus - SPEAKER 17 - SEQUENCE S1
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S4
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S2
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S5
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S3
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 17 - SEQUENCE S6
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 35 - COMMANDS C1
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 21 - SEQUENCE S6
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
MODALITY corpus - SPEAKER 21 - COMMANDS C5
Dane Badawcze
- seria: MODALITY corpus
The MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: VISUAL RECOGNITION

Karolina Zielińska-Dąbkowska dr inż. arch.