Filters
total: 447
filtered: 174
Search results for: speech recognition, allophone, phonology, foreign language, audio features
-
MODALITY corpus - SPEAKER 10 - COMMANDS C3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 33 - SEQUENCE S6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 35 - COMMANDS C6
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 32 - COMMANDS C5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 35 - COMMANDS C5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 33 - COMMANDS C4
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 27 - SEQUENCE S3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 27 - COMMANDS C3
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 33 - SEQUENCE S5
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 27 - SEQUENCE S2
Open Research DataThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
The American Sign Language alphabet
Open Research DataThe American Sign Language dataset contains all static letters of the American alphabet, meaning those that do not require movement to perform (the entire alphabet except for the letters 'J' and 'Z', which are dynamic and require hand movement).
-
Video recordings of static hand gestures for gesture based interaction
Open Research DataThis data set contains video recording of selected simple hand gestures related to sign language. The purpose of the data set is to evaluate different computer algorithms design for hand gesture detection as well as for hand features and hand pose detection and identification. The data set contains 5 video recordings in mp4 format. Each recording is...
-
Emotions in polish speech recordings
Open Research DataThe data set presents emotions recorded in sound files that are expressions of Polish speech. Statements were made by people aged 21-23, young voices of 5 men. Each person said the following words / nie – no, oddaj - give back, podaj – pass, stop - stop, tak - yes, trzymaj -hold / five times representing a specific emotion - one of three - anger (a),...
-
ALOFON corpus
Open Research DataThe ALOFON corpus is one of the multimodal database of word recordings in English, available at http://www.modality-corpus.org/. The ALOFON corpus is oriented towards the recording of the speech equivalence variants. For this purpose, a total of 7 people who are or speak English with native speaker fluency and a variety of Standard Southern British...
-
Elgold partial: Automotive blogs
Open Research DataThe dataset contains 34 English texts scrapped from automotive blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and...
-
Elgold partial: Movie reviews
Open Research DataThe dataset contains 37 English texts with movie reviews. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold partial: Job offers
Open Research DataThe dataset contains 34 English texts scrapped from the web portals offering job offers. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity...
-
Elgold partial: Scientific papers' abstracts
Open Research DataThe dataset contains 87 Scientific papers' abstracts in English randomly chosen from the folowing scientific disciplines: Biomedicine, Life Sciences, Mathematics, Medicine, Science, Humanities, Social Science.
-
Elgold partial: Amazon product reviews
Open Research DataThe dataset contains 34 Amazon product reviews in English. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold partial: History blogs
Open Research DataThe dataset contains 13 texts from English history blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
SYNAT_MUSIC_GENRE_FV_173
Open Research DataThis is the original dataset containing 51582 music tracks (22 music genres) and 173 element-feature vector [1-6,9]. A collection of more than 50000 music excerpts described with a set of descriptors obtained through the analysis of 30-second mp3 recordings was gathered in a database called SYNAT. The SYNAT database was realized by the Gdansk University...
-
SYNAT Music Genre Parameters PCA 19
Open Research DataThe dataset contains feature vector after Principal Component Analysis (PCA) performing, so there are 11 music genres and 19-element vector derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of 52532 music excerpts described...
-
SYNAT_PCA_48
Open Research DataThere is a series of datasets containing feature vectors derived from music tracks. The dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 48-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier...
-
SYNAT_PCA_11
Open Research DataThe dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 11-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of more than...