Filtry
wszystkich: 450
wybranych: 192
Wyniki wyszukiwania dla: TEXT REPRESENTATION DOCUMENTS CATEGORIZATION INFORMATION RETRIEVAL
-
TF-IDF weighted bag-of-words preprocessed text documents from Simple English Wikipedia
Dane BadawczeThe SimpleWiki2K-scores dataset contains TF-IDF weighted bag-of-words preprocessed text documents (raw strings are not available) [feature matrix] and their multi-label assignments [label-matrix]. Label scores for each document are also provided for an enhanced multi-label KNN [1] and LEML [2] classifiers. The aim of the dataset is to establish a benchmark...
-
Internal legal acts of technical and medical universities in Poland regulating classes conducted in-person during the Covid-19 pandemic
Dane BadawczeA database of legal acts and other internal documents of medical and technical universities in Poland regulating the way of organizing in-person or hybrid classes during the COVID-19 pandemic from the summer semester 2019/2020 to the winter semester 2020/2021.Documents were encoded in two separate coding systems using the MAXQDA program for qualitative...
-
A collection of directed graphs for the minimum cycle mean weight computation
Dane BadawczeThis dataset contains definitions of the 16 directed graphs with weighted edges that were described in the following paper: Paweł Pilarczyk, A space-efficient algorithm for computing the minimum cycle mean in a directed graph, Journal of Mathematics and Computer Science, 20 (2020), no. 4, 349--355, DOI: 10.22436/jmcs.020.04.08, URL: http://dx.doi.org/10.22436/jmcs.020.04.08 These...
-
Elgold partial: News
Dane BadawczeThe dataset contains 37 English texts scrapped from news websites. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking...
-
Elgold partial: Automotive blogs
Dane BadawczeThe dataset contains 34 English texts scrapped from automotive blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and...
-
Elgold partial: Movie reviews
Dane BadawczeThe dataset contains 37 English texts with movie reviews. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold partial: Job offers
Dane BadawczeThe dataset contains 34 English texts scrapped from the web portals offering job offers. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity...
-
Elgold partial: Scientific papers' abstracts
Dane BadawczeThe dataset contains 87 Scientific papers' abstracts in English randomly chosen from the folowing scientific disciplines: Biomedicine, Life Sciences, Mathematics, Medicine, Science, Humanities, Social Science.
-
Elgold partial: Amazon product reviews
Dane BadawczeThe dataset contains 34 Amazon product reviews in English. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
Elgold partial: History blogs
Dane BadawczeThe dataset contains 13 texts from English history blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
-
ArchBGal32cB 441Glu mutein gene analysis dataset
Dane Badawcze -
X-ray Photoelectron Spectroscopy studies of laser-induced titania nanotubes
Dane BadawczeThis dataset contains the results of high-resolution XPS studies obtained during the formation of the hollow nanopillar arrays through the laser-induced transformation of titania nanotubes.
-
X-ray Photoelectron Spectroscopy studies of citric acid adsorption on aluminium alloy 5754 in alkaline media
Dane BadawczeThis dataset contains the results of high-resolution XPS obtained during evaluation of high corrosion inhibition efficiency of citric acid towards aluminium alloy 5754 in bicarbonate buffer pH=11. The exposition duration of samples to electrolytic environment was 100 min. Each sample was exposed at different citric acid concentration ranging from 0...
-
SYNAT_MUSIC_GENRE_FV_173
Dane BadawczeThis is the original dataset containing 51582 music tracks (22 music genres) and 173 element-feature vector [1-6,9]. A collection of more than 50000 music excerpts described with a set of descriptors obtained through the analysis of 30-second mp3 recordings was gathered in a database called SYNAT. The SYNAT database was realized by the Gdansk University...
-
X-ray Photoelectron Spectroscopy studies of various carboxylic acids adsorption on aluminium alloys in alkaline media
Dane BadawczeThis dataset contains the results of high-resolution XPS studies obtained during evaluation of high corrosion inhibition efficiency of various carboxylic acids towards aluminium alloy 5754 in bicarbonate buffer pH=11.
-
SYNAT Music Genre Parameters PCA 19
Dane BadawczeThe dataset contains feature vector after Principal Component Analysis (PCA) performing, so there are 11 music genres and 19-element vector derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of 52532 music excerpts described...
-
SYNAT_PCA_48
Dane BadawczeThere is a series of datasets containing feature vectors derived from music tracks. The dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 48-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier...
-
SYNAT_PCA_11
Dane BadawczeThe dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 11-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of more than...
-
SkinDepth - synthetic 3D skin lesion database
Dane BadawczeSkinDepth is the first synthetic 3D skin lesion database. The release of SkinDepth dataset intends to contribute to the development of algorithms for:
-
Conley-Morse graphs for a two-dimensional discrete neuron model (low resolution)
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper “Topological-numerical analysis of a two-dimensional discrete neuron model” by Paweł Pilarczyk, Justyna Signerska-Rynkowska and Grzegorz Graff. A preprint of this paper is available at https://doi.org/10.48550/arXiv.2209.03443.
-
Conley-Morse graphs for a two-dimensional discrete neuron model (limited range)
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper “Topological-numerical analysis of a two-dimensional discrete neuron model” by Paweł Pilarczyk, Justyna Signerska-Rynkowska and Grzegorz Graff. A preprint of this paper is available at https://doi.org/10.48550/arXiv.2209.03443.
-
Conley-Morse graphs for a two-dimensional discrete neuron model (full range)
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper “Topological-numerical analysis of a two-dimensional discrete neuron model” by Paweł Pilarczyk, Justyna Signerska-Rynkowska and Grzegorz Graff. A preprint of this paper is available at https://doi.org/10.48550/arXiv.2209.03443.
-
Clinical situations text database for Polish language
Dane BadawczeDataset contains a database of anonymized texts in Polish for the purposes of building a medical speech corpus, for clinical situations in the following areas: medical interview, interview and description of the result of an oncological examination, description of a radiological examination, description of a pathomorphological examination, description...
-
Conley-Morse graphs for a non-linear Leslie population model with 2 varying parameters
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "A database schema for the analysis of global dynamics of multiparameter systems" by Z. Arai, W. Kalies, H. Kokubu, K. Mischaikow, H. Oka, P. Pilarczyk, published in SIAM Journal on Applied Dynamical Systems (SIADS),...
-
Conley-Morse graphs for a non-linear Leslie population model with 3 varying parameters
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "A database schema for the analysis of global dynamics of multiparameter systems" by Z. Arai, W. Kalies, H. Kokubu, K. Mischaikow, H. Oka, P. Pilarczyk, published in SIAM Journal on Applied Dynamical Systems (SIADS),...
-
Conley-Morse graphs for a two-patch vaccination model
Dane BadawczeThis dataset contains selected results of rigorous numerical computations described in Section 5 of the paper "Rich bifurcation structure in a two-patch vaccination model" by D.H. Knipl, P. Pilarczyk, G. Röst, published in SIAM Journal on Applied Dynamical Systems (SIADS), Vol. 14, No. 2 (2015), pp. 980–1017, doi: 10.1137/140993934.
-
Simulations of wave propagation and attenuation in fields of colliding ice floes
Dane BadawczeThis dataset contains results of numerical smulations of sea ice-wave interactions, corresponding to laboratory experiments conducted at the Large Ice Model Basin (LIMB) at the Hamburg Ship Model Basin (HSVA) as part of the LS-WICE ("Loads on Structure and Waves in Ice"; https://zenodo.org/record/1067170#.XrLt_dhpxhE) project. THe simulations were conducted...
-
Conley-Morse graphs for a population model with harvesting. Case He-S1: Equal harvesting of juveniles and adults, survival rates of juveniles and adults add up to 1
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "Global dynamics in a stage-structured discrete population model with harvesting" by E. Liz and P. Pilarczyk: Journal of Theoretical Biology, Vol. 297 (2012), pp. 148–165, doi: 10.1016/j.jtbi.2011.12.012.
-
Conley-Morse graphs for a population model with harvesting. Case Hj-Se: Harvesting juveniles only, equal survival rates of juveniles and adults
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "Global dynamics in a stage-structured discrete population model with harvesting" by E. Liz and P. Pilarczyk: Journal of Theoretical Biology, Vol. 297 (2012), pp. 148–165, doi: 10.1016/j.jtbi.2011.12.012.
-
Conley-Morse graphs for a population model with harvesting. Case He-Se: Equal harvesting and equal survival rates of juveniles and adults
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "Global dynamics in a stage-structured discrete population model with harvesting" by E. Liz and P. Pilarczyk: Journal of Theoretical Biology, Vol. 297 (2012), pp. 148–165, doi: 10.1016/j.jtbi.2011.12.012.
-
Conley-Morse graphs for a population model with harvesting. Case Hj-S1: Harvesting juveniles only, survival rates of juveniles and adults add up to 1
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "Global dynamics in a stage-structured discrete population model with harvesting" by E. Liz and P. Pilarczyk: Journal of Theoretical Biology, Vol. 297 (2012), pp. 148–165, doi: 10.1016/j.jtbi.2011.12.012.
-
Conley-Morse graphs for a population model with harvesting. Case Ha-S1: Harvesting adults only, survival rates of juveniles and adults add up to 1
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "Global dynamics in a stage-structured discrete population model with harvesting" by E. Liz and P. Pilarczyk: Journal of Theoretical Biology, Vol. 297 (2012), pp. 148–165, doi: 10.1016/j.jtbi.2011.12.012.
-
Conley-Morse graphs for a population model with harvesting. Case Ha-Se: Harvesting adults only, equal survival rates of juveniles and adults
Dane BadawczeThis dataset contains selected results of rigorous numerical computations conducted in the framework of the research described in the paper "Global dynamics in a stage-structured discrete population model with harvesting" by E. Liz and P. Pilarczyk: Journal of Theoretical Biology, Vol. 297 (2012), pp. 148–165, doi: 10.1016/j.jtbi.2011.12.012.
-
MODALITY corpus - SPEAKER 35 - COMMANDS C1
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S6
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C5
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S4
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 10 - SEQUENCE S1
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S2
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 39 - COMMANDS C1
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S3
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C3
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S2
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 33 - SEQUENCE S1
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C2
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - COMMANDS C3
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S4
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - SEQUENCE S6
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 21 - SEQUENCE S5
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...
-
MODALITY corpus - SPEAKER 01 - COMMANDS C4
Dane BadawczeThe MODALITY corpus is one of the multimodal database of word recordings in English. It consists of over 30 hours of multimodal recordings. The database contains high-resolution, high-framerate stereoscopic video streams and audio signals obtained from a microphone array and a laptop microphone. The corpus can be employed to develop an AVSR system,...