Julian Szymański - Research data

dr hab. inż. Julian Szymański

Employment

Deputy Director, Industrial Doctoral School at Industrial Doctoral School
Associate professor at Department Of Computer Architecture

Keywords Help

seria: Elgold - partial liczba: 8

expand collapse

Elgold partial: News
Open Research Data
version 1.1
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 37 English texts scrapped from news websites. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking...
Elgold intermediate: annotated raw
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains a subset of texts from Elgold intermediate: raw texts with named entities marked and linked to corresponding Wikipedia articles. The texts were annotated by 31 participants during the 1.5-hour session.
Elgold partial: History blogs
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 13 texts from English history blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
Elgold partial: Scientific papers' abstracts
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 87 Scientific papers' abstracts in English randomly chosen from the folowing scientific disciplines: Biomedicine, Life Sciences, Mathematics, Medicine, Science, Humanities, Social Science.
Elgold partial: Amazon product reviews
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 34 Amazon product reviews in English. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
Elgold partial: Automotive blogs
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 34 English texts scrapped from automotive blogs. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and...
Elgold partial: Movie reviews
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 37 English texts with movie reviews. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
Elgold partial: Job offers
Open Research Data
version 1.0
- S. Olewniczak
- J. Szymański
- series: Elgold - partial
The dataset contains 34 English texts scrapped from the web portals offering job offers. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity...

seria: Elgold intermediate liczba: 3

expand collapse

Elgold intermediate: verified by the authors
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold intermediate
The dataset contains the texts from Elgold intermediate: verified by verification team additionaly verified by the dataset authors but before the final validation step with the elgold toolset.
Elgold intermediate: verified by verification team
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold intermediate
The dataset contains the texts from Elgold intermediate: annotated raw additionaly verified by the five-person verification team. arly 25% of the mentions were corrected in some aspect.
Elgold intermediate: raw texts
Open Research Data
- S. Olewniczak
- J. Szymański
- series: Elgold intermediate
The dataset contains raw texts scrapped from various internet sources which were used for creating the Elgold dataset.

seria: Bees liczba: 4

expand collapse

Tagged images with bees
Open Research Data
- T. Boiński
- J. Szymański
- series: Bees
Images taken from bee hive with tagged bees. The images are prepared for training yolo5 deep neural network (supplied with the data).
Tagged images with bees 3
Open Research Data
- T. Boiński
- J. Szymański
- B. Rychcik
- J. Rudnik
- R. Nowicki
- series: Bees
Images taken from bee hive with tagged bees. The images are random frames from movies recorded in may 2017 and 2018. All images are taken from full HD video stream.
Tagged images with bees 2
Open Research Data
- T. Boiński
- J. Szymański
- A. Krauzewicz
- Ł. Łepek
- series: Bees
Images taken from bee hive with tagged bees.
Video recordings of bees at entrance to hives
Open Research Data
- T. Boiński
- J. Szymański
- series: Bees
Video recordings of bees at entrance to hives from 2017-04-22, 2017-04-23 and 2018-05-22. All recordings were made using hand-held full HD camera (Samsung Galaxy S3) and encoded using H.264 video codec (Standard Baseline Profile for mov files from 2017, High Profile for mp4 files from 2018) , 30 FPS and bit rate 14478 kb/s (mov files from 2017) or 16869 kb/s...

Elgold: gold standard, multi-genre dataset for named entity recognition and linking
Open Research Data
version 1.1
- S. Olewniczak
- J. Szymański
The dataset contains 276 multi-genre texts with marked named entities, which are linked to corresponding Wikipedia articles if available. Each entity was manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
Respiratory Rythm Phases Classifiction Dataset
Open Research Data
- J. Szymański
The dataset includes recordings of various breath patterns captured using accelerometer and tensometer sensors. The data is labeled into four classes corresponding to different respiratory phases: inhalation, retention, exhalation, and post-breath apnea.
WikiPrefs: human preferences dataset build from text edits
Open Research Data
- J. Majkutewicz
- J. Szymański
The WikiPrefs dataset is a human preferences dataset for Large Language Models alignment. It was built using the EditPrefs method from historical edits of Wikipedia featured articles
TF-IDF weighted bag-of-words preprocessed text documents from Simple English Wikipedia
Open Research Data
The SimpleWiki2K-scores dataset contains TF-IDF weighted bag-of-words preprocessed text documents (raw strings are not available) [feature matrix] and their multi-label assignments [label-matrix]. Label scores for each document are also provided for an enhanced multi-label KNN [1] and LEML [2] classifiers. The aim of the dataset is to establish a benchmark...
Automatically created and partially veriffied Wikipedia - WordNet mappings
Open Research Data
- T. Boiński
- J. Szymański
Mapping between Wikipedia articles and WordNet synsets. The mappings between Wikipedia articles and WordNet synsets were obtained automatically using 4 algorithms of data processing. The automatically generated mappings were than a subject of verification by a group of volunteers using crowdsourcing approach through so called Games with a Purpose. The...

Search

dr hab. inż. Julian Szymański

Employment

Keywords Help

seria: Elgold - partial liczba: 8

seria: Elgold intermediate liczba: 3

seria: Bees liczba: 4