
mgr inż. Szymon Olewniczak
Employment
- Deputy Head of Department at Department of Computer Architecture
- assistant at Department of Computer Architecture
Research fields
-
Remus: Polish-Kashubian parallel translation corpus
Open Research DataThe dataset contains 10,825 sentences from the Kashubian book "Life and Adventures of Remus" (Żëcé i przigòdë Remùsa) with parallel Polish translations. Aleksander Majkowski's book is considered the most important book in Kashubian literature, making it a valuable source of high-quality translation data.
-
Elgold intermediate: verified by the authors
Open Research DataThe dataset contains the texts from Elgold intermediate: verified by verification team additionaly verified by the dataset authors but before the final validation step with the elgold toolset.
-
Single Bit Errors in Ethernet II frames
Open Research DataCheck our final report for a detailed sumary on how the data was gathered and processed ("Methods" section of the report.pdf file).In the report, there are 7 different datasets mentionted. Below you can find specific information on how to navigate all the folders and construct those datasets from multiple files.
-
Elgold: gold standard, multi-genre dataset for named entity recognition and linking
Open Research DataThe dataset contains 276 multi-genre texts with marked named entities, which are linked to corresponding Wikipedia articles if available. Each entity was manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.
seen 2250 times