Wyniki wyszukiwania dla: WIKIPEDIA - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: WIKIPEDIA

Filtry

wszystkich: 67
wybranych: 5

wyczyść wszystkie filtry


Filtry wybranego katalogu

wyczyść Filtry wybranego katalogu

Wyniki wyszukiwania dla: WIKIPEDIA

  • Automatically created and partially veriffied Wikipedia - WordNet mappings

    Dane Badawcze

    Mapping between Wikipedia articles and WordNet synsets. The mappings between Wikipedia articles and WordNet synsets were obtained automatically using 4 algorithms of data processing. The automatically generated mappings were than a subject of verification by a group of volunteers using crowdsourcing approach through so called Games with a Purpose. The...

  • TF-IDF weighted bag-of-words preprocessed text documents from Simple English Wikipedia

    Dane Badawcze

    The SimpleWiki2K-scores dataset contains TF-IDF weighted bag-of-words preprocessed text documents (raw strings are not available) [feature matrix] and their multi-label assignments [label-matrix]. Label scores for each document are also provided for an enhanced multi-label KNN [1] and LEML [2] classifiers. The aim of the dataset is to establish a benchmark...

  • Elgold: gold standard, multi-genre dataset for named entity recognition and linking

    Dane Badawcze

    The dataset contains 276 multi-genre texts with marked named entities, which are linked to corresponding Wikipedia articles if available. Each entity was manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking algorithms.

  • Elgold partial: News

    Dane Badawcze

    The dataset contains 37 English texts scrapped from news websites. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity recognition and linking...

  • Elgold partial: Job offers

    Dane Badawcze

    The dataset contains 34 English texts scrapped from the web portals offering job offers. In each text, the named entities are marked. Each name entity is linked to the corresponding Wikipedia if possible. All entities were manually verified by at least three people, which makes the dataset a high-quality gold standard for the evaluation of named entity...