Evaluation of Path Based Methods for Conceptual Representation of the Text - Publication - Bridge of Knowledge

Search

Evaluation of Path Based Methods for Conceptual Representation of the Text

Abstract

Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based measures for calcu- lating document relatedness in such conceptual space and compare them with the Path Length widely used approach. We perform their evaluation using the OPTICS clustering algorithm for categorization of keyword-based search results. The results have shown that our method outperforms the Path-Length approach.

Citations

  • 1

    CrossRef

  • 0

    Web of Science

  • 1

    Scopus

Cite as

Full text

full text is not available in portal

Keywords

Details

Category:
Conference activity
Type:
publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
Title of issue:
W : Foundations of Intelligent Systems strony 435 - 444
Language:
English
Publication year:
2014
Bibliographic description:
Kucharczyk Ł., Szymański J.: Evaluation of Path Based Methods for Conceptual Representation of the Text// W : Foundations of Intelligent Systems/ ed. Andreasen, Troels and Christiansen, Henning and Cubero, Juan-Carlos and Raś, Zbigniew : Springer International Publishing, 2014, s.435-444
DOI:
Digital Object Identifier (open in new tab) 10.1007/978-3-319-08326-1_44
Verified by:
Gdańsk University of Technology

seen 120 times

Recommended for you

Meta Tags