Abstract
The paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the interaction with the user by presenting him so-called Conceptual Directions the search results are purified, which allows to increase Precision value. The preliminary evaluation on multi-sense test phrases indicates, that the algorithm is able to increase the Precision, within result set, without Recall loss. We also describe an additional method used for extending the result set based on creating cluster prototypes and finding the most similar, not retrieved content in text repository. In our demo implementation in the form of web portal, clustering has been used to present the search results organized in thematic groups instead of ranked list.
Citations
-
0
CrossRef
-
0
Web of Science
-
0
Scopus
Author (1)
Cite as
Full text
full text is not available in portal
Keywords
Details
- Category:
- Conference activity
- Type:
- materiały konferencyjne indeksowane w Web of Science
- Title of issue:
- Distributed Computing and Internet Technology strony 391 - 402
- Language:
- English
- Publication year:
- 2015
- Bibliographic description:
- Szymański J..: Information Retrieval in Wikipedia with Conceptual Directions, W: Distributed Computing and Internet Technology, 2015, Volume 8956 of the series Lecture Notes in Computer Science pp ,.
- DOI:
- Digital Object Identifier (open in new tab) 10.1007/978-3-319-14977-6_42
- Verified by:
- Gdańsk University of Technology
seen 104 times