Search results for: INFORMATION RETRIEVAL
-
Comparative Analysis of Text Representation Methods Using Classification
PublicationIn our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...
-
Towards Increasing Density of Relations in Category Graphs
PublicationIn the chapter we propose methods for identifying new associations between Wikipedia categories. The first method is based on Bag-of-Words (BOW) representation of Wikipedia articles. Using similarity of the articles belonging to different categories allows to calculate the information about categories similarity. The second method is based on average scores given to categories while categorizing documents by our dedicated score-based...
-
Dynamic Semantic Visual Information Management
PublicationDominant Internet search engines use keywords and therefore are not suited for exploration of new domains of knowledge, when the user does not know specific vocabulary. Browsing through articles in a large encyclopedia, each presenting a small fragment of knowledge, it is hard to map the whole domain, see relevant concepts and their relations. In Wikipedia for example some highly relevant articles are not linked with each other....
-
Hanna Gaweł
PeopleHanna is a PhD student at the Doctoral School in Social Sciences in the discipline of Social Communication and Media Sciences at the Jagiellonian University and is employed as an Assistant at the Institute of Information Studies of the said university. She received her Master's degree from Jagiellonian University, where she studied Information Management at the Faculty of Management and Social Sciences. Her Bachelor's degree was...