Search results for: WIKIPEDIA - Bridge of Knowledge

Relation-based Wikipedia Search System for Factoid Questions Answering

Publication

- International Journal of Innovative Research in Computer and Communication Engineering - Year 2014

In this paper we propose an alternative keyword search mechanism for Wikipedia, designed as a prototype solution towards factoid questions answering. The method considers relations between articles for finding the best matching article. Unlike the standard Wikipedia search engine and also Google engine, which search the articles content independently, requiring the entire query to be satisfied by a single article, the proposed...

Full text available to download

Self-Organizing Map representation for clustering Wikipedia search results

Publication

J. Szymański

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2011

The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...

Path-based methods on categorical structures for conceptual representation of wikipedia articles

Publication

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Year 2017

Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....

Full text available to download

Przegląd badań na temat Wikipedii oraz z wykorzystaniem Wikipedii jako instrument badawczego

Publication

B. Atroszko
J. Atroszko

- Year 2020

W badaniach dotychczas prowadzonych w Polsce Wikipedia była zarówno przedmiotem badań, jak i instrumentem badawczym. Badania na jej temat oraz na temat skutków społecznych jej używania prowadzili przedstawiciele nauk humanistycznych, społecznych, ekonomicznych i prawnych. Dla wielu badaczy (zwłaszcza z dziedziny informatyki) Wikipedia była instrumentem pomocnym w prowadzeniu różnorodnych analiz i dociekań naukowych. Niniejszy artykuł...

Full text to download in external service

Comparative Analysis of Text Representation Methods Using Classification

Publication

J. Szymański

- CYBERNETICS AND SYSTEMS - Year 2014

In our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...

Full text to download in external service

Wydobywanie wiedzy z Wikipedii

Publication

J. Kuchta

- Year 2022

Wikipedia jest olbrzymim źródłem wiedzy encyklopedycznej gromadzonej przez ludzi i przeznaczonej dla ludzi. W systemach informatycznych odpowiednikiem takiego źródła wiedzy są ontologie. Ten rozdział pokazuje, w jaki sposób Wikipedia jest transformowana w ontologię i jak wydobywać z niej pojęcia, ich właściwości i relacje między nimi.

Text classifiers for automatic articles categorization

Publication

- Year 2012

The article concerns the problem of automatic classification of textual content. We present selected methods for generation of documents representation and we evaluate them in classification tasks. The experiments have been performed on Wikipedia articles classified automatically to their categories made by Wikipedia editors.

Wordventure - cooperative wordnet editor. Architecture for lexical semantic aquisition

Publication

J. Szymański

- Year 2009

This article presents architecture for acquiring lexical semanticsin a collaborative approach paradigm. The system enablesfunctionality for editing semantic networks in a wikipedia-like style. The core of the system is a user-friendly interface based on interactive graph navigation.It has been used for semantic network presentation,and brings simultaneously modification functionality.

WordVenture - COOPERATIVE WordNet EDITOR Architecture for Lexical Semantic Acquisition

Publication

J. Szymański

- Year 2017

This article presents architecture for acquiring lexical semantics in a collaborative approach paradigm. The system enables functionality for editing semantic networks in a wikipedia-like style. The core of the system is a user-friendly interface based on interactive graph navigation. It has been used for semantic network presentation, and brings simultaneously modification functionality.

Full text to download in external service

Self Organizing Maps for Visualization of Categories

Publication

J. Szymański
W. Duch

- Year 2012

Visualization of Wikipedia categories using Self Organizing Mapsshows an overview of categories and their relations, helping to narrow down search domains. Selecting particular neurons this approach enables retrieval of conceptually similar categories. Evaluation of neural activations indicates that they form coherent patterns that may be useful for building user interfaces for navigation over category structures.

Towards Increasing Density of Relations in Category Graphs

Publication

- Year 2014

In the chapter we propose methods for identifying new associations between Wikipedia categories. The first method is based on Bag-of-Words (BOW) representation of Wikipedia articles. Using similarity of the articles belonging to different categories allows to calculate the information about categories similarity. The second method is based on average scores given to categories while categorizing documents by our dedicated score-based...

Full text to download in external service

Metody ekstrakcji ustrukturalizowanej treści z Wikipedii

Publication

J. Kuchta

- Year 2022

Wikipedia jest od dawna przedmiotem zainteresowania badaczy. Jednym z obszarów zainteresowania jest pozyskiwanie wiedzy z treści Wikipedii a to wymaga parsowania tekstu artykułów. W tym rozdziale przedstawiono analizę porównawczą różnych możliwości parsowania treści Wikipedii, wskazując problemy, z jakimi muszą się mierzyć autorzy parserów. Dzięki temu można zrozumieć, dlaczego proces wydobywania wiedzy z Wikipedii jest trudny

Management of Textual Data at Conceptual Level

Publication

J. Szymański

- Year 2011

The article presents the approach to the management of a large repository of documents at conceptual level. We describe our approach to representing Wikipedia articles using their categories. The representation has been used to construct groups of similar articles. Proposed approach has been implemented in prototype system that allows to organize articles that are search results for a given query. Constructed clusters allow to...

Review on Wikification methods

Publication

J. Szymański
M. Naruszewicz

- AI COMMUNICATIONS - Year 2019

The paper reviews methods on automatic annotation of texts with Wikipedia entries. The process, called Wikification aims at building references between concepts identified in the text and Wikipedia articles. Wikification finds many applications, especially in text representation, where it enables one to capture the semantic similarity of the documents. Also, it can be considered as automatic tagging of the text. We describe typical...

Full text to download in external service

Evaluation of Path Based Methods for Conceptual Representation of the Text

Publication

- Year 2014

Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

Full text to download in external service

Towards Facts Extraction From Texts in Polish Language

Publication

- International Journal of Innovative Research in Computer and Communication Engineering - Year 2014

The Polish language differs from English in many ways. It has more complicated conjugation and declination. Because of that automatic facts extraction from texts is difficult. In this paper we present basic differences between those languages. The paper presents an algorithm for extraction of facts from articles from Polish Wikipedia. The algorithm is based on 7 proposed facts schemes that are searched for in the analyzed text....

Full text available to download

Dynamic Semantic Visual Information Management

Publication

J. Szymański
W. Duch

- Year 2010

Dominant Internet search engines use keywords and therefore are not suited for exploration of new domains of knowledge, when the user does not know specific vocabulary. Browsing through articles in a large encyclopedia, each presenting a small fragment of knowledge, it is hard to map the whole domain, see relevant concepts and their relations. In Wikipedia for example some highly relevant articles are not linked with each other....

Full text to download in external service

Selecting Features with SVM

Publication

- Year 2013

A common problem with feature selection is to establish how many features should be retained at least so that important information is not lost. We describe a method for choosing this number that makes use of Support Vector Machines. The method is based on controlling an angle by which the decision hyperplane is tilt due to feature selection. Experiments were performed on three text datasets generated from a Wikipedia dump. Amount...

Full text to download in external service

DBpedia As a Formal Knowledge Base – An Evaluation

Publication

- WSEAS Transactions on Information Science and Applications - Year 2015

DBpedia is widely used by researchers as a mean of accessing Wikipedia in a standardized way. In this paper it is characterized from the point of view of questions answering system. Simple implementation of such system is also presented. The paper also characterizes alternatives to DBpedia in form of OpenCyc and YAGO knowledge bases. A comparison between DBpedia and those knowledge bases is presented.

Full text available to download

Improving css-KNN Classification Performance by Shifts in Training Data

Publication

- Year 2015

This paper presents a new approach to improve the performance of a css-k-NN classifier for categorization of text documents. The css-k-NN classifier (i.e., a threshold-based variation of a standard k-NN classifier we proposed in [1]) is a lazy-learning instance-based classifier. It does not have parameters associated with features and/or classes of objects, that would be optimized during off-line learning. In this paper we propose...

Filters

Catalog

Category

Year

Options

Relation-based Wikipedia Search System for Factoid Questions Answering

Self-Organizing Map representation for clustering Wikipedia search results

Path-based methods on categorical structures for conceptual representation of wikipedia articles

Przegląd badań na temat Wikipedii oraz z wykorzystaniem Wikipedii jako instrument badawczego

Comparative Analysis of Text Representation Methods Using Classification

Wydobywanie wiedzy z Wikipedii

Text classifiers for automatic articles categorization

Wordventure - cooperative wordnet editor. Architecture for lexical semantic aquisition

WordVenture - COOPERATIVE WordNet EDITOR Architecture for Lexical Semantic Acquisition

Self Organizing Maps for Visualization of Categories

Towards Increasing Density of Relations in Category Graphs

Metody ekstrakcji ustrukturalizowanej treści z Wikipedii

Management of Textual Data at Conceptual Level

Review on Wikification methods

Evaluation of Path Based Methods for Conceptual Representation of the Text

Towards Facts Extraction From Texts in Polish Language

Dynamic Semantic Visual Information Management

Selecting Features with SVM

DBpedia As a Formal Knowledge Base – An Evaluation

Improving css-KNN Classification Performance by Shifts in Training Data