Wyniki wyszukiwania dla: text representation · document categorization wikipedia · word2vec · paragraph vector · self-organizing maps - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: text representation · document categorization wikipedia · word2vec · paragraph vector · self-organizing maps
Przykład wyników znalezionych w innych katalogach

Wyniki wyszukiwania dla: text representation · document categorization wikipedia · word2vec · paragraph vector · self-organizing maps

  • Text Categorization Improvement via User Interaction

    Publikacja

    - Rok 2018

    In this paper, we propose an approach to improvement of text categorization using interaction with the user. The quality of categorization has been defined in terms of a distribution of objects related to the classes and projected on the self-organizing maps. For the experiments, we use the articles and categories from the subset of Simple Wikipedia. We test three different approaches for text representation. As a baseline we use...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Self–Organizing Map representation for clustering Wikipedia search results

    Publikacja

    - Rok 2011

    The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Self-Organizing Map representation for clustering Wikipedia search results

    The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...

  • Evaluation of Path Based Methods for Conceptual Representation of the Text

    Publikacja

    Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Comparative Analysis of Text Representation Methods Using Classification

    Publikacja

    In our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Path-based methods on categorical structures for conceptual representation of wikipedia articles

    Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....

    Pełny tekst do pobrania w portalu

  • Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network

    To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...

    Pełny tekst do pobrania w portalu

  • An Analysis of Neural Word Representations for Wikipedia Articles Classification

    Publikacja

    - CYBERNETICS AND SYSTEMS - Rok 2019

    One of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Self Organizing Maps for Visualization of Categories

    Publikacja

    - Rok 2012

    Visualization of Wikipedia categories using Self Organizing Mapsshows an overview of categories and their relations, helping to narrow down search domains. Selecting particular neurons this approach enables retrieval of conceptually similar categories. Evaluation of neural activations indicates that they form coherent patterns that may be useful for building user interfaces for navigation over category structures.

  • Music Mood Visualization Using Self-Organizing Maps

    Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

    Pełny tekst do pobrania w portalu