Wyniki wyszukiwania dla: BAG OF WORDS

Wyniki wyszukiwania dla: BAG OF WORDS

wyników na stronę:
osadź ten widok na swojej stronie

Znaleźliśmy mało wyników, wypróbuj alternatywnej metody wyszukiwania.

Filtry

wszystkich: 9

wyczyść wszystkie filtry niedostępne

TF-IDF weighted bag-of-words preprocessed text documents from Simple English Wikipedia
Dane Badawcze
open access
The SimpleWiki2K-scores dataset contains TF-IDF weighted bag-of-words preprocessed text documents (raw strings are not available) [feature matrix] and their multi-label assignments [label-matrix]. Label scores for each document are also provided for an enhanced multi-label KNN [1] and LEML [2] classifiers. The aim of the dataset is to establish a benchmark...
Text Categorization Improvement via User Interaction
Publikacja
- J. Atroszko
- J. Szymański
- D. Gil
- H. Mora
- Rok 2018
In this paper, we propose an approach to improvement of text categorization using interaction with the user. The quality of categorization has been defined in terms of a distribution of objects related to the classes and projected on the self-organizing maps. For the experiments, we use the articles and categories from the subset of Simple Wikipedia. We test three different approaches for text representation. As a baseline we use...

Pełny tekst do pobrania w serwisie zewnętrznym
Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network
Publikacja
- A. Wawrzyński
- J. Szymański
- Applied Sciences-Basel - Rok 2021
To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...

Pełny tekst do pobrania w portalu
Towards Increasing Density of Relations in Category Graphs
Publikacja
- Rok 2014
In the chapter we propose methods for identifying new associations between Wikipedia categories. The first method is based on Bag-of-Words (BOW) representation of Wikipedia articles. Using similarity of the articles belonging to different categories allows to calculate the information about categories similarity. The second method is based on average scores given to categories while categorizing documents by our dedicated score-based...

Pełny tekst do pobrania w serwisie zewnętrznym
Evaluation of Path Based Methods for Conceptual Representation of the Text
Publikacja
- Ł. Kucharczyk
- J. Szymański
- Rok 2014
Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

Pełny tekst do pobrania w serwisie zewnętrznym
An Analysis of Neural Word Representations for Wikipedia Articles Classification
Publikacja
- J. Szymański
- N. Kawalec
- CYBERNETICS AND SYSTEMS - Rok 2019
One of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...

Pełny tekst do pobrania w serwisie zewnętrznym
Path-based methods on categorical structures for conceptual representation of wikipedia articles
Publikacja
- Ł. Kucharczyk
- J. Szymański
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017
Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....

Pełny tekst do pobrania w portalu
An automated learning model for twitter sentiment analysis using Ranger AdaBelief optimizer based Bidirectional Long Short Term Memory
Publikacja
- S. Natarajan
- S. Kurian
- P. Bidare Divakarachari
- P. Falkowski-Gilski
- EXPERT SYSTEMS - Rok 2024
Sentiment analysis is an automated approach which is utilized in process of analysing textual data to describe public opinion. The sentiment analysis has major role in creating impact in the day-to-day life of individuals. However, a precise interpretation of text still relies as a major concern in classifying sentiment. So, this research introduced Bidirectional Long Short Term Memory with Ranger AdaBelief Optimizer (Bi-LSTM RAO)...

Pełny tekst do pobrania w serwisie zewnętrznym
Experiments on Preserving Pieces of Information in a Given Order in Holographic Reduced Representations and the Continuous Geometric Algebra Model
Publikacja
- A. Patyk-Łońska
- Informatica - Rok 2011
Geometric Analogues of Holographic Reduced Representations (GAc, which is the continuous version of the previously developed discrete GA model) employ role-filler binding based on geometric products.Atomic objects are real-valued vectors in n-dimensional Euclidean space and complex statements belong to a hierarchy of multivectors. The property of GAc and HRR studied here is the ability to store pieces of information in a given...

Pełny tekst do pobrania w portalu

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: BAG OF WORDS