Towards Effective Processing of Large Text Collections

Julian Szymański; Henryk Krawczyk

doi:10.1109/intech.2012.6457784

Towards Effective Processing of Large Text Collections

Abstrakt

In the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof the datasets. We describe the method used for evaluation ofthe clustering quality. Finally we discuss achieved results, pointsome improvements and perspectives for future development.

Cytowania

0

CrossRef
0

Web of Science
0

Scopus

Autorzy (2)

Cytuj jako

Pełna treść

pełna treść publikacji nie jest dostępna w portalu

Słowa kluczowe

Informacje szczegółowe

Kategoria:: Aktywność konferencyjna
Typ:: materiały konferencyjne indeksowane w Web of Science
Tytuł wydania:: 2nd International Conference on Innovative Computing Technology (INTECH) strony 293 - 298
Język:: angielski
Rok wydania:: 2012
Opis bibliograficzny:: Szymański J., Krawczyk H..: Towards Effective Processing of Large Text Collections, W: 2nd International Conference on Innovative Computing Technology (INTECH), 2012, ,.
DOI:: Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1109/intech.2012.6457784
Weryfikacja:: Politechnika Gdańska

wyświetlono 93 razy

Publikacje, które mogą cię zainteresować

Evaluation of Path Based Methods for Conceptual Representation of the Text

2014

Development and Research of the Text Messages Semantic Clustering Methodology

N. Rizun,
P. Kapłański,
Y. Taranenko

2016

Spectral Clustering Wikipedia Keyword-Based search Results

2017

Parallel Computations of Text Similarities for Categorization Task

J. Szymański

2013

Meta Tagi

Towards Effective Processing of Large Text Collections

Abstrakt

Cytowania

Autorzy (2)

Julian Szymański dr hab. inż.

Henryk Krawczyk prof. dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

Evaluation of Path Based Methods for Conceptual Representation of the Text

Development and Research of the Text Messages Semantic Clustering Methodology

Spectral Clustering Wikipedia Keyword-Based search Results

Parallel Computations of Text Similarities for Categorization Task

Wyszukiwarka

Towards Effective Processing of Large Text Collections

Abstrakt

Cytowania

Autorzy (2)

Julian Szymański dr hab. inż.

Henryk Krawczyk prof. dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

Evaluation of Path Based Methods for Conceptual Representation of the Text

Development and Research of the Text Messages Semantic Clustering Methodology

Spectral Clustering Wikipedia Keyword-Based search Results

Parallel Computations of Text Similarities for Categorization Task