Wyniki wyszukiwania dla: DOCUMENTS CLUSTERING

External Validation Measures for Nested Clustering of Text Documents

Publikacja

- Rok 2011

Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...

Development and Research of the Text Messages Semantic Clustering Methodology

Publikacja

N. Rizun
P. Kapłański
Y. Taranenko

- Rok 2016

The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

Pełny tekst do pobrania w portalu

Information Retrieval with the Use of Music Clustering by Directions Algorithm

Publikacja

A. Kaczmarek

- Rok 2013

This paper introduces the Music Clustering by Directions (MCBD) algorithm. The algorithm is designed to support users of query by humming systems in formulating queries. This kind of systems makes it possible to retrieve songs and tunes on the basis of a melody recorded by the user. The Music Clustering by Directions algorithm is a kind of an interactive query expansion method. On the basis of query, the algorithm provides suggestions...

Pełny tekst do pobrania w serwisie zewnętrznym

Retrieval with Semantic Sieve

Publikacja

- Rok 2013

The article presents an algorithm we called Semantic Sieve applied for refining search results in text documents repository. The algorithm calculates socalled conceptual directions that enables interaction with the user and allows to narrow the set of results to the most relevant ones. We present the system where the algorithm has been implemented. The system also offers in the presentation layer clustering of the results into...

Pełny tekst do pobrania w serwisie zewnętrznym

Web search results clusterization with background knowledge

Publikacja

J. Szymański

- Rok 2009

Clusterization of web pages is an attractive wayfor presenting web resources. Arranging pages into groups ofsimilar topics simplifies and shorten the search process. Thispaper concerns the problem of clustering web pages and presentsour approach to this issue. Our solution is focused on findingsimilarities between documents delivered by different web searchengines. This process was accomplished by applying WordNetdictionary.

Evaluation of Path Based Methods for Conceptual Representation of the Text

Publikacja

- Rok 2014

Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

Pełny tekst do pobrania w serwisie zewnętrznym

Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection

Publikacja

P. Hrkút
M. Ďuračík
M. Mikušová
M. Callejas-cuervo
J. Żukowska

- Rok 2019

The problem of plagiarism is becoming increasingly more significant with the growth of Internet technologies and the availability of information resources. Many tools have been successfully developed to detect plagiarisms in textual documents, but the situation is more complicated in the field of plagiarism of source codes, where the problem is equally serious. At present, there are no complex tools available to detect plagiarism...

Path-based methods on categorical structures for conceptual representation of wikipedia articles

Publikacja

- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS - Rok 2017

Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....

Pełny tekst do pobrania w portalu

Semantic Analysis and Text Summarization in Socio-Technical Systems

Publikacja

N. Rizun

- Rok 2018

In this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...

Pełny tekst do pobrania w serwisie zewnętrznym

Social learning in cluster initiatives

Publikacja

M. Rozkwitalska
A. Lis

- Competitiveness Review - Rok 2022

Purpose – The purpose of the paper is to portray social learning in cluster initiatives (CIs), namely: 1) to explore, with the lens of the communities of practice (CoPs) theory, in what ways social learning occurs in CIs; 2) to discover how various CoPs emerge and evolve in CIs to facilitate a collective journey in their learning process. Subsequently, the authors address the research questions: In what ways does social learning...

Pełny tekst do pobrania w portalu

Social learning and knowledge flows in cluster initiatives, In: Sanz S.C., Blanco F.P., Urzelai B. (Eds). Human and Relational Resources (pp. 44-45). the 4th International Conference on Clusters and Industrial Districts CLUSTERING, University of Valencia, Spain, May 23–24 (ISBN: 978-84-09-11926-4).

Publikacja

M. Rozkwitalska
A. Lis

- Rok 2019

Purpose – The purpose of the paper is to explore how learning manifests and knowledge flows in cluster initiatives (CIs) due to interactions undertaken by their members. The paper addresses the research question of how social learning occurs and knowledge flows in CIs. Design/methodology/approach – The qualitative study of four cluster initiatives helped to identify various symptoms of social learning and knowledge flows in...

Information Retrieval in Wikipedia with Conceptual Directions

Publikacja

J. Szymański

- Rok 2015

The paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the...

Pełny tekst do pobrania w serwisie zewnętrznym

Interactive Information Retrieval Algorithm for Wikipedia Articels

Publikacja

J. Szymański

- Rok 2012

The article presents an algorithm for retrieving textual information in documents collection. The algorithm employs a category system that organizers the repository and using interaction with user improves search precision. The algorithm was implemented for simple English Wikipedia and the first evaluation results indicates the proposed method can help to retrieve information from large document repositories.

Filtry

Katalog

External Validation Measures for Nested Clustering of Text Documents

Development and Research of the Text Messages Semantic Clustering Methodology

Information Retrieval with the Use of Music Clustering by Directions Algorithm

Retrieval with Semantic Sieve

Web search results clusterization with background knowledge

Evaluation of Path Based Methods for Conceptual Representation of the Text

Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection

Path-based methods on categorical structures for conceptual representation of wikipedia articles

Semantic Analysis and Text Summarization in Socio-Technical Systems

Social learning in cluster initiatives

Information Retrieval in Wikipedia with Conceptual Directions

Interactive Information Retrieval Algorithm for Wikipedia Articels

Wyszukiwarka

Filtry

Katalog

Wyniki wyszukiwania dla: DOCUMENTS CLUSTERING