Wyniki wyszukiwania dla: DOCUMENTS CLUSTERING - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: DOCUMENTS CLUSTERING

Wyniki wyszukiwania dla: DOCUMENTS CLUSTERING

  • External Validation Measures for Nested Clustering of Text Documents

    Publikacja

    Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...

  • Development and Research of the Text Messages Semantic Clustering Methodology

    Publikacja

    - Rok 2016

    The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

    Pełny tekst do pobrania w portalu

  • Information Retrieval with the Use of Music Clustering by Directions Algorithm

    Publikacja

    - Rok 2013

    This paper introduces the Music Clustering by Directions (MCBD) algorithm. The algorithm is designed to support users of query by humming systems in formulating queries. This kind of systems makes it possible to retrieve songs and tunes on the basis of a melody recorded by the user. The Music Clustering by Directions algorithm is a kind of an interactive query expansion method. On the basis of query, the algorithm provides suggestions...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Retrieval with Semantic Sieve

    Publikacja

    The article presents an algorithm we called Semantic Sieve applied for refining search results in text documents repository. The algorithm calculates socalled conceptual directions that enables interaction with the user and allows to narrow the set of results to the most relevant ones. We present the system where the algorithm has been implemented. The system also offers in the presentation layer clustering of the results into...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Web search results clusterization with background knowledge

    Publikacja

    - Rok 2009

    Clusterization of web pages is an attractive wayfor presenting web resources. Arranging pages into groups ofsimilar topics simplifies and shorten the search process. Thispaper concerns the problem of clustering web pages and presentsour approach to this issue. Our solution is focused on findingsimilarities between documents delivered by different web searchengines. This process was accomplished by applying WordNetdictionary.

  • Evaluation of Path Based Methods for Conceptual Representation of the Text

    Publikacja

    Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection

    Publikacja
    • P. Hrkút
    • M. Ďuračík
    • M. Mikušová
    • M. Callejas-cuervo
    • J. Żukowska

    - Rok 2019

    The problem of plagiarism is becoming increasingly more significant with the growth of Internet technologies and the availability of information resources. Many tools have been successfully developed to detect plagiarisms in textual documents, but the situation is more complicated in the field of plagiarism of source codes, where the problem is equally serious. At present, there are no complex tools available to detect plagiarism...

  • Path-based methods on categorical structures for conceptual representation of wikipedia articles

    Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Semantic Analysis and Text Summarization in Socio-Technical Systems

    Publikacja

    - Rok 2018

    In this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Social learning in cluster initiatives

    Publikacja

    - Competitiveness Review - Rok 2022

    Purpose – The purpose of the paper is to portray social learning in cluster initiatives (CIs), namely: 1) to explore, with the lens of the communities of practice (CoPs) theory, in what ways social learning occurs in CIs; 2) to discover how various CoPs emerge and evolve in CIs to facilitate a collective journey in their learning process. Subsequently, the authors address the research questions: In what ways does social learning...

    Pełny tekst do pobrania w portalu

  • Social learning and knowledge flows in cluster initiatives, In: Sanz S.C., Blanco F.P., Urzelai B. (Eds). Human and Relational Resources (pp. 44-45). the 4th International Conference on Clusters and Industrial Districts CLUSTERING, University of Valencia, Spain, May 23–24 (ISBN: 978-84-09-11926-4).

    Publikacja

    - Rok 2019

    Purpose – The purpose of the paper is to explore how learning manifests and knowledge flows in cluster initiatives (CIs) due to interactions undertaken by their members. The paper addresses the research question of how social learning occurs and knowledge flows in CIs. Design/methodology/approach – The qualitative study of four cluster initiatives helped to identify various symptoms of social learning and knowledge flows in...

  • Interactive Information Retrieval Algorithm for Wikipedia Articels

    Publikacja

    - Rok 2012

    The article presents an algorithm for retrieving textual information in documents collection. The algorithm employs a category system that organizers the repository and using interaction with user improves search precision. The algorithm was implemented for simple English Wikipedia and the first evaluation results indicates the proposed method can help to retrieve information from large document repositories.

  • Information Retrieval in Wikipedia with Conceptual Directions

    Publikacja

    - Rok 2015

    The paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the...

    Pełny tekst do pobrania w serwisie zewnętrznym