Search results for: TEXT CLUSTERING - Bridge of Knowledge

Search

Search results for: TEXT CLUSTERING

Filters

total: 19
filtered: 14

clear all filters


Chosen catalog filters

  • Category

  • Year

  • Options

clear Chosen catalog filters disabled

Search results for: TEXT CLUSTERING

  • Development and Research of the Text Messages Semantic Clustering Methodology

    Publication

    - Year 2016

    The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

    Full text available to download

  • External Validation Measures for Nested Clustering of Text Documents

    Publication

    Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...

  • Evaluation of Path Based Methods for Conceptual Representation of the Text

    Publication

    Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

    Full text to download in external service

  • Information Retrieval with the Use of Music Clustering by Directions Algorithm

    Publication

    - Year 2013

    This paper introduces the Music Clustering by Directions (MCBD) algorithm. The algorithm is designed to support users of query by humming systems in formulating queries. This kind of systems makes it possible to retrieve songs and tunes on the basis of a melody recorded by the user. The Music Clustering by Directions algorithm is a kind of an interactive query expansion method. On the basis of query, the algorithm provides suggestions...

    Full text to download in external service

  • Interactive Information Search in Text Data Collections

    Publication

    This article presents a new idea for retrieving in text repositories, as well as it describes general infrastructure of a system created to implement and test those ideas. The implemented system differs from today’s standard search engine by introducing process of interactive search with users and data clustering. We present the basic algorithms behind our system and measures we used for results evaluation. The achieved results...

    Full text to download in external service

  • Spectral Clustering Wikipedia Keyword-Based search Results

    The paper summarizes our research in the area of unsupervised categorization of Wikipedia articles. As a practical result of our research, we present an application of spectral clustering algorithm used for grouping Wikipedia search results. The main contribution of the paper is a representation method for Wikipedia articles that has been based on combination of words and links and used for categoriation of search results in this...

    Full text available to download

  • Semantic Analysis and Text Summarization in Socio-Technical Systems

    Publication

    - Year 2018

    In this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...

    Full text to download in external service

  • Retrieval with Semantic Sieve

    Publication

    The article presents an algorithm we called Semantic Sieve applied for refining search results in text documents repository. The algorithm calculates socalled conceptual directions that enables interaction with the user and allows to narrow the set of results to the most relevant ones. We present the system where the algorithm has been implemented. The system also offers in the presentation layer clustering of the results into...

    Full text to download in external service

  • Interdisciplinarity in Smart Sustainable City education: exploring educational offerings and competencies worldwide

    Publication

    More and more higher education institutions are offering specialized study programs for current and future managers of Smart Sustainable Cities (SSCs). In the process, they try to reconcile the interdisciplinary nature of such studies, covering at least the technical and social aspects of SSC management, with their own traditionally discipline-based organization. However, there is little guidance on how such interdisciplinarity...

    Full text available to download

  • Path-based methods on categorical structures for conceptual representation of wikipedia articles

    Machine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....

    Full text available to download

  • Towards Effective Processing of Large Text Collections

    Publication

    In the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof...

  • Information Retrieval in Wikipedia with Conceptual Directions

    Publication

    - Year 2015

    The paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the...

    Full text to download in external service

  • Parallel Computations of Text Similarities for Categorization Task

    Publication

    - Year 2013

    In this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....

  • The image of the City on social media: A comparative study using “Big Data” and “Small Data” methods in the Tri-City Region in Poland

    Publication

    “The Image of the City” by Kevin Lynch is a landmark planning theory of lasting influence; its scientific rigor and relevance in the digital age were in dispute. The rise of social media and other digital technologies offers new opportunities to study the perception of urban environments. Questions remain as to whether social media analytics can provide a reliable measure of perceived city images? If yes, what implication does...

    Full text available to download