Filtry
wszystkich: 20
Wyniki wyszukiwania dla: TEXT CLUSTERING
-
Development and Research of the Text Messages Semantic Clustering Methodology
PublikacjaThe methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...
-
External Validation Measures for Nested Clustering of Text Documents
PublikacjaAbstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...
-
Evaluation of Path Based Methods for Conceptual Representation of the Text
PublikacjaTypical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...
-
Information Retrieval with the Use of Music Clustering by Directions Algorithm
PublikacjaThis paper introduces the Music Clustering by Directions (MCBD) algorithm. The algorithm is designed to support users of query by humming systems in formulating queries. This kind of systems makes it possible to retrieve songs and tunes on the basis of a melody recorded by the user. The Music Clustering by Directions algorithm is a kind of an interactive query expansion method. On the basis of query, the algorithm provides suggestions...
-
Interactive Information Search in Text Data Collections
PublikacjaThis article presents a new idea for retrieving in text repositories, as well as it describes general infrastructure of a system created to implement and test those ideas. The implemented system differs from today’s standard search engine by introducing process of interactive search with users and data clustering. We present the basic algorithms behind our system and measures we used for results evaluation. The achieved results...
-
Spectral Clustering Wikipedia Keyword-Based search Results
PublikacjaThe paper summarizes our research in the area of unsupervised categorization of Wikipedia articles. As a practical result of our research, we present an application of spectral clustering algorithm used for grouping Wikipedia search results. The main contribution of the paper is a representation method for Wikipedia articles that has been based on combination of words and links and used for categoriation of search results in this...
-
Semantic Analysis and Text Summarization in Socio-Technical Systems
PublikacjaIn this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...
-
Retrieval with Semantic Sieve
PublikacjaThe article presents an algorithm we called Semantic Sieve applied for refining search results in text documents repository. The algorithm calculates socalled conceptual directions that enables interaction with the user and allows to narrow the set of results to the most relevant ones. We present the system where the algorithm has been implemented. The system also offers in the presentation layer clustering of the results into...
-
Interdisciplinarity in Smart Sustainable City education: exploring educational offerings and competencies worldwide
PublikacjaMore and more higher education institutions are offering specialized study programs for current and future managers of Smart Sustainable Cities (SSCs). In the process, they try to reconcile the interdisciplinary nature of such studies, covering at least the technical and social aspects of SSC management, with their own traditionally discipline-based organization. However, there is little guidance on how such interdisciplinarity...
-
Towards Effective Processing of Large Text Collections
PublikacjaIn the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof...
-
Path-based methods on categorical structures for conceptual representation of wikipedia articles
PublikacjaMachine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....
-
Enhancing Word Embeddings for Improved Semantic Alignment
PublikacjaThis study introduces a method for the improvement of word vectors, addressing the limitations of traditional approaches like Word2Vec or GloVe through introducing into embeddings richer semantic properties. Our approach leverages supervised learning methods, with shifts in vectors in the representation space enhancing the quality of word embeddings. This ensures better alignment with semantic reference resources, such as WordNet....
-
Information Retrieval in Wikipedia with Conceptual Directions
PublikacjaThe paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the...
-
Parallel Computations of Text Similarities for Categorization Task
PublikacjaIn this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....
-
The image of the City on social media: A comparative study using “Big Data” and “Small Data” methods in the Tri-City Region in Poland
Publikacja“The Image of the City” by Kevin Lynch is a landmark planning theory of lasting influence; its scientific rigor and relevance in the digital age were in dispute. The rise of social media and other digital technologies offers new opportunities to study the perception of urban environments. Questions remain as to whether social media analytics can provide a reliable measure of perceived city images? If yes, what implication does...
-
Are you a Strategic Thinker? Summer 21/22 - Nowy
Kursy OnlineThe course explores strategic management choices along with the role of innovation in creating sustainable competitive advantage. It introduces frameworks and tools of strategic management (e.g. how to analyze organizations in their industry context and how to design and execute a coherent strategy). Concepts such as value creation, product diversification, clustering and open innovation will be explored to understand how entrepreneurs...
-
Are you a Strategic Thinker? Summer 22/23
Kursy OnlineThe course explores strategic management choices along with the role of innovation in creating sustainable competitive advantage. It introduces frameworks and tools of strategic management (e.g. how to analyze organizations in their industry context and how to design and execute a coherent strategy). Concepts such as value creation, product diversification, clustering and open innovation will be explored to understand how entrepreneurs...
-
Are you a Strategic Thinker? WINTER 23/24
Kursy OnlineThe course explores strategic management choices along with the role of innovation in creating sustainable competitive advantage. It introduces frameworks and tools of strategic management (e.g. how to analyze organizations in their industry context and how to design and execute a coherent strategy). Concepts such as value creation, product diversification, clustering and open innovation will be explored to understand how entrepreneurs...
-
Are you a Strategic Thinker? WINTER 24
Kursy OnlineThe course explores strategic management choices along with the role of innovation in creating sustainable competitive advantage. It introduces frameworks and tools of strategic management (e.g. how to analyze organizations in their industry context and how to design and execute a coherent strategy). Concepts such as value creation, product diversification, clustering and open innovation will be explored to understand how entrepreneurs...
-
Are you a Strategic Thinker? SUMMER 23/24
Kursy OnlineThe course explores strategic management choices along with the role of innovation in creating sustainable competitive advantage. It introduces frameworks and tools of strategic management (e.g. how to analyze organizations in their industry context and how to design and execute a coherent strategy). Concepts such as value creation, product diversification, clustering and open innovation will be explored to understand how entrepreneurs...