Filters
total: 3675
displaying 1000 best results Help
Search results for: K-MEANS CLUSTERING
-
K-means clustering for SAT-AIS data analysis
Publication -
Breast Cancer Heterogeneity Investigation: Multiple k-Means Clustering Approach
Publication -
Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection
PublicationThe problem of plagiarism is becoming increasingly more significant with the growth of Internet technologies and the availability of information resources. Many tools have been successfully developed to detect plagiarisms in textual documents, but the situation is more complicated in the field of plagiarism of source codes, where the problem is equally serious. At present, there are no complex tools available to detect plagiarism...
-
0-step K-means for clustering Wikipedia search results
PublicationThis article describes an improvement for K-means algorithm and its application in the form of a system that clusters search results retrieved from Wikipedia. The proposed algorithm eliminates K-means isadvantages and allows one to create a cluster hierarchy. The main contributions of this paper include the ollowing: (1) The concept of an improved K-means algorithm and its application for hierarchical clustering....
-
Kernel-Based Fuzzy C-Means Clustering Algorithm for RBF Network Initialization
Publication -
Designing RBFNs Structure Using Similarity-Based and Kernel-Based Fuzzy C-Means Clustering Algorithms
Publication -
The chapter analyses the K-Means algorithm in its parallel setting. We provide detailed description of the algorithm as well as the way we paralellize the computations. We identified complexity of the particular steps of the algorithm that allows us to build the algorithm model in MERPSYS system. The simulations with the MERPSYS have been performed for different size of the data as well as for different number of the processors used for the computations. The results we got using the model have been compared to the results obtained from real computational environment.
PublicationThe chapter analyses the K-Means algorithm in its parallel setting. We provide detailed description of the algorithm as well as the way we paralellize the computations. We identified complexity of the particular steps of the algorithm that allows us to build the algorithm model in MERPSYS system. The simulations with the MERPSYS have been performed for different size of the data as well as for different number of the processors used...
-
Categorization of Cloud Workload Types with Clustering
PublicationThe paper presents a new classification schema of IaaS cloud workloads types, based on the functional characteristics. We show the results of an experiment of automatic categorization performed with different benchmarks that represent particular workload types. Monitoring of resource utilization allowed us to construct workload models that can be processed with machine learning algorithms. The direct connection between the functional...
-
A Clustering-Based Methodology for Selection of Fault Tolerance Techniques
PublicationDevelopment of dependable applications requires selection of appropriate fault tolerance techniques that balance efficiency in fault handling and resulting consequences, such as increased development cost or performance degradation. This paper describes an advisory system that recommends fault tolerance techniques considering specified development and runtime application attributes. In the selection process, we use the K-means...
-
Image Segmentation of MRI image for Brain Tumor Detection
Publicationthis research work presents a new technique for brain tumor detection by the combination of Watershed algorithm with Fuzzy K-means and Fuzzy C-means (KIFCM) clustering. The MATLAB based proposed simulation model is used to improve the computational simplicity, noise sensitivities, and accuracy rate of segmentation, detection and extraction from MR...
-
Towards Effective Processing of Large Text Collections
PublicationIn the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof...
-
The Use of Cluster Analysis to Evaluate the Impact of COVID-19 Pandemic on Daily Water Demand Patterns
PublicationProper determination of unitary water demand and diurnal distribution of water consumption (water consumption histogram) provides the basis for designing, dimensioning, and all analyses of water supply networks. It is important in the case of mathematical modelling of flows in the water supply network, particularly during the determination of nodal water demands in the context of Extended Period Simulation (EPS). Considering the...
-
Parallel Computations of Text Similarities for Categorization Task
PublicationIn this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....
-
Comparison of selected electroencephalographic signal classification methods
PublicationA variety of methods exists for electroencephalographic (EEG) signals classification. In this paper, we briefly review selected methods developed for such a purpose. First, a short description of the EEG signal characteristics is shown. Then, a comparison between the selected EEG signal classification methods, based on the overview of research studies on this topic, is presented. Examples of methods included in the study are: Artificial...
-
The Application of Cluster Analysis in the Assessment of the Weldability of Unalloyed Steels
PublicationNon-alloy steels constitute a large group of steels characterised by diversified chemical composition, structural morphology and a wide range of mechanical properties (determining weldability). The paper presents results of multidimensional analyses (based on cluster analysis) of 110 selected unalloyed steel grades. Properties adopted as diagnostic features included the chemical composition, mechanical properties (yield point)...
-
Searching for Solvents with an Increased Carbon Dioxide Solubility Using Multivariate Statistics
PublicationIonic liquids (ILs) are used in various fields of chemistry. One of them is CO2 capture, a process that is quite well described. The solubility of CO2 in ILs can be used as a model to investigate gas absorption processes. The aim is to find the relationships between the solubility of CO2 and other variables—physicochemical properties and parameters related to greenness. In this study, 12 variables are used to describe a dataset...
-
Multimodal system for diagnosis and polysensory stimulation of subjects with communication disorders
PublicationAn experimental multimodal system, designed for polysensory diagnosis and stimulation of persons with impaired communication skills or even non-communicative subjects is presented. The user interface includes an eye tracking device and the EEG monitoring of the subject. Furthermore, the system consists of a device for objective hearing testing and an autostereoscopic projection system designed to stimulate subjects through their...
-
Path-based methods on categorical structures for conceptual representation of wikipedia articles
PublicationMachine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....
-
Development of cluster analysis methodology for identification of model rainfall hyetographs and its application at an urban precipitation field scale
PublicationDespite growing access to precipitation time series records at a high temporal scale, in hydrology, and particularly urban hydrology, engineers still design and model drainage systems using scenarios of rainfall temporal distributions predefined by means of model hyetographs. This creates the need for the availability of credible statistical methods for the development and verification of already locally applied model hyetographs....
-
Development and Research of the Text Messages Semantic Clustering Methodology
PublicationThe methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...
-
External Validation Measures for Nested Clustering of Text Documents
PublicationAbstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...
-
Categorization of Wikipedia articles with spectral clustering
PublicationAbstract. The article reports application of clustering algorithms for creating hierarchical groups withinWikipedia articles.We evaluate three spectral clustering algorithms based on datasets constructed with usage ofWikipedia categories. Selected algorithm has been implemented in the system that categorize Wikipedia search results in the fly.
-
Fuzzy Divisive Hierarchical Clustering of Solvents According to Their Experimentally and Theoretically Predicted Descriptors
PublicationThe present study describes a simple procedure to separate into patterns of similarity a large group of solvents, 259 in total, presented by 15 specific descriptors (experimentally found and theoretically predicted physicochemical parameters). Solvent data is usually characterized by its high variability, dierent molecular symmetry, and spatial orientation. Methods of chemometrics can usefully be used to extract and explore accurately...
-
Design of Cost-Efficient Optical Fronthaul for 5G/6G Networks: An Optimization Perspective
PublicationCurrently, 5G and the forthcoming 6G mobile communication systems are the most promising cellular generations expected to beat the growing hunger for bandwidth and enable the fully connected world presented by the Internet of Everything (IoE). The cloud radio access network (CRAN) has been proposed as a promising architecture for meeting the needs and goals of 5G/6G (5G and beyond) networks. Nevertheless, the provisioning of cost-efficient...
-
Information Retrieval with the Use of Music Clustering by Directions Algorithm
PublicationThis paper introduces the Music Clustering by Directions (MCBD) algorithm. The algorithm is designed to support users of query by humming systems in formulating queries. This kind of systems makes it possible to retrieve songs and tunes on the basis of a melody recorded by the user. The Music Clustering by Directions algorithm is a kind of an interactive query expansion method. On the basis of query, the algorithm provides suggestions...
-
Spectral Clustering Wikipedia Keyword-Based search Results
PublicationThe paper summarizes our research in the area of unsupervised categorization of Wikipedia articles. As a practical result of our research, we present an application of spectral clustering algorithm used for grouping Wikipedia search results. The main contribution of the paper is a representation method for Wikipedia articles that has been based on combination of words and links and used for categoriation of search results in this...
-
Wyszukiwanie informacji z wykorzystaniem algorytmu Ontology Clustering by Directions
PublicationArtykuł opisuje algorytm Ontology Clustering by Directions. Algorytm ten ma na celu wspieranie użytkowników w formułowaniu ontologicznych zapytań. Ontologiczne zapytania służą do wydobywania informacji sformułowanych za pomocą ontologii opisanych np. językiem OWL. Artykuł przedstawia rodzaje języków wykorzystywanych do formułowania ontologicznych zapytań. W szczególności opisuje języki, które mają być przyjazne użytkownikom. Na...
-
Optimized Deep Learning Model for Flood Detection Using Satellite Images
PublicationThe increasing amount of rain produces a number of issues in Kerala, particularly in urban regions where the drainage system is frequently unable to handle a significant amount of water in such a short duration. Meanwhile, standard flood detection results are inaccurate for complex phenomena and cannot handle enormous quantities of data. In order to overcome those drawbacks and enhance the outcomes of conventional flood detection...
-
Evaluation of Machine Learning Methods for the Experimental Classification and Clustering of Higher Education Institutions
PublicationHigher education institutions have a big impact on the future of skills supplied on the labour market. It means that depending on the changes in labour market, higher education institutions are making changes to fields of study or adding new ones to fulfil the demand on labour market. The significant changes on labour market caused by digital transformation, resulted in new jobs and new skills. Because of the necessity of computer...
-
Interactive Query Expansion with the Use of Clustering by Directions Algorithm
PublicationThis paper concerns Clustering by Directions algorithm. The algorithm introduces a novel approach to interactive query expansion. It is designed to support users of search engines in forming web search queries. When a user executes a query, the algorithm shows potential directions in which the search can be continued. This paper describes the algorithm and it presents an enhancement which reduces the computational complexity of...
-
The adaptive spatio-temporal clustering method in classifying direct labor costs for the manufacturing industry
PublicationEmployee productivity is critical to the profitability of not only the manufacturing industry. By capturing employee locations using recent advanced tracking devices, one can analyze and evaluate the time spent during a workday of each individual. However, over time, the quantity of the collected data becomes a burden, and decreases the capabilities of efficient classification of direct labor costs. However, the results obtained...
-
Clustering Context Items into User Trust Levels
PublicationAn innovative trust-based security model for Internet systems is proposed. The TCoRBAC model operates on user profiles built on the history of user with system interaction in conjunction with multi-dimensional context information. There is proposed a method of transforming the high number of possible context value variants into several user trust levels. The transformation implements Hierarchical Agglomerative Clustering strategy....
-
Self-Organizing Map representation for clustering Wikipedia search results
PublicationThe article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...
-
Self–Organizing Map representation for clustering Wikipedia search results
PublicationThe article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...
-
Ontology clustering by directions algorithm to expand ontology queries
PublicationThis paper concerns formulating ontology queries. It describes existing languages in which ontologies can be queried. It focuses on languages which are intended to be easily understood by users who are willing to retrieve information from ontologies. Such a language can be, for example, a type of controlled natural language (CNL). In this paper a novel algorithm called Ontology Clustering by Directions is presented. The algorithm...
-
Weighted Clustering for Bees Detection on Video Images
PublicationThis work describes a bee detection system to monitor bee colony conditions. The detection process on video images has been divided into 3 stages: determining the regions of interest (ROI) for a given frame, scanning the frame in ROI areas using the DNN-CNN classifier, in order to obtain a confidence of bee occurrence in each window in any position and any scale, and form one detection window from a cloud of windows provided by...
-
Method for Clustering of Brain Activity Data Derived from EEG Signals
PublicationA method for assessing separability of EEG signals associated with three classes of brain activity is proposed. The EEG signals are acquired from 23 subjects, gathered from a headset consisting of 14 electrodes. Data are processed by applying Discrete Wavelet Transform (DWT) for the signal analysis and an autoencoder neural network for the brain activity separation. Processing involves 74 wavelets from 3 DWT families: Coiflets,...
-
Automatic Clustering of EEG-Based Data Associated with Brain Activity
PublicationThe aim of this paper is to present a system for automatic assigning electroencephalographic (EEG) signals to appropriate classes associated with brain activity. The EEG signals are acquired from a headset consisting of 14 electrodes placed on skull. Data gathered are first processed by the Independent Component Analysis algorithm to obtain estimates of signals generated by primary sources reflecting the activity of the brain....
-
Identification, Assessment and Automated Classification of Requirements Engineering Techniques
PublicationSelection of suitable techniques to be used in requirements engineering or business analysis activities is not easy, especially considering the large number of new proposals that emerged in recent years. This paper provides a summary of techniques recommended by major sources recognized by the industry. A universal attribute structure for the description of techniques is proposed and used to describe 33 techniques most frequently...
-
Interfejs do algorytmu Clustering by Directions ułatwiający formułowanie zapytań w wyszukiwarkach internetowych
PublicationRozdział dotyczy tworzenia zapytań w wyszukiwarkach internetowych. Opisuje sposoby wspierania użytkowników wyszukiwarek w formułowaniu zapytań. Ponadto opisuje zasadę działania opracowanego przez autora algorytmu Clustering by Directions. Algorytm ten przeznaczony jest do wskazywania użytkownikom potencjalnych kierunków, w których mogą kontynuować wyszukiwanie. Kierunki są reprezentowane przez wyrazy, które użytkownik może dodawać...
-
Fuzzy soft modeling of environmental data. A study of the impact of a Phosphatic Fertilizer Plant on the adjacent environment in Gdańsk
PublicationAnaliza podobieństwa obejmuje nie tylko zastosowanie logiki rozmytej, ale również wiele innych podejść matematycznych. Opracowano wiele algorytmów, których celem jest wyodrębnienie wyraźnych skupień (hard clusters) z danego zbioru danych. Prawdopodobnie najczęściej stosowanymi algorytmami są tzw. algorytmy c-średnie (c-means algorithms). Twarde c-średnie (hard c-means) służy do ostrej klasyfikacji, podczas której obiekt jest przypisany...
-
Controlling computer by lip gestures employing neural network
PublicationResults of experiments regarding lip gesture recognition with an artificial neural network are discussed. The neural network module forms the core element of a multimodal human-computer interface called LipMouse. This solution allows a user to work on a computer using lip movements and gestures. A user face is detected in a video stream from a standard web camera using a cascade of boosted classifiers working with Haar-like features....
-
Human-Computer Interface Based on Visual Lip Movement and Gesture Recognition
PublicationThe multimodal human-computer interface (HCI) called LipMouse is presented, allowing a user to work on a computer using movements and gestures made with his/her mouth only. Algorithms for lip movement tracking and lip gesture recognition are presented in details. User face images are captured with a standard webcam. Face detection is based on a cascade of boosted classifiers using Haar-like features. A mouth region is located in...
-
Qualitative and Quantitative Analysis of Selected Tonic Waters by Potentiometric Taste Sensor With All-Solid-State Electrodes
PublicationTaste sensor with five all-solid-state electrodes (ASSE) III (third version) was used for qualitative and quantitative analysis of selected tonic waters (J.Gasco, Kinley, Jurajski, Jurajski with citrus flavor, Carrefour, Schweppes Indian Tonic, and Schweppes Bitter Lemon). The results obtained by this taste sensor analyzed with principal component analysis, agglomerative hierarchical clustering methods show that this sensor can...
-
Lower bound on the distance k-domination number of a tree
PublicationW artykule przedstawiono dolne ograniczenie na liczbę k-dominowania w drzewach oraz scharakteryzowano wszystkie grafy ekstremalne.
-
Real estate investment trusts in Turkey: Structure, analysis, and strategy
PublicationPurpose-Aim of this study is to make the determinations related to the problems mentioned in the REIT sector in Turkey, to offer a solution for this issue, and to ensure the classification in the sector by adhering to the financial data of the REITsMethodology-Financial data set of the REITs was firstly standardized by using median instead of mean. Then, the scoring was performed according to defined coefficients....
-
Emilia Miszewska dr inż.
PeopleEmilia Miszewska was born in 1986 in Gdańsk. She graduated from Primary School No. 17 in Gdańsk with sports classes specializing in swimming and Janusz Kusociński Sports Secondary School No. 11 in Gdańsk. In 2005, she started uniform master's studies at the Faculty of Civil and Environmental Engineering, which she completed in 2011, defending her diploma thesis entitled "Analysis and development of fire protection guidelines and...
-
CHARAKTERYSTYKA WYMIANY CIEPŁA I OPORÓW PRZEPŁYWU OMYWANIA MIKROSTRUGAMI ROZWINIĘTEJ POWIERZCHNI
PublicationW artykule przedstawiono eksperymentalne badania kompaktowego wymiennika ciepła z intensyfikacją przejmowania ciepła przy użyciu mikrostrug. Niniejszy artykuł przedstawia projekt i testy prototypu, wymiennika ciepła typu mikrostrugowego. Modułowa konstrukcja wymiennika umożliwia zmianę wymiarów geometrycznych, a także zmianę materiału przegrody wymiennika ciepła. Metoda graficzna Wilsona została z zastosowana w celu określenia...
-
Evaluation of Path Based Methods for Conceptual Representation of the Text
PublicationTypical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...
-
Factors that strengthen and weaken the identity of the cluster structures
PublicationThe main aim of this paper is the application of "identity" to the issues related to "clustering process" and particularly - to the cooperation in the clusters and the cluster initiatives. The authors distinguish these factors that have the greatest influence on the formation and maintenance of identity in mentioned networks of cooperation.