Filters
total: 31
Search results for: similarity measure
-
Simulation of parallel similarity measure computations for large data sets
PublicationThe paper presents our approach to implementation of similarity measure for big data analysis in a parallel environment. We describe the algorithm for parallelisation of the computations. We provide results from a real MPI application for computations of similarity measures as well as results achieved with our simulation software. The simulation environment allows us to model parallel systems of various sizes with various components...
-
Feature Reduction Using Similarity Measure in Object Detector Learning with Haar-like Features
PublicationThis paper presents two methods of training complexity reduction by additional selection of features to check in object detector training task by AdaBoost training algorithm. In the first method, the features with weak performance at first weak classifier building process are reduced based on a list of features sorted by minimum weighted error. In the second method the feature similarity measures are used to throw away that features...
-
Similarity Measures for Face Images: An Experimental Study
PublicationThis work describes experiments aimed at finding a straightforward but effective way of comparing face images.We discuss properties of the basic concepts, such as the Euclidean, cosine and correlation metrics, test the simplest version of elastic templates, and compare these solutions with distances based on texture descriptors (Local Ternary Patterns). The influence of selected image processing methods (e.g. bilateral ltering)...
-
Benchmarking Performance of a Hybrid Intel Xeon/Xeon Phi System for Parallel Computation of Similarity Measures Between Large Vectors
PublicationThe paper deals with parallelization of computing similarity measures between large vectors. Such computations are important components within many applications and consequently are of high importance. Rather than focusing on optimization of the algorithm itself, assuming specific measures, the paper assumes a general scheme for finding similarity measures for all pairs of vectors and investigates optimizations for scalability...
-
Music Recommendation Based on Multidimensional Description and Similarity Measures . Rekomendacja muzyki na podstawie wielowymiarowego wektora cech i miar podobieństwa
PublicationThis study aims to create an algorithm for assessing the degree to which songs belong to genres defined a priori. Such an algorithm is not aimed at providing unambiguous classification-labelling of songs, but at producing a multidimensional description encompassing all of the defined genres. The algorithm utilized data derived from the most relevant examples belonging to a particular genre of music. For this condition to be met,...
-
Accelerating Video Frames Classification With Metric Based Scene Segmentation
PublicationThis paper addresses the problem of the efficient classification of images in a video stream in cases, where all of the video has to be labeled. Realizing the similarity of consecutive frames, we introduce a set of simple metrics to measure that similarity. To use these observations for decreasing the number of necessary classifications, we propose a scene segmentation algorithm. Performed experiments have evaluated the acquired...
-
Comparison of Selected Neural Network Models Used for Automatic Liver Tumor Segmentation
PublicationAutomatic and accurate segmentation of liver tumors is crucial for the diagnosis and treatment of hepatocellular carcinoma or metastases. However, the task remains challenging due to imprecise boundaries and significant variations in the shape, size, and location of tumors. The present study focuses on tumor segmentation as a more critical aspect from a medical perspective, compared to liver parenchyma segmentation, which is the...
-
Decoding soundscape stimuli and their impact on ASMR studies
PublicationThis paper focuses on extracting and understanding the acoustical features embedded in the soundscape used in ASMR (Autonomous Sensory Meridian Response) studies. To this aim, a dataset of the most common sound effects employed in ASMR studies is gathered, containing whispering stimuli but also sound effects such as tapping and scratching. Further, a comparative analytical survey is performed based on various acoustical features...
-
Development and Research of the Text Messages Semantic Clustering Methodology
PublicationThe methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...
-
Comprehensive Comparison of a Few Variants of Cluster Analysis as Data Mining Tool in Supporting Environmental Management
PublicationA few variants of hierarchical cluster analysis (CA) as tool of assessment of multidimensional similarity in environmental dataset are compared. The dataset consisted of analytical results of determination of metals (Na, K, Ca, Sc, Fe, Co, Zn, As, Br, Rb, Mo, Sb, Cs, Ba, La, Ce, Sm, Hf and Th) in ambient air dried and kept alive, by the means of hydroponics, moss baskets collected in 12 locations on the area of Tricity (Poland)....
-
Creating new voices using normalizing flows
PublicationCreating realistic and natural-sounding synthetic speech remains a big challenge for voice identities unseen during training. As there is growing interest in synthesizing voices of new speakers, here we investigate the ability of normalizing flows in text-to-speech (TTS) and voice conversion (VC) modes to extrapolate from speakers observed during training to create unseen speaker identities. Firstly, we create an approach for TTS...
-
Matching Exception Class Hierarchies between .NET, Java Environments
PublicationThe paper presents a methodology of exception classification and matching exception messages between .NET andJava environments. The methodology operates on existing exception class hierarchies and proposes two complementingapproaches: automated and manual matching. The automated matching uses the similarity measure to find associationsbetween exception messages from the two sets of classes for the considered programming languages....
-
Software Tools to Measure the Duplication of Information
PublicationData stored in average computer system usually is not unique, portions of stored data are duplicated. When duplicated data are stored in separate files containing source code of computer program of student homework, a possibility of cheating should be seriously considered. This paper presents software tools built, in order to detect re-use of pieces of code in supplied text files. Three aspects of information atching are considered:...
-
In-depth characterization of icosahedral ordering in liquid copper
PublicationThe presence of icosahedral ordering in liquid copper at temperatures close to the melting point is now well-established both experimentally and through computer simulation. However, a more elaborate analysis of local icosahedral and icosahedron-like structures, together with a system for classifying such structures based on some measure of "icosahedrity", has so far been conspicuously absent in the literature. Similarly, the dynamics...
-
APPLICATION OF CHEMOMETRIC ANALYSIS TO THE STUDY OF SNOW AT THE SUDETY MOUNTAINS, POLAND
PublicationSnow samples were collected during winter 2011/2012 in three posts in the Western Sudety Mountains (Poland) in 3 consecutive phases of snow cover development, i.e. stabilisation (Feb 1st), growth (Mar 15th) and its ablation (Mar 27th). To maintain a fixed number of samples, each snow profile has been divided into six layers, but hydrochemical indications were made for each 10 cm section of core. The complete data set was subjected...
-
A Parallel Corpus-Based Approach to the Crime Event Extraction for Low-Resource Languages
PublicationThese days, a lot of crime-related events take place all over the world. Most of them are reported in news portals and social media. Crime-related event extraction from the published texts can allow monitoring, analysis, and comparison of police or criminal activities in different countries or regions. Existing approaches to event extraction mainly suggest processing texts in English, French, Chinese, and some other resource-rich...
-
System for automatic singing voice recognition
PublicationW artykule przedstawiono system automatycznego rozpoznawania jakości i typu głosu śpiewaczego. Przedstawiono bazę danych oraz zaimplementowane parametry. Algorytmem decyzyjnym jest algorytm sztucznych sieci neuronowych. Wytrenowany system decyzyjny osiąga skuteczność ok. 90% w obydwu kategoriach rozpoznawania. Dodatkowo wykazano przy pomocy metod statystycznych, że wyniki działania systemu automatycznej oceny jakości technicznej...
-
Molecular level interpretation of excess infrared spectroscopy
PublicationInfrared (IR) spectroscopy is an invaluable tool in studying intermolecular interactions in solvent mixtures. The deviation of the IR spectrum of a mixture from the spectra of its pure components is a sensitive measure of the non-ideality of solutions and the modulation of intermolecular interactions introduced by mutual influence of the components. Excess IR spectroscopy, based on the established notion of excess thermodynamic...
-
Numerical Issues and Approximated Models for the Diagnosis of Transmission Pipelines
PublicationThe chapter concerns numerical issues encountered when the pipeline flow process is modeled as a discrete-time state-space model. In particular, issues related to computational complexity and computability are discussed, i.e., simulation feasibility which is connected to the notions of singularity and stability of the model. These properties are critical if a diagnostic system is based on a discrete mathematical model of the flow...
-
The orthogonalization of objects simplified with the Simplify Building tool representing groups of buildings in Kartuzy district - scale 1:10000
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the national geodesy and cartography resource from BDOT10k (10k topographic database) [1].
-
The orthogonalization of objects simplified using the Sester’s method representing groups of buildings in Kartuzy district - scale 1:10000
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the national geodesy and cartography resource from BDOT10k (10k topographic database) [1].
-
The orthogonisation of objects simplified using the Chrobak’s method representing groups of buildings in Gdańsk district - scale 1:10000
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the national geodesy and cartography resource from BDOT10k (10k topographic database) [1].
-
The orthogonalization of simplified objects representing groups of buildings in Gdańsk district using the Simplify Building tool - scale 1:10000
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the national geodesy and cartography resource from BDOT10k (10k topographic database) [1].
-
The orthogonisation of objects simplified using the Sester’s method representing groups of buildings in Gdańsk district - scale 1:10000
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the national geodesy and cartography resource from BDOT10k (10k topographic database) [1].
-
The orthogonalization of objects simplified using the Chrobak’s method representing groups of buildings in Kartuzy district - scale 1:10000
Open Research DataThe process of automatic generalization is one of the elements of spatial data preparation for the purpose of creating digital cartographic studies. The presented data include a part of the process of generalization of building groups obtained from the national geodesy and cartography resource from BDOT10k (10k topographic database) [1].
-
SYNAT_MUSIC_GENRE_FV_173
Open Research DataThis is the original dataset containing 51582 music tracks (22 music genres) and 173 element-feature vector [1-6,9]. A collection of more than 50000 music excerpts described with a set of descriptors obtained through the analysis of 30-second mp3 recordings was gathered in a database called SYNAT. The SYNAT database was realized by the Gdansk University...
-
SYNAT Music Genre Parameters PCA 19
Open Research DataThe dataset contains feature vector after Principal Component Analysis (PCA) performing, so there are 11 music genres and 19-element vector derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of 52532 music excerpts described...
-
SYNAT_PCA_48
Open Research DataThere is a series of datasets containing feature vectors derived from music tracks. The dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 48-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier...
-
SYNAT_PCA_11
Open Research DataThe dataset contains 51582 music tracks (22 music genres) and feature vector after Principal Component Analysis (PCA) performing, so there are 11-element vectors derived from music excerpts. Originally, a feature vector containing 173 elements was conceived in earlier research studies carried out by the team of authors [1-6]. A collection of more than...
-
Vident-real: an intra-oral video dataset for multi-task learning
Open Research DataWe introduce Vident-real, a large dataset of 100 video sequences of intra-oral scenes from real conservative dental treatments performed at the Medical University of Gdańsk, Poland. The dataset can be used for multi-task learning methods including:
-
Things You Might Not Know about the k-Nearest Neighbors Algorithm
PublicationRecommender Systems aim at suggesting potentially interesting items to a user. The most common kind of Recommender Systems is Collaborative Filtering which follows an intuition that users who liked the same things in the past, are more likely to be interested in the same things in the future. One of Collaborative Filtering methods is the k Nearest Neighbors algorithm which finds k users who are the most similar to an active user...