Publications
Filters
total: 347
Catalog Publications
-
Path-based methods on categorical structures for conceptual representation of wikipedia articles
PublicationMachine learning algorithms applied to text categorization mostly employ the Bag of Words (BoW) representation to describe the content of the documents. This method has been successfully used in many applications, but it is known to have several limitations. One way of improving text representation is usage of Wikipedia as the lexical knowledge base – an approach that has already shown promising results in many research studies....
-
Evaluation of Path Based Methods for Conceptual Representation of the Text
PublicationTypical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...
-
Semantic Memory for Avatars in Cyberspace
PublicationAvatars that show intelligent behavior should have an access to general knowledge about the world, knowledge that humans store in their semantic memories. The simplest knowledge representation for semantic memory is based on the Concept Description Vectors (CDVs) that store, for each concept, an information whether a given property can be applied to this concept or not. Unfortunately large-scale semantic memories are not available....
-
Collaborative Data Acquisition and Learning Support
PublicationWith the constant development of neural networks, traditional algorithms relying on data structures lose their significance as more and more solutions are using AI rather than traditional algorithms. This in turn requires a lot of correctly annotated and informative data samples. In this paper, we propose a crowdsourcing based approach for data acquisition and tagging with support for Active Learning where the system acts as an...
-
Active Learning Based on Crowdsourced Data
PublicationThe paper proposes a crowdsourcing-based approach for annotated data acquisition and means to support Active Learning training approach. In the proposed solution, aimed at data engineers, the knowledge of the crowd serves as an oracle that is able to judge whether the given sample is informative or not. The proposed solution reduces the amount of work needed to annotate large sets of data. Furthermore, it allows a perpetual increase...
-
Detection of anomalies in bee colony using transitioning state and contrastive autoencoders
PublicationHoneybees plays vital role for the environmental sustainability and overall agricultural economy. Assisting bee colonies within their proper functioning brings the attention of researchers around the world. Electronics systems and machine learning algorithms are being developed for classifying specific undesirable bee behaviors in order to alert about upcoming substantial losses. However, classifiers could be impaired when used...
-
Bringing Common Sense to WordNet with a Word Game
PublicationWe present a tool for common sense knowledge acquisition in form of a twenty questions game. The described approach uses WordNet dictionary, which rich taxonomy allows to keep cognitive economy and accelerate knowledge propagation, although sometimes inferences made on hierarchical relations result in noise. We extend the dictionary with common sense assertions acquired during the games played with humans. The facts added to the...
-
0-step K-means for clustering Wikipedia search results
PublicationThis article describes an improvement for K-means algorithm and its application in the form of a system that clusters search results retrieved from Wikipedia. The proposed algorithm eliminates K-means isadvantages and allows one to create a cluster hierarchy. The main contributions of this paper include the ollowing: (1) The concept of an improved K-means algorithm and its application for hierarchical clustering....
-
Recycling of raw materials, silicon wafers and complete solar cells from photovoltaic modules
PublicationPhotovoltaic modules (PVs) are an attractive way of generating electricity in reliable and maintenance-free systems with the use of solar energy. The average lifetime of photovoltaic modules is 25 to 30 years. To offset the negative impact of photovoltaic modules on the environment, it is necessary to introduce a long-term strategy that includes a complete lifecycle of all system components from the production phase through installation...
-
Network-assisted processing of advanced IoT applications: challenges and proof-of-concept application
PublicationRecent advances in the area of the Internet of Things shows that devices are usually resource-constrained. To enable advanced applications on these devices, it is necessary to enhance their performance by leveraging external computing resources available in the network. This work presents a study of computational platforms to increase the performance of these devices based on the Mobile Cloud Computing (MCC) paradigm. The main...
-
DBpedia and YAGO Based System for Answering Questions in Natural Language
PublicationIn this paper we propose a method for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference) based on DBpedia and YAGO. Our method is based on generating dependency trees for the query. In the dependency tree we look for paths leading from the root to the named entity of interest. These paths (referenced further as fibers) are candidates for representation of actual user intention. The...
-
Weighted Clustering for Bees Detection on Video Images
PublicationThis work describes a bee detection system to monitor bee colony conditions. The detection process on video images has been divided into 3 stages: determining the regions of interest (ROI) for a given frame, scanning the frame in ROI areas using the DNN-CNN classifier, in order to obtain a confidence of bee occurrence in each window in any position and any scale, and form one detection window from a cloud of windows provided by...
-
How to Sort Them? A Network for LEGO Bricks Classification
PublicationLEGO bricks are highly popular due to the ability to build almost any type of creation. This is possible thanks to availability of multiple shapes and colors of the bricks. For the smooth build process the bricks need to properly sorted and arranged. In our work we aim at creating an automated LEGO bricks sorter. With over 3700 different LEGO parts bricks classification has to be done with deep neural networks. The question arises...
-
Selecting Features with SVM
PublicationA common problem with feature selection is to establish how many features should be retained at least so that important information is not lost. We describe a method for choosing this number that makes use of Support Vector Machines. The method is based on controlling an angle by which the decision hyperplane is tilt due to feature selection. Experiments were performed on three text datasets generated from a Wikipedia dump. Amount...
-
Spectral Clustering Wikipedia Keyword-Based search Results
PublicationThe paper summarizes our research in the area of unsupervised categorization of Wikipedia articles. As a practical result of our research, we present an application of spectral clustering algorithm used for grouping Wikipedia search results. The main contribution of the paper is a representation method for Wikipedia articles that has been based on combination of words and links and used for categoriation of search results in this...
-
Gaseous products from scrap tires pyrolisis
PublicationIn European Union 75% of used tires should be recycled. The most common method of used tires disposal, is burning in cement kilns, which does not solve the problem. Pyrolysis process can be an alternative way of utilization of tires. The aim of the researches was to check the influence of pyrolysis products (gas and oil fractions) on environment. Samples from pyrolysis process, like light oil fractions or pyrolysis gases were analyzed...
-
Selection of Relevant Features for Text Classification with K-NN
PublicationIn this paper, we describe five features selection techniques used for a text classification. An information gain, independent significance feature test, chi-squared test, odds ratio test, and frequency filtering have been compared according to the text benchmarks based on Wikipedia. For each method we present the results of classification quality obtained on the test datasets using K-NN based approach. A main advantage of evaluated...
-
Representation of hypertext documents based on terms, Links and text compressibility
PublicationOpisano metody reprezentacji dokumentów tekstowych oparte na słowach, wzajemnych powiązaniach i metodach kompresji. Dokonano ich oceny w oparciu o klasyfikator SVM.
-
Analysis of Denoising Autoencoder Properties Through Misspelling Correction Task
PublicationThe paper analyzes some properties of denoising autoencoders using the problem of misspellings correction as an exemplary task. We evaluate the capacity of the network in its classical feed-forward form. We also propose a modification to the output layer of the net, which we called multi-softmax. Experiments show that the model trained with this output layer outperforms traditional network both in learning time and accuracy. We...
-
Privacy-Preserving, Scalable Blockchain-Based Solution for Monitoring Industrial Infrastructure in the Near Real-Time
PublicationThis paper proposes an improved monitoring and measuring system dedicated to industrial infrastructure. Our model achieves security of data by incorporating cryptographical methods and near real-time access by the use of virtual tree structure over records. The currently available blockchain networks are not very well adapted to tasks related to the continuous monitoring of the parameters of industrial installations. In the database...
-
The Issue of Shading Photovoltaic Installation Caused by Dust Accumulation on the Glass Surface
PublicationThe issue of accumulation of dust and other pollutants on the surface of photovoltaic modules was thoroughly analysed over the years. One of the first surveys in this field of knowledge linked pollutant accumulation on the module surface with transmittance loss of its glass covering, which leads to lessened amount of solar radiation reaching solar cells. First stage of this accumulation process is linear transparency loss, and second...
-
Improvement of Imperfect String Matching Based on Asymetric n-Grams
PublicationTypical approaches to string comparing treats them as either different or identical without taking into account the possibility of misspelling of the word. In this article we present an approach we used for improvement of imperfect string matching that allows one to reconstruct potential string distortions. The proposed method increases the quality of imperfect string matching, allowing the lookup of misspelled words without significant...
-
Interactive Information Search in Text Data Collections
PublicationThis article presents a new idea for retrieving in text repositories, as well as it describes general infrastructure of a system created to implement and test those ideas. The implemented system differs from today’s standard search engine by introducing process of interactive search with users and data clustering. We present the basic algorithms behind our system and measures we used for results evaluation. The achieved results...
-
The Possibility of Phase Change Materials (PCM) Usage to Increase Efficiency of the Photovoltaic Modules
PublicationSolar energy is widely available, free and inexhaustible. Furthermore this source of energy is the most friendly to the environment. For direct conversion of solar energy into useful forms like of electricity and thermal energy, respectively photovoltaic cells and solar collectors are being used. Forecast indicate that the first one solution will soon have a significant part in meeting the global energy demand. Therefore it is...
-
Synthesis of reduced graphene oxide nanosheets using nanofibers from methane and biogas thermal decomposition with various catalysts
PublicationReduced graphene oxide and graphene oxide (rGO, GO) were synthesised from carbon nanofibers, which were formed in catalytic thermal decomposition of methane (CDM) and biogas with different catalysts used in the process. The aim of the work was valorization of CDM carbon nanofiber products. The samples were characterized using Raman spectra, a scanning electron microscope and a transmission electron microscope. As a result, we observe...
-
Improving css-KNN Classification Performance by Shifts in Training Data
PublicationThis paper presents a new approach to improve the performance of a css-k-NN classifier for categorization of text documents. The css-k-NN classifier (i.e., a threshold-based variation of a standard k-NN classifier we proposed in [1]) is a lazy-learning instance-based classifier. It does not have parameters associated with features and/or classes of objects, that would be optimized during off-line learning. In this paper we propose...
-
Advances in Architectures, Big Data, and Machine Learning Techniques for Complex Internet of Things Systems
PublicationTe feld of Big Data is rapidly developing with a lot of ongoing research, which will likely continue to expand in the future. A crucial part of this is Knowledge Discovery from Data (KDD), also known as the Knowledge Discovery Process (KDP). Tis process is a very complex procedure, and for that reason it is essential to divide it into several steps (Figure 1). Some authors use fve steps to describe this procedure, whereas others...
-
Improving Effectiveness of SVM Classifier for Large Scale Data
PublicationThe paper presents our approach to SVM implementation in parallel environment. We describe how classification learning and prediction phases were pararellised. We also propose a method for limiting the number of necessary computations during classifier construction. Our method, named one-vs-near, is an extension of typical one-vs-all approach that is used for binary classifiers to work with multiclass problems. We perform experiments...
-
Annotating Words Using WordNet Semantic Glosses
PublicationAn approach to the word sense disambiguation (WSD) relaying onthe WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations...
-
Categorization of Cloud Workload Types with Clustering
PublicationThe paper presents a new classification schema of IaaS cloud workloads types, based on the functional characteristics. We show the results of an experiment of automatic categorization performed with different benchmarks that represent particular workload types. Monitoring of resource utilization allowed us to construct workload models that can be processed with machine learning algorithms. The direct connection between the functional...
-
Concept description vectors and the 20 question game
PublicationKnowledge of properties that are applicable to a given object is a necessary prerequisite to formulate intelligent question. Concept description vectors provide simplest representation of this knowledge, storing for each object information about the values of its properties. Experiments with automatic creation of concept description vectors from various sources, including ontologies, dictionaries, encyclopedias and unstructured...
-
Crowdsourcing-Based Evaluation of Automatic References Between WordNet and Wikipedia
PublicationThe paper presents an approach to build references (also called mappings) between WordNet and Wikipedia. We propose four algorithms used for automatic construction of the references. Then, based on an aggregation algorithm, we produce an initial set of mappings that has been evaluated in a cooperative way. For that purpose, we implement a system for the distribution of evaluation tasks, that have been solved by the user community....
-
Energy Yield Generated by a Small Building Integrated Photovoltaic Installation
PublicationIn the recent years photovoltaic (PV) industry has experienced a major growth, caused by the ever present annual decrease in module production prices and the expanding awareness of the general public in terms of renewable energy. There are numerous ways to implement PV modules as an additional energy source for a building, be it mounted on the rooftop, or building integrated (BIPV). An analysis of BIPV consisting of 8 modules with...
-
Soiling Effect Mitigation Obtained by Applying Transparent Thin-Films on Solar Panels: Comparison of Different Types of Coatings
PublicationDust accumulation on the front cover of solar panels is closely linked to location and orientation of photovoltaic (PV) installation. Its build-up depends on the module tilt angle, frequency of precipitation, humidity, wind strength and velocity, as well as grain size. Additionally, soil composition is determined by solar farm surroundings such as local factories, agricultural crops, and traffic. Over time, molecules of atmospheric...
-
Highly Oriented Zirconium Nitride and Oxynitride Coatings Deposited via High‐Power Impulse Magnetron Sputtering: Crystal‐Facet‐Driven Corrosion Behavior in Domestic Wastewater
PublicationHerein, highly crystalline ZrxNy and ZrxNyOz coatings are achieved by the deposition via high‐power impulse magnetron sputtering. Various N2 and N2/O2 gas mixtures with argon are investigated. The chemical composition and, as a result, mechanical properties of the deposited layer can be tailored along with morphological and crystallographic structural changes. The corrosion resistance behavior is studied by potentiodynamic measurements...
-
Categorization of Wikipedia articles with spectral clustering
PublicationAbstract. The article reports application of clustering algorithms for creating hierarchical groups withinWikipedia articles.We evaluate three spectral clustering algorithms based on datasets constructed with usage ofWikipedia categories. Selected algorithm has been implemented in the system that categorize Wikipedia search results in the fly.
-
An Analysis of Neural Word Representations for Wikipedia Articles Classification
PublicationOne of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...
-
Experimental investigations of natural convection from circular plates at variable inclination
PublicationW pracy przedstawiono wyniki badań eksperymentalnych oraz bezwymiarową korelację liczb kryterialnych dla konwekcyjnej wymiany ciepła od izotermicznej płyty kołowej skierowanej ku górze dla różnych kątów pochylenia (od poziomej do pionowej) w szerokim zakresie wartości liczb Rayleigha. Badania przeprowadzono w wodzie dla płyty kołowej o średnicy 0,07 m. Celem pracy było określenie wpływu kąta pochylenia płyty na wartość liczby Nusselta.
-
Two Stage SVM and kNN Text Documents Classifier
PublicationThe paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...
-
Utilization of rapeseed pellet from fatty acid methyl esters production as an energy source
PublicationRapeseed pellet – crushed seed residue from oil extraction is a by-product of fatty acid methyl esters production process. As other types of biomass, it can either be burned directly in furnaces or processed to increase its energetic value. Biomass is renewable, abundant and has domestic usage; the sources of biomass can help the world reduce its dependence on petroleum products, fossil coal and natural gas. Energetically effective...
-
Identification of category associations using a multilabel classifier
PublicationDescription of the data using categories allows one to describe it on a higher abstraction level. In this way, we can operate on aggregated groups of the information, allowing one to see relationships that do not appear explicit when we analyze the individual objects separately. In this paper we present automatic identification of the associations between categories used for organization of the textual data. As experimental data...
-
Towards bees detection on images: study of different color models for neural networks
PublicationThis paper presents an approach to bee detection in videostreams using a neural network classifier. We describe the motivationfor our research and the methodology of data acquisition. The maincontribution to this work is a comparison of different color models usedas an input format for a feedforward convolutional architecture appliedto bee detection. The detection process has is based on a neural...
-
Bees Detection on Images: Study of Different Color Models for Neural Networks
PublicationThis paper presents an approach to bee detection in video streams using a neural network classifier. We describe the motivation for our research and the methodology of data acquisition. The main contribution to this work is a comparison of different color models used as an input format for a feedforward convolutional architecture applied to bee detection. The detection process has is based on a neural binary classifier that classifies...
-
Framework for Integration Decentralized and Untrusted Multi-vendor IoMT Environments
PublicationLack of standardization is highly visible while we use historical data sets or compare our model with others that use IoMT devices from different vendors. The problem also concerns the trust in highly decentralized and anonymous environments where sensitive data are transferred through the Internet and then are analyzed by third-party companies. In our research we propose a standard that has been implemented in the form of framework...
-
Semantic URL Analytics to Support Efficient Annotation of Large Scale Web Archives
PublicationLong-term Web archives comprise Web documents gathered over longer time periods and can easily reach hundreds of terabytes in size. Semantic annotations such as named entities can facilitate intelligent access to the Web archive data. However, the annotation of the entire archive content on this scale is often infeasible. The most efficient way to access the documents within Web archives is provided through their URLs, which are...
-
Capacitance Enhancement by Incorporation of Functionalised Carbon Nanotubes into Poly(3,4-Ethylenedioxythiophene)/Graphene Oxide Composites
PublicationThis paper reports on the role of oxidised carbon nanotubes (oxMWCNTs) present in poly-3,4-ethylenedioxytiophene (PEDOT)/graphene oxide (GOx) composite. The final ternary composites (pEDOT/GOx/oxMWCNTs) are synthesised by an electrodeposition process from the suspension-containing monomer, oxidised carbon nanotubes and graphene oxide. Dissociated functional groups on the surface of graphene oxide play a role of counter-ions for the...
-
Decrease in Photovoltaic Module Efficiency Due to Deposition of Pollutants
PublicationThe deposition of pollutants on the surface of photovoltaic (PV) modules reduce the efficiency that can be achieved in given climatic conditions. This results in the loss of energy yield obtained from the solar installation. A number of factors determine the scale of this problem. The first of these is the amount of impurities deposited, the associated amount of precipitation, and the speed and direction of the wind. A second aspect...
-
The Analysis of Working Parameters Decrease in Photovoltaic Modules as a Result of Dust Deposition
PublicationThe aspect of dust accumulation on the surface of photovoltaic (PV) modules should be thoroughly understood in order to minimize possible obstacles affecting energy generation. Several elements affect the amount of pollutant gathered on the surface of a solar device, mainly its localization, which is irreversibly linked to factors such as annual rainfall, occasional snow coverage, or, in a dry climate, increased blow of dust during...
-
Free convective heat transfer structures as a function of the width of isothermal horizontal rectangular plates
PublicationZaprezentowano rezultaty badań eksperymentalnych konwekcyjnej wymiany ciepła od izotermicznych płyt prostokątnych skierowanych ku górze. Na podstawie przeprowadzonych wizualizacji zaproponowano dwa modele struktur przepływu. Rozwiązania analityczne dla tych modeli przedstawiono w postaci zależności liczb Nusselta Rayleigh'a. Stwierdzono, że przepływ koncentryczny występuje dla małych wartości liczb Ra, szczególnie w przypadku płyt...
-
Review on Wikification methods
PublicationThe paper reviews methods on automatic annotation of texts with Wikipedia entries. The process, called Wikification aims at building references between concepts identified in the text and Wikipedia articles. Wikification finds many applications, especially in text representation, where it enables one to capture the semantic similarity of the documents. Also, it can be considered as automatic tagging of the text. We describe typical...