dr hab. inż. Julian Szymański
Zatrudnienie
- Zastępca dyrektora Szkoły Doktorskiej Wdrożeniowej w Szkoła Doktorska Wdrożeniowa
- Profesor uczelni w Katedra Architektury Systemów Komputerowych
Publikacje
Filtry
wszystkich: 132
Katalog Publikacji
Rok 2011
-
0-step K-means for clustering Wikipedia search results
PublikacjaThis article describes an improvement for K-means algorithm and its application in the form of a system that clusters search results retrieved from Wikipedia. The proposed algorithm eliminates K-means isadvantages and allows one to create a cluster hierarchy. The main contributions of this paper include the ollowing: (1) The concept of an improved K-means algorithm and its application for hierarchical clustering....
-
Categorization of Wikipedia articles with spectral clustering
PublikacjaAbstract. The article reports application of clustering algorithms for creating hierarchical groups withinWikipedia articles.We evaluate three spectral clustering algorithms based on datasets constructed with usage ofWikipedia categories. Selected algorithm has been implemented in the system that categorize Wikipedia search results in the fly.
-
Cooperative Word Net Editor for Lexical Semantic Acquisition
PublikacjaThe article describes an approach for building Word Net semantic dictionary in a collaborative approach paradigm. The presented system system enables functionality for gathering lexical data in a Wikipedia-like style. The core of the system is a user-friendly interface based on component for interactive graph navigation. The component has been used for Word Net semantic network presentation on web page, and it brings functionalities...
-
External Validation Measures for Nested Clustering of Text Documents
PublikacjaAbstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...
-
Gra słowna do pozyskiwania wiedzy językowej
PublikacjaW artykule opisano implementację gry słownej w pytania, będącej modelem wyszukiwarki kontekstowej oraz narzędziem do pozyskiwania wiedzy o pojęciach języka naturalnego. Zdefiniowano określenie wyszukiwania kontekstowego oraz przedstawiono opis algorytmu znajdującego obiekty na podstawie ich cech. Scharakteryzowano przyjętą reprezentację wiedzy oraz sposób uczenia się w kontekście innych znanych projektów poruszających problem akwizycji...
Rok 2023
-
A Formal Approach to Model the Expansion of Natural Events: The Case of Infectious Diseases
PublikacjaA formal approach to modeling the expansion of natural events is presented in this paper. Since the mathematical, statistical or computational methods used are not relevant for development, a modular framework is carried out that guides from the external observation down to the innermost level of the variables that have to appear in the future mathematical-computational formalization. As an example we analyze the expansion of Covid-19....
-
Application of a stochastic compartmental model to approach the spread of environmental events with climatic bias
PublikacjaWildfires have significant impacts on both environment and economy, so understanding their behaviour is crucial for the planning and allocation of firefighting resources. Since forest fire management is of great concern, there has been an increasing demand for computationally efficient and accurate prediction models. In order to address this challenge, this work proposes applying a parameterised stochastic model to study the propagation...
-
From Scores to Predictions in Multi-Label Classification: Neural Thresholding Strategies
PublikacjaIn this paper, we propose a novel approach for obtaining predictions from per-class scores to improve the accuracy of multi-label classification systems. In a multi-label classification task, the expected output is a set of predicted labels per each testing sample. Typically, these predictions are calculated by implicit or explicit thresholding of per-class real-valued scores: classes with scores exceeding a given threshold value...
Rok 2022
-
Active Learning Based on Crowdsourced Data
PublikacjaThe paper proposes a crowdsourcing-based approach for annotated data acquisition and means to support Active Learning training approach. In the proposed solution, aimed at data engineers, the knowledge of the crowd serves as an oracle that is able to judge whether the given sample is informative or not. The proposed solution reduces the amount of work needed to annotate large sets of data. Furthermore, it allows a perpetual increase...
-
Detection of anomalies in bee colony using transitioning state and contrastive autoencoders
PublikacjaHoneybees plays vital role for the environmental sustainability and overall agricultural economy. Assisting bee colonies within their proper functioning brings the attention of researchers around the world. Electronics systems and machine learning algorithms are being developed for classifying specific undesirable bee behaviors in order to alert about upcoming substantial losses. However, classifiers could be impaired when used...
Rok 2012
-
Adaptive Algorithm for Interactive Question-based Search
PublikacjaPopular web search engines tend to improve the relevanceof their result pages, but the search is still keyword-oriented and far from "understanding" the queries' meaning. In the article we propose an interactive question-based search algorithm that might come up helpful for identifying users' intents. We describe the algorithm implemented in a form of a questions game. The stress is put mainly on the most critical aspect of this...
-
Annotating Words Using WordNet Semantic Glosses
PublikacjaAn approach to the word sense disambiguation (WSD) relaying onthe WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations...
-
Collaborative approach to WordNet and Wikipedia integration
PublikacjaIn this article we present a collaborative approach tocreating mappings between WordNet and Wikipedia. Wikipediaarticles have been first matched with WordNet synsets in anautomatic way. Then such associations have been evaluated andcomplemented in a collaborative way using a web application.We describe algorithms used for creating automatic mappingsas well as a system for their collaborative development. Theoutcome enables further...
-
Context Search Algorithm for Lexical Knowledge Acquisition
PublikacjaA Context Search algorithm used for lexical knowledge acquisition is presented. Knowledge representation based on psycholinguistic theories of cognitive processes allows for implementation of a computational model of semantic memory in the form of semantic network. A knowledge acquisition using supervised dialog templates have been performed in a word game designed to guess the concept a human user is thinking about. The game,...
Rok 2019
-
Advances in Architectures, Big Data, and Machine Learning Techniques for Complex Internet of Things Systems
PublikacjaTe feld of Big Data is rapidly developing with a lot of ongoing research, which will likely continue to expand in the future. A crucial part of this is Knowledge Discovery from Data (KDD), also known as the Knowledge Discovery Process (KDP). Tis process is a very complex procedure, and for that reason it is essential to divide it into several steps (Figure 1). Some authors use fve steps to describe this procedure, whereas others...
-
An Analysis of Neural Word Representations for Wikipedia Articles Classification
PublikacjaOne of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...
-
Bees Detection on Images: Study of Different Color Models for Neural Networks
PublikacjaThis paper presents an approach to bee detection in video streams using a neural network classifier. We describe the motivation for our research and the methodology of data acquisition. The main contribution to this work is a comparison of different color models used as an input format for a feedforward convolutional architecture applied to bee detection. The detection process has is based on a neural binary classifier that classifies...
-
Crowdsourcing-Based Evaluation of Automatic References Between WordNet and Wikipedia
PublikacjaThe paper presents an approach to build references (also called mappings) between WordNet and Wikipedia. We propose four algorithms used for automatic construction of the references. Then, based on an aggregation algorithm, we produce an initial set of mappings that has been evaluated in a cooperative way. For that purpose, we implement a system for the distribution of evaluation tasks, that have been solved by the user community....
-
Deep learning in the fog
PublikacjaIn the era of a ubiquitous Internet of Things and fast artificial intelligence advance, especially thanks to deep learning networks and hardware acceleration, we face rapid growth of highly decentralized and intelligent solutions that offer functionality of data processing closer to the end user. Internet of Things usually produces a huge amount of data that to be effectively analyzed, especially with neural networks, demands high...
-
Distributed Architectures for Intensive Urban Computing: A Case Study on Smart Lighting for Sustainable Cities
PublikacjaNew information and communication technologies have contributed to the development of the smart city concept. On a physical level, this paradigm is characterised by deploying a substantial number of different devices that can sense their surroundings and generate a large amount of data. The most typical case is image and video acquisition sensors. Recently, these types of sensors are found in abundance in urban spaces and are responsible...
-
Exact-match Based Wikipedia-WordNet Integration
PublikacjaAbility to link between WordNet synsets and Wikipedia articles allows usage of those resources by computers during natural language processing. A lot of work was done in this field, however most of the approaches focus on similarity between Wikipedia articles and WordNet synsets rather than creation of perfect matches. In this paper we proposed a set of methods for automatic perfect matching generation. The proposed methods were...
Rok 2024
-
An intelligent cellular automaton scheme for modelling forest fires
PublikacjaForest fires have devastating consequences for the environment, the economy and human lives. Understanding their dynamics is therefore crucial for planning the resources allocated to combat them effectively. In a world where the incidence of such phenomena is increasing every year, the demand for efficient and accurate computational models is becoming increasingly necessary. In this study, we perform a revision of an initial proposal...
Rok 2017
-
An IoT-Based Computational Framework for Healthcare Monitoring in Mobile Environments
PublikacjaThe new Internet of Things paradigm allows for small devices with sensing, processing and communication capabilities to be designed, which enable the development of sensors, embedded devices and other ‘things’ ready to understand the environment. In this paper, a distributed framework based on the internet of things paradigm is proposed for monitoring human biomedical signals in activities involving physical exertion. The main...
-
Analysis of Denoising Autoencoder Properties Through Misspelling Correction Task
PublikacjaThe paper analyzes some properties of denoising autoencoders using the problem of misspellings correction as an exemplary task. We evaluate the capacity of the network in its classical feed-forward form. We also propose a modification to the output layer of the net, which we called multi-softmax. Experiments show that the model trained with this output layer outperforms traditional network both in learning time and accuracy. We...
-
Categorization of Cloud Workload Types with Clustering
PublikacjaThe paper presents a new classification schema of IaaS cloud workloads types, based on the functional characteristics. We show the results of an experiment of automatic categorization performed with different benchmarks that represent particular workload types. Monitoring of resource utilization allowed us to construct workload models that can be processed with machine learning algorithms. The direct connection between the functional...
Rok 2014
-
Automatic Classification of Polish Sign Language Words
PublikacjaIn the article we present the approach to automatic recognition of hand gestures using eGlove device. We present the research results of the system for detection and classification of static and dynamic words of Polish language. The results indicate the usage of eGlove allows to gain good recognition quality that additionally can be improved using additional data sources such as RGB cameras.
-
Big Data Paradigm Developed in Volunteer Grid System with Genetic Programming Scheduler
PublikacjaArtificial intelligence techniques are capable to handle a large amount of information collected over the web. In this paper, big data paradigm has been studied in volunteer and grid system called Comcute that is optimized by a genetic programming scheduler. This scheduler can optimize load balancing and resource cost. Genetic programming optimizer has been applied for finding the Pareto solu-tions. Finally, some results from numerical...
-
Comparative Analysis of Text Representation Methods Using Classification
PublikacjaIn our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...
-
Evaluation of Path Based Methods for Conceptual Representation of the Text
PublikacjaTypical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...
-
How Specific Can We Be with k-NN Classifier?
PublikacjaThis paper discusses the possibility of designing a two stage classifier for large-scale hierarchical and multilabel text classification task, that will be a compromise between two common approaches to this task. First of it is called big-bang, where there is only one classifier that aims to do all the job at once. Top-down approach is the second popular option, in which at each node of categories’ hierarchy, there is a flat classifier...
Rok 2016
-
Automatic Discovery of IaaS Cloud Workload Types
PublikacjaThe paper presents an approach to automatic discovery of workloads types. We perform functional characteristics of the workloads executed in our cloud environment, that have been used to create model of the computations. To categorize the resources utilization we used K-means algorithm, that allow us automatically select six types of computations. We perform analysis of the discovered types against to typical computational benchmarks,...
-
Depth Images Filtering In Distributed Streaming
PublikacjaIn this paper, we propose a distributed system for point cloud processing and transferring them via computer network regarding to effectiveness-related requirements. We discuss the comparison of point cloud filters focusing on their usage for streaming optimization. For the filtering step of the stream pipeline processing we evaluate four filters: Voxel Grid, Radial Outliner Remover, Statistical Outlier Removal and Pass Through....
-
DEPTH IMAGES FILTERING IN DISTRIBUTED STREAMING
PublikacjaIn this paper we discuss the comparison of point cloud filters focusing on their applicability for streaming optimization. For the filtering stage within a stream pipeline processing we evaluate three filters: Voxel Grid, Pass Through and Statistical Outlier Removal. For the filters we perform series of the tests aiming at evaluation of changes of point cloud size and transmitting frequency (various fps ratio). We propose a distributed...
Rok 2010
-
Automatyczna klasyfikacja artykułów Wikipedii
PublikacjaWikipedia- internetowa encyklopedia do organizacji artykułów wykorzystuje system kategorii. W chwili obecnej proces przypisywania artykułu do odpowiednich kategorii tematycznych realizowany jest ręcznie przez jej edytorów. Zadanie to jest czasochłonne i wymaga wiedzy o strukturze Wikiedii. Ręczna kategoryzacja jest również podatna na błędy wynikające z faktu, że przyporządkowanie artykułu don kategorii odbywa się w oparciu o arbitralną...
-
Dynamic Semantic Visual Information Management
PublikacjaDominant Internet search engines use keywords and therefore are not suited for exploration of new domains of knowledge, when the user does not know specific vocabulary. Browsing through articles in a large encyclopedia, each presenting a small fragment of knowledge, it is hard to map the whole domain, see relevant concepts and their relations. In Wikipedia for example some highly relevant articles are not linked with each other....
Rok 2020
-
Bidirectional Fragment to Fragment Links in Wikipedia
PublikacjaThe paper presents a WikiLinks system that extends the Wikipedia linkage model with bidirectional links between fragments of the articles and overlapping links’ anchors. The proposed model adopts some ideas from the research conducted in a field of nonlinear, computer-aided writing, often called a hypertext. WikiLinks may be considered as a web augmentation tool but it presents a new approach to the problem that addresses the specific...
-
Buzz-based recognition of the honeybee colony circadian rhythm
PublikacjaHoneybees are one of the highly valued pollinators. Their work as individuals is appreciated for crops pollination and honey production. It is believed that work of an entire bee colony is intense and almost continuous. The goal of the work presented in this paper is identification of bees circadian rhythm with a use of sound-based analysis. In our research as a source of information on bee colony we use their buzz that have been...
-
Collaborative Data Acquisition and Learning Support
PublikacjaWith the constant development of neural networks, traditional algorithms relying on data structures lose their significance as more and more solutions are using AI rather than traditional algorithms. This in turn requires a lot of correctly annotated and informative data samples. In this paper, we propose a crowdsourcing based approach for data acquisition and tagging with support for Active Learning where the system acts as an...
-
Framework for Integration Decentralized and Untrusted Multi-vendor IoMT Environments
PublikacjaLack of standardization is highly visible while we use historical data sets or compare our model with others that use IoMT devices from different vendors. The problem also concerns the trust in highly decentralized and anonymous environments where sensitive data are transferred through the Internet and then are analyzed by third-party companies. In our research we propose a standard that has been implemented in the form of framework...
Rok 2021
-
Blockchain technologies to address smart city and society challenges
PublikacjaNew Information and Communications Technologies (ICT) are changing the way in which the world works. These technologies provide new tools to face the issues of contemporary society (poverty, migrations, sustainable development challenges, governance, etc.). Among them, blockchain emerge as a disruptive technology able to make things in a completely different and innovative way. They can provide solutions where before there were...
-
Buzz-based honeybee colony fingerprint
PublikacjaNon-intrusive remote monitoring has its applications in a variety of areas. For industrial surveillance case, devices are capable of detecting anomalies that may threaten machine operation. Similarly, agricultural monitoring devices are used to supervise livestock or provide higher yields. Modern IoT devices are often coupled with Machine Learning models, which provide valuable insights into device operation. However, the data...
-
Embedded Representations of Wikipedia Categories
PublikacjaIn this paper, we present an approach to building neural representations of the Wikipedia category graph. We test four different methods and examine the neural embeddings in terms of preservation of graphs edges, neighborhood coverage in representation space, and their influence on the results of a task predicting parent of two categories. The main contribution of this paper is application of neural representations for improving the...
-
Fast Approximate String Search for Wikification
PublikacjaThe paper presents a novel method for fast approximate string search based on neural distance metrics embeddings. Our research is focused primarily on applying the proposed method for entity retrieval in the Wikification process, which is similar to edit distance-based similarity search on the typical dictionary. The proposed method has been compared with symmetric delete spelling correction algorithm and proven to be more efficient...
-
Generowanie tekstu z użyciem sieci typu Transformer
PublikacjaOpisano działanie wybranych modeli uczenia maszynowego znajdujących zastosowanie w przetwarzaniu języka naturalnego w szczególności wy- korzystywanych do generowania tekstu. Przedstawiono również model BERT i jego różne wersje, a także praktyczne wykorzystanie modeli typu Transformer. Przedstawiono ich działanie w aplikacji zmieniającej nastrój tekstu w sposób sekwencyjny.
Rok 2013
-
Bringing Common Sense to WordNet with a Word Game
PublikacjaWe present a tool for common sense knowledge acquisition in form of a twenty questions game. The described approach uses WordNet dictionary, which rich taxonomy allows to keep cognitive economy and accelerate knowledge propagation, although sometimes inferences made on hierarchical relations result in noise. We extend the dictionary with common sense assertions acquired during the games played with humans. The facts added to the...
Rok 2005
-
Concept description vectors and the 20 question game
PublikacjaKnowledge of properties that are applicable to a given object is a necessary prerequisite to formulate intelligent question. Concept description vectors provide simplest representation of this knowledge, storing for each object information about the values of its properties. Experiments with automatic creation of concept description vectors from various sources, including ontologies, dictionaries, encyclopedias and unstructured...
Rok 2007
-
Cooperative editing approach for building Wordnet database
PublikacjaArtykuł przedstawia podejście do kooperacyjnej pracy nad baza danych systemu Wordnet. Opisana została architektura systemu oraz wizualizacja sieci powiązań konceptualnych z użyciem komponentu touchgraph.
Rok 2018
-
DBpedia and YAGO Based System for Answering Questions in Natural Language
PublikacjaIn this paper we propose a method for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference) based on DBpedia and YAGO. Our method is based on generating dependency trees for the query. In the dependency tree we look for paths leading from the root to the named entity of interest. These paths (referenced further as fibers) are candidates for representation of actual user intention. The...
-
Detection of the Bee Queen Presence Using Sound Analysis
PublikacjaThis work describes the system and methods of data analysis we use for beehive monitoring. We present overview of the hardware infrastructures used in hive monitoring systems and we describe algorithms used for analysis of this kind of data. Based on acquisited signals we construct the application that is capable to detect an absence of honey bee queen. We describe our method of signal analysis and present results that allow us...
Rok 2015
-
DBpedia As a Formal Knowledge Base – An Evaluation
PublikacjaDBpedia is widely used by researchers as a mean of accessing Wikipedia in a standardized way. In this paper it is characterized from the point of view of questions answering system. Simple implementation of such system is also presented. The paper also characterizes alternatives to DBpedia in form of OpenCyc and YAGO knowledge bases. A comparison between DBpedia and those knowledge bases is presented.
wyświetlono 3029 razy