dr hab. inż. Julian Szymański
Employment
- Deputy Director, Industrial Doctoral School at Industrial Doctoral School
- Associate professor at Department of Computer Architecture
Publications
Filters
total: 132
Catalog Publications
Year 2013
-
Thresholding Strategies for Large Scale Multi-Label Text Classifier
PublicationThis article presents an overview of thresholding methods for labeling objects given a list of candidate classes’ scores. These methods are essential to multi-label classification tasks, especially when there are a lot of classes which are organized in a hierarchy. Presented techniques are evaluated using the state-of-the-art dedicated classifier on medium scale text corpora extracted from Wikipedia. Obtained results show that the...
-
Wikipedia Articles Representation with Matrix'u
PublicationIn the article we evaluate different text representation methods used for a task of Wikipedia articles categorization. We present the Matrix’u application used for creating computational datasets ofWikipedia articles. The representations have been evaluated with SVM classifiers used for reconstruction human made categories.
Year 2012
-
Adaptive Algorithm for Interactive Question-based Search
PublicationPopular web search engines tend to improve the relevanceof their result pages, but the search is still keyword-oriented and far from "understanding" the queries' meaning. In the article we propose an interactive question-based search algorithm that might come up helpful for identifying users' intents. We describe the algorithm implemented in a form of a questions game. The stress is put mainly on the most critical aspect of this...
-
Annotating Words Using WordNet Semantic Glosses
PublicationAn approach to the word sense disambiguation (WSD) relaying onthe WordNet synsets is proposed. The method uses semantically tagged glosses to perform a process similar to the spreading activation in semantic network, creating ranking of the most probable meanings for word annotation. Preliminary evaluation shows quite promising results. Comparison with the state-of-theart WSD methods indicates that the use of WordNet relations...
-
Collaborative approach to WordNet and Wikipedia integration
PublicationIn this article we present a collaborative approach tocreating mappings between WordNet and Wikipedia. Wikipediaarticles have been first matched with WordNet synsets in anautomatic way. Then such associations have been evaluated andcomplemented in a collaborative way using a web application.We describe algorithms used for creating automatic mappingsas well as a system for their collaborative development. Theoutcome enables further...
-
Context Search Algorithm for Lexical Knowledge Acquisition
PublicationA Context Search algorithm used for lexical knowledge acquisition is presented. Knowledge representation based on psycholinguistic theories of cognitive processes allows for implementation of a computational model of semantic memory in the form of semantic network. A knowledge acquisition using supervised dialog templates have been performed in a word game designed to guess the concept a human user is thinking about. The game,...
-
Interactive Information Retrieval Algorithm for Wikipedia Articels
PublicationThe article presents an algorithm for retrieving textual information in documents collection. The algorithm employs a category system that organizers the repository and using interaction with user improves search precision. The algorithm was implemented for simple English Wikipedia and the first evaluation results indicates the proposed method can help to retrieve information from large document repositories.
-
Matching Exception Class Hierarchies between .NET, Java Environments
PublicationThe paper presents a methodology of exception classification and matching exception messages between .NET andJava environments. The methodology operates on existing exception class hierarchies and proposes two complementingapproaches: automated and manual matching. The automated matching uses the similarity measure to find associationsbetween exception messages from the two sets of classes for the considered programming languages....
-
Rozpraszanie obliczeń za pomocą serwerów dystrybucyjnych
PublicationOmówiono zasady funkcjonowania serwerów dystrybucyjnych w systemie obliczeniowym klasy grid pracującym w trybie volunteer computing. Omówiono sposoby zwiększania wydajności tej warstwy systemu za pomocą zarządzania strumieniem paczek danych. Odniesiono się także do koncepcji Map-Reduce w implementacji przetwarzania równoległego.
-
Self Organizing Maps for Visualization of Categories
PublicationVisualization of Wikipedia categories using Self Organizing Mapsshows an overview of categories and their relations, helping to narrow down search domains. Selecting particular neurons this approach enables retrieval of conceptually similar categories. Evaluation of neural activations indicates that they form coherent patterns that may be useful for building user interfaces for navigation over category structures.
-
Text classifiers for automatic articles categorization
PublicationThe article concerns the problem of automatic classification of textual content. We present selected methods for generation of documents representation and we evaluate them in classification tasks. The experiments have been performed on Wikipedia articles classified automatically to their categories made by Wikipedia editors.
-
Towards Effective Processing of Large Text Collections
PublicationIn the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof...
-
Words context analysis for improvement of information retrieval
PublicationIn the article we present an approach to improvement of retrieval informationfrom large text collections using words context vectors. The vectorshave been created analyzing English Wikipedia with Hyperspace Analogue to Language model of words similarity. For test phrases we evaluate retrieval with direct user queries as well as retrieval with context vectors of these queries. The results indicate that the proposed method can not...
-
Zastosowanie systemu Comcute do łamania algorytmu DES
PublicationZaprezentowano zastosowanie systemu Comcute do łamania szyfru DES. Przedstawiono podstawową architekturę wykorzystaną do dystrybucji obliczeń oraz zaprezentowano wyniki skalowalności rozwiązania w funkcji użytych jednostek obliczeniowych.
Year 2011
-
0-step K-means for clustering Wikipedia search results
PublicationThis article describes an improvement for K-means algorithm and its application in the form of a system that clusters search results retrieved from Wikipedia. The proposed algorithm eliminates K-means isadvantages and allows one to create a cluster hierarchy. The main contributions of this paper include the ollowing: (1) The concept of an improved K-means algorithm and its application for hierarchical clustering....
-
Categorization of Wikipedia articles with spectral clustering
PublicationAbstract. The article reports application of clustering algorithms for creating hierarchical groups withinWikipedia articles.We evaluate three spectral clustering algorithms based on datasets constructed with usage ofWikipedia categories. Selected algorithm has been implemented in the system that categorize Wikipedia search results in the fly.
-
Cooperative Word Net Editor for Lexical Semantic Acquisition
PublicationThe article describes an approach for building Word Net semantic dictionary in a collaborative approach paradigm. The presented system system enables functionality for gathering lexical data in a Wikipedia-like style. The core of the system is a user-friendly interface based on component for interactive graph navigation. The component has been used for Word Net semantic network presentation on web page, and it brings functionalities...
-
External Validation Measures for Nested Clustering of Text Documents
PublicationAbstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...
-
Gra słowna do pozyskiwania wiedzy językowej
PublicationW artykule opisano implementację gry słownej w pytania, będącej modelem wyszukiwarki kontekstowej oraz narzędziem do pozyskiwania wiedzy o pojęciach języka naturalnego. Zdefiniowano określenie wyszukiwania kontekstowego oraz przedstawiono opis algorytmu znajdującego obiekty na podstawie ich cech. Scharakteryzowano przyjętą reprezentację wiedzy oraz sposób uczenia się w kontekście innych znanych projektów poruszających problem akwizycji...
-
Induction of the common-sense hierarchies in lexical data
PublicationUnsupervised organization of a set of lexical concepts that captures common-sense knowledge inducting meaningful partitioning of data is described. Projection of data on principal components allow for dentification of clusters with wide margins, and the procedure is recursively repeated within each cluster. Application of this idea to a simple dataset describing animals created hierarchical partitioning with each clusters related...
seen 3086 times