Search results for: text representation documents categorization information retrieval

Search results for: text representation documents categorization information retrieval

results on page:
embed this view on your website

Filters

total: 444

clear all filters disabled

Anna Baj-Rogowska dr

People

Department of Informatics in Management

Anna Baj-Rogowska is employed as an assistant professor at the Department of Informatics in Management at the Faculty of Management and Economics, Gdańsk University of Technology. Her higher education is connected with the University of Gdańsk, where she graduated from a master's degree in business informatics, doctoral studies and then obtained a PhD degree in economics in management science (Department of Business Informatics...
Improving css-KNN Classification Performance by Shifts in Training Data
Publication
- K. Draszawka
- J. Szymański
- F. Guerra
- Year 2015
This paper presents a new approach to improve the performance of a css-k-NN classifier for categorization of text documents. The css-k-NN classifier (i.e., a threshold-based variation of a standard k-NN classifier we proposed in [1]) is a lazy-learning instance-based classifier. It does not have parameters associated with features and/or classes of objects, that would be optimized during off-line learning. In this paper we propose...
An Analysis of Neural Word Representations for Wikipedia Articles Classification
Publication
- J. Szymański
- N. Kawalec
- CYBERNETICS AND SYSTEMS - Year 2019
One of the current popular methods of generating word representations is an approach based on the analysis of large document collections with neural networks. It creates so-called word-embeddings that attempt to learn relationships between words and encode this information in the form of a low-dimensional vector. The goal of this paper is to examine the differences between the most popular embedding models and the typical bag-of-words...

Full text to download in external service
Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations
Publication
- N. Rizun
- W. Waloszek
- Y. Taranenko
- Information - Year 2018
The research presents the methodology of improving the accuracy in sentiment classification in the light of modelling the latent semantic relations (LSR). The objective of this methodology is to find ways of eliminating the limitations of the discriminant and probabilistic methods for LSR revealing and customizing the sentiment classification process (SCP) to the more accurate recognition of text tonality. This objective was achieved...

Full text available to download
Concept description vectors and the 20 question game
Publication
- J. Szymański
- T. Sarnatowicz
- W. Duch
- Year 2005
Knowledge of properties that are applicable to a given object is a necessary prerequisite to formulate intelligent question. Concept description vectors provide simplest representation of this knowledge, storing for each object information about the values of its properties. Experiments with automatic creation of concept description vectors from various sources, including ontologies, dictionaries, encyclopedias and unstructured...

Full text to download in external service
Fusion-based Representation Learning Model for Multimode User-generated Social Network Content
Publication
- A. M. Soomar
- ACM Journal of Data and Information Quality - Year 2023
As mobile networks and APPs are developed, user-generated content (UGC), which includes multi-source heterogeneous data like user reviews, tags, scores, images, and videos, has become an essential basis for improving the quality of personalized services. Due to the multi-source heterogeneous nature of the data, big data fusion offers both promise and drawbacks. With the rise of mobile networks and applications, UGC, which includes...

Full text to download in external service
Development and Research of the Text Messages Semantic Clustering Methodology
Publication
- N. Rizun
- P. Kapłański
- Y. Taranenko
- Year 2016
The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

Full text available to download
Just look at to open it up: A biometric verification facility for password autofill to protect electronic documents
Publication
- M. Smiatacz
- B. Wiszniewski
- MULTIMEDIA TOOLS AND APPLICATIONS - Year 2021
Electronic documents constitute specific units of information, and protecting them against unauthorized access is a challenging task. This is because a password protected document may be stolen from its host computer or intercepted while on transfer and exposed to unlimited offline attacks. The key issue is, therefore, making document passwords hard to crack. We propose to augment a common text password authentication interface...

Full text available to download
Agile Commerce in the light of Text Mining
Publication
- A. Baj-Rogowska
- Przedsiębiorczość i Zarządzanie - Year 2017
The survey conducted for this study reveals that more than 84% of respondents have never encountered the term “agile commerce” and do not understand its meaning. At the same time, they are active participants of this strategy. Using digital channels as customers more often than ever before, they have already been included in the agile philosophy. Based on the above, the purpose of the study is to analyse major text sets containing...

Full text available to download
Ontologies vs. Rules — Comparison of Methods of Knowledge Representation Based on the Example of IT Services Management
Publication
- A. Czarnecki
- T. Sitek
- Year 2013
This text provides a brief overview of selected structures aimed at knowledge representation in the form of ontologies based on description logic and aims at comparing them with their counterparts based on the rule-based approach. Due to the limitations on the length of the article, only elements associated with the representation of concepts could be shown, without including roles. The formalisms of the OWL language were used...

Full text to download in external service
Semantic Analysis and Text Summarization in Socio-Technical Systems
Publication
- N. Rizun
- Year 2018
In this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...

Full text to download in external service
Context Search Algorithm for Lexical Knowledge Acquisition
Publication
- J. Szymański
- W. Duch
- CONTROL AND CYBERNETICS - Year 2012
A Context Search algorithm used for lexical knowledge acquisition is presented. Knowledge representation based on psycholinguistic theories of cognitive processes allows for implementation of a computational model of semantic memory in the form of semantic network. A knowledge acquisition using supervised dialog templates have been performed in a word game designed to guess the concept a human user is thinking about. The game,...
Methodology of Selecting the Hadoop Ecosystem Configuration in Order to Improve the Performance of a Plagiarism Detection System
Publication
- A. Sobecki
- M. Kępa
- Year 2018
The plagiarism detection problem involves finding patterns in unstructured text documents. Similarity of documents in this approach means that the documents contain some identical phrases with defined minimal length. The typical methods used to find similar documents in dig- ital libraries are not suitable for this task (plagiarism detection) because found documents may contain similar content and we have not any war- ranty that...

Full text to download in external service
SEMANTIC ANALYSIS ALGORITHMS FOR KNOWLEDGE WORKERS SUPPORT
Publication
- N. Rizun
- M. Rizun
- J. Taranenko
- Year 2017
The paper examines various aspects of text analysis application for knowledge worker’s activity realization. Conclusions are drawn about the relevance and importance of processing the non-structured textual information in order to increase knowledge worker’s efficiency, as well as their awareness in different branches of science. The paper considers the existing algorithms of texts semantic analysis as the sphere of documents topical...

Full text available to download
Information Retrieval Facility Conference

Conferences
Asia Information Retrieval Symposium

Conferences
European Conference on Information Retrieval

Conferences
SIGIR workshop: Stylistic Analysis of Text For Information Access

Conferences
Music Mood Visualization Using Self-Organizing Maps
Publication
- M. Piotrowska
- B. Kostek
- Archives of Acoustics - Year 2015
Due to an increasing amount of music being made available in digital form in the Internet, an automatic organization of music is sought. The paper presents an approach to graphical representation of mood of songs based on Self-Organizing Maps. Parameters describing mood of music are proposed and calculated and then analyzed employing correlation with mood dimensions based on the Multidimensional Scaling. A map is created in which...

Full text available to download
DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING
Publication
- N. Rizun
- J. Taranenko
- Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie - Year 2017
The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming...

Full text available to download
Ontologie vs. reguły — porównanie metod reprezentacji wiedzy na przykładzie dziedziny zarządzania usługami informatycznymi
Publication
- A. Czarnecki
- T. Sitek
- Ekonomiczne Problemy Usług - Year 2013
Tekst stanowi krótki przegląd wybranych konstrukcji służących reprezentacji wiedzy w postaci ontologii opartych na logice opisowej i porównanie ich z odpowiednikami opartymi na zapisie regułowym. Z powodu ograniczonej liczby stron pokazano tylko elementy związane z reprezentacją konceptów, bez uwzględniania ról. Do zapisu ontologii wykorzystano formalizmy języka OWL, zaś reguły wyrażono w Prologu. Dla lepszego zilustrowania tych...

Full text available to download
Krystyna Dziubich mgr inż.

People

Department of Computer Architecture

Krystyna Dziubich obtained a Eng. degree in computer science granted by a council at the Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology in 1996. 1996-2005 employment in industry, as a computer scientist as specialist analyst in the Department of Management Systems Development; She is employed at ETI Faculty as a lecturer since 2005. She conducts lectures for full-time, extramural...
Internal legal acts of technical and medical universities in Poland regulating classes conducted in-person during the Covid-19 pandemic
Open Research Data
open access
- K. Górak-Sosnowska
- L. Tomaszewska
A database of legal acts and other internal documents of medical and technical universities in Poland regulating the way of organizing in-person or hybrid classes during the COVID-19 pandemic from the summer semester 2019/2020 to the winter semester 2020/2021.Documents were encoded in two separate coding systems using the MAXQDA program for qualitative...
Speech Analytics Based on Machine Learning
Publication
- Year 2019
In this chapter, the process of speech data preparation for machine learning is discussed in detail. Examples of speech analytics methods applied to phonemes and allophones are shown. Further, an approach to automatic phoneme recognition involving optimized parametrization and a classifier belonging to machine learning algorithms is discussed. Feature vectors are built on the basis of descriptors coming from the music information...

Full text to download in external service
Towards Increasing Density of Relations in Category Graphs
Publication
- Year 2014
In the chapter we propose methods for identifying new associations between Wikipedia categories. The first method is based on Bag-of-Words (BOW) representation of Wikipedia articles. Using similarity of the articles belonging to different categories allows to calculate the information about categories similarity. The second method is based on average scores given to categories while categorizing documents by our dedicated score-based...

Full text to download in external service
Retrieval of Heterogeneus Sevices in C2NIWA Repository
Publication
- J. Szymański
- TASK Quarterly - Year 2015
The paper reviews the methods used for retrieval of information and services. The selected approaches presented in the review inspired us to build retrieval mechanisms in a system for searching the resources stored in the C2NIWA repository. We describe the architecture of the system, its functions and the surrounding subsystems to which it is related. For retrieval of C2NIWA sevices we propos three approaches based on: keyword...

Full text available to download
Marek Czachor prof. dr hab.

People

Instytut Fizyki i Informatyki Stosowanej
Context-Aware Indexing and Retrieval for Cognitive Systems Using SOEKS and DDNA
Publication
- C. De Silva Oliveira
- C. Sanin
- E. Szczerbicki
- Advances in Intelligent Systems and Computing - Year 2019
Visual content searching, browsing and retrieval tools have been a focus area of interest as they are required by systems from many different domains. Context-based, Content-Based, and Semantic-based are different approaches utilized for indexing/retrieving, but have their drawbacks when applied to systems that aim to mimic the human capabilities. Such systems, also known as Cognitive Systems, are still limited in terms of processing...

Full text available to download
International Conference on the Theory of Information Retrieval (The 3rd ACM International Conference on the Theory of Information Retrieval)

Conferences
CAD. Integrated Architectural Design, MSc Arch (2022/2023)
e-Learning Courses
- D. Cyparski
The programme will provide students with a solid grounding in BIM (Building Information Modelling) using Autodesks Revit Architecture. Students will review the advanced features of Revit for Architecture, a tool to support BIM (Building Information Modelling) and delivery of 3D digital models and related documentation. The lesson plans will specifically introduce students to common workflows and problem-solving skills while creating...
CAD. Integrated Architectural Design, BSc Arch (2023-24)
e-Learning Courses
- D. Cyparski
The programme will provide students with a solid grounding in BIM (Building Information Modelling) using Autodesks Revit Architecture. Students will review the advanced features of Revit for Architecture, a tool to support BIM (Building Information Modelling) and delivery of 3D digital models and related documentation. The lesson plans will specifically introduce students to common workflows and problem-solving skills while creating...
DBpedia and YAGO Based System for Answering Questions in Natural Language
Publication
- Year 2018
In this paper we propose a method for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference) based on DBpedia and YAGO. Our method is based on generating dependency trees for the query. In the dependency tree we look for paths leading from the root to the named entity of interest. These paths (referenced further as fibers) are candidates for representation of actual user intention. The...

Full text available to download
Contextual ontology for tonality assessment
Publication
- W. Waloszek
- N. Rizun
- Procedia Computer Science - Year 2020
classification tasks. The discussion focuses on two important research hypotheses: (1) whether it is possible to construct such an ontology from a corpus of textual document, and (2) whether it is possible and beneficial to use inferencing from this ontology to support the process of sentiment classification. To support the first hypothesis we present a method of extraction of hierarchy of contexts from a set of textual documents...

Full text available to download
Semantic Memory for Avatars in Cyberspace
Publication
- J. Szymański
- T. Sarnatowicz
- W. Duch
- Year 2005
Avatars that show intelligent behavior should have an access to general knowledge about the world, knowledge that humans store in their semantic memories. The simplest knowledge representation for semantic memory is based on the Concept Description Vectors (CDVs) that store, for each concept, an information whether a given property can be applied to this concept or not. Unfortunately large-scale semantic memories are not available....
Next Generation Digital
Publication
- B. Wiszniewski
- Pan European Networks: Science & Technology - Year 2013
The paper outlines the major objectives of the MENAID research project, eimed at novel architectures of digital documents. Such documents will enable reduction of information overflow and strain, a major threat to the growth of a digital society. They will be forward compatible, technology neutral and lightweight, allowing workers of network organizations to use personal devices of any type.

Full text to download in external service
Modeling the Customer’s Contextual Expectations Based on Latent Semantic Analysis Algorithms
Publication
- Year 2017
Nowadays, in the age of Internet, access to open data detects the huge possibilities for information retrieval. More and more often we hear about the concept of open data which is unrestricted access, in addition to reuse and analysis by external institutions, organizations and people. It’s such information that can be freely processed, add another data (so-called remix) and then published. More and more data are available in text...

Full text available to download
Towards Healthcare Cloud Computing
Publication
- Year 2016
In this paper we present construction of a software platform for supporting medical research teams, in the area of impedance cardiography, called IPMed. Using the platform, research tasks will be performed by the teams through computer-supported cooperative work. The platform enables secure medical data storing, access to the data for research group members, cooperative analysis of medical data and provide analysis supporting tools...

Full text to download in external service
Machine Learning and Text Analysis in an Artificial Intelligent System for the Training of Air Traffic Controllers
Publication
- T. Shmelova
- Y. Sikirda
- N. Rizun
- V. Lazorenko
- V. Kharchenko
- Year 2020
This chapter presents the application of new information technology in education for the training of air traffic controllers (ATCs). Machine learning, multi-criteria decision analysis, and text analysis as the methods of artificial intelligence for ATCs training have been described. The authors have made an analysis of the International Civil Aviation Organization documents for modern principles of ATCs education. The prototype...

Full text available to download
Workflow patterns applicable to virtual knowledge-based organizations
Publication
- M. Godlewska
- Zeszyty Naukowe Wydziału ETI Politechniki Gdańskiej. Technologie Informacyjne - Year 2010
Workflow is a term specifying how to automate a business process, in whole or part during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules. Workflow is therefore directly applicable in virtual knowledge-based organizations, where information is exchanged via electronic documents. In the literature, is presented a complete list of workflow control-flow...
System of specific grants for local government units in Poland
Publication
- A. Sekuła
- Year 2009
The article analyses the system of specific grants in local governments in Poland. First, main revenue sources of local self-governments are presented. Their presentation is based upon the consideration of one of the basic important principles in democratic states today, i.e. decentralization. The text then, in more details, describes specific grants with respect to the European Charter of Local Self-Government. Subsequently, the...
Gaining knowledge through experience: developing decisional DNA applications in robotics
Publication
- H. Zhang
- C. Sanin
- E. Szczerbicki
- CYBERNETICS AND SYSTEMS - Year 2010
Omówiono nowatorskie podejscie do zastosowania wiedzy opartej na doświadczeniu i budowie decyzyjnego DNA w obszarach związanych z robotyką.In this article, we explore an approach that integrates Decisional DNA, a domain-independent, flexible, and standard knowledge representation structure, with robots in order to test the usability and suitability of this novel knowledge representation structure. Core issues in using this Decisional...

Full text to download in external service
Facial data registration facility for biometric protection of electronic documents
Publication
- Year 2014
In modern world, information is crucial, and its leakage may lead to serious losses. Documents as the main medium of information must be therefore highly protected. Nowadays, the most common way of protecting data is using passwords, however it seems inconvenient to type complex passwords, when it is needed many times a day. For that reason a significant research has been conducted on biometric authentication...
ACM SIGIR Workshop on XML and Information Retrieval

Conferences
International Symposium on String Processing and Information Retrieval

Conferences
Magdalena Szuflita-Żurawska

People

Gdańsk University of Technology, Scientific and Technological Information Section, Main Library

Head of the Scientific and Technical Information Services at the Gdansk University of Technology Library and the Leader of the Open Science Competence Center. She is also a Plenipotentiary of the Rector of the Gdańsk University of Technology for open science. She is a PhD Candidate. Her main areas of research and interests include research productivity, motivation, management of HEs, Open Access, Open Research Data, information...
Manufacturing Data Analysis in Internet of Things/Internet of Data (IoT/IoD) Scenario
Publication
- E. Szczerbicki
- S. I. Shafiq
- C. Sanin
- CYBERNETICS AND SYSTEMS - Year 2018
Computer integrated manufacturing (CIM) has enormous benefits as it increases the rate of production, reduces errors and production waste, and streamlines manufacturing sub-systems. However, there are some new challenges related to CIM operating in the Internet of Things/Internet of Data (IoT/IoD) scenarios associated with Industry 4.0 and cyber-physical systems. The main challenge is to deal with the massive volume of data flowing...

Full text available to download
Self-Organizing Map representation for clustering Wikipedia search results
Publication
- J. Szymański
- LECTURE NOTES IN COMPUTER SCIENCE - Year 2011
The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...
Self–Organizing Map representation for clustering Wikipedia search results
Publication
- J. Szymański
- Year 2011
The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal...

Full text to download in external service
ACM International Conference on Research and Development in Information Retrieval

Conferences
Semantic URL Analytics to Support Efficient Annotation of Large Scale Web Archives
Publication
- T. Souza
- E. Demidova
- T. Risse
- H. Holzmann
- G. Gossen
- J. Szymański
- Year 2015
Long-term Web archives comprise Web documents gathered over longer time periods and can easily reach hundreds of terabytes in size. Semantic annotations such as named entities can facilitate intelligent access to the Web archive data. However, the annotation of the entire archive content on this scale is often infeasible. The most efficient way to access the documents within Web archives is provided through their URLs, which are...

Full text to download in external service

Search

Filters

Catalog

Search results for: text representation documents categorization information retrieval

Anna Baj-Rogowska dr

Krystyna Dziubich mgr inż.

Marek Czachor prof. dr hab.