Search results for: TEXT INDEXING

Context-Aware Indexing and Retrieval for Cognitive Systems Using SOEKS and DDNA

Publication

C. De Silva Oliveira
C. Sanin
E. Szczerbicki

- Advances in Intelligent Systems and Computing - Year 2019

Visual content searching, browsing and retrieval tools have been a focus area of interest as they are required by systems from many different domains. Context-based, Content-Based, and Semantic-based are different approaches utilized for indexing/retrieving, but have their drawbacks when applied to systems that aim to mimic the human capabilities. Such systems, also known as Cognitive Systems, are still limited in terms of processing...

Full text available to download

Acquisition and indexing of RGB-D recordings for facial expressions and emotion recognition

Publication

M. Szwoch

- Studia Informatica Pomerania - Year 2015

In this paper KinectRecorder comprehensive tool is described which provides for convenient and fast acquisition, indexing and storing of RGB-D video streams from Microsoft Kinect sensor. The application is especially useful as a supporting tool for creation of fully indexed databases of facial expressions and emotions that can be further used for learning and testing of emotion recognition algorithms for affect-aware applications....

Full text to download in external service

A comparison of indexing methods to evaluate quality of soils: the role of soil microbiological properties

Publication

R. Romaniuk
L. Giuffré
A. Costantini
N. Bartoloni
P. Nannipieri
R. S. Romaniuk

- Soil Research - Year 2011

Full text to download in external service

The partial-order tree: a new structure for indexing on complex attributes in object-oriented databases

Publication

K. Goczyla

- Year 1997

Full text to download in external service

Inducing a map on homology from a correspondence

Publication

S. Harker
H. Kokubu
K. Mischaikow
P. Pilarczyk

- PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY - Year 2015

Full text to download in external service

A comparison of indexing methods to evaluate quality of horticultural soils. Part II. sensitivity of soil microbiological indicators

Publication

R. Romaniuk
L. Giuffré
A. Costantini
N. Bartoloni
P. Nannipieri
R. S. Romaniuk

- Soil Research - Year 2014

Full text to download in external service

Text classifiers for automatic articles categorization

Publication

- Year 2012

The article concerns the problem of automatic classification of textual content. We present selected methods for generation of documents representation and we evaluate them in classification tasks. The experiments have been performed on Wikipedia articles classified automatically to their categories made by Wikipedia editors.

Agile Commerce in the light of Text Mining

Publication

A. Baj-Rogowska

- Przedsiębiorczość i Zarządzanie - Year 2017

The survey conducted for this study reveals that more than 84% of respondents have never encountered the term “agile commerce” and do not understand its meaning. At the same time, they are active participants of this strategy. Using digital channels as customers more often than ever before, they have already been included in the agile philosophy. Based on the above, the purpose of the study is to analyse major text sets containing...

Full text available to download

A Proliferation-Inducing Ligand and B-Cell Activating Factor Are Upregulated in Patients with Essential Thrombocythemia

Publication

L. Bolkun
M. Tynecka
T. Wasiluk
J. Piszcz
A. Starosz
K. Grubczak
M. Moniuszko
A. Eljaszewicz

- Journal of Clinical Medicine - Year 2022

Full text to download in external service

Prioritising national healthcare service issues from free text feedback – A computational text analysis & predictive modelling approach

Publication

A. Ojo
N. Rizun
G. Walsh
M. I. Mashinchi
M. Venosa
M. N. Rao

- DECISION SUPPORT SYSTEMS - Year 2024

Patient experience surveys have become a key source of evidence for supporting decision-making and continuous quality improvement within healthcare services. To harness free-text feedback collected as part of these surveys for additional insights, text analytics methods are increasingly employed when the data collected is not amenable to traditional qualitative analysis due to volume. However, while text analytics techniques offer...

Full text available to download

Redox reactions of the FAD-containing apoptosis-inducing factor (AIF) with quinoidal xenobiotics: A mechanistic study

Publication

L. Misevičienė
Ž. Anusevičius
J. Šarlauskas
I. Sevrioukova
N. Čėnas

- Archives of Biochemistry and Biophysics - Year 2011

Full text to download in external service

UGT1A1gene polymorphism as a potential factor inducing iron overload in the pathogenesis of type 1 hereditary hemochromatosis

Publication

T. Romanowski
K. Sikorska
K. Bielawski
K. P. Bielawski

- HEPATOLOGY RESEARCH - Year 2009

Full text to download in external service

Text Documents Classification with Support Vector Machines

Publication

P. Majewski

- Year 2008

Towards Effective Processing of Large Text Collections

Publication

- Year 2012

In the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof...

Parallel Computations of Text Similarities for Categorization Task

Publication

J. Szymański

- Year 2013

In this chapter we describe the approach to parallel implementation of similarities in high dimensional spaces. The similarities computation have been used for textual data categorization. A test datasets we create from Wikipedia articles that with their hyper references formed a graph used in our experiments. The similarities based on Euclidean distance and Cosine measure have been used to process the data using k-means algorithm....

Interactive Information Search in Text Data Collections

Publication

- Year 2013

This article presents a new idea for retrieving in text repositories, as well as it describes general infrastructure of a system created to implement and test those ideas. The implemented system differs from today’s standard search engine by introducing process of interactive search with users and data clustering. We present the basic algorithms behind our system and measures we used for results evaluation. The achieved results...

Full text to download in external service

Text Categorization Improvement via User Interaction

Publication

J. Atroszko
J. Szymański
D. Gil
H. Mora

- Year 2018

In this paper, we propose an approach to improvement of text categorization using interaction with the user. The quality of categorization has been defined in terms of a distribution of objects related to the classes and projected on the self-organizing maps. For the experiments, we use the articles and categories from the subset of Simple Wikipedia. We test three different approaches for text representation. As a baseline we use...

Full text to download in external service

Evaluation and Irony in Text in the Light of Speech Act Theory

Publication

K. Kukowicz-Zarska

- Forum Filologiczne Ateneum - Year 2020

Full text to download in external service

Evaluation of Path Based Methods for Conceptual Representation of the Text

Publication

- Year 2014

Typical text clustering methods use the bag of words (BoW) representation to describe content of documents. However, this method is known to have several limitations. Employing Wikipedia as the lexical knowledge base has shown an improvement of the text representation for data-mining purposes. Promising extensions of that trend employ hierarchical organization of Wikipedia category system. In this paper we propose three path-based...

Full text to download in external service

Text categorization with semantic commonsense knowledge: First results

Publication

P. Majewski
J. Szymański

- Year 2008

Do przetwarzania tekstów typowo wykorzystuje się reprezentacjeBOW. Podejście takie nie daje jednak dobrych rezultatów w sytuacjigdy podobne dokumenty nie współdzielą ze sobą słów.W artykule zaprezentowano podejście do konstrukcji funkcjijądra dla klasyfikatorów SVM opartego na zewnętrznej bazie wiedzyo pojęciach językowych.

External Validation Measures for Nested Clustering of Text Documents

Publication

- Year 2011

Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...

Time-domain prosodic modifications for text-to-speech synthesizer

Publication

- Year 2010

An application of prosodic speech processing algorithms to Text-To-Speech synthesis is presented. Prosodic modifications that improve the naturalness of the synthesized signal are discussed. The applied method is based on the TD-PSOLA algorithm. The developed Text-To-Speech Synthesizer is used in applications employing multimodal computer interfaces.

Semantic Analysis and Text Summarization in Socio-Technical Systems

Publication

N. Rizun

- Year 2018

In this chapter the authors present the results of the development the methodology for increasing the reliability of the functioning of the Socio-Technical System. The existed methods and algorithms for processing unstructured (textual) information were studied. Taking into account noted above strengths and weaknesses of Discriminant and Probabilistic approaches of Latent Semantic Relations analysis in of the summarization projection...

Full text to download in external service

Comparative Analysis of Text Representation Methods Using Classification

Publication

J. Szymański

- CYBERNETICS AND SYSTEMS - Year 2014

In our work, we review and empirically evaluate five different raw methods of text representation that allow automatic processing of Wikipedia articles. The main contribution of the article—evaluation of approaches to text representation for machine learning tasks—indicates that the text representation is fundamental for achieving good categorization results. The analysis of the representation methods creates a baseline that cannot...

Full text to download in external service

Two Stage SVM and kNN Text Documents Classifier

Publication

- Year 2015

The paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...

Selection of Relevant Features for Text Classification with K-NN

Publication

- Year 2013

In this paper, we describe five features selection techniques used for a text classification. An information gain, independent significance feature test, chi-squared test, odds ratio test, and frequency filtering have been compared according to the text benchmarks based on Wikipedia. For each method we present the results of classification quality obtained on the test datasets using K-NN based approach. A main advantage of evaluated...

Full text to download in external service

Development and Research of the Text Messages Semantic Clustering Methodology

Publication

N. Rizun
P. Kapłański
Y. Taranenko

- Year 2016

The methodology of semantic clustering analysis of customer’s text-opinions collection is developed. The author's version of the mathematical models of formalization and practical realization of short textual messages semantic clustering procedure is proposed, based on the customer’s text-opinions collection Latent Semantic Analysis knowledge extracting method. An algorithm for semantic clustering of the text-opinions is developed,...

Full text available to download

Towards facts extraction from text in Polish language

Publication

- Year 2017

Natural Language Processing (NLP) finds many usages in different fields of endeavor. Many tools exists allowing analysis of English language. For Polish language the situation is different as the language itself is more complicated. In this paper we show differences between NLP of Polish and English language. Existing solutions are presented and TEAMS software for facts extraction is described. The paper shows also evaluation of...

Full text available to download

Generating actionable evidence from free-text feedback to improve maternity and acute hospital experiences: A computational text analytics & predictive modelling approach

Publication

A. Ojo
N. Rizun
M. Isazad Mashinchi
G. Walsh
J. Gruda
M. N. Narayana
M. Venosa
C. Foley
D. Rohde
R. Flynn

- EUROPEAN JOURNAL OF PUBLIC HEALTH - Year 2023

Background Patient experience surveys are a key source of evidence for supporting decision-making and quality improvement in healthcare services. These surveys contain two main types of questions: closed and open-ended, asking about patients’ care experiences. Apart from the knowledge obtained from analysing closed-ended questions, invaluable insights can be gleaned from free-text data. Advanced analytics techniques are increasingly...

Full text to download in external service

Test PDF

Publication

Ł. Pilorz

- Year 2018

Test PDF

Thresholding Strategies for Large Scale Multi-Label Text Classifier

Publication

- Year 2013

This article presents an overview of thresholding methods for labeling objects given a list of candidate classes’ scores. These methods are essential to multi-label classiﬁcation tasks, especially when there are a lot of classes which are organized in a hierarchy. Presented techniques are evaluated using the state-of-the-art dedicated classiﬁer on medium scale text corpora extracted from Wikipedia. Obtained results show that the...

Full text to download in external service

Wieloznaczność w języku i tekście [Ambiguity in language and text]

Publication

K. Wojan

- PROGRESS. JOURNAL OF YOUNG RESEARCHERS - Year 2017

Full text to download in external service

Representation of hypertext documents based on terms, Links and text compressibility

Publication

J. Szymański
W. Duch

- LECTURE NOTES IN COMPUTER SCIENCE - Year 2010

Opisano metody reprezentacji dokumentów tekstowych oparte na słowach, wzajemnych powiązaniach i metodach kompresji. Dokonano ich oceny w oparciu o klasyfikator SVM.

Automatic prosodic modification in a Text-To-Speech synthesizer of Polish language

Publication

K. Łopatka
P. Suchomski
A. Czyżewski

- Elektronika : konstrukcje, technologie, zastosowania - Year 2011

Przedstawiono system syntezy mowy polskiej z funkcją automatycznej modyfikacji prozodii wypowiedzi. Opisane zostały metody automatycznego wyznaczania akcentu i intonacji wypowiedzi. Przedstawiono zastosowanie algorytmów przetwarzania sygnału mowy w procesie kształtowania prozodii. Omówiono wpływ zastosowanych modyfikacji na naturalność brzmienia syntezowanego sygnału. Zastosowana metoda oparta jest na algorytmie TD-PSOLA. Opracowany...

The use of can in automation test bench to test the engine cooling system

Publication

Z. Kneba
M. Śmieja

- Journal of KONES - Year 2010

W publikacji opisano zasady rejestracji danych pomiarowych w hamowni silnikowej do badań układów chłodzenia silników samochodowych. Założeniami projektu stanowiska było użycie standardu przesyłu danych typu CAN. Opracowano metodę dużej gęstości zapisu przesyłanych danych.

Full text available to download

Next Generation Digital

Publication

B. Wiszniewski

- Pan European Networks: Science & Technology - Year 2013

The paper outlines the major objectives of the MENAID research project, eimed at novel architectures of digital documents. Such documents will enable reduction of information overflow and strain, a major threat to the growth of a digital society. They will be forward compatible, technology neutral and lightweight, allowing workers of network organizations to use personal devices of any type.

Full text to download in external service

Application of dynamic time warping and cepstrograms to text-dependent speaker verification

Publication

A. Kaczmarek
M. Staworko

- Year 2009

This work provides a description of an automatic speaker verification (ASV) system. In particular, it documents the evolution of all individual stages of the proposed ASV system design from the phase of preprocessing to an operational decision making system. The aim of this research was to achieve the system of the best safety and ease of use in view of users. The objective estimation of this target has been accomplished by assessing...

SYNTHESIZING MEDICAL TERMS – QUALITY AND NATURALNESS OF THE DEEP TEXT-TO-SPEECH ALGORITHM

Publication

- Journal of the Acoustical Society of America - Year 2023

The main purpose of this study is to develop a deep text-to-speech (TTS) algorithm designated for an embedded system device. First, a critical literature review of state-of-the-art speech synthesis deep models is provided. The algorithm implementation covers both hardware and algorithmic solutions. The algorithm is designed for use with the Raspberry Pi 4 board. 80 synthesized sentences were prepared based on medical and everyday...

Full text available to download

Text-mining Similarity Approximation Operators for Opinion Mining in BI tools

Publication

N. Rizun
P. Kapłański
Y. Taranenko
S. Alessandro

- Year 2016

The concept of the Text-mining Similarity Approximation Operators for Opinion Mining as extensions to Natural Language Interface Database is defined. The new operators: “keywords of” dimension; subsetting operator “about C is q”; aggregation operator “by similar C” are proposed. These operators are based on the Latent Semantic Analysis and Social Network Analysis

Full text available to download

Text Mining Algorithms for Extracting Brand Knowledge; The fashion Industry Case

Publication

- Year 2018

Brand knowledge is determined by customer knowledge. The opportunity to develop brands based on customer knowledge management has never been greater. Social media as a set of leading communication platforms enable peer to peer interplays between customers and brands. A large stream of such interactions is a great source of information which, when thoroughly analyzed, can become a source of innovation and lead to competitive advantage....

Full text available to download

The Method of a Two-Level Text-Meaning Similarity Approximation of the Customers’ Opinions

Publication

N. Rizun
P. Kapłański
Y. Taranenko

- Studia Ekonomiczne. Zeszyty Naukowe Uniwersytetu Ekonomicznego w Katowicach - Year 2016

The method of two-level text-meaning similarity approximation, consisting in the implementation of the classification of the stages of text opinions of customers and identifying their rank quality level was developed. Proposed and proved the significance of major hypotheses, put as the basis of the developed methodology, notably about the significance of suggestions about the existence of analogies between mathematical bases of...

Full text available to download

Application of colour image segmentation for localization and extraction text from images

Publication

- Year 2005

W otaczającym nas świecie informacja tekstowa odgrywa wielką rolę. W postaci tekstowej podawane są: nazwy ulic, nazwy sklepów i instytucji, opisy przedmiotów np. tytuły książek, opakowań itp. Jednocześnie współczesne programy komputerowe służące do rozpoznawania tekstu (OCR) ''nie radzą sobie'' z analizą obrazów otrzymanaych za pomocą kamer. Segmentacja obrazu z następującą kontekstową analizą parametrów segmentów może dostarczyć...

(Di-tert-butylmethylphosphane)(η2-di-tert-butylphosphanylphosphinidene)(triphenylphosphane)platinum(0)

Publication

A. Konitz
H. Krautscheid
J. Pikies

- ACTA CRYSTALLOGRAPHICA SECTION C-CRYSTAL STRUCTURE COMMUNICATIONS - Year 2009

Struktura krystaliczna tytułowego związku, [(Ph3P)(tBu2PMe)Pt(η2-tBu2PP)], zawiera cztery cząsteczki w części niezależnej nieznacznie różniące się konformacjami. Odległości P-P w ligandzie tBu2PP są zbliżone dla wszystkich czterech cząsteczek [2.0661(13)-2.0678(13)A˚]. Odległości te, wskazują na wielokrotny charakter wiązania P-P w ligandzie tBu2PP. Atom platyny w kompleksie wykazuje koordynację płaską kwadratową. Prezentowana...

Full text to download in external service

Synthesis and structure of Dicyclohexylammonium Tri-tert-pentoxysilanethiolate and 5-aminopentylammonium Tri-tert-pentoxysilanethiolate

Publication

- ZEITSCHRIFT FUR ANORGANISCHE UND ALLGEMEINE CHEMIE - Year 2006

Tri-tert-pentoksysilanotiol reaguje z dicykloheksyloaminą i 1,5-diaminopentanem dając odpowiednie sole amoniowe. Sole te scharakteryzowano poprzez analizę elementarną, widma IR i NMR oraz metodą rentgenowskiej analizy strukturalnej. Są to pierwsze pochodne tri-tert-pentoksysilanotiolu, dla których wyznaczono strukturę krystaliczną.

Study of Statistical Text Representation Methods for Performance Improvement of a Hierarchical Attention Network

Publication

- Applied Sciences-Basel - Year 2021

To effectively process textual data, many approaches have been proposed to create text representations. The transformation of a text into a form of numbers that can be computed using computers is crucial for further applications in downstream tasks such as document classification, document summarization, and so forth. In our work, we study the quality of text representations using statistical methods and compare them to approaches...

Full text available to download

Plug-in to Eclipse environment for VHDL source code editor with advanced formatting of text

Publication

B. Niton
K. Pozniak
R. Romaniuk
R. S. Romaniuk

- Year 2011

Full text to download in external service

Ontology-based text convolution neural network (TextCNN) for prediction of construction accidents

Publication

S. Donghui
L. Zhigang
J. Zurada
A. Manikas
J. Guan
P. Weichbroth

- KNOWLEDGE AND INFORMATION SYSTEMS - Year 2024

The construction industry suffers from workplace accidents, including injuries and fatalities, which represent a significant economic and social burden for employers, workers, and society as a whole.The existing research on construction accidents heavily relies on expert evaluations,which often suffer from issues such as low efficiency, insufficient intelligence, and subjectivity.However, expert opinions provided in construction...

Full text to download in external service

Methodology for Text Classification using Manually Created Corpora-based Sentiment Dictionary

Publication

- Year 2018

This paper presents the methodology of Textual Content Classification, which is based on a combination of algorithms: preliminary formation of a contextual framework for the texts in particular problem area; manual creation of the Hierarchical Sentiment Dictionary (HSD) on the basis of a topically-oriented Corpus; tonality texts recognition via using HSD for analysing the documents as a collection of topically completed fragments...

Full text available to download

The smoothness test for a density function

Publication

- NONLINEAR ANALYSIS-THEORY METHODS & APPLICATIONS - Year 2014

The problem of testing hypothesis that a density function has no more than μ derivatives versus it has more than μ derivatives is considered. For a solution, the L2 norms of wavelet orthogonal projections on some orthogonal ‘‘differences’’ of spaces from a multiresolution analysis is used. For the construction of the smoothness test an asymptotic distribution of a smoothness estimator is used. To analyze that asymptotic distribution,...

Full text available to download

Collective Uncertainty Entanglement Test

Publication

Ł. Rudnicki
P. Horodecki
K. Życzkowski

- PHYSICAL REVIEW LETTERS - Year 2011

For a given pure state of a composite quantum system we analyze the product of its projections onto aset of locally orthogonal separable pure states. We derive a bound for this product analogous to theentropic uncertainty relations. For bipartite systems the bound is saturated for maximally entangled statesand it allows us to construct a family of entanglement measures, we shall call collectibility. As thesequantities are experimentally...

Full text to download in external service

Search

Filters

Catalog

Category

Year

Options

Search results for: TEXT INDEXING