Search results for: DOCUMENTS CLASSIFICATION - MOST Wiedzy

Search

Search results for: DOCUMENTS CLASSIFICATION

Search results for: DOCUMENTS CLASSIFICATION

  • Text Documents Classification with Support Vector Machines

    Publication
    • P. Majewski

    - 2008

  • Two Stage SVM and kNN Text Documents Classifier

    Publication

    The paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...

  • Improving css-KNN Classification Performance by Shifts in Training Data

    Publication

    - 2015

    This paper presents a new approach to improve the performance of a css-k-NN classifier for categorization of text documents. The css-k-NN classifier (i.e., a threshold-based variation of a standard k-NN classifier we proposed in [1]) is a lazy-learning instance-based classifier. It does not have parameters associated with features and/or classes of objects, that would be optimized during off-line learning. In this paper we propose...

  • Contextual ontology for tonality assessment

    classification tasks. The discussion focuses on two important research hypotheses: (1) whether it is possible to construct such an ontology from a corpus of textual document, and (2) whether it is possible and beneficial to use inferencing from this ontology to support the process of sentiment classification. To support the first hypothesis we present a method of extraction of hierarchy of contexts from a set of textual documents...

    Full text available

  • Text classifiers for automatic articles categorization

    Publication

    The article concerns the problem of automatic classification of textual content. We present selected methods for generation of documents representation and we evaluate them in classification tasks. The experiments have been performed on Wikipedia articles classified automatically to their categories made by Wikipedia editors.

  • Methodology for Text Classification using Manually Created Corpora-based Sentiment Dictionary

    Publication

    This paper presents the methodology of Textual Content Classification, which is based on a combination of algorithms: preliminary formation of a contextual framework for the texts in particular problem area; manual creation of the Hierarchical Sentiment Dictionary (HSD) on the basis of a topically-oriented Corpus; tonality texts recognition via using HSD for analysing the documents as a collection of topically completed fragments...

    Full text available

  • Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations

    Publication

    - Information - 2018

    The research presents the methodology of improving the accuracy in sentiment classification in the light of modelling the latent semantic relations (LSR). The objective of this methodology is to find ways of eliminating the limitations of the discriminant and probabilistic methods for LSR revealing and customizing the sentiment classification process (SCP) to the more accurate recognition of text tonality. This objective was achieved...

    Full text available

  • Searching for innovation knowledge: insight into KIBS companies

    Publication

    - Knowledge Management Research & Practice - 2017

    The paper analyses the activity of research for “innovation knowledge”—here defined as knowledge that can lead to the introduction of service innovations—by Knowledge-Intensive Business Services (KIBS) companies. It proposes a classification of the possible search approaches adopted by those companies based on two dimensions: the pro-activity of search efforts and the source primarily used. Such classification is then discussed...

    Full text in external service

  • Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection

    Publication
    • P. Hrkút
    • M. Ďuračík
    • M. Mikušová
    • M. Callejas-Cuervo
    • J. Żukowska

    - 2019

    The problem of plagiarism is becoming increasingly more significant with the growth of Internet technologies and the availability of information resources. Many tools have been successfully developed to detect plagiarisms in textual documents, but the situation is more complicated in the field of plagiarism of source codes, where the problem is equally serious. At present, there are no complex tools available to detect plagiarism...

  • External Validation Measures for Nested Clustering of Text Documents

    Publication

    Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...

  • Functional safety and reliability analysis methodoloogy for hazardous industrial plants

    Publication

    - 2013

    This monograph is devoted to current problems and methods of the functional safety and reliability analyses of the programmable control and protection systems for industrial hazardous plants. The results of these analyses are useful in the process of safety management in life cycle, for effective reducing relevant risks at the design stage, and then controlling these risks during the operation of given installation. The methodology...

  • Wikipedia Articles Representation with Matrix'u

    Publication

    - 2013

    In the article we evaluate different text representation methods used for a task of Wikipedia articles categorization. We present the Matrix’u application used for creating computational datasets ofWikipedia articles. The representations have been evaluated with SVM classifiers used for reconstruction human made categories.

    Full text in external service