Wyniki wyszukiwania dla: DOCUMENTS CLASSIFICATION - MOST Wiedzy

Wyszukiwarka

Wyniki wyszukiwania dla: DOCUMENTS CLASSIFICATION

Wyniki wyszukiwania dla: DOCUMENTS CLASSIFICATION

  • Text Documents Classification with Support Vector Machines

    Publikacja
    • P. Majewski

    - Rok 2008

  • Two Stage SVM and kNN Text Documents Classifier

    Publikacja

    - Rok 2015

    The paper presents an approach to the large scale text documents classification problem in parallel environments. A two stage classifier is proposed, based on a combination of k-nearest neighbors and support vector machines classification methods. The details of the classifier and the parallelisation of classification, learning and prediction phases are described. The classifier makes use of our method named one-vs-near. It is...

  • Improving css-KNN Classification Performance by Shifts in Training Data

    Publikacja

    - Rok 2015

    This paper presents a new approach to improve the performance of a css-k-NN classifier for categorization of text documents. The css-k-NN classifier (i.e., a threshold-based variation of a standard k-NN classifier we proposed in [1]) is a lazy-learning instance-based classifier. It does not have parameters associated with features and/or classes of objects, that would be optimized during off-line learning. In this paper we propose...

  • Contextual ontology for tonality assessment

    classification tasks. The discussion focuses on two important research hypotheses: (1) whether it is possible to construct such an ontology from a corpus of textual document, and (2) whether it is possible and beneficial to use inferencing from this ontology to support the process of sentiment classification. To support the first hypothesis we present a method of extraction of hierarchy of contexts from a set of textual documents...

    Pełny tekst do pobrania w portalu

  • Text classifiers for automatic articles categorization

    Publikacja

    The article concerns the problem of automatic classification of textual content. We present selected methods for generation of documents representation and we evaluate them in classification tasks. The experiments have been performed on Wikipedia articles classified automatically to their categories made by Wikipedia editors.

  • Methodology for Text Classification using Manually Created Corpora-based Sentiment Dictionary

    Publikacja

    - Rok 2018

    This paper presents the methodology of Textual Content Classification, which is based on a combination of algorithms: preliminary formation of a contextual framework for the texts in particular problem area; manual creation of the Hierarchical Sentiment Dictionary (HSD) on the basis of a topically-oriented Corpus; tonality texts recognition via using HSD for analysing the documents as a collection of topically completed fragments...

    Pełny tekst do pobrania w portalu

  • Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations

    Publikacja

    - Information - Rok 2018

    The research presents the methodology of improving the accuracy in sentiment classification in the light of modelling the latent semantic relations (LSR). The objective of this methodology is to find ways of eliminating the limitations of the discriminant and probabilistic methods for LSR revealing and customizing the sentiment classification process (SCP) to the more accurate recognition of text tonality. This objective was achieved...

    Pełny tekst do pobrania w portalu

  • TF-IDF weighted bag-of-words preprocessed text documents from Simple English Wikipedia

    Dane Badawcze

    The SimpleWiki2K-scores dataset contains TF-IDF weighted bag-of-words preprocessed text documents (raw strings are not available) [feature matrix] and their multi-label assignments [label-matrix]. Label scores for each document are also provided for an enhanced multi-label KNN [1] and LEML [2] classifiers. The aim of the dataset is to establish a benchmark...

  • Searching for innovation knowledge: insight into KIBS companies

    Publikacja

    - Knowledge Management Research & Practice - Rok 2017

    The paper analyses the activity of research for “innovation knowledge”—here defined as knowledge that can lead to the introduction of service innovations—by Knowledge-Intensive Business Services (KIBS) companies. It proposes a classification of the possible search approaches adopted by those companies based on two dimensions: the pro-activity of search efforts and the source primarily used. Such classification is then discussed...

    Pełny tekst do pobrania w serwisie zewnętrznym

  • Increasing K-Means Clustering Algorithm Effectivity for Using in Source Code Plagiarism Detection

    Publikacja
    • P. Hrkút
    • M. Ďuračík
    • M. Mikušová
    • M. Callejas-cuervo
    • J. Żukowska

    - Rok 2019

    The problem of plagiarism is becoming increasingly more significant with the growth of Internet technologies and the availability of information resources. Many tools have been successfully developed to detect plagiarisms in textual documents, but the situation is more complicated in the field of plagiarism of source codes, where the problem is equally serious. At present, there are no complex tools available to detect plagiarism...

  • Preliminary safety assessment of Polish interchanges

    Publikacja

    - Archives of Transport - Rok 2021

    Interchanges are a key and the most complex element of a road infrastructure. The safety and functionality of interchanges determine the traffic conditions and safety of the entire road network. This applies particularly to motorways and express-ways, for which they are the only way to access and exchange traffic. A big problem in Poland is the lack of comprehensive tools for designers at individual stages of the design process....

    Pełny tekst do pobrania w portalu

  • External Validation Measures for Nested Clustering of Text Documents

    Publikacja

    Abstract. This article handles the problem of validating the results of nested (as opposed to "flat") clusterings. It shows that standard external validation indices used for partitioning clustering validation, like Rand statistics, Hubert Γ statistic or F-measure are not applicable in nested clustering cases. Additionally to the work, where F-measure was adopted to hierarchical classification as hF-measure, here some methods to...

  • Functional safety and reliability analysis methodoloogy for hazardous industrial plants

    Publikacja

    - Rok 2013

    This monograph is devoted to current problems and methods of the functional safety and reliability analyses of the programmable control and protection systems for industrial hazardous plants. The results of these analyses are useful in the process of safety management in life cycle, for effective reducing relevant risks at the design stage, and then controlling these risks during the operation of given installation. The methodology...

  • Wikipedia Articles Representation with Matrix'u

    Publikacja

    - Rok 2013

    In the article we evaluate different text representation methods used for a task of Wikipedia articles categorization. We present the Matrix’u application used for creating computational datasets ofWikipedia articles. The representations have been evaluated with SVM classifiers used for reconstruction human made categories.

    Pełny tekst do pobrania w serwisie zewnętrznym