Selection of Relevant Features for Text Classification with K-NN

Jerzy Balicki; Henryk Krawczyk; Łukasz Rymko; Julian Szymański

doi:10.1007/978-3-642-38610-7_44

Selection of Relevant Features for Text Classification with K-NN

Abstrakt

In this paper, we describe five features selection techniques used for a text classification. An information gain, independent significance feature test, chi-squared test, odds ratio test, and frequency filtering have been compared according to the text benchmarks based on Wikipedia. For each method we present the results of classification quality obtained on the test datasets using K-NN based approach. A main advantage of evaluated approach is reducing the dimensionality of the vector space that allows to improve effectiveness of classification task. The information gain method, that obtained the best results, has been used for evaluation of features selection and classification scalability. We also provide the results indicating the feature selection is also useful for obtaining the commonsense features for describing natural-made categories.

Cytowania

2

CrossRef
0

Web of Science
3

Scopus

Autorzy (4)

Cytuj jako

Pełna treść

pełna treść publikacji nie jest dostępna w portalu

pełna treść artykułu zobacz w serwisie zewnętrznym otwiera się w nowej karcie

Słowa kluczowe

Informacje szczegółowe

Kategoria:: Aktywność konferencyjna
Typ:: materiały konferencyjne indeksowane w Web of Science
Tytuł wydania:: Artificial Intelligence and Soft Computing. - Part 2 strony 477 - 488
Język:: angielski
Rok wydania:: 2013
Opis bibliograficzny:: Balicki J., Krawczyk H., Rymko Ł., Szymański J..: Selection of Relevant Features for Text Classification with K-NN, W: Artificial Intelligence and Soft Computing. - Part 2, 2013, Springer,.
DOI:: Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1007/978-3-642-38610-7_44
Weryfikacja:: Politechnika Gdańska

wyświetlono 131 razy

Publikacje, które mogą cię zainteresować

Text Categorization Improvement via User Interaction

J. Atroszko,
J. Szymański,
D. Gil
+ 1 autorów

2018

Selection of Relevant Features for Text Classification with K-NN

Abstrakt

Cytowania

Autorzy (4)

Jerzy Balicki dr hab. inż.

Henryk Krawczyk prof. dr hab. inż.

Łukasz Rymko

Julian Szymański dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

Text Categorization Improvement via User Interaction

Selecting Features with SVM

Path-based methods on categorical structures for conceptual representation of wikipedia articles

Spectral Clustering Wikipedia Keyword-Based search Results

Wyszukiwarka

Selection of Relevant Features for Text Classification with K-NN

Abstrakt

Cytowania

Autorzy (4)

Jerzy Balicki dr hab. inż.

Henryk Krawczyk prof. dr hab. inż.

Łukasz Rymko

Julian Szymański dr hab. inż.

Cytuj jako

Pełna treść

Słowa kluczowe

Informacje szczegółowe

Publikacje, które mogą cię zainteresować

Text Categorization Improvement via User Interaction

Selecting Features with SVM

Path-based methods on categorical structures for conceptual representation of wikipedia articles

Spectral Clustering Wikipedia Keyword-Based search Results