DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING

Nina Rizun; Jurij Taranenko

DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING

Abstract

The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming the Documents Models Database Procedure. The experiments of this algorithm conduction on the test sampling of reviews analysis was performed and the main conclusion was formulated.

Authors (2)

Nina Rizun dr
Jurij Taranenko
- Alfred Nobel University, Dnipro Department of Applied Linguistics and Methods of Teaching Foreign Languages

Cite as

Full text

download paper

downloaded 1221 times

Publication version: Accepted or Published Version
License: Copyright (Wydział Zarządzania w Ciechanowie (WSM w Warszawie))

Keywords

Details

Category:: Articles
Type:: artykuły w czasopismach recenzowanych i innych wydawnictwach ciągłych
Published in:: Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie pages 167 - 188,
ISSN: 1897-4716
Language:: English
Publication year:: 2017
Bibliographic description:: Rizun N., Taranenko J.: DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING// Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie. -., nr. 1-4 (XI) (2017), s.167-188
Verified by:: Gdańsk University of Technology