Abstract
The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming the Documents Models Database Procedure. The experiments of this algorithm conduction on the test sampling of reviews analysis was performed and the main conclusion was formulated.
Authors (2)
Cite as
Full text
download paper
downloaded 1196 times
- Publication version
- Accepted or Published Version
- License
- Copyright (Wydział Zarządzania w Ciechanowie (WSM w Warszawie))
Keywords
Details
- Category:
- Articles
- Type:
- artykuły w czasopismach recenzowanych i innych wydawnictwach ciągłych
- Published in:
-
Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie
pages 167 - 188,
ISSN: 1897-4716 - Language:
- English
- Publication year:
- 2017
- Bibliographic description:
- Rizun N., Taranenko J.: DEVELOPMENT OF THE ALGORITHM OF POLISH LANGUAGE FILM REVIEWS PREPROCESSING// Rocznik Naukowy Wydzialu Zarzadzania w Ciechanowie. -., nr. 1-4 (XI) (2017), s.167-188
- Verified by:
- Gdańsk University of Technology
seen 134 times