The algorithm of building the hierarchical contextual framework of textual corpora
This paper presents an approach for Modeling the Latent Semantic Relations. The approach is based on advantages of two computational approaches: Latent Semantic Analysis and Latent Dirichlet Allocation. The scientific question about the possibility of reducing the influence of these Methods limitation on the Quality of the Latent Semantic Relations Analysis Results is raised. The case study for building the Two-level Hierarchical Contextual Framework of Textual Corpora was performed. The main scientific contributions of this research are: using the paragraphs as a topically completed textual messages can guarantee that it will be centered on a single topic; collecting the topics within the Corpora via its identification in each document separately is the instrument for preventing the model size increasing; film’s review as a specific type of textual document have the approximately similar writing style only within the Corpora with the same semantic tonality.
wyświetlono 14 razy