Information and communication and chemical technologies

No. 4 (21) - 2023 / 2023-12-22 / Number of views: 129

ANALYSIS OF SCIENTIFIC TEXTS BASED ON LANGUAGE MODELS BY DISTRIBUTED PROCESSING ALGORITHMS

Authors

Esil University
L.N. Gumilyov Eurasian National University

Keywords

язык программирования Scala, научный текст, большие данные, неструктурированные данные, обработка данных, Apache Spark, распределенные вычисления, математический аппарат

Link to DOI:

https://doi.org/10.58805/kazutb.v.4.21-220

How to quote

Shuitenov Г., Turusbekova У. ., and Muratbekov М. “ANALYSIS OF SCIENTIFIC TEXTS BASED ON LANGUAGE MODELS BY DISTRIBUTED PROCESSING ALGORITHMS”. Vestnik KazUTB, vol. 4, no. 21, Dec. 2023, doi:10.58805/kazutb.v.4.21-220.

Abstract

The paper analyzes the problems associated with processing large amounts of text data, such as scientific articles, and discusses the possibilities of using distributed data processing systems to improve the efficiency of analysis. In particular, the authors of the paper study the use of language models such as -grams and recurrent neural networks to extract meaning and classify scientific texts. The paper presents algorithmic approaches and methods based on distributed processing and describes the possibilities of using language models and distributed processing algorithms. In general, a new approach to the analysis of scientific texts is proposed, based on the use of language models and distributed data processing frameworks.