<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root>
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" article-type="research-article" dtd-version="1.2" xml:lang="en"><front><journal-meta><journal-id journal-id-type="publisher-id">Informacionnye Tehnologii</journal-id><journal-title-group><journal-title xml:lang="en">Informacionnye Tehnologii</journal-title><trans-title-group xml:lang="ru"><trans-title>Информационные технологии</trans-title></trans-title-group></journal-title-group><issn publication-format="print">1684-6400</issn><publisher><publisher-name xml:lang="en">New Technologies Publishing House</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">702266</article-id><article-id pub-id-type="doi">10.17587/it.31.283-290</article-id><article-categories><subj-group subj-group-type="toc-heading" xml:lang="en"><subject>Modeling and optimization</subject></subj-group><subj-group subj-group-type="toc-heading" xml:lang="ru"><subject>Моделирование и оптимизация</subject></subj-group><subj-group subj-group-type="article-type"><subject>Research Article</subject></subj-group></article-categories><title-group><article-title xml:lang="en">Algorithms of automatic construction of hierarchical model of scientific field based on clustering of semantic graphs of scientific terminology</article-title><trans-title-group xml:lang="ru"><trans-title>Алгоритмы автоматического построения иерархической модели научной области на основе кластеризации семантических графов научной терминологии</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="en"><surname>Kryuchkova</surname><given-names>E. N.</given-names></name><name xml:lang="ru"><surname>Крючкова</surname><given-names>Е. Н.</given-names></name></name-alternatives><address><country country="RU">Russian Federation</country></address><bio xml:lang="en"><p>Ph.D., professor</p></bio><bio xml:lang="ru"><p>канд. физ.-мат. наук, проф.</p></bio><email>kruchkova_elena@mail.ru</email><xref ref-type="aff" rid="aff1"/></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="en"><surname>Vopilova</surname><given-names>E. V.</given-names></name><name xml:lang="ru"><surname>Вопилова</surname><given-names>Е. В.</given-names></name></name-alternatives><address><country country="RU">Russian Federation</country></address><bio xml:lang="en"><p>postgraduate student</p></bio><bio xml:lang="ru"><p>аспирант</p></bio><email>vopilova.elena@gmail.com</email><xref ref-type="aff" rid="aff1"/></contrib></contrib-group><aff-alternatives id="aff1"><aff><institution xml:lang="en">Polzunov Altai State Technical University</institution></aff><aff><institution xml:lang="ru">Алтайский государственный технический университет им. И. И. Ползунова</institution></aff></aff-alternatives><pub-date date-type="pub" iso-8601-date="2025-06-15" publication-format="electronic"><day>15</day><month>06</month><year>2025</year></pub-date><volume>31</volume><issue>6</issue><issue-title xml:lang="en"/><issue-title xml:lang="ru"/><fpage>283</fpage><lpage>290</lpage><history><date date-type="received" iso-8601-date="2026-02-06"><day>06</day><month>02</month><year>2026</year></date></history><permissions><copyright-statement xml:lang="en">Copyright ©; 2025, Informacionnye Tehnologii</copyright-statement><copyright-statement xml:lang="ru">Copyright ©; 2025, Информационные технологии</copyright-statement><copyright-year>2025</copyright-year><copyright-holder xml:lang="en">Informacionnye Tehnologii</copyright-holder><copyright-holder xml:lang="ru">Информационные технологии</copyright-holder></permissions><self-uri xlink:href="https://journals.eco-vector.com/1684-6400/article/view/702266">https://journals.eco-vector.com/1684-6400/article/view/702266</self-uri><abstract xml:lang="en"><p>The article proposes the model of a scientific thesaurus, represented as the domain hierarchical graph, which connects the scientific field with scientific terminology. The source of training data for model building is partially structured scientific texts, including subject scientific. We propose the algorithms for calculating the significance of semantic relations between scientific terms. The scientific publication semantic model combines the knowledge stored in the thesaurus with information on term usage statistics in the publication text. The experiments of analyzing scientific publications presented in this paper were conducted using the domain semantic graph "Mathematics", built as a result of automatic processing of the text of the mathematical encyclopedia in five volumes.</p></abstract><trans-abstract xml:lang="ru"><p>Предложена модель научного тезауруса в виде доменного иерархического графа, связывающего тематику предметной области с научной терминологией. Источником обучающих данных являются частично размеченные научные тексты, в том числе предметные научные энциклопедии. Предложены, алгоритмы, вычисления значимости семантических отношений между научными терминами. Семантическая модель отдельной научной публикации строится как объединение знаний тезауруса и статистики использования терминов в тексте. Представленные в работе эксперменты. анализа научных публикаций проведены, с использованием доменного семантического графа "Математика".</p></trans-abstract><kwd-group xml:lang="en"><kwd>aspect-oriented analysis</kwd><kwd>scientific vocabulary</kwd><kwd>semantic graph</kwd><kwd>classification of scientific text</kwd><kwd>automatic processing of unstructured texts</kwd></kwd-group><kwd-group xml:lang="ru"><kwd>аспектно-ориентированный анализ</kwd><kwd>научный лексикон</kwd><kwd>семантический граф</kwd><kwd>классификация научных текстов</kwd><kwd>автоматическая обработка неструктурированных текстов</kwd></kwd-group><funding-group/></article-meta></front><body></body><back><ref-list><ref id="B1"><label>1.</label><citation-alternatives><mixed-citation xml:lang="en">Bruches E. P., Batura T. V. Method for Automatic Term Extraction from Scientific Articles Based on Weak Supervision, Vestnik NGU, Seriya: Informacionnye tekhnologii, 2021, vol. 19, no. 2, pp. 5-16 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Бручес Е. П., Батура Т. В. Метод автоматического извлечения терминов из научных статей на основе слабо контролируемого обучения // Вестник НГУ, Серия: Информационные технологии. 2021. Т.19, № 2. С. 5-16.</mixed-citation></citation-alternatives></ref><ref id="B2"><label>2.</label><citation-alternatives><mixed-citation xml:lang="en">Morozov D. A., Glazkova A. V., Tyutyulnikov M. A., Iomdin B. L. Keyphrase Generation for Abstracts of the Russian-Language Scientific Articles, Vestnik NGU, Seriya: Lingvistika i mezhkul'turnaya kommunikaciya, 2023, vol. 21, no. 1, pp. 54-66 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Морозов Д. А., Глазкова А. В., Тютюльников М. А., Иомдин Б. Л. Генерация ключевых слов для аннотаций русскоязычных научных статей // Вестник НГУ, Серия: Лингвистика и межкультурная коммуникация. 2023. Т. 21, № 1. С. 54-66.</mixed-citation></citation-alternatives></ref><ref id="B3"><label>3.</label><citation-alternatives><mixed-citation xml:lang="en">Altmami N., Menai M. Automatic Summarization of Scientific Articles: A Survey, Journal of King Saud University - Computer and Information Sciences, 2020, vol. 34, pp. 1011-1028.</mixed-citation><mixed-citation xml:lang="ru">Altmami N., Menai M. Automatic Summarization of Scientific Articles: A Survey // Journal of King Saud University - Computer and Information Sciences. 2020. V. 34. P. 1011-1028.</mixed-citation></citation-alternatives></ref><ref id="B4"><label>4.</label><citation-alternatives><mixed-citation xml:lang="en">Benites F. Information Retrieval and Knowledge Extraction for Academic Writing, Digital Writing Technologies in Higher Education, 2023, pp. 303-315.</mixed-citation><mixed-citation xml:lang="ru">Benites F. Information Retrieval and Knowledge Extraction for Academic Writing // Digital Writing Technologies in Higher Education. 2023. P. 303-315.</mixed-citation></citation-alternatives></ref><ref id="B5"><label>5.</label><citation-alternatives><mixed-citation xml:lang="en">Ushakov S. N., Savelyev A. O. A comparative review of tasks, approaches and tools for automated knowledge extraction from scientific publication texts, Informatsionnye Tekhnologii, 2024, vol. 30, no. 6, pp. 291-299 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Ушаков С. Н., Савельев А. О. Сравнительный обзор задач, подходов и инструментов автоматизированного извлечения знаний из текстов научных публикаций // Информационные технологии. 2024. Т. 30, № 6. С. 291-299.</mixed-citation></citation-alternatives></ref><ref id="B6"><label>6.</label><citation-alternatives><mixed-citation xml:lang="en">Borovikova O. I., Kononenko I. S., Sidorova E. A. An approach to information extraction from clinical trials protocols on the basis of medical ontology, Sistemnaya informatika, 2017, no. 9, pp. 93-110 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Боровикова О. И., Кононенко И. С., Сидорова Е. А. Подход к извлечению информации из протоколов клинических испытаний на основе медицинской онтологии // Системная информатика. 2017. № 9. C. 93-110.</mixed-citation></citation-alternatives></ref><ref id="B7"><label>7.</label><citation-alternatives><mixed-citation xml:lang="en">Beliga S., Mestrovic A., Martincic-Ipsic S. An Overview of Graph-Based Keyword Extraction Methods and Approaches, Journal of Information and Organizational Sciences, 2015, vol. 39, pp. 1-20.</mixed-citation><mixed-citation xml:lang="ru">Beliga S., Mestrovic A., Martincic-Ipsic S. An Overview of Graph-Based Keyword Extraction Methods and Approaches // Journal of Information and Organizational Sciences. 2015. Vol. 39. P. 1-20.</mixed-citation></citation-alternatives></ref><ref id="B8"><label>8.</label><citation-alternatives><mixed-citation xml:lang="en">Lunev K. V. Graph Methods for Computing Semantic Similarity of a Pair of Keywords and Their Application to the Problem of Keywords Clustering, Programmnaya Inzheneriya, 2018, vol. 9, no. 6, pp. 262-271 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Лунев К. В. Графовые методы определения семантической близости пары ключевых слов и их применения к задаче кластеризации ключевых слов // Программная инженерия. 2018. Т. 9, № 6. С. 262-271.</mixed-citation></citation-alternatives></ref><ref id="B9"><label>9.</label><citation-alternatives><mixed-citation xml:lang="en">Dubinina E. Y. Automatic extraction of key lexical units of the scientific texts at the process of summarization, Nauchnaya sessiya GUAP, 2018, vol. 3, pp. 115-118 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Дубинина Е. Ю. Автоматическое выделение ключевых лексических единиц научного текста в процессе реферирования // Научная сессия ГУАП. 2018. Т. 3. С. 115-118.</mixed-citation></citation-alternatives></ref><ref id="B10"><label>10.</label><citation-alternatives><mixed-citation xml:lang="en">Hossari M., Dev S., Kelleher J. D. TEST: A Terminology Extraction System for Technology Related Terms, Proc. The 2019 11th International Conference on Computer and Automation Engineering, 2019, pp. 78-81.</mixed-citation><mixed-citation xml:lang="ru">Hossari M., Dev S., Kelleher J. D. TEST: A Terminology Extraction System for Technology Related Terms // Proc. The 2019 11th International Conference on Computer and Automation Engineering. 2019. P. 78-81.</mixed-citation></citation-alternatives></ref><ref id="B11"><label>11.</label><citation-alternatives><mixed-citation xml:lang="en">Danilov G., Ishankulov T., Kotik K., Orlov Yu., Shifrin M., Potapov A. The Classification of Short Scientific Texts Using Pretrained BERT Model, Public Health and Informatics, 2021, vol. 281, pp. 83-87.</mixed-citation><mixed-citation xml:lang="ru">Danilov G., Ishankulov T., Kotik K., Orlov Yu., Shifrin M., Potapov A. The Classification of Short Scientific Texts Using Pretrained BERT Model // Public Health and Informatics. 2021. Vol. 281. P. 83-87.</mixed-citation></citation-alternatives></ref><ref id="B12"><label>12.</label><citation-alternatives><mixed-citation xml:lang="en">Dunn A., Dagdelen J., Walker N., Lee S., Rosen A., Ceder G., Persson K., Jain A. Structured information extraction from complex scientific text with fine-tuned large language models, available at: https://doi.org/10.48550/arXiv.2212.05238 (дата обращения: 15.07.24).</mixed-citation><mixed-citation xml:lang="ru">Dunn A., Dagdelen J., Walker N., Lee S., Rosen A., Ceder G., Persson K., Jain A. Structured information extraction from complex scientific text with fine-tuned large language models. URL: https://doi.org/10.48550/arXiv.2212.05238 (дата обращения: 15.07.24).</mixed-citation></citation-alternatives></ref><ref id="B13"><label>13.</label><citation-alternatives><mixed-citation xml:lang="en">Lukashevich N. V., Dobrov B. V. Designing linguistic ontologies for information systems in broad subject areas, Ontologiya proektirovaniya, 2015, vol. 5, no. 1 (15), pp. 47-69 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Лукашевич Н. В., Добров Б. В. Проектирование лингвистических онтологий для информационных систем в широких предметных областях // Онтология проектирования. 2015. Т. 5, № 1(15). С. 47-69.</mixed-citation></citation-alternatives></ref><ref id="B14"><label>14.</label><citation-alternatives><mixed-citation xml:lang="en">Belwal R., Rai S., Gupta A. A new graph-based extractive text summarization using keywords or topic modeling, Journal of Ambient Intelligence and Humanized Computing, 2021, vol. 12, pp. 8975-8990.</mixed-citation><mixed-citation xml:lang="ru">Belwal R., Rai S., Gupta A. A new graph-based extractive text summarization using keywords or topic modeling // Journal of Ambient Intelligence and Humanized Computing. 2021. Vol. 12. P. 8975-8990.</mixed-citation></citation-alternatives></ref><ref id="B15"><label>15.</label><citation-alternatives><mixed-citation xml:lang="en">Yerimbetova A. S., Sagnayeva S. K., Murzin F. A., Tussupov J. A. Creation of tools and algorithms for assessing the relevance of documents, Proceedings of the 3rd Russian-Pacific Conference on Computer Technology and Applications, 2018, pp. 1-4.</mixed-citation><mixed-citation xml:lang="ru">Yerimbetova A. S., Sagnayeva S. K., Murzin F. A., Tussupov J. A. Creation of tools and algorithms for assessing the relevance of documents // Proceedings of the 3rd Russian-Pacific Conference on Computer Technology and Applications, 2018. P. 1-4.</mixed-citation></citation-alternatives></ref><ref id="B16"><label>16.</label><citation-alternatives><mixed-citation xml:lang="en">Vinogradov I. M. Ed. Mathematical encyclopedia in 5 volumes, Moscow, Sovetskaya enciklopediya, 1977 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Математическая энциклопедия в 5 томах / Под ред. И. М. Виноградова. М.: Советская энциклопедия, 1977.</mixed-citation></citation-alternatives></ref><ref id="B17"><label>17.</label><citation-alternatives><mixed-citation xml:lang="en">Bachishe O. I., Kryuchkova E. N., Shushakov D. S. Problems of automatic processing of scientific texts based on extraction of information from encyclopedias of relevant domain areas, Programmnaya Inzheneriya, 2023, vol. 14, no. 1, pp. 42-50 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Бачище О. И., Крючкова Е. Н., Шушаков Д. С. Проблемы автоматической обработки научных текстов на основе извлечения информации из энциклопедий соответствующих доменных областей // Программная инженерия. 2023. Т. 14, № 1. С. 42-50.</mixed-citation></citation-alternatives></ref><ref id="B18"><label>18.</label><citation-alternatives><mixed-citation xml:lang="en">Vopilova E. V. Characteristic functions for calculating the significance of terms in a semantic model of scientific knowledge representation, Materialy IX Mezhdunarodnoj konferencii "Znaniya - Ontologii - Teorii" (ZONT-2023), 2023, pp. 49 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Вопилова Е. В. Характеристические функции вычисления значимости терминов в семантической модели представления научных знаний // Материалы IX Международной конференции Знания - Онтологии - Теории" (ЗОНТ-2023). 2023. С. 49.</mixed-citation></citation-alternatives></ref><ref id="B19"><label>19.</label><citation-alternatives><mixed-citation xml:lang="en">Kazakov M. G., Kryuchkova E. N. Classification of complex images based on semantic graph, Prikladnaya informatika, 2014, no. 6 (54), pp. 79-89 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Казаков М. Г., Крючкова Е. Н. Классификация сложных изображений на основе семантического графа понятий // Прикладная информатика. 2014. № 6 (54). С. 79-89.</mixed-citation></citation-alternatives></ref><ref id="B20"><label>20.</label><citation-alternatives><mixed-citation xml:lang="en">Korney A., Kryuchkova E., Savchenko V. Information Retrieval Approach Using Semiotic Models Based on Multilayered Semantic Graphs, High-Performance Computing Systems and Technologies in Scientific Research, 2020, vol. 1304, pp. 162-177.</mixed-citation><mixed-citation xml:lang="ru">Korney A., Kryuchkova E., Savchenko V. Information Retrieval Approach Using Semiotic Models Based on Multilayered Semantic Graphs // High-Performance Computing Systems and Technologies in Scientific Research. 2020. Vol. 1304. P. 162-177.</mixed-citation></citation-alternatives></ref><ref id="B21"><label>21.</label><citation-alternatives><mixed-citation xml:lang="en">Korney A. O., Kryuchkova E. N. Text categorization based on a condensed graph, Informatsionnye Tekhnologii, 2021, vol. 27, no. 3, pp. 138-146 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Корней А. О., Крючкова Е. Н. Категоризация текстов на основе сконденсированного графа // Информационные технологии. 2021. Т. 27, № 3. C.138-146.</mixed-citation></citation-alternatives></ref><ref id="B22"><label>22.</label><citation-alternatives><mixed-citation xml:lang="en">Vopilova E. V., Kryuchkova E. N. Automatic analysis methods of dynamics of information presentation in texts based on adaptable dictionaries of scientific terms, Programmnaya Inzheneriya, 2024, vol. 15, no. 4, pp. 206-215 (in Russian).</mixed-citation><mixed-citation xml:lang="ru">Вопилова Е. В., Крючкова Е. Н. Методы автоматического анализа динамики изложения информации в текстах на основе адаптируемых словарей научных терминов // Программная инженерия. 2024. Т. 15, № 4. С. 206-215.</mixed-citation></citation-alternatives></ref><ref id="B23"><label>23.</label><citation-alternatives><mixed-citation xml:lang="en">Natasha. Tools for Russian NLP: segmentation, embeddings, morphology, lemmatization, syntax, NER, fact extraction, available at: https://github.com/natasha (дата обращения: 01.06.24).</mixed-citation><mixed-citation xml:lang="ru">Natasha. Tools for Russian NLP: segmentation, embeddings, morphology, lemmatization, syntax, NER, fact extraction. URL: https://github.com/natasha (дата обращения: 01.06.24).</mixed-citation></citation-alternatives></ref></ref-list></back></article>
