Validation of social agent training: synthesis of reinforcement learning and evolutionary optimization methods

Cover Page

Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription or Fee Access

Abstract

The article proposes a time series forecasting model designed for unstable and partially observable environments. Unlike traditional approaches, the developed FELAR architecture combines local agent learning with reward adjustment based on collective characteristics (trust, reputation, influence), alongside global evolutionary adaptation of strategies. The proposed model operates in a distributed multi-agent environment, enabling both local adaptive behavior and global strategy evolution. A set of experiments on publicly available time series datasets (urban traffic, transformer temperatures, electricity consumption) confirms the model’s high forecasting accuracy and robustness to concept drift. The article details the agent architecture, algorithmic loop, experimental setup, and computational efficiency of the approach. The paper highlights key advantages of the approach, including robustness to concept drift, real-time adaptability, and low computational overhead.

Full Text

Restricted Access

About the authors

A. V. Chernikov

Moscow State University of Technology "STANKIN"

Author for correspondence.
Email: aleksandrchernikov98@gmail.com

Postgraduate Student, Senior Lecturer

Russian Federation, Moscow

References

  1. Yakovleva E. A. et al. The Role of Artificial Intelligence Technologies in the Digital Transformation of the Economy, Questions of Innovative Economy, 2023, vol. 13, no. 2, pp. 707—726.
  2. Chalam Y. Y. Adaptive control systems: Techniques and applications, London, UK, Routledge, 2017.
  3. Sewak M., Sahay S. K., Rathore H. Policy-approximation based deep reinforcement learning techniques: an overview, Information and Communication Technology for Competitive Strategies (ICTCS 2020) ICT: Applications and Social Interfaces, 2021, pp. 493—507.
  4. Yang X. S. Nature-inspired optimization algorithms: Challenges and open problems, Journal of Computational Science, 2020, vol. 46, p. 101104.
  5. Hansen N., Wang X., Su H. Temporal difference learning for model predictive control, arXiv preprint arXiv:2203.04955, 2022.
  6. Gladkov L., Kureichik V., Kureichik V. Genetic algorithms, Moscow, Publishing house Liters "Physical and Mathematical Literature", 2022.
  7. Sakharov M. K., Karpenko A. P. Memetic algorithms for solving the problem of global nonlinear optimization. Review, Mechanical Engineering and Computer Technologies, 2015, no. 12, pp. 119—142.
  8. Karpenko A. P. Modern algorithms for search engine optimization, Algorithms inspired by nature, Moscow, Bauman Moscow State Technical University, 2017.
  9. Li S. E. Deep reinforcement learning //Reinforcement learning for sequential decision and optimal control, Singapore, Springer Nature Singapore, 2023, pp. 365-402.
  10. Gao Z. et al. Application of deep q-network in portfolio management, 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), IEEE, 2020, pp. 268—275.
  11. Ceron J. S. O., Castro P. S. Revisiting rainbow: Promoting more insightful and inclusive deep reinforcement learning research, International Conference on Machine Learning, PMLR, 2021, pp. 1373—1383.
  12. Yu E. et al. Online boosting adaptive learning under concept drift for multistream classification, Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 15, pp. 16522—16530.
  13. Saleh A. et al. Trust-aware routing mechanism through an edge node for IoT-enabled sensor networks, Sensors, 2022, vol. 22, no. 20, p. 7820.
  14. Zadeh L. A. Fuzzy logic //Granular, fuzzy, and soft computing, New York, NY, Springer US, 2023, pp. 19—49.
  15. Zhong X. et al. Hybrid path planning based on safe A* algorithm and adaptive window approach for mobile robot in large-scale dynamic environment, Journal of Intelligent & Robotic Systems, 2020, vol. 99, no. 1, pp. 65—77.
  16. Bessonov A. V. Symbolic specification and analysis of software models of hybrid systems: diss., Novosib. state tech. univ, 2016.
  17. Gladkov L. A., Gladkova N. V. New approaches to building systems for analysis and knowledge extraction based on hybrid methods, Bulletin of the Southern Federal University. Technical sciences, 2010, vol. 108, no. 7, pp. 146—153.
  18. Oroojlooy A., Hajinezhad D. A review of cooperative multi-agent deep reinforcement learning, Applied Intelligence, 2023, vol. 53, no. 11, pp. 13677—13722.
  19. Malyszko M. Fuzzy logic in selection of maritime search and rescue units, Applied Sciences, 2021, vol. 12, no. 1, p. 21.
  20. Willemsen F. J. et al. A methodology for comparing optimization algorithms for auto-tuning, Future Generation Computer Systems, 2024, vol. 159 pp. 489—504.
  21. Ponnambalam S. G., Janardhanan M. N., Rishwaraj G. Trust-based decision-making framework for multiagent system, Soft Computing, 2021, vol. 25 no. 11, pp. 7559—7575.
  22. Zubaroğlu A., Atalay V. Data stream clustering: a review, Artificial Intelligence Review, 2021, vol. 54, no 2, pp. 1201—1236.
  23. Li F. et al. Dynamic graph convolutional recurrent network for traffic prediction: Benchmark and solution, ACM Transactions on Knowledge Discovery from Data, 2023, vol. 17, no. 1, pp. 1—21.
  24. Zhao Y. et al. TranDRL: a transformer-driven deep reinforcement learning enabled prescriptive maintenance framework, IEEE Internet of Things Journal, 2024, pp. 35432—35444.
  25. Goncalves C. et al. Dataset of an energy community’s generation and consumption with appliance allocation, Data in Brief, 2022, vol. 45, p. 108590.
  26. Wang C., Wei H. Dynamic Memory Derived Graph learning for Spatio-Temporal Metro Flow Prediction, 2024 IEEE International Conference on Big Data (BigData), IEEE, 2024, pp. 2924—2931.
  27. Zhu S., Zhang Y., Feng Y. GW-net: An efficient grad-CAM consistency neural network with weakening of random erasing features for semi-supervised person re-identification, Image and Vision Computing, 2023, vol. 137, p. 104790.
  28. Plizzari C., Cannici M., Matteucci M. Spatial temporal transformer network for skeleton-based action recognition, Pattern recognition. ICPR international workshops and challenges: virtual event, January 10—15, 2021, Proceedings, Part III, Springer International Publishing, 2021, pp. 694—701.
  29. Yoon J., Zame W. R., van der Schaar M. Multi-directional recurrent neural networks: A novel method for estimating missing data, Time series workshop in international conference on machine learning, 2017, pp. 1—5.
  30. Du W., Côté D., Liu Y. Saits: Self-attention-based imputation for time series, Expert Systems with Applications, 2023, vol. 219, p. 119619.
  31. Li Y. et al. Deep learning based on Transformer architecture for power system short-term voltage stability assessment with class imbalance, Renewable and Sustainable Energy Reviews, 2024, vol. 189, p. 113913.
  32. Gupta M. et al. Concurrent imputation and prediction on EHR data using bi-directional GANs: Bi-GANs for EHR imputation and prediction, Proceedings of the 12th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, 2021, pp. 1—9.
  33. Ashman M. et al. Sparse gaussian process variational autoencoders, arXiv preprint arXiv:2010.10177, 2020.
  34. Zhang J., Zheng Y., Qi D. Deep spatio-temporal residual networks for citywide crowd flows prediction, Proceedings of the AAAI conference on artificial intelligence, 2017, vol. 31, no. 1.
  35. Li Y. et al. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting, arXiv preprint arXiv:1707.01926, 2017.
  36. Zeng A. et al. Are transformers effective for time series forecasting?, Proceedings of the AAAI conference on artificial intelligence, 2023, vol. 37, no. 9, pp. 11121—11128.
  37. Amalou I., Mouhni N., Abdali A. Multivariate time series prediction by RNN architectures for energy consumption forecasting, Energy Reports, 2022, vol. 8, pp. 1084—1091.
  38. Khalil M. et al. Machine Learning, Deep Learning and Statistical Analysis for forecasting building energy consumption—A systematic review, Engineering Applications of Artificial Intelligence, 2022, vol. 115, p. 105287.
  39. Zhou H. et al. Informer: Beyond efficient transformer for long sequence time-series forecasting, Proceedings of the AAAI conference on artificial intelligence, 2021, vol. 35, no. 12, pp. 11106—11115.
  40. Dissanayake B. et al. A comparison of ARIMAX, VAR and LSTM on multivariate short-term traffic volume forecasting, Conference of open innovations association, FRUCT, FRUCT Oy, 2021, no. 28, pp. 564—570.
  41. Assad U. et al. Smart grid, demand response and optimization: A critical review of computational methods, Energies, 2022, vol. 15, no. 6, p. 2003.
  42. Pham Q. V. et al. Deep learning for intelligent demand response and smart grids: A comprehensive survey, arXiv preprint arXiv:2101.08013, 2021.
  43. Aziz H. et al. Task allocation using a team of robots, Current Robotics Reports, 2022, vol. 3, no. 4, pp. 227—238.
  44. Sharma S., Kumar V. A comprehensive review on multi-objective optimization techniques: Past, present and future, Archives of Computational Methods in Engineering, 2022, vol. 29, no. 7, pp. 5605—5633.
  45. Cano A., Krawczyk B. Kappa updated ensemble for drifting data stream mining, Machine Learning, 2020, vol. 109, no. 1, pp. 175—218.
  46. Wu J. et al. Toward human-in-the-loop AI: Enhancing deep reinforcement learning via real-time human guidance for autonomous driving, Engineering, 2023, no. 21, pp. 75—91.

Supplementary files

Supplementary Files
Action
1. JATS XML
2. Fig. 1. Agent block diagram

Download (59KB)
3. Fig. 2. General block diagram of the model

Download (76KB)
4. Fig. 3. Simulated map of Beijing

Download (234KB)
5. Fig. 4. Visualization of node coordinates using pyplot

Download (133KB)
6. Fig. 5. Beijing traffic. Forecast and actual values (fragment)

Download (106KB)
7. Fig. 6. Transformer temperature. Forecast and actual values (fragment)

Download (181KB)
8. Fig. 7. Energy consumption in the department of Hauts-de-Seine. Forecast and actual values (fragment)

Download (138KB)

Copyright (c) 2026 Informacionnye Tehnologii



СМИ зарегистрировано Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор).
Регистрационный номер и дата принятия решения о регистрации СМИ: серия ПИ № 77 - 15565 от 02 июня 2003 г.