The Possibilities of Using Big Data Technologies in Solving Problems of Processing Data on Atmospheric Air Pollution

封面

如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

The main objective of the article is to substantiate the possibility of using Big Data technologies in the field of atmospheric air monitoring. In the form of a diagram, a model for processing big data obtained from measuring meteorological gas analysis stations using the PySpark library for further experimental studies is presented. The factors accompanying the use of Big Data in the field of atmospheric air monitoring are derived, and the performance of the Pandas and PySpark libraries is compared. The obtained results will allow us to further rely on the derived factors and use the most optimal data processing technologies to build predictive machine learning models in the field of analyzing the level of atmospheric air pollution. Consistent use of big data and machine learning methods will ensure clean and healthy air for future generations through more effective predictive analytics. This article is valuable for students and specialists in the field of information technology, in particular, in the field of data processing and machine learning.

全文:

受限制的访问

作者简介

Dmitry Bogomolov

MIREA – Russian Technological University

编辑信件的主要联系方式.
Email: bogomolov.d.n@edu.mirea.ru

graduate student, Department of Instrumental and Application Software

俄罗斯联邦, Moscow

Sergey Plotnikov

MIREA – Russian Technological University

Email: plotnikovsb@mail.ru

Cand. Sci. (Eng.), associate professor, Department of Instrumental and Application Software

俄罗斯联邦, Moscow

参考

  1. Azemov D.T. Assessment of the quality of atmospheric air in St. Petersburg based on the results of the operation of the automated air monitoring system in 2019. In: Modern problems of hydrometeorology and environmental monitoring in the CIS: A collection of abstracts of the International Scientific and Practical Conference dedicated to the 90th anniversary of the Russian State Hydrometeorological University (St. Petersburg, October 22–24, 2020). St. Petersburg: Russian State Hydrometeorological University, 2020. Pp. 103–104. EDN: BTHHCS.
  2. Borisov I.D., Semenov V.A., Bychkova Ya.A., Zhao M.N. Apache Spark и Pyspark. In: Young Russia: Collection of materials of the XIV All-Russian scientific and practical conference of young scientists with international participation (Kemerovo, April 18–21, 2023). Kemerovo: Kuzbass State Technical University named after T.F. Gorbachev, 2023. Pp. 31603.1–31603.3. EDN: UMSNKI.
  3. Bosubabu S. Air pollution monitoring and prediction system using the Internet of things. International Journal of Research and Development in Engineering Sciences. 2020. Vol. 2. Issue 3. Pp. 144–150.
  4. Vinogradova E.A., Dmitriev M.M., Kudryavets A.S. Air eco-monitoring using technological solutions to process data from automatic online air quality monitoring systems using machine learning, artificial intelligence and big data analytics approaches. In: Modern technologies: Problems and development trends: Monograph. Petrozavodsk: International Center for Scientific Partnership “New Science” (Individual Entrepreneur Ivanovskaya I.I.), 2021. Pp. 174–189. EDN: FKJEWT.
  5. Gusak D.V. Device concept for organization of monitoring network. Bulletin of the Peoples’ Friendship University of Russia. Series: Ecology and Life Safety. 2023. Vol. 31. Issue 2. Pp. 241–250. (In Rus.) doi: 10.22363/2313-2310-2023-31-2-241-250. EDN: HCKTOL.
  6. Egorov G.G. The use of big data technology for environmental protection. In: Ecophilosophy in the design of a noospheric city: A collection of articles based on the results of the Third Russian Round Table with international participation (Moscow, May 18, 2023. E.V. Barkova, O.M. Buzskaya (eds.). Moscow: Ruscience Limited Liability Company, 2023. Pp. 42–48.
  7. Igonina E.I. Application of machine learning for clustering of Russian regions on population health and ecology. In: I. Lipanov’s scientific readings: Materials of the regional scientific conference (Izhevsk, June 15–16, 2021). Izhevsk: Izhevsk State Technical University named after M.T. Kalashnikov, 2021. Pp. 169–175. EDN: HURYBU.
  8. Johansson C., Zhang Z., Engardt M. et al. Improving 3-day deterministic air pollution forecasts using machine learning algorithms. Atmos. Chem. Phys. Discus. 2023. doi: 10.5194/acp-2023-38.
  9. Kaplenkova P.A., Sivova A.N. Prediction of atmospheric air pollution using machine learning and PySpark. Science and Business: Ways of Development. 2020. No. 10 (112). Pp. 54–56. (In Rus.) EDN: LTZEYE.
  10. Kostromin N.S., Sivova A.N. Application of machine learning methods for solving environmental problems. Modern Science. 2019. No. 5-3. Pp. 144–148. (In Rus.) EDN: YLPWAT.
  11. Nandi B.P., Singh G., Jain Tayal D.K. Evolution of neural network to deep learning in prediction of air, water pollution and its Indian context. Int. J. Environ. Sci. Technol. 2023. doi: 10.1007/s13762-023-04911-y.
  12. Panarin V.M., Maslova A.A., Savinkova S.A. Automated monitoring of atmospheric air pollution in industrially developed territories. Tula: Tula State University, 2021. 219 p. ISBN: 978-5-7679-4817-8. EDN: ZDKTXH.
  13. Parkavi P., Rathi S. Deep learning model for air quality prediction based on big data. International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT). 2021. Vol. 7. Issue 3. Pp. 170–175. ISSN: 2456-3307. doi: 10.32628/CSEIT217332.
  14. Samad A., Garuda S., Vogt U., Yang B. Air pollution prediction using machine learning techniques – an approach to replace existing monitoring stations with virtual monitoring stations, atmospheric environment. International Journal of Computational Intelligence Studies. 2023. Issue 310. P. 119987. ISSN: 1352-2310. doi: 10.1016/j.atmosenv.2023.119987.
  15. Sossi Alaoui S., Aksasse B., Farhaoui Y. Air pollution prediction through internet of things technology and big data analytics. International Journal of Computational Intelligence Studies. 2020. No. 8. P. 177. doi: 10.1504/IJCISTUDIES.2019.102525.
  16. Shih D.-H., To T.H., Nguyen L.S.P. et al. Design of a spark big data framework for PM2.5 air pollution forecasting. Int. J. Environ. Res. Public Health. 2021. Vol. 18. No. 7087. 15 p. doi: 10.3390/ijerph18137087.

补充文件

附件文件
动作
1. JATS XML
2. Fig. 1. The execution time of the COUNT request

下载 (18KB)
3. Fig. 2. The execution time of the SUM request

下载 (17KB)
4. Fig. 3. Screenshot. RAM overflow when processing a data set larger than 30GB

下载 (9KB)
5. Fig. 4. Model of processing big data received from measuring stations

下载 (50KB)
6. Fig. 5. Pearson Correlation Matrix

下载 (37KB)


##common.cookie##