Comparative Analysis of HDFS and Apache Ozone Data Storage Systems

Kirill O. Ievlev; Иевлев Кирилл Олегович; Mikhail G. Gorodnichev; Городничев Михаил Геннадьевич

doi:10.33693/2313-223X-2025-12-1-26-33

Comparative Analysis of HDFS and Apache Ozone Data Storage Systems

作者: Ievlev K.O.¹, Gorodnichev M.G.¹
隶属关系:
1. Moscow Technical University of Communications and Informatics
期: 卷 12, 编号 1 (2025)
页面: 26-33
栏目: INFORMATION TECHNOLOGY AND TELECOMMUNICATION
URL: https://journals.eco-vector.com/2313-223X/article/view/679126
DOI: https://doi.org/10.33693/2313-223X-2025-12-1-26-33
EDN: https://elibrary.ru/LNPTVP
ID: 679126

如何引用文章

全文:

开放存取

##reader.subscriptionAccessGranted##
受限制的访问

订阅或者付费存取

详细
全文:
作者简介
参考
补充文件
统计

详细

Over the last few decades, both the volume of digital data in the globe and the variety of ways to use it have increased dramatically. For a long time, the Hadoop ecosystem, which is still widely utilized, has been synonymous with large data storage and processing platforms. However, during the past 20 years, Hadoop has been found to have a number of serious flaws, including the “small files problem” and uneven cluster resource usage. Various commercial and research organizations are faced with the issue of upgrading the data stack to improve resource utilization and increasing data processing efficiency. This study aims to examine the benefits and drawbacks of the next-generation data storage system, Apache Ozone, and to assess whether this technology is ready to completely supplant the Hadoop Distributed File System (HDFS).

关键词

big data storage, distributed file systems, object storage, S3, Apache Hadoop, Apache Ozone

全文:

作者简介

Kirill Ievlev

Moscow Technical University of Communications and Informatics

编辑信件的主要联系方式.
Email: ievlev.k.o@yandex.ru
ORCID iD: 0009-0003-2723-3154
SPIN 代码: 1380-5720
Researcher ID: IAN-1730-2023

Postgraduate Student, Assistant of the Department of Mathematical Cybernetics and Information Technologies

俄罗斯联邦, Moscow

Mikhail Gorodnichev

Moscow Technical University of Communications and Informatics

Email: m.g.gorodnichev@mtuci.ru
ORCID iD: 0000-0003-1739-9831
SPIN 代码: 4576-9642
Scopus 作者 ID: 55836031600
Researcher ID: D-3256-2019

Cand. Sci. (Eng.), Associate Professor, Head of the Department of Mathematical Cybernetics and Information Technologies, Dean of the Faculty of Information Technologies

俄罗斯联邦, Moscow

参考

Aggarwal R., Verma J., Siwach M. Small files’ problem in Hadoop: A systematic literature review. Journal of King Saud University “Computer and Information Sciences”. 2022. No. 34 (10). Part A. Pp. 8658–8674. doi: 10.1016/j.jksuci.2021.09.007.
Harby A.A., Zulkernine F. From data warehouse to lakehouse: A comparative review. In: IEEE International Conference on Big Data (Big Data). Osaka, 2022. Pp. 389–395. doi: 10.1109/BigData55660.2022.10020719.
Jain E.P., Gupta E.A. Hadoop architecture and its issues. International Journal of Engineering Research and General Science. 2017. No. 5 (2). Pp. 211–217. doi: 10.1109/CSCI.2014.140.
Niazi S., Ismail M., Haridi S. et al. HopsFS: Scaling Hierarchical File System Metadata Using NewSQL Databases. In: 15th USENIX Conference on File and Storage Technologies (FAST 17). USENIX Association, 2017. Pp. 89–104. doi: 10.48550/arXiv.1606.01588.
Sharma G., Tripathi V., Srivastava A. Recent trends in Big Data ingestion tools: A study. In: Research in Intelligent and Computing in Engineering, Springer, 2021. Pp. 873–881. doi: 10.1007/978-981-15-7527-3_83.
Shvachko K. HDFS scalability: The limits to growth. Login Usenix Mag. 2010. No. 35. Pp. 6–16.
White T. Hadoop: The definitive guide. 4 ed. O’Reilly Media, Inc., 2015. 754 p.