Software for Spectral Data Processing by Chemometrics and Machine Learning Methods

Cover Page

Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription or Fee Access

Abstract

The article describes a software package that supports the basic methods of chemometrics, and machine learning used for spectral data processing. The package can be used both as part of the software for analytical spectral instruments or independently. The package contains both common methods Qlinear and quadratic discriminant analysis, principal component regression, and partial least squaresS, as well as lesser known but proven effective in processing spectra, including the random forest method and extreme gradient boosting. Data on the testing of the program are provided, incl. an example of using the developed software package to solve problems of classifying
black carbon particles according to the initial combustion objects.

Full Text

Restricted Access

About the authors

Aram V. Sahakyan

Moscow Institute of Physics and Technology

Author for correspondence.
Email: saakian.av@phystech.edu
ORCID iD: 0000-0002-4012-4935

postgraduate student

Russian Federation, Dolgoprudny

Alexander D. Levin

All-Russian Research Institute of Optical and Physical Measurements FGBU VNIIOFI

Email: levin-ad@vniiofi.ru
ORCID iD: 0000-0001-9087-952X

Ph.D., Leading Researcher

Russian Federation, Moscow

References

  1. Houhou R., Bocklitz T. Trends in artificial intelligence, machine learning and chemometrics applied to chemical data. Anal Sci Adv. 2021; 2: 128–141. https://doi.org/10.1002/ansa.202000162
  2. Joshi P. B. Navigating with chemometrics and machine learning in chemistry. Artif Intell Rev. 2023; 56: 9089–9114. https://doi.org/10.1007/s10462-023-10391-w
  3. PLS_Toolbox. Eigenvector Research, Inc. Available from: https://eigenvector.com/software/pls-toolbox/ [Accessed 4th November 2022]
  4. ChemProject. INRAE Available from: https://www.chemproject.org/ [Accessed 18th July 2023]
  5. Solo. Eigenvector Research, Inc. Available from: https://eigenvector.com/software/solo/ [Accessed 4th November 2022]
  6. Хемометрическое программное обеспечение BWIQ. ООО «Промэнерголаб» Доступно по адресу: https://www.czl.ru/catalog/spektr/bwtek/bwiq.html [Доступ 16 сентября 2020] Chemometric software BWIQ. LLC PromEnergoLab. Available from: https://www.czl.ru/catalog/spektr/bwtek/bwiq.html [Accessed 16th September 2020]
  7. Scikit-learn developers (BSD License). Available from: https://scikit-learn.org/stable/ [Accessed 6th February 2024]
  8. Xgboost developers. Revision 82d846bb. Available from: https://xgboost.readthedocs.io/en/stable/python/python_intro.html [Accessed 6th February 2024]
  9. NumPy team. All rights reserved. Available from: https://numpy.org/ [Accessed 16th September 2023]
  10. pandas via NumFOCUS, Inc. Hosted by OVHcloud. Available from: https://pandas.pydata.org/ [Accessed 8th December 2023]
  11. Python Software Foundation. Available from: https://docs.python.org/3/library/tkinter.html [Accessed 6th February 2024]
  12. Tom Schimansky. Available from: https://github.com/TomSchimansky/CustomTkinter [Accessed 18th January 2024]
  13. GitHub, Inc. Available from: https://github.com/Akascape/CTkMessagebox [Accessed 11th December 2023]
  14. GitHub, Inc. Available from: https://github.com/Akascape/CTkXYFrame [Accessed 11th December 2023]
  15. The SciPy community. Available from: https://docs.scipy.org/doc/scipy/reference/signal.html [Accessed 6th February 2024]
  16. Python Software Foundation. Available from: https://docs.python.org/3/library/sys.html [Accessed 6th February 2024]
  17. The Matplotlib development team. Available from: https://matplotlib.org/ [Accessed 15th September 2023]
  18. GitHub, Inc. Available from: https://github.com/StatguyUser/BaselineRemoval [Accessed 21th September 2023]
  19. Fredrik Lundh and contributors, Jeffrey A. Clark (Alex) and contributors. Logo by Alastair Houghton. Psychedelic art by Jeremy Kun. Available from: https://python-pillow.org/ [Accessed 6th February 2024]
  20. Python Software Foundation. Available from: https://docs.python.org/3/library/os.html [Accessed 6th February 2024]
  21. Python Software Foundation. Available from: https://docs.python.org/3/library/pickle.html [Accessed 6th February 2024]
  22. Clark Consulting & Research. Available from: https://www.python-excel.org/ [Accessed 6th February 2024]
  23. Neptune Labs. All rights reserved. Available from: https://neptune.ai/blog/xgboost-everything-you-need-to-know#:~: text=Gradient%20Boosting%20comes%20with%20an, its%20predictions%20easy%20to%20handle.&text=XGBoost%20performs%20very%20well%20on, with%20not%20too%20many%20features [Accessed 11th August 2023]
  24. Саакян А. В., Юшина А. А., Левин А. Д. Классификация бренди и коньячной продукции по географическому происхождению и сроку выдержки с использованием спектроскопии комбинационного рассеяния и машинного обучения. Измерительная техника. 2023; (3):33–38. https://doi.org/10.32446/0368-1025it.2023-3-33-38 Sahakyan A. V., Yushina A. A., Levin A. D. Classifcation of brandy products by geographical origin and ageing based on raman spectra and discriminant analysis methods. Izmeritel’naya Tekhnika = Measurement Techniques. 2023; (3):33–38. (In Russ.) https://doi.org/10.32446/0368-1025it.2023-3-33-38
  25. Саакян А. B., Аленичев М. К., Левин А. Д. Характеризация коньяков и виноградных бренди по спектрам флуоресценции, обработанным с помощью методов машинного обучения. Заводская лаборатория. Диагностика материалов. 2023; 89(10):25–33. https://doi.org/10.26896/1028-6861-2023-89-10-25-33 Sahakyan A. V., Alenichev M. K., Levin A. D. Characterization of cognacs and grape brandies by fluorescence spectra processed using machine learning methods. Zavodskaja Laboratorija. Diagnostika Materialov = Industrial laboratory. Diagnostics of materials. 2023; 89(10): 25–33. (In Russ.) https://doi.org/10.26896/1028-6861-2023-89-10-25-33

Supplementary files

Supplementary Files
Action
1. JATS XML
2. Fig. 1. Spectrum without filtering

Download (224KB)
3. Fig. 2. Spectrum with filtering

Download (191KB)
4. Fig. 3. Spectra of samples in the range of 300-3,000 cm−1

Download (326KB)
5. Fig. 4. Projections of Raman spectra in the plane of the main components

Download (139KB)

Copyright (c) 2024 Sahakyan A.V., Levin A.D.

This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies