Omics data analysis using deep learning-based framework in differential diagnosis of ovarian cancer

Мұқаба

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат Ашық рұқсат
Рұқсат жабық Рұқсат берілді
Рұқсат жабық Рұқсат ақылы немесе тек жазылушылар үшін

Аннотация

Relevance: The course of malignant epithelial ovarian tumors is considered to be highly aggressive. Limitations of diagnostic methods are associated with the late detection of tumors at stages III–IV, which is the cause with high mortality.

Objective: To compare the effectiveness of machine learning (ML) methods for minimally invasive diagnosis of early-stage ovarian cancer (OC) using scalable, objective lipid biomarker profile data.

Materials and methods: A single-center observational retrospective cohort clinical study included 239 patients with early-stage high-grade ovarian cancer (HGOC, n=10); with other tumor/proliferative processes (n=203, of which: including 30 cystadenomas, 59 endometrioid cysts, 21 teratomas, 28 borderline tumors; 16 – low-grade ovarian cancer (LSOC), HGOC of III-IV stages and control group women (n=26). Lipid extraction, analysis by high-performance liquid chromatography coupled with electrospray ionization mass spectrometry, and data preprocessing were performed. The SHAP method was used to interpret the predictions generated by building complex models. For multi-class classification, 7 ML methods were tested, including Naive Bayes classification, PLS discriminant analysis, Random Forest, External Gradient Boosting classification, Multilayer Percepton, and Convolutional Network. For binary classification, the following were additionally tested: support vector machine and extreme gradient boosting (Xgboos) classifications.

Results: In Stages I–II HGOC, a decrease in PC O-18:1/18:0, PE P-18:0/18:2, LPC O-16:0, PC 18:0_18:2, OxTG 16:0_18:1_16:1(CHO), OxPC 18:2_16:1(COOH), OxPC 20:4_14:0(COOH) and an increase in PC 16:0_18:0, PC P-18:1/20:4, PC 18:1_18:2, PC 16:0_18:0, PC 18:2_18:2 (compared to the control group) occurred, as well as a decrease in Cer-NS d18:1/22:0, PC P-16:0/18:1, PC P-18:1/20:4, PC P-18:0/18:1, oxidized lipids, carboxy- and carbohydroxy-derivatized and an increase in PC P-18:0/18:2, PC P-20:0/20:4 (compared to patients with OC). The best differentiation ability between the control group and the OC group was demonstrated by OPLS models, as well as random forest, and support vector machine with a radial kernel (90%).

Conclusion: The use of advanced ML methods strengthens the diagnostic potential of omics data and can be applied in gynecological oncology.

Толық мәтін

Рұқсат жабық

Авторлар туралы

Mariia Iurova

Academician V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology and Perinatology, Ministry of Health of Russia

Хат алмасуға жауапты Автор.
Email: hi5melisa@gmail.com
ORCID iD: 0000-0002-0179-7635

PhD, obstetrician-gynecologist, oncologist, Senior Researcher at the Scientific Polyclinic Department

Ресей, Moscow

Alisa Tokareva

Academician V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology and Perinatology, Ministry of Health of Russia

Email: alisa.tokareva@phystech.edu
ORCID iD: 0000-0001-5918-9045

PhD (Physico-Mathematical Sciences), Specialist at the Laboratory of Clinical Proteomics

Ресей, Moscow

Vitaliy Chagovets

Academician V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology and Perinatology, Ministry of Health of Russia

Email: vvchagovets@gmail.com

PhD (Physico-Mathematical Sciences), Head of the Laboratory of Metabolomics and Bioinformatics

Ресей, Moscow

Natalia Starodubtseva

Academician V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology and Perinatology, Ministry of Health of Russia

Email: n_starodubtseva@oparina4.ru
ORCID iD: 0000-0001-6650-5915

PhD (Bio), Head of the Laboratory of Clinical Proteomics

Ресей, Moscow

Vladimir Frankevich

Academician V.I. Kulakov National Medical Research Center for Obstetrics, Gynecology and Perinatology, Ministry of Health of Russia

Email: v_vfrankevich@oparina4.ru

Dr. Sci. (Physico-Mathematical Sciences), Deputy Director of the Institute of Translational Medicine

Ресей, Moscow

Әдебиет тізімі

  1. Feng Y., Yang W., Zhu J., Wang S., Wu N., Zhao H. et al. Clinical utility of various liquid biopsy samples for the early detection of ovarian cancer: a comprehensive review. Front. Oncol. 2025; 15: 1594100. https://dx.doi.org/10.3389/fonc.2025.1594100
  2. Cancer Research UK. Health inequalities: breaking down barriers to cancer screening. Available at: https://news.cancerresearchuk.org/2022/09/23/health-inequalities-breaking-down-barriers-to-cancer-screening/ (accessed on August 13, 2025)
  3. Mikami M., Tanabe K., Imanishi T., Ikeda M., Hirasawa T., Yasaka M. et al. Comprehensive serum glycopeptide spectra analysis to identify early-stage epithelial ovarian cancer. Sci. Rep. 2024; 14(1): 20000. https://dx.doi.org/10.1038/s41598-024-70228-6
  4. Юрова М.В., Токарева А.О., Чаговец В.В., Стародубцева Н.Л., Франкевич В.Е. Дифференциальная диагностика злокачественных новообразований яичников на ранней стадии на основании биоинформационного исследования метаболома крови. Акушерство и гинекология. 2024; 12: 118-26. [Iurova M.V., Tokareva A.O., Chagovets V.V., Starodubtseva N.L., Frankevich V.E. Differential diagnosis of early-stage ovarian cancer based on the bioinformatic analysis of the blood metabolome. Obstetrics and Gynecology. 2024; (12): 118-26 (in Russian)]. https://dx.doi.org/10.18565/aig.2024.283
  5. Tokareva A., Iurova M., Starodubtseva N., Chagovets V., Novoselova A., Kukaev E. et al. Machine learning framework for ovarian cancer diagnostics using plasma lipidomics and metabolomics. Int. J. Mol. Sci. 2025; 26(14): 6630. https://dx.doi.org/10.3390/ijms26146630
  6. Iurova M.V., Chagovets V.V., Pavlovich S.V., Starodubtseva N.L., Khabas G.N., Chingin K.S. et al. Lipid alterations in early-stage high-grade serous ovarian cancer. Front. Mol. Biosci. 2022; 9: 770983. https://dx.doi.org/10.3389/fmolb.2022.770983
  7. Prat J.; FIGO Committee on Gynecologic Oncology. Staging classification for cancer of the ovary, fallopian tube, and peritoneum. Int. J. Gynaecol. Obstet. 2014; 124(1): 1-5. https://dx.doi.org/10.1016/j.ijgo.2013.10.001
  8. Liang D., Yi B., Cao W., Zheng Q. Exploring ensemble oversampling method for imbalanced keyword extraction learning in policy text based on three-way decisions and SMOTE. Expert Systems with Applications. 2022; 188(1): 116051. https://dx.doi.org/10.1016/j.eswa.2021.116051
  9. Lundberg S.M., Erion G., Chen H., DeGrave A., Prutkin J.M., Nair B. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020; 2(1): 56-67. https://dx.doi.org/10.1038/s42256-019-0138-9
  10. Юрова М.В., Франкевич В.Е., Павлович С.В., Чаговец В.В., Стародубцева Н.Л., Хабас Г.Н., Ашрафян Л.А., Сухих Г.Т. Диагностика серозного рака яичников высокой степени злокачественности Iа–Iс стадии по липидному профилю сыворотки крови. Гинекология. 2021; 23(4): 335-40. [Iurova M.V., Frankevich V.E., Pavlovich S.V., Chagovets V.V., Starodubtseva N.L., Khabas G.N., Ashrafyan L.A., Sukhikh G.T. Diagnosis of Ia–Ic stages of serous highgrade ovarian cancer by the lipid profile of blood serum. Gynecology. 2021; 23(4): 335-40 (in Russian)]. https://dx.doi.org/10.26442/20795696.2021.4.200911
  11. Sharma A., Vans E., Shigemizu D., Boroevich K.A., Tsunoda T. DeepInsight: a methodology to transform a non-image data to an image for convolution neural network architecture. Sci. Rep. 2019; 9(1): 11399. https://dx.doi.org/10.1038/s41598-019-47765-6
  12. Fan L., Yin M., Ke C., Ge T., Zhang G., Zhang W. et al. Use of plasma metabolomics to identify diagnostic biomarkers for early stage epithelial ovarian cancer. J. Cancer. 2016; 7(10): 1265-72. https://dx.doi.org/10.7150/jca.15074
  13. Li J., Wang Z., Liu W., Tan L., Yu Y., Liu D. et al. Identification of metabolic biomarkers for diagnosis of epithelial ovarian cancer using internal extraction electrospray ionization mass spectrometry (iEESI-MS). Cancer Biomark. 2023; 37(2): 67-84. https://dx.doi.org/10.3233/CBM-220250
  14. Garcia E., Andrews C., Hua J., Kim H.L., Sukumaran D.K., Szyperski T. et al. Diagnosis of early stage ovarian cancer by 1H NMR metabonomics of serum explored by use of a microflow NMR probe. J. Proteome Res. 2011; 10(4): 1765-71. https://dx.doi.org/10.1021/pr101050d
  15. Ke C., Hou Y., Zhang H., Fan L., Ge T., Guo B. et al. Large-scale profiling of metabolic dysregulation in ovarian cancer. Int. J. Cancer. 2015; 136(3): 516-26. https://dx.doi.org/10.1002/ijc.29010
  16. Chistyakov D.V., Guryleva M.V., Stepanova E.S., Makarenkova L.M., Ptitsyna E.V., Goriainov S.V. et al. Multi-omics approach points to the importance of oxylipins metabolism in early-stage breast cancer. Cancers. 2022; 14(8): 2041. https://dx.doi.org/10.3390/cancers14082041
  17. Gaul D.A., Mezencev R., Long T.Q., Jones C.M., Benigno B.B., Gray A. et al. Highly-accurate metabolomic detection of early-stage ovarian cancer. Sci. Rep. 2015; 5: 16351. https://dx.doi.org/10.1038/srep16351
  18. Ban D., Housley S.N., Matyunina L.V., McDonald L.D., Bae-Jump V.L., Benigno B.B. et al. A personalized probabilistic approach to ovarian cancer diagnostics. Gynecol. Oncol. 2024; 182: 168-75. https://dx.doi.org/10.1016/j.ygyno.2023.12.030
  19. Gaillard D.H.K., Lof P., Sistermans E.A., Mokveld T., Horlings H.M., Mom C.H. et al. Evaluating the effectiveness of pre-operative diagnosis of ovarian cancer using minimally invasive liquid biopsies by combining serum human epididymis protein 4 and cell-free DNA in patients with an ovarian mass. Int. J. Gynecol. Cancer. 2024; 34(5): 713-21. https://dx.doi.org/10.1136/ijgc-2023-005073

Қосымша файлдар

Қосымша файлдар
Әрекет
1. JATS XML
2. Fig. 1. a) Lipids most significantly (top 15 by mean Shapley value) associated with tumor lesions other than OC (relative to the control group); b) Lipids most significantly (top 15 by mean Shapley value) associated with early-stage OC; c) Lipids most significantly (top 15 by mean Shapley value) associated with differential diagnosis of early-stage OC from other tumor lesions. The color of the dots changes from purple to yellow in the direction of "low lipid level" -- "high lipid level".

Жүктеу (563KB)
3. Fig. 2. a) Accuracy of models built using binary classification methods; b) Completeness of models built using binary classification methods; c) Quality of models built using multi-class classification methods

Жүктеу (427KB)

© Bionika Media, 2025