<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root>
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" article-type="research-article" dtd-version="1.2" xml:lang="en"><front><journal-meta><journal-id journal-id-type="publisher-id">Current Bioinformatics</journal-id><journal-title-group><journal-title xml:lang="en">Current Bioinformatics</journal-title><trans-title-group xml:lang="ru"><trans-title>Current Bioinformatics</trans-title></trans-title-group></journal-title-group><issn publication-format="print">1574-8936</issn><issn publication-format="electronic">2212-392X</issn><publisher><publisher-name xml:lang="en">Bentham Science</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">643763</article-id><article-id pub-id-type="doi">10.2174/0115748936285540240116065719</article-id><article-categories><subj-group subj-group-type="toc-heading"><subject>Life Sciences</subject></subj-group><subj-group subj-group-type="article-type"><subject>Research Article</subject></subj-group></article-categories><title-group><article-title xml:lang="en">Sia-m7G: Predicting m7G Sites through the Siamese Neural Network with an Attention Mechanism</article-title></title-group><contrib-group><contrib contrib-type="author"><name><surname>Zheng</surname><given-names>Jia</given-names></name><email>info@benthamscience.net</email><xref ref-type="aff" rid="aff1"/></contrib><contrib contrib-type="author"><name><surname>Zhou</surname><given-names>Yetong</given-names></name><email>info@benthamscience.net</email><xref ref-type="aff" rid="aff1"/></contrib></contrib-group><aff id="aff1"><institution>School of Science, Dalian Maritime University</institution></aff><pub-date date-type="pub" iso-8601-date="2024-10-01" publication-format="electronic"><day>01</day><month>10</month><year>2024</year></pub-date><volume>19</volume><issue>10</issue><issue-title xml:lang="ru"/><fpage>953</fpage><lpage>962</lpage><history><date date-type="received" iso-8601-date="2025-01-07"><day>07</day><month>01</month><year>2025</year></date></history><permissions><copyright-statement xml:lang="en">Copyright ©; 2024, Bentham Science Publishers</copyright-statement><copyright-year>2024</copyright-year><copyright-holder xml:lang="en">Bentham Science Publishers</copyright-holder><ali:free_to_read xmlns:ali="http://www.niso.org/schemas/ali/1.0/"/></permissions><self-uri xlink:href="https://journals.eco-vector.com/1574-8936/article/view/643763">https://journals.eco-vector.com/1574-8936/article/view/643763</self-uri><abstract xml:lang="en"><p id="idm46041443827040">Background:The chemical modification of RNA plays a crucial role in many biological processes. N7-methylguanosine (m7G), being one of the most important epigenetic modifications, plays an important role in gene expression, processing metabolism, and protein synthesis. Detecting the exact location of m7G sites in the transcriptome is key to understanding their relevant mechanism in gene expression. On the basis of experimentally validated data, several machine learning or deep learning tools have been designed to identify internal m7G sites and have shown advantages over traditional experimental methods in terms of speed, cost-effectiveness and robustness.</p><p id="idm46041443831040">Aims:In this study, we aim to develop a computational model to help predict the exact location of m7G sites in humans.</p><p id="idm46041443837200">Objective:Simple and advanced encoding methods and deep learning networks are designed to achieve excellent m7G prediction efficiently.</p><p id="idm46041443841808">Methods:Three types of feature extractions and six classification algorithms were tested to identify m7G sites. Our final model, named Sia-m7G, adopts one-hot encoding and a delicate Siamese neural network with an attention mechanism. In addition, multiple 10-fold cross-validation tests were conducted to evaluate our predictor.</p><p id="idm46041443850544">Results:Sia-m7G achieved the highest sensitivity, specificity and accuracy on 10-fold crossvalidation tests compared with the other six m7G predictors. Nucleotide preference and model visualization analyses were conducted to strengthen the interpretability of Sia-m7G and provide a further understanding of m7G site fragments in genomic sequences.</p><p id="idm46041443858672">Conclusion:Sia-m7G has significant advantages over other classifiers and predictors, which proves the superiority of the Siamese neural network algorithm in identifying m7G sites.</p></abstract><kwd-group xml:lang="en"><kwd>N7 methylguanosine</kwd><kwd>predictor</kwd><kwd>deep learning</kwd><kwd>siamese neural network</kwd><kwd>protein synthesis</kwd><kwd>genomic sequences.</kwd></kwd-group></article-meta></front><body></body><back><ref-list><ref id="B1"><label>1.</label><mixed-citation>Frye M, Harada BT, Behm M, He C. RNA modifications modulate gene expression during development. Science 2018; 361(6409): 1346-9. doi: 10.1126/science.aau1646 PMID: 30262497</mixed-citation></ref><ref id="B2"><label>2.</label><mixed-citation>Komal S, Zhang LR, Han SN. Potential regulatory role of epigenetic RNA methylation in cardiovascular diseases. Biomed Pharmacother 2021; 137: 111376. doi: 10.1016/j.biopha.2021.111376 PMID: 33588266</mixed-citation></ref><ref id="B3"><label>3.</label><mixed-citation>Furuichi Y. Discovery of m(7)G-cap in eukaryotic mRNAs. Proc Jpn Acad, Ser B, Phys Biol Sci 2015; 91(8): 394-409. doi: 10.2183/pjab.91.394 PMID: 26460318</mixed-citation></ref><ref id="B4"><label>4.</label><mixed-citation>Tomikawa C. 7-Methylguanosine modifications in Transfer RNA (tRNA). Int J Mol Sci 2018; 19(12): 4080. doi: 10.3390/ijms19124080 PMID: 30562954</mixed-citation></ref><ref id="B5"><label>5.</label><mixed-citation>Lin S, Liu Q, Lelyveld VS, Choe J, Szostak JW, Gregory RI. Mettl1/Wdr4-Mediated m7G tRNA methylome is required for Normal mRNA translation and embryonic stem cell self-renewal and differentiation. Mol Cell 2018; 71(2): 244-255.e5. doi: 10.1016/j.molcel.2018.06.001 PMID: 29983320</mixed-citation></ref><ref id="B6"><label>6.</label><mixed-citation>Marchand V, Ayadi L, Ernst FGM, et al. AlkAniline‐Seq: Profiling of m 7 G and m 3 C RNA modifications at single nucleotide resolution. Angew Chem Int Ed 2018; 57(51): 16785-90. doi: 10.1002/anie.201810946 PMID: 30370969</mixed-citation></ref><ref id="B7"><label>7.</label><mixed-citation>Zhang LS, Liu C, Ma H, et al. Transcriptome-wide mapping of internal N7-methylguanosine methylome in mammalian mRNA. Mol Cell 2019; 74(6): 1304-1316.e8. doi: 10.1016/j.molcel.2019.03.036 PMID: 31031084</mixed-citation></ref><ref id="B8"><label>8.</label><mixed-citation>Malbec L, Zhang T, Chen YS, et al. Dynamic methylome of internal mRNA N7-methylguanosine and its regulatory role in translation. Cell Res 2019; 29(11): 927-41. doi: 10.1038/s41422-019-0230-z PMID: 31520064</mixed-citation></ref><ref id="B9"><label>9.</label><mixed-citation>Luo X, Chi W, Deng M. Deepprune: Learning efficient and interpretable convolutional networks through weight pruning for predicting DNA-protein binding. Front Genet 2019; 10: 1145. doi: 10.3389/fgene.2019.01145 PMID: 31824562</mixed-citation></ref><ref id="B10"><label>10.</label><mixed-citation>Zhang Y, Qiao S, Ji S, Li Y. DeepSite: Bidirectional LSTM and CNN models for predicting DNA-protein binding. Int J Mach Learn Cybern 2020; 11(4): 841-51. doi: 10.1007/s13042-019-00990-x</mixed-citation></ref><ref id="B11"><label>11.</label><mixed-citation>Chen W, Feng P, Song X, Lv H, Lin H. iRNA-m7G: Identifying N7-methylguanosine sites by fusing multiple features. Mol Ther Nucleic Acids 2019; 18: 269-74. doi: 10.1016/j.omtn.2019.08.022 PMID: 31581051</mixed-citation></ref><ref id="B12"><label>12.</label><mixed-citation>Yang YH, Ma C, Wang JS, et al. Prediction of N7-methylguanosine sites in human RNA based on optimal sequence features. Genomics 2020; 112(6): 4342-7. doi: 10.1016/j.ygeno.2020.07.035 PMID: 32721444</mixed-citation></ref><ref id="B13"><label>13.</label><mixed-citation>Song B, Tang Y, Chen K, et al. m7GHub: Deciphering the location, regulation and pathogenesis of internal mRNA N7-methylguanosine (m7G) sites in human. Bioinformatics 2020; 36(11): 3528-36. doi: 10.1093/bioinformatics/btaa178 PMID: 32163126</mixed-citation></ref><ref id="B14"><label>14.</label><mixed-citation>Zou H, Yin Z. m7G-DPP: Identifying N7-methylguanosine sites based on dinucleotide physicochemical properties of RNA. Biophys Chem 2021; 279: 106697. doi: 10.1016/j.bpc.2021.106697 PMID: 34628276</mixed-citation></ref><ref id="B15"><label>15.</label><mixed-citation>Liu X, Liu Z, Mao X, Li Q. m7GPredictor: An improved machine learning-based model for predicting internal m7G modifications using sequence properties. Anal Biochem 2020; 609: 113905. doi: 10.1016/j.ab.2020.113905 PMID: 32805275</mixed-citation></ref><ref id="B16"><label>16.</label><mixed-citation>Dai C, Feng P, Cui L, Su R, Chen W, Wei L. Iterative feature representation algorithm to improve the predictive performance of N7-methylguanosine sites. Brief Bioinform 2021; 22(4): bbaa278. doi: 10.1093/bib/bbaa278</mixed-citation></ref><ref id="B17"><label>17.</label><mixed-citation>Bi Y, Xiang D, Ge Z, Li F, Jia C, Song J. An interpretable prediction model for identifying N7-methylguanosine sites based on XGBoost and SHAP. Mol Ther Nucleic Acids 2020; 22: 362-72. doi: 10.1016/j.omtn.2020.08.022 PMID: 33230441</mixed-citation></ref><ref id="B18"><label>18.</label><mixed-citation>Zhang L, Qin X, Liu M, Liu G, Ren Y. BERT-m7G: A transformer architecture based on BERT and stacking ensemble to identify RNA N7-methylguanosine sites from sequence information. Comput Math Methods Med 2021; 2021: 7764764.</mixed-citation></ref><ref id="B19"><label>19.</label><mixed-citation>Shoombuatong W, Basith S, Pitti T, Lee G, Manavalan B. THRONE: A new approach for accurate prediction of human RNA N7-methylguanosine sites. J Mol Biol 2022; 434(11): 167549. doi: 10.1016/j.jmb.2022.167549 PMID: 35662472</mixed-citation></ref><ref id="B20"><label>20.</label><mixed-citation>Zhang Y, Yu L, Jing R, Han B, Luo J. Fast and efficient design of deep neural networks for predicting N 7 -methylguanosine sites using autobioseqpy. ACS Omega 2023; 8(22): 19728-40. doi: 10.1021/acsomega.3c01371 PMID: 37305295</mixed-citation></ref><ref id="B21"><label>21.</label><mixed-citation>Ning Q, Sheng M. m7G-DLSTM: Intergrating directional Double-LSTM and fully connected network for RNA N7-methlguanosine sites prediction in human. Chemom Intell Lab Syst 2021; 217: 104398. doi: 10.1016/j.chemolab.2021.104398</mixed-citation></ref><ref id="B22"><label>22.</label><mixed-citation>Tahir M, Hayat M, Khan R, Chong KT. An effective deep learning-based architecture for prediction of N7-methylguanosine sites in health systems. Electronics 2022; 11(12): 1917. doi: 10.3390/electronics11121917</mixed-citation></ref><ref id="B23"><label>23.</label><mixed-citation>Chen Z, Zhao P, Li F, et al. iLearn: An integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data. Brief Bioinform 2020; 21(3): 1047-57. doi: 10.1093/bib/bbz041 PMID: 31067315</mixed-citation></ref><ref id="B24"><label>24.</label><mixed-citation>Chen W, Tang H, Ye J, Lin H, Chou K-C. iRNA-PseU: Identifying RNA pseudouridine sites. Mol Ther Nucleic Acids 2016; 5(7): e332. PMID: 28427142</mixed-citation></ref><ref id="B25"><label>25.</label><mixed-citation>Wu H, Pan X, Yang Y, Shen HB. Recognizing binding sites of poorly characterized RNA-binding proteins on circular RNAs using attention Siamese network. Brief Bioinform 2021; 22(6): bbab279. doi: 10.1093/bib/bbab279 PMID: 34297803</mixed-citation></ref><ref id="B26"><label>26.</label><mixed-citation>Vacic V, Iakoucheva LM, Radivojac P. Two Sample Logo: A graphical representation of the differences between two sets of sequence alignments. Bioinformatics 2006; 22(12): 1536-7. doi: 10.1093/bioinformatics/btl151 PMID: 16632492</mixed-citation></ref><ref id="B27"><label>27.</label><mixed-citation>Luo X, Tu X, Ding Y, Gao G, Deng M. Expectation pooling: An effective and interpretable pooling method for predicting DNAprotein binding. Bioinformatics 2020; 36(5): 1405-12. doi: 10.1093/bioinformatics/btz768 PMID: 31598637</mixed-citation></ref><ref id="B28"><label>28.</label><mixed-citation>Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W. Eds. LightGBM: A highly efficient gradient boosting decision tree. 31st Annual Conference on Neural Information Processing Systems (NIPS). 04-09 Dec; Long Beach, CA, USA. 2017.</mixed-citation></ref><ref id="B29"><label>29.</label><mixed-citation>Tang Z, Li Z, Hou T, et al. SiGra: Single-cell spatial elucidation through an image-augmented graph transformer. Nat Commun 2023; 14(1): 5618. doi: 10.1038/s41467-023-41437-w PMID: 37699885</mixed-citation></ref><ref id="B30"><label>30.</label><mixed-citation>Tang Z, Liu X, Li Z, et al. SpaRx: Elucidate single-cell spatial heterogeneity of drug responses for personalized treatment. Brief Bioinform 2023; 24(6): bbad338. doi: 10.1093/bib/bbad338 PMID: 37798249</mixed-citation></ref><ref id="B31"><label>31.</label><mixed-citation>Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN. Eds. Attention is all you need. 31st Annual Conference on Neural Information Processing Systems (NIPS). 04-09 Dec; Long Beach, CA, USA. 2017.</mixed-citation></ref><ref id="B32"><label>32.</label><mixed-citation>van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res 2008; 9: 2579-605.</mixed-citation></ref></ref-list></back></article>
