Detection of Depression Among Social Network Users Using Machine Learning Methods
- Authors: Zotkina A.A.1, Martyshkin A.I.1
-
Affiliations:
- Penza State Technological University
- Issue: Vol 10, No 4 (2023)
- Pages: 16-22
- Section: ИСКУССТВЕННЫЙ ИНТЕЛЛЕКТ И МАШИННОЕ ОБУЧЕНИЕ
- URL: https://journals.eco-vector.com/2313-223X/article/view/626619
- DOI: https://doi.org/10.33693/2313-223X-2023-10-4-16-22
- ID: 626619
Cite item
Abstract
Statistical data provided by the FSBI “NMITSPN named after V.P. Serbsky” of the Ministry of Health of Russia indicate that depression, as a psychoemotional state, is the main cause of concern around the world, which in most cases leads to suicide, if not detected, and to a threat to others. Studies show that depression tends to have an impact on writing style and appropriate language use. The main purpose of the proposed study is to study user messages on the VKontakte social network and identify attributes that may indicate depressive symptoms of users. The article uses machine learning approaches (logistic regression, random forest, support vector machine, XGBoost) and natural language processing methods (removal of stop words, character deletion, tokenization, lemmatization) to prepare data and evaluate their effectiveness. The work demonstrated that the ability to search for depressed users with an accuracy of 77% using the XGBoost classifier. This method is combined with other linguistic functions (N-gram + TF-IDF) and LDA to achieve higher accuracy. In conclusion, the main conclusions of the study are formulated.
Full Text
About the authors
Alena A. Zotkina
Penza State Technological University
Author for correspondence.
Email: alena.zotkina.97@mail.ru
ORCID iD: 0000-0002-2497-6433
postgraduate student of the 4th year of study at the Department “Programming”
Russian Federation, PenzaAlexey I. Martyshkin
Penza State Technological University
Email: mai@penzgtu.ru
ORCID iD: 0000-0002-3358-4394
Cand. Sci. (Eng.), Associate Professor; Head of the Department “Programming”
Russian Federation, PenzaReferences
- Abboute A., Boudjeriou Y., Entringer G. et al. Mining Twitter for suicide prevention. In: Natural language processing and information systems. NLDB 2014. Lecture notes in computer science. E. Métais, M. Roche, M. Teisseire (eds.). Vol. 8455. Cham: Springer, 2014. Pp. 250–253. DOI: https://doi.org/10.1007/978-3-319-07983-7_36
- Chattopadhyay S. A study on suicidal risk analysis. 9th International Conference on e-Health Networking, Application and Services. Taipei: IEEE, 2007. Pp. 74–78.
- Coppersmith G., Ngo K., Leary R., Wood A. Exploratory analysis of social media prior to a suicide attempt. In: Proceedings of the third workshop on computational linguistics and clinical psychology. San Diego, CA: Association for Computational Linguistics. 2016. Pp. 106–117.
- O’dea B., Wan S., Batterham P.J. et al. Detecting suicidality on Twitter. Internet Interventions. The Application of Information Technology in Mental and Behavioural Health. 2015. No. 2 (2). Pp. 183–188. doi: 10.1016/j.invent.2015.03.005.
- Bonzanini M. Social Media analysis in Python. Extract and analyze data from all corners of the social web in Python. Transl. from English by A.V. Logunov. Moscow: DMK Press, 2018. 288 p. ISBN 978-5-97060-574-5. URL: https:// e.lanbook.com/book/108129
- Zotkina A.A. Analysis of the depressive state of users of the VKontakte social network. XXI century: Results of the Past and Problems of the Present Plus. 2022. Vol. 11. No. 4 (60). Pp. 52–55. (In Rus.) doi: 10.46548/21vek-2022-1160-0007
- Coelho L.P., Richart V. Building machine learning systems in Python. Transl. from English by A.A. Slinkin. 2nd ed. Moscow: DMK Press, 2016. 302 p. ISBN 978-5-97060-330-7. URL: https://e.lanbook.com/book/82818
- Makshanov A.V., Zhuravlev A.E., Tyndykar L.N. Big data. 2nd ed., erased. St. Petersburg: Lan, 2022. 188 p. ISBN 978-5-8114-9690-7. URL: https://e.lanbook.com/book/198599
- Moskvitin A.A. Data, information, knowledge: Methodology, theory, technologies: Monograph. St. Petersburg: Lan, 2022. 236 p. ISBN 978-5-8114-3232-5. URL: https://e.lanbook.com/book/206267
- Semerikov A.V., Glazyrin M.A. Classification of objects based on a neural network and methods of the decision tree and nearest neighbors: Textbook. Ukhta: USTU, 2022. 68 p. URL: https://e.lanbook.com/book/267857
- Flach P. Machine learning. The science and art of building algorithms that extract knowledge from data. Moscow: DMK Press, 2015. 400 p. ISBN 978-5-97060-273-7. URL: https://e.lanbook.com/book/69955
- Shalev-Schwartz Sh., Ben-David Sh. Ideas of machine learning: Textbook. Transl. from English by A.A. Slinkin. Moscow: DMK Press, 2019. 436 p. ISBN 978-5-97060-673-5. URL: https://e.lanbook.com/book/131686 (data of accesses: 02.02.2023).