Evolution of the capabilities of large language models in the legal field: Meta-analysis of four experimental studies

Roman V. Dushkin; Душкин Роман Викторович; Vladimir N. Podoprigora; Подопригора Владимир Николаевич; Alexey A. Kuzmin; Кузьмин Алексей Алексеевич; Kirill R. Dushkin; Душкин Кирилл Романович

doi:10.33693/2313-223X-2025-12-3-209-220

Evolution of the capabilities of large language models in the legal field: Meta-analysis of four experimental studies

Авторлар: Dushkin R.V.¹, Podoprigora V.N.², Kuzmin A.A.³, Dushkin K.R.⁴
Мекемелер:
1. National Research Nuclear University “MEPhI”
2. Plekhanov Russian University of Economics
3. Ecosystem Digital Solutions LLC
4. A-Ya expert LLC
Шығарылым: Том 12, № 3 (2025)
Беттер: 209-220
Бөлім: LARGE LANGUAGE MODELS IN LEGAL PRACTICE
URL: https://journals.eco-vector.com/2313-223X/article/view/695768
DOI: https://doi.org/10.33693/2313-223X-2025-12-3-209-220
EDN: https://elibrary.ru/CBJQVM
ID: 695768

Дәйексөз келтіру

Толық мәтін

Ашық рұқсат
Рұқсат жабық

Рұқсат берілді
Рұқсат жабық

Рұқсат ақылы немесе тек жазылушылар үшін

Аннотация
Толық мәтін
Авторлар туралы
Әдебиет тізімі
Қосымша файлдар
Статистика

Аннотация

This paper presents a meta-analysis of four experimental studies from the Norm! project, aimed at systematically studying the effectiveness of large language models in the legal field. The study includes a comparative analysis of junior and senior models, optimization of system prompts, and testing of multi-agent architectures on tasks in Russian family and civil law. A key discovery was the identification of a nonlinear relationship between architectural complexity and the quality of results: the transition from simple to complex systems provides a slight increase in quality (15–40%) with an exponential increase in resource costs (by a factor of 10–15). The flagship models GPT-4.1 and Gemini 2.5 Pro demonstrate superior quality (9.04 and 8.52 points), but junior LLMs with efficiency coefficients up to 130.3 remain cost-effective. A universal problem area for all architectures is tasks requiring an integrative analysis of multiple legal norms. The results form scientifically sound recommendations for various implementation scenarios: from mass consulting services to specialized legal applications, defining the prospects for the development of hybrid architectures in legal practice.

Негізгі сөздер

large language models, legal artificial intelligence, meta-analysis, multi-agent systems, system prompts, cost-effectiveness, legal consulting, RAG systems, family law, artificial intelligence system architecture

Толық мәтін

Авторлар туралы

Roman Dushkin

National Research Nuclear University “MEPhI”

Хат алмасуға жауапты Автор.
Email: drv@aia.expert
ORCID iD: 0000-0003-4789-0736
SPIN-код: 1371-0337

senior lecturer, Department 22 “Cybernetics”

Ресей, Moscow

Vladimir Podoprigora

Plekhanov Russian University of Economics

Email: Podoprigora.VN@rea.ru
ORCID iD: 0000-0001-6485-8135
SPIN-код: 9587-1028

Cand. Sci. (Econ.), head of the laboratory

Ресей, Moscow

Alexey Kuzmin

Ecosystem Digital Solutions LLC

Email: a.kuzmin@edisai.tech

general director

Ресей, Moscow

Kirill Dushkin

A-Ya expert LLC

Email: dkr@aia.expert

analyst

Ресей, Moscow

Әдебиет тізімі

Dushkin R.V. Artificial Intelligence. Moscow: DMK-Press, 2019. 280 p. ISBN: 978-5-97060-787-9.
Lai J., Gan W., Wu J. et al. Large language models in law: A survey. AI Open. 2024. URL: https://www.sciencedirect.com/science/article/pii/S2666651024000172 (data of accesses: 13.10.2023).
Ma S., Chen C., Chu Q. et al. Leveraging large language models for relevance judgments in legal case retrieval. arXiv preprint arXiv:2403.18405. 2024. URL: https://arxiv.org/abs/2403.18405 (data of accesses: 13.10.2023).
Paul V. Automation in legal: The increasing role of AI. Medium. 2024. URL: https://medium.com/@vincentpaulai/automation-in-legal-the-increasing-role-of-ai-70724ef0b225 (data of accesses: 13.10.2023).
Magesh V., Surani F., Dahl M. et al. Hallucination-free? Assessing the reliability of leading AI legal research tools. arXiv preprint arXiv:2405.20362. 2024. URL: https://arxiv.org/abs/2405.20362 (data of accesses: 13.10.2023).
The future of artificial intelligence in the legal industry: Opportunities, challenges, and ethical considerations. Legal Stuff. URL: https://medium.com/@legal.stuff.notion/the-future-of-artificial-intelligence-in-the-legal-industry-opportunities-challenges-and-ethical-61c3198b425a (data of accesses: 13.10.2023).
Korneenkov A.A., Yanov Yu.K., Ryazantsev S.V. et al. Meta-analysis of clinical studies in otorhinolaryngology. Bulletin of Otorhinolaryngology. 2020. Vol. 85. No. 2. Pp. 26–30. (In Rus.). doi: 10.17116/otorino20208502126.
Dushkin R.V. Overview of approaches and methods of artificial intelligence. Radio Electronic Technologies. 2018. No. 3. Pp. 85–89. (In Rus.)
Rumyantsev P.O., Saenko U.V., Rumyantseva U.V. Statistical methods of analysis in clinical practice. Part 1. Univariate statistical analysis. Problems of Endocrinology. 2009. Vol. 55. No. 5. Pp. 48–55. (In Rus.). doi: 10.14341/probl200955548-55.
Dushkin R.V. Why hybrid AI systems hold the future. Economic Strategies. 2018. No. 6 (156). Pp. 84–93. (In Rus.)

Қосымша файлдар

Әрекет

1. JATS XML

Жүктеу

2. Fig. 1. Stages of conducting meta-analysis of LLMs in the legal domain

Жүктеу (299KB)

Метадеректер

3. Fig. 2. Comparative analysis of quality and economic efficiency of lesser LLMs: a – quality of responses of lesser LLMs in the legal domain; b – economic efficiency of lesser LLMs

Жүктеу (223KB)

Метадеректер

4. Fig. 3. Comparative analysis of effectiveness of various system prompts for GPT-4o mini: a – quality of responses of various system prompts; b – economic efficiency of prompts; c – token consumption by various agents Agent: 1 – universal; 2 – specialized; 3 – modified; 4 – overtrained

Жүктеу (273KB)

Метадеректер

5. Fig. 4. Comparative analysis of greater LLMs performance by overall indicators and complexity levels: a – performance of greater LLMs in the legal domain; b – performance of top-3 LLMs by task complexity levels Level: 1 – simple; 2 – secondary; 3 – combination; 4 – collisions; 5 – problematic

Жүктеу (312KB)

Метадеректер

6. Fig. 5. Comparative analysis of MAS architectures by quality, resource consumption and efficiency a – quality of responses of different MAS architectures; b – token consumption by MAS architectures; c – economic efficiency of MAS architectures Variant: 1 – simple; 2 – with dispatcher; 3 – modified; 4 – ensemble; 5 – with jury

Жүктеу (263KB)

Метадеректер

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу

Пайдаланушының аты
Құпиясөз
Мені есте сақтау

Құпия сөзді ұмыттыңыз ба?	Тіркеу