Classificação de Documentos Jurídicos Utilizando IA Generativa: Uma Abordagem com RAG e Gemini

Ruan Dias Santana; Marcelo Lisboa Rocha

doi:10.20873/uft.2675-3588.2026.v7n3.p19-24

Legal Document Classification Using Generative AI

A Retrieval-Augmented Generation (RAG) and Gemini Approach

Authors

Ruan Dias Santana Universidade Federal do Tocantins https://orcid.org/0009-0000-4647-9094
Marcelo Lisboa Rocha Universidade Federal do Tocantins https://orcid.org/0000-0002-4034-0021

DOI:

https://doi.org/10.20873/uft.2675-3588.2026.v7n3.p19-24

Keywords:

Legal AI, RAG, LLMs, Text Classification, Gemini, Data Augmentation

Abstract

The Brazilian Judiciary faces a critical challenge regarding the massive volume of digital lawsuits, making manual screening costly and error-prone. This work investigates the application of Generative Artificial Intelligence to automate the classification of legal petitions using Large Language Models (LLMs). The research presents a methodological evolution in three stages: (i) an initial approach based on few-shot learning, establishing a 56% accuracy baseline; (ii) refinement through prompt engineering with N-grams and Data Augmentation techniques to address class imbalance, which achieved 85% accuracy; and (iii) the implementation of a Retrieval-Augmented Generation (RAG) architecture, connecting Google's Gemini 2.5 model to a vector knowledge base, which achieved 84% accuracy. The experiments utilized real datasets from the Court of Justice of Tocantins (TJTO), covering themes from Superior Courts (STF/STJ). Final results demonstrate that the RAG approach achieved 84% accuracy in a complex scenario of 11 thematic classes, effectively mitigating hallucinations and semantic ambiguities found in previous stages.

Downloads

PDF (Português (Brasil))

Published

2026-05-02

How to Cite

[1]

Dias Santana, R. and Rocha, M.L. 2026. Legal Document Classification Using Generative AI: A Retrieval-Augmented Generation (RAG) and Gemini Approach. Academic Journal on Computing, Engineering and Applied Mathematics. 7, 3 (May 2026), 19–24. DOI:https://doi.org/10.20873/uft.2675-3588.2026.v7n3.p19-24.

Download Citation

Issue

Vol. 7 No. 3 (2026)

Section

Research Papers

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Authors who publish in this journal agree to the following terms:

Authors retain copyright and grant the journal the right of first publication, with work simultaneously licensed under the Creative Commons Attribution License (CC BY-NC 4.0), allowing work sharing with acknowledgment of the work's authorship and initial publication in this journal. ;
Authors are authorized to enter additional contracts separately for the non-exclusive distribution of the version of the work published in this journal (eg, publishing in an institutional repository or as a book chapter), with acknowledgment of authorship and initial publication in this journal;
Authors are allowed and encouraged to post and distribute their work online (eg, in institutional repositories or on their personal page) at any point after the editorial process;
In addition, the AUTHOR is informed and agrees with the journal that, therefore, his paper may be incorporated by the AJCEAM into existing or existing scientific information systems and databases (indexers and databases). in the future (indexers and future databases), under the conditions defined by the latter at all times, which will involve at least the possibility that the holders of these databases may perform the following actions on the paper:

Reproduce, transmit and distribute the paper in whole or in part in any form or means of existing or future electronic transmission, including electronic transmission for research, viewing and printing purposes;
Reproduce and distribute all or part of the article in print;
Translate certain parts of the paper;
Extract figures, tables, illustrations, and other graphic objects and capture metadata, captions, and related article for research, visualization, and printing purposes;
Transmission, distribution, and reproduction by agents or authorized by the owners of database distributors;
The preparation of bibliographic citations, summaries and indexes and related capture references from selected parts of the paper;
Scan and/or store electronic article images and text.

Legal Document Classification Using Generative AI

A Retrieval-Augmented Generation (RAG) and Gemini Approach

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Categories

License

Similar Articles

Make a Submission

Language

Information

Browse

Sobre este sistema de publicação

Similar Articles

Study of Equipment to Measure Quantities Related to Electricity in a Three-Phase Engine

Evaluation of a Sliding Window mechanism as DataAugmentation over Emotion Detection on Speech

Machine Learning Applied to Fruit Quality Assessment

Experimental Analysis of Fluke 345 Measurement Instrument Laboratory: Unloaded and Loaded Engines Case Studies

Exploring Super-Resolution for Face Recognition

Steganography using Biased Random Keys Genetic Algorithm

Progress in the Tactical Positioning of the iBots Team Based on HeterogeneousCharacteristics for the 2D Simulation Category of RoboCup

Work in progress: analysis and evaluation of the impact of the code approximation for IoT applications

The use of Bluetooth Low Energy (BLE) Beacons for attendance at the Superior da Magistratura Tocantinense school (ESMAT)