- Researchers propose a novel three-stage framework, Retrieve–Revise–Refine, specifically designed to address the intricate challenge of legal article set retrieval, which focuses on retrieving a concise (i.e., precise and compact) set of entailing legal articles.
- Secondly, they rigorously evaluate the framework using two datasets, where they observe notable improvements in the macro F2 score, achieving increases of 3.17% and 4.24% over the previous state-of-the-art methods, respectively.
- Lastly, their comprehensive ablation studies and subsequent analysis provide valuable insights into the critical functions of each stage within the framework.
Artificial Intelligence (AI) continues to redefine the boundaries of legal technology, offering promise in automating advanced tasks such as legal question answering and consultation. In the domain of statute law, a particularly principal challenge is the task of retrieving the concise set of entailing legal articles to a query, a task essential to enhancing these advanced applications. In this context, we refer to this task as entailing legal article set retrieval or, more briefly, legal article set retrieval.
The task of retrieving entailing legal article sets differs markedly from traditional information retrieval (IR) in two main aspects. Firstly, unlike the traditional IR which returns a ranked list of articles, the legal article set retrieval task seeks a concise set of articles. This level of specificity extends to the nature of the legal queries and legal articles themselves: they are inherently complex and steeped in specialized legal language, demanding a retrieval system with deeper legal reasoning and linking capacity. Secondly, while traditional IR efforts primarily involve ranking candidates by relevance, our task requires that the retrieved articles not just relate to but jointly entail the contents of a query or its negation. These characteristics set this task apart from the broader goals and methods of traditional IR tasks.
Previous research in legal article set retrieval has predominantly employed two approaches. The first approach combines classical IR models with fine-tuned language models (LMs), and then ensembles the retrieval results to consolidate the final retrieved sets. Meanwhile, the second approach uses classical IR models exclusively for preliminary candidate filtering, which prepares inputs for further LM fine-tuning; the final results are often ensembled from various fine-tuned LMs.
To address the task of legal article set retrieval, a team of researchers from the Japan Advanced Institute of Science and Technology (JAIST), led by Professor Le-Minh Nguyen and including doctoral students Chau Nguyen, proposed framework, called Retrieve–Revise–Refine. The framework is designed to pinpoint the concise set of legal articles that either entail a query or its negation, advancing the current understanding of this task. Furthermore, their approach leverages the unique advantages of combining both small LMs and large LMs to improve the accuracy of the articles retrieved (i.e., precision), while endeavoring to limit the loss in coverage (i.e., recall). The framework consists of three stages:
1. Retrieve: Maximizing the comprehensive retrieval of entailing articles using an ensemble of multiple small LMs, fine-tuned with various tailored strategies.
2. Revise: Large LMs are utilized to assess the validity of the query with respect to each combination of articles from the top retrieval results, aiming to derive a more compact subset of entailing legal articles.
3. Refine: Further distilling the outputs from the second stage, using insights derived from the small LMs’ predictions as refiners for the predictions of the large LMs.
As shown in the empirical results, their proposed framework achieved state-of-the-art results for the task across two datasets, showing improvements of 3.17% and 4.24%, respectively. Their study was published online in
Information Processing & Management.
###
Reference
Title of original paper: |
Retrieve–Revise–Refine: A novel framework for retrieval of concise entailing legal article set |
Authors: |
Chau Nguyen*, Phuong Nguyen, Le-Minh Nguyen |
Journal: |
Information Processing & Management |
DOI: |
10.1016/j.ipm.2024.103949 |
About Japan Advanced Institute of Science and Technology, Japan
Founded in 1990 in Ishikawa prefecture, the Japan Advanced Institute of Science and Technology (JAIST) was the first independent national graduate school in Japan. Now, after 30 years of steady progress, JAIST has become one of Japan’s top-ranking universities. JAIST strives to foster capable leaders with a state-of-the-art education system where diversity is key; about 40% of its alumni are international students. The university has a unique style of graduate education based on a carefully designed coursework-oriented curriculum to ensure that its students have a solid foundation on which to carry out cutting-edge research. JAIST also works closely both with local and overseas communities by promoting industry–academia collaborative research.
Funding information
This work is supported partly by AOARD grant FA23862214039.