Leveraging Large Language Models for Analyzing Judicial Disparities in China

October 8, 2024

Leveraging Large Language Models for Analyzing Judicial Disparities in China

In November 2013, the Supreme People's Court of China (SPC) mandated that all court levels upload their decisions to a centralized, SPC-managed platform starting in January 2014. By 2023, this platform held over 50 million publicly accessible decisions, significantly aiding academic research. Nonetheless, the archive's incompleteness and opaque upload criteria introduce potential biases as some judgments remain unuploaded.

The effectiveness of different types of defense lawyers—private attorneys, public defenders, and appointed lawyers—continues to be debated. Empirical studies from the United States show mixed results about their effectiveness in criminal cases, often limited by the small number of cases analyzed. This project aims to overcome these limitations by analyzing data from the SPC platform to provide a comprehensive view of disparities within China's legal system, potentially informing policy changes for fairer legal processes.

Multi-stage preprocessing pipeline

The primary data source for this study is the archived version of the China Judgment Online database, maintained by the Supreme People's Court of China, covering cases from 2010 to 2021, with the majority post-2014. This database aims to enhance transparency and public trust by making legal decisions accessible.

The methodology begins with the meticulous selection of judicial cases suitable for computational analysis. Selection criteria were based on the availability and quantifiable nature of cases within a dataset of approximately 50 million rows. The focus was narrowed to drug-related crimes due to their detailed quantification in judicial records, including clear metrics like incident descriptions and legal outcomes. The study specifically examines 290,000 drug offense cases, chosen for their comprehensive documentation of drug types and quantities.

The next phase utilized large language models (LLMs) to extract data from unstructured judgment texts. Initially, GPT 3.5 was used, followed by Claude Haiku due to its improved processing speed and cost-effectiveness. These models were tasked with identifying and recording specifics such as charges, sentences, legal representation (private attorney or public defender), and detailed drug information. The final step involves the meticulous cleaning and preparation of extracted data, ensuring its accuracy and reliability for subsequent analysis. This stage is critical for the integrity of statistical testing and interpretation.

Isolating the Impact of Lawyer Representation

To assess the effect of lawyer representation on sentencing outcomes, my study implemented a streamlined approach to manage and analyze the data. I clustered the data by province to account for regional variations in legal policies that might affect sentencing, ensuring that any differences observed are more likely linked to the quality of legal representation. I then organized cases into groups based on the amount of drugs involved, from as little as 0-10 grams to over 100 grams. This categorization allows for a focused comparison between similar cases, highlighting the influence of legal representation apart from the severity of the drug offense. Additionally, I narrowed my dataset to only include cases involving the sale of methamphetamine, enhancing the consistency of my analysis and emphasizing how legal representation impacts sentencing in these particular situations. This structured yet accessible method helps clarify the specific role of legal advocacy in judicial decisions.

Examining 290,000 Cases of Drug-Related Crimes

Histogram of Imprisonment Lengths by Legal Representation (Percentage of Total Cases)

To investigate how legal representation influences the length of imprisonment, I analyzed a dataset of criminal cases and categorized the defendants by whether they had no lawyer, a private attorney, or a public defender. This histogram illustrates the distribution of imprisonment lengths based on the type of legal representation—no lawyer, private attorney, or public defender. The x-axis indicates the length of imprisonment in months, while the y-axis shows the percentage of total cases.

Most cases, regardless of the type of legal representation, involve shorter imprisonment lengths of up to 60 months or 5 years. This suggests that shorter sentences are common across different legal defenses, highlighting a trend in the judicial outcomes for a variety of offenses.

In terms of legal representation, cases where the defendant had no lawyer (represented by red bars) are noticeably concentrated at the shortest sentence lengths (0-30 months). This clustering indicates that less severe offenses are often handled without legal representation. In contrast, cases handled by public defenders (green bars) are distributed across a wider range of sentence lengths, including longer terms. This distribution suggests that public defenders handle a broad spectrum of cases, including those that involve more severe crimes which lead to longer sentences.

There is a significant presence of long-term sentences (over 150 months) especially in cases defended by public defenders. This pattern implies that in more complex and severe cases, which are likely to result in longer sentences, courts will appoint a public defender, in accordance with the criminal procedure requirement.

Regression analysis

To examine the correlation between the total imprisonment length and the presence of legal representation (private attorney or public defender), I conducted a regression analysis. The regression model aimed to assess how different factors might influence sentencing outcomes, particularly the type of legal representation and the quantity of drugs involved in each case.

Y_i = β_0 + β_1 * X_(1i) + β_2 * X_(2i) + β_3 * X_(3i) + ε_i

[ Y_i = β₀ + β₁ * X_1i + β₂ * X_2i + β₃ * X_3i + ε_i]

The dependent variable Y_i measures the total imprisonment length for the i-th case in months, reflecting the severity of the sentence.

The independent variables are defined as follows:

X₁: A continuous variable for the amount of drug involved in grams, used to assess its impact on the sentencing outcome.
X₂: A binary variable representing the presence of a private attorney (1 if present, 0 otherwise), intended to evaluate the influence of private legal representation on sentence length.
X₃: A binary variable indicating the presence of a public defender (1 if present, 0 otherwise), to explore how public defense affects sentencing compared to cases with no lawyer or a private attorney.

Hypotheses and Expectations

The regression model tests several hypotheses:

Drug Amount Influence: I expect X₁to positively correlate with Y_i, hypothesizing that cases involving larger amounts of drugs will generally result in longer sentences.
Impact of Private Attorney: For X₂, the hypothesis is that the presence of a private attorney might be associated with shorter imprisonment lengths compared to those without one, possibly due to better defense strategies or more resources in legal representation.
Role of Public Defender: With X₃, I anticipate that the presence of a public defender might also affect the sentencing outcomes, though the direction of this impact could vary. Public defenders might mitigate the sentence due to effective defense, or cases with public defenders could correlate with longer sentences if they tend to be assigned to severe or legally complex cases in which defendants could not afford private attorneys.

Output and Analysis

1. Spatial Varian

2. Temporal Analysis

3. Moving to cases with higher drug amount involved

Explain the Observations

The observations from the study reveal significant insights into how legal representation influences judicial outcomes across various regions and economic scenarios.

The first observation notes provincial variability in the impact of private attorneys on sentencing outcomes. This mixed result likely stems from differences in regulations governing access to lawyers and public defenders across provinces, as well as the uneven distribution of legal resources. Some regions may have more robust legal aid systems, significantly influencing case outcomes, while others might not, reflecting a disparity in legal resource allocation at the provincial level.

I have also observed that the impact of public defenders on case outcomes varies significantly across different provinces. In some regions, defendants represented by public defenders tend to receive shorter sentences compared to those with private attorneys, while in other provinces, the opposite is true, with longer sentences being more common. This variation could be attributed to the differing policies regarding the assignment of public defenders across provinces, which are subject to frequent changes over the years. (Zhang, Lu, & Hu, 2024).

The second observation highlights consistency within cities over time regarding the impact of sentencing. This suggests that local judicial practices exert a stronger influence on outcomes than provincial policies. Judges in the same city likely share similar interpretations of the law and adhere to established local norms, leading to more predictable and consistent outcomes. This local consistency suggests a dominant role of city-level legal culture in shaping judicial decisions.

The third observation discusses the economic impact on legal access, particularly in less affluent provinces. Here, the presence of a lawyer often correlates with harsher sentences, likely because legal representation is limited and typically reserved for severe or notable cases. This scenario indicates that in poorer areas, economic constraints restrict access to legal resources, making legal aid available primarily in more serious cases which naturally may result in harsher penalties. This highlights the economic disparities in legal representation and suggests that wealthier provinces might provide broader access to legal services, potentially leading to more equitable outcomes.

Lastly, I've observed that my regression analysis encounters difficulties when handling cases involving higher amounts of drugs. This is evidenced by higher RMSEs and lower R² values in such cases. These indicators suggest that the model struggles to effectively explain the variability in sentencing outcomes for more severe cases. This difficulty in isolating the impact of lawyer representation in cases with larger quantities of drugs highlights the complexities involved in analyzing more serious offenses. The increased severity and potential for more significant legal repercussions may introduce additional variables that my current model does not adequately account for.

Limitations

A Correlation Analysis

This analysis is fundamentally correlational, not causal, due to significant challenges in drawing conclusive causal relationships from the provided data. Notable gaps exist because certain decisions are deliberately omitted from the database, often for reasons such as privacy concerns or national security. These omissions, not disclosed by the courts (Liebman et al., 2020), affect the analysis's integrity and completeness, complicating the establishment of a causal link.

Isolating the impact of legal representation on sentencing outcomes presents challenges. Ethical concerns prevent conducting a randomized controlled trial that could compromise defendants' rights. Additionally, the absence of broad policy reforms limits the use of methods like Difference in Differences to assess causal effects over time.

Another issue is how multiple charges are handled in the Chinese legal system, where concurrent sentences may receive cumulative sentencing discounts, making the quantity of drugs involved less relevant to sentencing outcomes beyond a certain threshold.

Given these complexities, integrating qualitative data might be necessary to more thoroughly understand the reasons behind data discrepancies and explore the underlying dynamics.

Balancing the Cost and Accuracy of Large Language Models

The study faces limitations regarding the accuracy and cost-efficiency of using Large Language Models (LLMs) for data extraction. Extraction accuracy with LLMs varies, typically from 75% to 99%, with critical data like drug type and amount often at the lower end due to their reliance on deeper logical inference. To manage costs effectively, text data was trimmed and condensed before processing, slightly compromising accuracy due to reduced contextual information.

Additionally, self-validation was omitted to avoid tripling expenses but could be reconsidered in future projects where budget flexibility allows for enhanced accuracy. This approach would improve the reliability of data extraction.

As technology advances, the challenges associated with extraction accuracy and cost-efficiency are expected to diminish. Improvements in computing power and LLM technology are likely to enhance their ability to process and analyze data accurately, reducing the costs associated with high-volume data processing and extensive manual validation.

References

Zhang, Lening, Hong Lu, and Ming Hu. "An Empirical Study of Publicly Appointed and Privately Retained Defense Lawyers in Plea Bargaining: The Chinese Experience." Criminal Law Forum. April 23, 2024. https://doi.org/10.1007/s10609-024-09482-2.
Liebman, Benjamin L., Margaret E. Roberts, Rachel E. Stern, and Alice Z. Wang. "Mass digitization of Chinese court decisions: How to use text as data in the field of Chinese law." Journal of Law and Courts 8, no. 2 (2020): 177-201.

Acknowledgment

I would like to acknowledge Jiarui Song, co-author of the manuscript on which this blog post is based, for his invaluable contributions to the study.

Leveraging Large Language Models for Analyzing Judicial Disparities in China

Topics