Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities

Zejiang Shen, Kyle Lo, Lauren Yu, Nathan Dahlberg, Margo Schlanger, Doug Downey

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

With the advent of large language models, methods for abstractive summarization have made great strides, creating potential for use in applications to aid knowledge workers processing unwieldy document collections. One such setting is the Civil Rights Litigation Clearinghouse (CRLC), which posts information about large-scale civil rights lawsuits, serving lawyers, scholars, and the general public. Today, summarization in the CRLC requires extensive training of lawyers and law students who spend hours per case understanding multiple relevant documents in order to produce high-quality summaries of key events and outcomes. Motivated by this ongoing real-world summarization effort, we introduce Multi-LexSum, a collection of 9,280 expert-authored summaries drawn from ongoing CRLC writing. Multi-LexSum presents a challenging multi-document summarization task given the length of the source documents, often exceeding two hundred pages per case. Furthermore, Multi-LexSum is distinct from other datasets in its multiple target summaries, each at a different granularity (ranging from one-sentence “extreme” summaries to multi-paragraph narrations of over five hundred words). We present extensive analysis demonstrating that despite the high-quality summaries in the training data (adhering to strict content and style guidelines), state-of-the-art summarization models perform poorly on this task. We release Multi-LexSum for further summarization research and to facilitate the development of applications to assist in the CRLC's mission.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherNeural information processing systems foundation
ISBN (Electronic)9781713871088
StatePublished - 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: Nov 28 2022Dec 9 2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period11/28/2212/9/22

Funding

We thank the reviewers for their very helpful suggestions and feedback! We thank the following institutions and entities who generously provide the support for the curation of the underlying Civil Rights Litigation Clearinghouse data over its 15-year history, including: University of Michigan Law School; Washington University in St. Louis School of Law; Center for Empirical Research in Law; Arnold Ventures, “Improving Criminal Justice Reformers’ Use of Litigation Information, Documents, and Insights” (2021-2023); Vital Projects Fund, “Revamping the Civil Rights Litigation Clearinghouse” (2021); Proteus Fund, “Revamping the Civil Rights Litigation Clearinghouse” (2021); National Science Foundation SES-0718831, “The Litigation Process in Government-Initiated Employment Discrimination Suits” (2007). The construction of the Multi-LexSum dataset was also funded in part by NSF Convergence Accelerator Award ITE-2132318. We thank the reviewers for their very helpful suggestions and feedback! We thank the following institutions and entities who generously provide the support for the curation of the underlying Civil Rights Litigation Clearinghouse data over its 15-year history, including: University of Michigan Law School; Washington University in St. Louis School of Law; Center for Empirical Research in Law; Arnold Ventures, “Improving Criminal Justice Reformers' Use of Litigation Information, Documents, and Insights” (2021-2023); Vital Projects Fund, “Revamping the Civil Rights Litigation Clearinghouse” (2021); Proteus Fund, “Revamping the Civil Rights Litigation Clearinghouse” (2021); National Science Foundation SES-0718831, “The Litigation Process in Government-Initiated Employment Discrimination Suits” (2007). The construction of the Multi-LexSum dataset was also funded in part by NSF Convergence Accelerator Award ITE-2132318.

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities'. Together they form a unique fingerprint.

Cite this