Statutory Interpretation Data Set

This repository contains the data set created for the following research papers:

Savelka, Jaromir, and Kevin D. Ashley. "Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models." Findings of the Association for Computational Linguistics: EMNLP 2021. 2021.

Jaromir Savelka, Huihui Xu, and Kevin D. Ashley. 2019. Improving Sentence Retrieval from Case Law for Statutory Interpretation. In Seventeenth International Conference on Artificial Intelligence and Law (ICAIL ’19), June 17–21, 2019, Montreal, QC, Canada, Floris Bex (Ed.). ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3322640.3326736

Task

Given a statutory provision, user's interest in the meaning of a phrase from the provision, and a list of sentences we would like to rank more highly the sentences that elaborate upon the meaning of the statutory phrase of interest, such as:

definitional sentences (e.g., a sentence that provides a test for when the phrase applies)
sentences that state explicitly in a different way what the statutory phrase means or state what it does not mean
sentences that provide an example, instance, or counterexample of the phrase
sentences that show how a court determines whether something is such an example, instance, or counterexample.

Corpus Overview

For this corpus we selected fourty two terms from different provisions of the United States Code.

For each term we have collected a set of sentences by extracting all the sentences mentioning the term from the court decisions retrieved from the Caselaw access project data.

In total the corpus consists of 26,959 sentences.

The sentences are classified into four categories according to their usefulness for the interpretation:

high value - sentence intended to define or elaborate on the meaning of the term
certain value - sentence that provides grounds to elaborate on the term's meaning
potential value - sentence that provides additional information beyond what is known from the provision the term comes from
no value - no additional information over what is known from the provision

See Annotation guidelines for additional details.

Data Structure

Each zip file contains data related to one of the fourty two queries. There are four files in total containing the texts of different granularity. These allow to replicate experiments reported in the paper cited above.

case
- original_id - case id from Caselaw access project
- name
- short_name
- date
- official_date
- official citation
- alternate_citations
- court
- short_court - court abbreviation
- jurisdiction
- short_jurisdiction - jurisdiction abbreviation
- attorneys
- parties
- judges
- text
opinion
- case_id - pointer to the case the opinion belongs to
- author
- type - e.g., concurrence, dissent
- position - position of the opinion within the case
- text
paragraph
- case_id - pointer to the case the opinion belongs to
- opinion_id - pointer to the opinion the paragraph belongs to
- position - position of the paragraph within the opinion
- text
sentence
- case_id - pointer to the case the sentence belongs to
- opinion_id - pointer to the opinion the sentence belongs to
- paragraph_id - pointer to the paragraph the sentence belongs to
- position - position of the sentence within the paragraph
- text
- label - human-created gold label of the sentence value

Terms of Use

For use of the data we kindly ask you to provide the two following attributions:

Savelka, Jaromir, and Kevin D. Ashley. "Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models." Findings of the Association for Computational Linguistics: EMNLP 2021. 2021.

The President and Fellows of Harvard University, Caselaw access project, Caselaw access project, 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
ICAIL2019_presentation.pdf		ICAIL2019_presentation.pdf
README.md		README.md
accommodation_trade.zip		accommodation_trade.zip
annotation_guidelines.pdf		annotation_guidelines.pdf
annotation_guidelines_v2.pdf		annotation_guidelines_v2.pdf
audiovisual_work.zip		audiovisual_work.zip
aural_transfer.zip		aural_transfer.zip
basic_allowance_for_subsistence.zip		basic_allowance_for_subsistence.zip
common_business_purpose.zip		common_business_purpose.zip
cybercrime.zip		cybercrime.zip
dependent_on_hours_worked.zip		dependent_on_hours_worked.zip
digital_musical_recording.zip		digital_musical_recording.zip
dischargeable_consumer_debt.zip		dischargeable_consumer_debt.zip
distributive_share_of_the_income.zip		distributive_share_of_the_income.zip
electronic_signature.zip		electronic_signature.zip
essential_step.zip		essential_step.zip
familiar_symbol.zip		familiar_symbol.zip
fermented_liquor.zip		fermented_liquor.zip
final_average_compensation.zip		final_average_compensation.zip
fully_amortize.zip		fully_amortize.zip
gas_pipeline_facility.zip		gas_pipeline_facility.zip
hazardous_liquid.zip		hazardous_liquid.zip
hybrid_instrument.zip		hybrid_instrument.zip
identifying_particular.zip		identifying_particular.zip
independent_economic_value.zip		independent_economic_value.zip
leadership_role_in_an_organization.zip		leadership_role_in_an_organization.zip
mechanical_recordation.zip		mechanical_recordation.zip
navigation_equipment.zip		navigation_equipment.zip
nonindustrial_use.zip		nonindustrial_use.zip
nonmonetary_benefits.zip		nonmonetary_benefits.zip
preemployment_testing.zip		preemployment_testing.zip
preexisting_work.zip		preexisting_work.zip
residential_dwelling.zip		residential_dwelling.zip
savelka_ashley_xu_improving_sentence_retrieval_ICAIL_2019.pdf		savelka_ashley_xu_improving_sentence_retrieval_ICAIL_2019.pdf
security_vulnerability.zip		security_vulnerability.zip
semiconductor_chip_product.zip		semiconductor_chip_product.zip
significant_property_damage.zip		significant_property_damage.zip
small_manufacturer.zip		small_manufacturer.zip
standard_coin.zip		standard_coin.zip
stored_electronically.zip		stored_electronically.zip
substantial_portion_of_the_public.zip		substantial_portion_of_the_public.zip
switchblade_knife.zip		switchblade_knife.zip
technological_measure.zip		technological_measure.zip
unduly_disrupt_the_operations.zip		unduly_disrupt_the_operations.zip
unreasonably_low_prices.zip		unreasonably_low_prices.zip
useful_improvement.zip		useful_improvement.zip
viticultural.zip		viticultural.zip

jsavelka/statutory_interpretation

Folders and files

Latest commit

History

Repository files navigation

Statutory Interpretation Data Set

Task

Corpus Overview

Data Structure

Terms of Use

About

Resources

Stars

Watchers

Forks