Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System

Overview

This repository contains code that gets data regarding financial disclosures from the Court Listener API

  • main.py: contains driver code that interacts with all the other files. Only file that should be run. When run it will grab all the data and populate output.csv with it
  • auth_token.py: Reads API authentication token.
  • AUTH_TOKEN.txt: Contains API authentication token. Obtain yours from here and paste it into this file
  • fields.py: contains the code that grabs all the fields from every disclosure
  • lookups.py: contains some extra lookup tables (aside form the ones embedded in fields.py) for the values returned from the API
  • utils.py: contains some utility functions
  • requirements.txt: contains the list of dependencies used. Install them by running pip install -r requirements.txt
  • README.txt: readme in txt format

Overview

Every year judges file a financial disclosure form as mandated by law. Courtlistener parses these forms which are PDFs into their database. Here is an example of one of the unederlying forms that will help me explain what every row in our data is: https://storage.courtlistener.com/us/federal/judicial/financial-disclosures/9529/patricia-a-sullivan-disclosure.2019.pdf Disclosures are seperated into certain categories, such as positions, or investments. Each individual listing under a certain type of disclosure, is a row in our data. So if you look at that PDF, Member and Officer at Board of Directors of Roger Williams University School of Law, would be the basis for one row. If you scroll down to investments, MFS Investment Management (Educational Funds) (H), would also be the basis for one row. For that row, the fields listed below under Disclosure Fields -> Investments will all be filled out (unless they are not present in the courtlistner database). The Common Fields and Person Fields will also be filled out. Person fields are fields unique to the judge, and common fields unique to the report. So for the two example rows, the common fields and person fields would remain constant (as the judge and report are the same), but the disclosure fields will be different. For the first one, the fields under Disclosure Fields -> Positions will be filled out, with the rest of the disclosure fields empty, and for the second one the fields under Disclosure Fields -> Investments would be filled out.

=============
Common Fields
=============



sha1: SHA1 hash of the generated PDF
is_amended: Is disclosure amended?
Disclosure PDF: PDF of the original filed disclosure
Year Disclosed: Date of judicial agreement.
report_type: Financial Disclosure report type
addendum_redacted: Is the addendum partially or completely redacted?
Disclosure Type: Type of the disclosure, (investments, debts, etc)

=============
Disclosure Fields
=============


Note: Depending on the Disclosure Type field above, the corresponding fields will be filled in for the row


agreements:
        date_raw: Date of judicial agreement.
        parties_and_terms: Parties and terms of agreement (ex. Board Member NY Ballet)
        redacted: Does the agreement row contain redaction(s)?
        financial_disclosure: The financial disclosure associated with this agreement.
        id: ID of the record.
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown

debts:
        creditor_name: Liability/Debt creditor
        description: Description of the debt
        value_code: Form code for the value of the judicial debt, substituted with the numerical values of the range.
        value_code_max: The maximum value of the value_code.
        redacted: Does the debt row contain redaction(s)?
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown

gifts:
        source: Source of the judicial gift. (ex. Alta Ski Area).
        description: Description of the gift (ex. Season Pass).
        value: Value of the judicial gift, (ex. $1,199.00)
        redacted: Does the gift row contain redaction(s)?
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown

investments:
        page_number: The page number the investment is listed on.  This is used to generate links directly to the PDF page.
        description: Name of investment (ex. APPL common stock).
        redacted: Does the investment row contains redaction(s)?
        income_during_reporting_period_code: Increase in investment value - as a form code. Substituted with the numerical values of the range.
        income_during_reporting_period_code_max: Maximum value of income_during_reporting_period_code.
        income_during_reporting_period_type: Type of investment (ex. Rent, Dividend). Typically standardized but not universally.
        gross_value_code: Investment total value code at end of reporting period as code (ex. J (1-15,000)). Substituted with the numerical values of the range.
        gross_value_code_max: Maximum value of the gross_value_code.
        gross_value_method: Investment valuation method code (ex. Q = Appraisal)
        transaction_during_reporting_period: Transaction of investment during reporting period (ex. Buy, Sold)
        transaction_date_raw: Date of the transaction, if any (D2)
        transaction_date: Date of the transaction, if any (D2)
        transaction_value_code: Transaction value amount, as form code (ex. J (1-15,000)). Substituted with the numerical values of the range.
        transaction_value_code_max: Maximum value of transaction_value_code.
        transaction_gain_code: Gain from investment transaction if any (ex. A (1-1000)). Substituted with the numerical values of the range.
        transaction_gain_code_max: Maximum value of transaction_gain_code.
        transaction_partner: Identity of the transaction partner
        has_inferred_values: If the investment name was inferred during extraction. This is common because transactions usually list the first purchase of a stock and leave the name value blank for subsequent purchases or sales.
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown

non_investment_incomes:
        date_raw: Date of non-investment income (ex. 2011).
        source_type: Source and type of non-investment income for the judge (ex. Teaching a class at U. Miami).
        income_amount: Amount earned by judge, often a number, but sometimes with explanatory text (e.g. 'Income at firm: $xyz').
        redacted: Does the non-investment income row contain redaction(s)?
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown

positions:
        non judiciary position: Position title (ex. Trustee).
        organization_name: Name of organization or entity (ex. Trust #1).
        redacted: Does the position row contain redaction(s)?
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown

reimbursements:
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown
        source: Source of the reimbursement (ex. FSU Law School).
        date_raw: Dates as a text string for the date of reimbursements. This is often conference dates (ex. June 2-6, 2011). 
        location: Location of the reimbursement (ex. Harvard Law School, Cambridge, MA).
        purpose: Purpose of the reimbursement (ex. Baseball announcer).
        items_paid_or_provided: Items reimbursed (ex. Room, Airfare).
        redacted: Does the reimbursement contain redaction(s)?

spouse_incomes:
        id: ID of the record
        date_created: The moment when the item was created.
        date_modified: The last moment when the item was modified. A value in year 1750 indicates the value is unknown
        source_type: Source and type of income of judicial spouse (ex. Salary from Bank job).
        redacted: Does the spousal-income row contain redaction(s)?
        date_raw: Date of spousal income (ex. 2011).


=============
Person Fields
=============


fjc_id: The ID of a judge as assigned by the Federal Judicial Center.
Date of Birth: The date of birth for the person
name_last: The last name of this person
political_affiliations: Political affiliations for the judge. Variable length so combined by a comma
Death Country: The country where the person died.
Birth City: The city where the person was born.
name_suffix: Any suffixes that this person's name may have
aba_ratings: American Bar Association Ratings. Variable length so combined by a comma
name_first: The first name of this person.
Death State: The state where the person died.
sources: Sources about the person. Variable length so combined with a newline
Birth Country: The country where the person was born.
cl_id: A unique identifier for judge, also indicating source of data.
gender: The person's gender
name_middle: The middle name or names of this person
ftm_eid: The ID of a judge as assigned by the Follow the Money database.
Death City: The city where the person died.
positions: Positions of person. Variable length so combined with a newline
ftm_total_received: The amount of money received by this person and logged by Follow the Money.
Date of Death: The date of death for the person
religion: The religion of a person
educations: Educations of the person. Variable length so combined by a comma
bachelor school: Name of the school from which they got their Bachelor's degree, and/or Bachelor's of Law degree. Variable length so combined by a comma
juris doctor school: name of the school from which they got their jusris doctor degree. their Bachelor's degree, and/or Bachelor's of Law degree. Variable length so combined by a comma
race: Race of the person. Variable length so combined by a comma
Birth State: The state where the person was born.


Owner
Ali Rastegar
Hi
Ali Rastegar
Tips for Writing a Research Paper using LaTeX

Tips for Writing a Research Paper using LaTeX

Guanying Chen 727 Dec 26, 2022
The tutorial is a collection of many other resources and my own notes

Why we need CTC? --- looking back on history 1.1. About CRNN 1.2. from Cross Entropy Loss to CTC Loss Details about CTC 2.1. intuition: forward algor

手写AI 7 Sep 19, 2022
Assignments from Launch X's python introduction course

Launch X - On Boarding Assignments from Launch X's Python Introduction Course Explore the docs » Report Bug · Request Feature Table of Contents About

Javier Méndez 0 Mar 15, 2022
Software engineering course project. Secondhand trading system.

PigeonSale Software engineering course project. Secondhand trading system. Documentation API doumenatation: list of APIs Backend documentation: notes

Harry Lee 1 Sep 01, 2022
Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System

This repository contains code that gets data regarding financial disclosures from the Court Listener API main.py: contains driver code that interacts

Ali Rastegar 2 Aug 06, 2022
Word document generator with python

In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.

Ezgi Turalı 3 Jan 30, 2022
Uses diff command to compare expected output with student's submission output

AUTOGRADER for GRADESCOPE using diff with partial grading Description: Uses diff command to compare expected output with student's submission output U

2 Jan 11, 2022
Highlight Translator can help you translate the words quickly and accurately.

Highlight Translator can help you translate the words quickly and accurately. By only highlighting, copying, or screenshoting the content you want to translate anywhere on your computer (ex. PDF, PPT

Coolshan 48 Dec 21, 2022
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model

Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th

Wanxuan Zhang 2 Feb 14, 2022
EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

EasyMultiClipboard - Python script written to handle more than 1 string in clipboard

WVlab 1 Jun 18, 2022
A markdown wiki and dashboarding system for Datasette

datasette-notebook A markdown wiki and dashboarding system for Datasette This is an experimental alpha and everything about it is likely to change. In

Simon Willison 19 Apr 20, 2022
Second version of SQL-PYTHON-Practicas

SQLite-Python Acerca de | Autor Sobre el repositorio Segunda version de SQL-PYTHON-Practicas 💻 Tecnologias Visual Studio Code Python SQLite3 📖 Requi

1 Jan 06, 2022
A Python library that simplifies the extraction of datasets from XML content.

xmldataset: simple xml parsing 🗃️ XML Dataset: simple xml parsing Documentation: https://xmldataset.readthedocs.io A Python library that simplifies t

James Spurin 75 Dec 30, 2022
DataRisk Detection Learning Resources

DataRisk Detection Learning Resources Data security: Based on the "data-centric security system" position, it generally refers to the entire security

Liao Wenzhe 59 Dec 05, 2022
Gtech μLearn Sample_bot

Ser_bot Gtech μLearn Sample_bot Do Greet a newly joined member in a channel (random message) While adding a reaction to a message send a message to a

Jerin Paul 1 Jan 19, 2022
A Python validator for SHACL

pySHACL A Python validator for SHACL. This is a pure Python module which allows for the validation of RDF graphs against Shapes Constraint Language (S

RDFLib 187 Dec 29, 2022
Numpy's Sphinx extensions

numpydoc -- Numpy's Sphinx extensions This package provides the numpydoc Sphinx extension for handling docstrings formatted according to the NumPy doc

NumPy 234 Dec 26, 2022
The blazing-fast Discord bot.

Wavy Wavy is an open-source multipurpose Discord bot built with pycord. Wavy is still in development, so use it at your own risk. Tools and services u

Wavy 7 Dec 27, 2022
Loudchecker - Python script to check files for earrape

loudchecker python script to check files for earrape automatically installs depe

1 Jan 22, 2022
Cleaner script to normalize knock's output EPUBs

clean-epub The excellent knock application by Benton Edmondson outputs EPUBs that seem to be DRM-free. However, if you run the application twice on th

2 Dec 16, 2022