Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Last update: Dec 22, 2021

Related tags

Data Analysis covid-county

Overview

Covid County

Executive summary

Setup

Install miniconda, then in the command line, run

conda create -n covid-county
conda activate covid-county
conda install pandas ipython matplotlib tabulate

(Let me know if you want pure-Python no-Conda instructions via venv.)

2020 US presidential election

I've already downloaded countypres_2000-2020.csv from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ but you can download it again to ensure I haven't committed bad data.

2020 data is missing counts for District of Columbia (FIPS 11001)? Party split taken from 2016 election.

Census

From https://www.census.gov/programs-surveys/popest/technical-documentation/research/evaluation-estimates/2020-evaluation-estimates/2010s-counties-total.html I downloaded co-est2020.csv from the "Annual Resident Population Estimates for States and Counties: April 1, 2010 to July 1, 2019; April 1, 2020; and July 1, 2020 (CO-EST2020)" link. It's committed in this repo but you can download it yourself too.

Covid

Install Git and run this in this directory: git clone --depth 1 https://github.com/nytimes/covid-19-data.git (it might take a while)

Note five boroughs of NYC are combined into a single "county". This is taken into account by merging the 2020 Presidential votes from all five boroughs into a single county (since we can't split the Covid deaths into individual boroughs, this is the best we can do). Fix follows the recommendation per upstream issue 105.

Run

python main.py

(Takes ~45 seconds on my 2015-vintage laptop.)

More results

party bin	total Covid-19 deaths
Rep 80+%	38284
Rep 60–79%	211416
Rep 50–59%	123587
Dem 50–59%	196084
Dem 60–79%	210070
Dem 80+%	18331
unknown	5243

Simply by party:

Dem: 424485
Rep: 373287

Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Related tags

Overview

Covid County

Executive summary

Setup

2020 US presidential election

Census

Covid

Run

More results

Owner

Ahmed Fasih

ETL pipeline on movie data using Python and postgreSQL

Airflow ETL With EKS EFS Sagemaker

Python implementation of Principal Component Analysis

Extract data from a wide range of Internet sources into a pandas DataFrame.

Data and code accompanying the paper Politics and Virality in the Time of Twitter

ETL flow framework based on Yaml configs in Python

SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).

Tools for analyzing data collected with a custom unity-based VR for insects.

University Challenge 2021 With Python

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

Toolchest provides APIs for scientific and bioinformatic data analysis.

Utilize data analytics skills to solve real-world business problems using Humana’s big data

A Python package for modular causal inference analysis and model evaluations

Pipeline and Dataset helpers for complex algorithm evaluation.

In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Fancy data functions that will make your life as a data scientist easier.

Pipeline to convert a haploid assembly into diploid

talkbox is a scikit for signal/speech processing, to extend scipy capabilities in that domain.

A columnar data container that can be compressed.

Minimal working example of data acquisition with nidaqmx python API