Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Last update: Dec 01, 2021

Related tags

Overview

opendata

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

import asyncio
from opendata.sources.bikeshare.bay_wheels import trips as bay_wheels

trips_df, _ = asyncio.run(bay_wheels.async_load(trip_sample_rate=1000))

len(trips_df.index)
# 8731

trips_df.columns
# Index(['started_at', 'ended_at', 'start_station_id', 'end_station_id',
#        'start_station_name', 'end_station_name', 'rideable_type', 'ride_id',
#        'start_lat', 'start_lng', 'end_lat', 'end_lng', 'gender', 'user_type',
#        'bike_id', 'birth_year'],
#       dtype='object')

An example analysis can be found here: https://observablehq.com/@brady/bikeshare

Supports sampling and local file caching to improve performance.

Markets supported

import opendata.sources.bikeshare.bay_wheels
import opendata.sources.bikeshare.bixi
import opendata.sources.bikeshare.divvy
import opendata.sources.bikeshare.capital_bikeshare
import opendata.sources.bikeshare.citi_bike
import opendata.sources.bikeshare.cogo
import opendata.sources.bikeshare.niceride
import opendata.sources.bikeshare.bluebikes
import opendata.sources.bikeshare.metro_bike_share
import opendata.sources.bikeshare.indego

Bootstrap

Set up your environment

brew install chromedriver
brew install python3
python3 -m pip install pre-commit

pre-commit install --install-hooks
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Entering virtualenv

python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Usage

Try the test export to CSV:

python3 test.py

Updating pip requirements

pip-compile

Pre-commit setup

pre-commit install --install-hooks

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Related tags

Overview

opendata

Markets supported

Bootstrap

Entering virtualenv

Usage

Updating pip requirements

Pre-commit setup

Bikeshare markets to add

USA

World

Owner

Brady Law

This program analyzes a DNA sequence and outputs snippets of DNA that are likely to be protein-coding genes.

Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Pandas and Spark DataFrame comparison for humans

Program that predicts the NBA mvp based on data from previous years.

Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Template for a Dataflow Flex Template in Python

Stochastic Gradient Trees implementation in Python

Pyspark project that able to do joins on the spark data frames.

2019 Data Science Bowl

ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

Spaghetti: an open-source Python library for the analysis of network-based spatial data

University Challenge 2021 With Python

A utility for functional piping in Python that allows you to access any function in any scope as a partial.

A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

Generate lookml for views from dbt models

In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Data analysis and visualisation projects from a range of individual projects and applications

Statistical Rethinking course winter 2022

Statistical package in Python based on Pandas