Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Related tags

Data Analysisopendata
Overview

opendata

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

import asyncio
from opendata.sources.bikeshare.bay_wheels import trips as bay_wheels

trips_df, _ = asyncio.run(bay_wheels.async_load(trip_sample_rate=1000))

len(trips_df.index)
# 8731

trips_df.columns
# Index(['started_at', 'ended_at', 'start_station_id', 'end_station_id',
#        'start_station_name', 'end_station_name', 'rideable_type', 'ride_id',
#        'start_lat', 'start_lng', 'end_lat', 'end_lng', 'gender', 'user_type',
#        'bike_id', 'birth_year'],
#       dtype='object')

An example analysis can be found here: https://observablehq.com/@brady/bikeshare

Supports sampling and local file caching to improve performance.

Markets supported

import opendata.sources.bikeshare.bay_wheels
import opendata.sources.bikeshare.bixi
import opendata.sources.bikeshare.divvy
import opendata.sources.bikeshare.capital_bikeshare
import opendata.sources.bikeshare.citi_bike
import opendata.sources.bikeshare.cogo
import opendata.sources.bikeshare.niceride
import opendata.sources.bikeshare.bluebikes
import opendata.sources.bikeshare.metro_bike_share
import opendata.sources.bikeshare.indego

Bootstrap

Set up your environment

brew install chromedriver
brew install python3
python3 -m pip install pre-commit
pre-commit install --install-hooks
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Entering virtualenv

python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt

Usage

Try the test export to CSV:

python3 test.py

Updating pip requirements

pip-compile

Pre-commit setup

pre-commit install --install-hooks

Bikeshare markets to add

USA

  • 119k/yr Pittsburgh (google drive links)
  • 180k/yr Austin (date and time fields separate)

World

  • 3868k/yr Ecobici (need station CSV)
  • 2900k/yr Toronto (needs more investigation)
  • 650k/yr Vancouver (google drive links)
Owner
Brady Law
prev SWE @lyft and @apple
Brady Law
CINECA molecular dynamics tutorial set

High Performance Molecular Dynamics Logging into CINECA's computer systems To logon to the M100 system use the following command from an SSH client ss

J. W. Dell 0 Mar 13, 2022
statDistros is a Python library for dealing with various statistical distributions

StatisticalDistributions statDistros statDistros is a Python library for dealing with various statistical distributions. Now it provides various stati

1 Oct 03, 2021
Project: Netflix Data Analysis and Visualization with Python

Project: Netflix Data Analysis and Visualization with Python Table of Contents General Info Installation Demo Usage and Main Functionalities Contribut

Kathrin Hälbich 2 Feb 13, 2022
Exploring the Top ML and DL GitHub Repositories

This repository contains my work related to my project where I scraped data on the most popular machine learning and deep learning GitHub repositories in order to further visualize and analyze it.

Nico Van den Hooff 17 Aug 21, 2022
Stochastic Gradient Trees implementation in Python

Stochastic Gradient Trees - Python Stochastic Gradient Trees1 by Henry Gouk, Bernhard Pfahringer, and Eibe Frank implementation in Python. Based on th

John Koumentis 2 Nov 18, 2022
Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas

Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas

Gabriele 3 Jul 05, 2022
Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

University of Stuttgart Visualization Research Center 2 Jul 01, 2022
MDAnalysis is a Python library to analyze molecular dynamics simulations.

MDAnalysis Repository README [*] MDAnalysis is a Python library for the analysis of computer simulations of many-body systems at the molecular scale,

MDAnalysis 933 Dec 28, 2022
An extension to pandas dataframes describe function.

pandas_summary An extension to pandas dataframes describe function. The module contains DataFrameSummary object that extend describe() with: propertie

Mourad 450 Dec 30, 2022
Display the behaviour of a realtime program with a scope or logic analyser.

1. A monitor for realtime MicroPython code This library provides a means of examining the behaviour of a running system. It was initially designed to

Peter Hinch 17 Dec 05, 2022
Performance analysis of predictive (alpha) stock factors

Alphalens Alphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open sour

Quantopian, Inc. 2.5k Jan 09, 2023
Picka: A Python module for data generation and randomization.

Picka: A Python module for data generation and randomization. Author: Anthony Long Version: 1.0.1 - Fixed the broken image stuff. Whoops What is Picka

Anthony 108 Nov 30, 2021
Weather analysis with Python, SQLite, SQLAlchemy, and Flask

Surf's Up Weather analysis with Python, SQLite, SQLAlchemy, and Flask Overview The purpose of this analysis was to examine weather trends (precipitati

Art Tucker 1 Sep 05, 2021
A columnar data container that can be compressed.

Unmaintained Package Notice Unfortunately, and due to lack of resources, the Blosc Development Team is unable to maintain this package anymore. During

944 Dec 09, 2022
INF42 - Topological Data Analysis

TDA INF421(Conception et analyse d'algorithmes) Projet : Topological Data Analysis SphereMin Etant donné un nuage des points, ce programme contient de

2 Jan 07, 2022
Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

Find relative paths from a project root directory Finding project directories in Python (data science) projects, just like there R here and rprojroot

Daniel Chen 102 Nov 16, 2022
GWpy is a collaboration-driven Python package providing tools for studying data from ground-based gravitational-wave detectors

GWpy is a collaboration-driven Python package providing tools for studying data from ground-based gravitational-wave detectors. GWpy provides a user-f

GWpy 342 Jan 07, 2023
Tokyo 2020 Paralympics, Analytics

Tokyo 2020 Paralympics, Analytics Thanks for checking out my app! It was built entirely using matplotlib and Tokyo 2020 Paralympics data. This applica

Petro Ivaniuk 1 Nov 18, 2021
small package with utility functions for analyzing (fly) calcium imaging data

fly2p Tools for analyzing two-photon (2p) imaging data collected with Vidrio Scanimage software and micromanger. Loading scanimage data relies on scan

Hannah Haberkern 3 Dec 14, 2022
OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase working capital.

Overview OpenARB is an open source program aiming to emulate a free market while encouraging players to participate in arbitrage in order to increase

Tom 3 Feb 12, 2022