A small timeseries transformation API built on Flask and Pandas

Overview

#Mcflyin

###A timeseries transformation API built on Pandas and Flask

This is a small demo of an API to do timeseries transformations built on Flask and Pandas.

Concept

The idea is that you can make a POST request to the API with a simple list/array of timestamps, from any language, and get back some interesting transformations of that data.

Why?

Partly to show how straightforward it is to build such a thing. Python is great because it has very powerful, intuitive, quick-to-learn tools for both building web applications and doing data analysis/statistics.

That puts Python in kind of a unique position: powerful web tools, powerful scientific/numerical/statistical data tools. This API is a very simple example of how you can take advantage of both. Go read the source code- it's short and easy to grok. Bug fixes and pull requests welcome.

Getting Started

First we need to find some data. We're going to use some data that Wes McKinney provided in a recent blog post, with some statistics on Python posts on Stack Overflow. This is something of a contrived example: I'm manipulating the data in Python, sending to a Python backend, and then getting a response to manipulate in Python. Just know that all you need is an array of timestamp strings, no matter your language.

import pandas as pd

data = pd.read_csv('AllPandas.csv')
data = data['CreationDate'].tolist()

A simple array of timestamps:

>>>data[:10]
['2011-04-01 14:50:44',
 '2012-01-18 19:41:27',
 '2012-01-23 03:21:00',
 '2012-01-24 17:59:53',
 '2012-03-04 16:58:45',
 '2012-03-09 22:36:52',
 '2012-03-10 15:35:26',
 '2012-03-18 12:53:06',
 '2012-03-30 13:58:29',
 '2012-04-04 23:17:23']

With the McFlyin application running on localhost, lets make a request to resample the data on an daily basis, to get the number of posts per day:

import requests
import json

freq = {'D': 'Daily'}
sends = {'freq': json.dumps(freq), 'data': json.dumps(data)}
r = requests.post('http://127.0.0.1:5000/resample', data=sends)
response = r.json

The response is simple JSON:

{'Monthly': {'data': [1.0, 2.0, 1.0, 1.0,...
             'time': ['2011-03-31T00:00:00', '2011-04-30T00:00:00', '2011-05-31T00:00:00', '2011-06-30T00:00:00', '2011-07-31T00:00:00',...

Here's the distribution of daily questions on Stack Overflow for Pandas (monthly probably would have been a little more informative):

Daily

Let's call Mcflyin for a rolling sum on a seven-day window. It will resample to the given freq, then apply the window to the result:

freq = {'D': 'Weekly Rolling'}
sends = {'freq': json.dumps(freq), 'data': json.dumps(data), 'window': 7}
r = requests.post('http://127.0.0.1:5000/rolling_sum', data=sends)
response = r.json

Rolling

Let's look at the total questions asked by day:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/daily', data=sends)
response = r.json

dailysum

and daily means:

sends = {'data': json.dumps(data), 'how': json.dumps('mean')}
r = requests.post('http://127.0.0.1:5000/daily', data=sends)
response = r.json

dailymean

The same for hourly:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/hourly', data=sends)
response = r.json

dailymean

Finally, we can look at hourly by day-of-week:

sends = {'data': json.dumps(data), 'how': json.dumps('sum')}
r = requests.post('http://127.0.0.1:5000/daily_hours', data=sends)
response = r.json

hourdow

Live demo here

Dependencies

Pandas, Numpy, Requests, Flask

How did you make those colorful graphs?

Vincent and Bearcart

Status

Lots of stuff that could be better- error handling on the requests, probably better handling of weird timestamps, etc. This is just a small demo of how powerful Python can be for building a statistics backend with relatively few lines of code.

If I want to write a front-end in a different language, can I put it in the examples folder?

Yes! PR's welcome.

Owner
Rob Story
Rob Story
A python wrapper for creating and viewing effects for Matt Parker's christmas tree.

Christmas Tree Visualizer A python wrapper for creating and viewing effects for Matt Parker's christmas tree. Displays py or csv effect files and allo

4 Nov 22, 2022
Visualize data of Vietnam's regions with interactive maps.

Plotting Vietnam Development Map This is my personal project that I use plotly to analyse and visualize data of Vietnam's regions with interactive map

1 Jun 26, 2022
The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

Dan Foreman-Mackey 1.3k Jan 04, 2023
A custom qq-plot for two sample data comparision

QQ-Plot 2 Sample Just a gist to include the custom code to draw a qq-plot in python when dealing with a "two sample problem". This means when u try to

1 Dec 20, 2021
Extract data from ThousandEyes REST API and visualize it on your customized Grafana Dashboard.

ThousandEyes Grafana Dashboard Extract data from the ThousandEyes REST API and visualize it on your customized Grafana Dashboard. Deploy Grafana, Infl

Flo Pachinger 16 Nov 26, 2022
An interactive UMAP visualization of the MNIST data set.

Code for an interactive UMAP visualization of the MNIST data set. Demo at https://grantcuster.github.io/umap-explorer/. You can read more about the de

grant 70 Dec 27, 2022
Make scripted visualizations in blender

Scripted visualizations in blender The goal of this project is to script 3D scientific visualizations using blender. To achieve this, we aim to bring

Praneeth Namburi 10 Jun 01, 2022
Gesture controlled media player

Media Player Gesture Control Gesture controller for media player with MediaPipe, VLC and OpenCV. Contents About Setup About A tool for using gestures

Atharva Joshi 2 Dec 22, 2021
HM02: Visualizing Interesting Datasets

HM02: Visualizing Interesting Datasets This is a homework assignment for CSCI 40 class at Claremont McKenna College. Go to the project page to learn m

Qiaoling Chen 11 Oct 26, 2021
A simple python tool for explore your object detection dataset

A simple tool for explore your object detection dataset. The goal of this library is to provide simple and intuitive visualizations from your dataset and automatically find the best parameters for ge

GRADIANT - Centro Tecnolóxico de Telecomunicacións de Galicia 142 Dec 25, 2022
Python & Julia port of codes in excellent R books

X4DS This repo is a collection of Python & Julia port of codes in the following excellent R books: An Introduction to Statistical Learning (ISLR) Stat

Gitony 5 Jun 21, 2022
Extract and visualize information from Gurobi log files

GRBlogtools Extract information from Gurobi log files and generate pandas DataFrames or Excel worksheets for further processing. Also includes a wrapp

Gurobi Optimization 56 Nov 17, 2022
Example Code Notebooks for Data Visualization in Python

This repository contains sample code scripts for creating awesome data visualizations from scratch using different python libraries (such as matplotli

Javed Ali 27 Jan 04, 2023
Generate SVG (dark/light) images visualizing (private/public) GitHub repo statistics for profile/website.

Generate daily updated visualizations of GitHub user and repository statistics from the GitHub API using GitHub Actions for any combination of private and public repositories, whether owned or contri

Adam Ross 2 Dec 16, 2022
Pretty Confusion Matrix

Pretty Confusion Matrix Why pretty confusion matrix? We can make confusion matrix by using matplotlib. However it is not so pretty. I want to make con

Junseo Ko 5 Nov 22, 2022
Standardized plots and visualizations in Python

Standardized plots and visualizations in Python pltviz is a Python package for standardized visualization. Routine and novel plotting approaches are f

Andrew Tavis McAllister 0 Jul 09, 2022
Schema validation just got Pythonic

Schema validation just got Pythonic schema is a library for validating Python data structures, such as those obtained from config-files, forms, extern

Vladimir Keleshev 2.7k Jan 06, 2023
Cartopy - a cartographic python library with matplotlib support

Cartopy is a Python package designed to make drawing maps for data analysis and visualisation easy. Table of contents Overview Get in touch License an

1.2k Jan 01, 2023
NorthPitch is a python soccer plotting library that sits on top of Matplotlib

NorthPitch is a python soccer plotting library that sits on top of Matplotlib.

Devin Pleuler 30 Feb 22, 2022
Plotting library for IPython/Jupyter notebooks

bqplot 2-D plotting library for Project Jupyter Introduction bqplot is a 2-D visualization system for Jupyter, based on the constructs of the Grammar

3.4k Dec 29, 2022