Interactive plotting for Pandas using Vega-Lite

Overview

pdvega: Vega-Lite plotting for Pandas Dataframes

build status Binder

pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas dataframes, using an API that is nearly identical to Pandas' built-in visualization tools, and designed for easy use within the Jupyter notebook.

Pandas currently has some basic plotting capabilities based on matplotlib. So, for example, you can create a scatter plot this way:

import numpy as np
import pandas as pd

df = pd.DataFrame({'x': np.random.randn(100), 'y': np.random.randn(100)})
df.plot.scatter(x='x', y='y')

matplotlib scatter output

The goal of pdvega is that any time you use dataframe.plot, you'll be able to replace it with dataframe.vgplot and instead get a similar (but prettier and more interactive) visualization output in Vega-Lite that you can easily export to share or customize:

import pdvega  # import adds vgplot attribute to pandas

df.vgplot.scatter(x='x', y='y')

vega-lite scatter output

The above image is a static screenshot of the interactive output; please see the Documentation for a full set of live usage examples.

Installation

You can get started with pdvega using pip:

$ pip install jupyter pdvega
$ jupyter nbextension install --sys-prefix --py vega3

The first line installs pdvega and its dependencies; the second installs the Jupyter extensions that allows plots to be displayed in the Jupyter notebook. For more information on installation and dependencies, see the Installation docs.

Why Vega-Lite?

When working with data, one of the biggest challenges is ensuring reproducibility of results. When you create a figure and export it to PNG or PDF, the data become baked-in to the rendering in a way that is difficult or impossible for others to extract. Vega and Vega-Lite change this: instead of packaging a figure by encoding its pixel values, they package a figure by describing, in a declarative manner, the relationship between data values and visual encodings through a JSON specification.

This means that the Vega-Lite figures produced by pdvega are portable: you can send someone the resulting JSON specification and they can choose whether to render it interactively online, convert it to a PNG or EPS for static publication, or even enhance and extend the figure to learn more about the data.

pdvega is a step in bringing this vision of figure portability and reproducibility to the Python world.

Relationship to Altair

Altair is a project that seeks to design an intuitive declarative API for generating Vega-Lite and Vega visualizations, using Pandas dataframes as data sources.

By contrast, pdvega seeks not to design new visualization APIs, but to use the existing DataFrame.plot visualization api and output visualizations with Vega/Vega-Lite rather than with matplotlib.

In this respect, pdvega is quite similar in spirit to the now-defunct mpld3 project, though the scope is smaller and (hopefully) much more manageable.

Owner
Altair
Declarative visualization in Python
Altair
A script written in Python that generate output custom color (HEX or RGB input to x1b hexadecimal)

ColorShell ─ 1.5 Planned for v2: setup.sh for setup alias This script converts HEX and RGB code to x1b x1b is code for colorize outputs, works on ou

Riley 4 Oct 31, 2021
Statistics and Visualization of acceptance rate, main keyword of CVPR 2021 accepted papers for the main Computer Vision conference (CVPR)

Statistics and Visualization of acceptance rate, main keyword of CVPR 2021 accepted papers for the main Computer Vision conference (CVPR)

Hoseong Lee 78 Aug 23, 2022
2021 grafana arbitrary file read

2021_grafana_arbitrary_file_read base on pocsuite3 try 40 default plugins of grafana alertlist annolist barchart cloudwatch dashlist elasticsearch gra

ATpiu 5 Nov 09, 2022
PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes

PolytopeSampler PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disrib

9 Sep 26, 2022
The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".

The HIST framework for stock trend forecasting The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining C

Wentao Xu 111 Jan 03, 2023
Generate "Jupiter" plots for circular genomes

jupiter Generate "Jupiter" plots for circular genomes Description Python scripts to generate plots from ViennaRNA output. Written in "pidgin" python w

Robert Edgar 2 Nov 29, 2021
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data

FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick

Synthesized 69 Dec 15, 2022
Interactive plotting for Pandas using Vega-Lite

pdvega: Vega-Lite plotting for Pandas Dataframes pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas datafra

Altair 342 Oct 26, 2022
🐍PyNode Next allows you to easily create beautiful graph visualisations and animations

PyNode Next A complete rewrite of PyNode for the modern era. Up to five times faster than the original PyNode. PyNode Next allows you to easily create

ehne 3 Feb 12, 2022
Visualization Library

CamViz Overview // Installation // Demos // License Overview CamViz is a visualization library developed by the TRI-ML team with the goal of providing

Toyota Research Institute - Machine Learning 67 Nov 24, 2022
basemap - Plot on map projections (with coastlines and political boundaries) using matplotlib.

Basemap Plot on map projections (with coastlines and political boundaries) using matplotlib. ⚠️ Warning: this package is being deprecated in favour of

Matplotlib Developers 706 Dec 28, 2022
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Painlessly create beautiful matplotlib plots.

Announcement Thank you to everyone who has used prettyplotlib and made it what it is today! Unfortunately, I no longer have the bandwidth to maintain

Olga Botvinnik 1.6k Jan 06, 2023
Pydrawer: The Python package for visualizing curves and linear transformations in a super simple way

pydrawer 📐 The Python package for visualizing curves and linear transformations in a super simple way. ✏️ Installation Install pydrawer package with

Dylan Tintenfich 56 Dec 30, 2022
Plot-configurations for scientific publications, purely based on matplotlib

TUEplots Plot-configurations for scientific publications, purely based on matplotlib. Usage Please have a look at the examples in the example/ directo

Nicholas Krämer 487 Jan 08, 2023
daily report of @arkinvest ETF activity + data collection

ark_invest daily weekday report of @arkinvest ETF activity + data collection This script was created to: Extract and save daily csv's from ARKInvest's

T D 27 Jan 02, 2023
Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python

Petrel Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python. NOTE: The base Storm package provides storm.py, which

AirSage 247 Dec 18, 2021
A toolkit to generate MR sequence diagrams

mrsd: a toolkit to generate MR sequence diagrams mrsd is a Python toolkit to generate MR sequence diagrams, as shown below for the basic FLASH sequenc

Julien Lamy 3 Dec 25, 2021
Learning Convolutional Neural Networks with Interactive Visualization.

CNN Explainer An interactive visualization system designed to help non-experts learn about Convolutional Neural Networks (CNNs) For more information,

Polo Club of Data Science 6.3k Jan 01, 2023
Type-safe YAML parser and validator.

StrictYAML StrictYAML is a type-safe YAML parser that parses and validates a restricted subset of the YAML specification. Priorities: Beautiful API Re

Colm O'Connor 1.2k Jan 04, 2023