Make sankey, alluvial and sankey bump plots in ggplot

Overview

ggsankey

The goal of ggsankey is to make beautiful sankey, alluvial and sankey bump plots in ggplot2

Installation

You can install the development version of ggsankey from github with:

# install.packages("devtools")
devtools::install_github("davidsjoberg/ggsankey")

How does it work

Google defines a sankey as:

A sankey diagram is a visualization used to depict a flow from one set of values to another. The things being connected are called nodes and the connections are called links. Sankeys are best used when you want to show a many-to-many mapping between two domains or multiple paths through a set of stages.

To plot a sankey diagram with ggsankey each observation has a stage (called a discrete x-value in ggplot) and be part of a node. Furthermore, each observation needs to have instructions of which node it will belong to in the next stage. See the image below for some clarification.

Hence, to use geom_sankey the aestethics x, next_x, node and next_node are required. The last stage should point to NA. The aestethics fill and color will affect both nodes and flows.

To controll geometries (not changed by data) like fill, color, size, alpha etc for nodes and flows you can either choose to set a global value that affect both, or you can specify which one you want to alter. For example node.color = 'black' will only draw a black line around the nodes, but not the flows (links).

Example

geom_sankey

A basic sankey plot that shows how dimensions are linked.

library(ggsankey)
library(dplyr)
library(ggplot2)

df <- mtcars %>%
  make_long(cyl, vs, am, gear, carb)

ggplot(df, aes(x = x, 
               next_x = next_x, 
               node = node, 
               next_node = next_node,
               fill = factor(node))) +
  geom_sankey()

And by adding a little pimp.

  • Labels with geom_sankey_label which places labels in the center of nodes if given the same aestethics.

  • ggsankey also comes with custom minimalistic themes that can be used. Here I use theme_sankey.

ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) +
  geom_sankey(flow.alpha = .6,
              node.color = "gray30") +
  geom_sankey_label(size = 3, color = "white", fill = "gray40") +
  scale_fill_viridis_d() +
  theme_sankey(base_size = 18) +
  labs(x = NULL) +
  theme(legend.position = "none",
        plot.title = element_text(hjust = .5)) +
  ggtitle("Car features")

geom_alluvial

Alluvial plots are very similiar to sankey plots but have no spaces between nodes and start at y = 0 instead being centered around the x-axis.

ggplot(df, aes(x = x, next_x = next_x, node = node, next_node = next_node, fill = factor(node), label = node)) +
  geom_alluvial(flow.alpha = .6) +
  geom_alluvial_text(size = 3, color = "white") +
  scale_fill_viridis_d() +
  theme_alluvial(base_size = 18) +
  labs(x = NULL) +
  theme(legend.position = "none",
        plot.title = element_text(hjust = .5)) +
  ggtitle("Car features")

geom_sankey_bump

Sankey bump plots is mix between bump plots and sankey and mostly useful for time series. When a group becomes larger than another it bumps above it.

# install.packages("gapminder")
library(gapminder)

df <- gapminder %>%
  group_by(continent, year) %>%
  summarise(gdp = (sum_(pop * gdpPercap)/1e9) %>% round(0), .groups = "keep") %>%
  ungroup()

ggplot(df, aes(x = year,
               node = continent,
               fill = continent,
               value = gdp)) +
  geom_sankey_bump(space = 0, type = "alluvial", color = "transparent", smooth = 6) +
  scale_fill_viridis_d(option = "A", alpha = .8) +
  theme_sankey_bump(base_size = 16) +
  labs(x = NULL,
       y = "GDP ($ bn)",
       fill = NULL,
       color = NULL) +
  theme(legend.position = "bottom") +
  labs(title = "GDP development per continent")

Owner
David Sjoberg
Happy R user. Twitter: @davsjob
David Sjoberg
Jupyter notebook and datasets from the pandas Q&A video series

Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note

Kevin Markham 2k Jan 05, 2023
A small tool to test and visualize protein embeddings and amino acid proportions.

polyprotein_stats A small tool to test and visualize protein embeddings and amino acid proportions. Currently deployed on streamlit.io. Given a set of

2 Jan 07, 2023
A simple project on Data Visualization for CSCI-40 course.

Simple-Data-Visualization A simple project on Data Visualization for CSCI-40 course - the instructions can be found here SAT results in New York in 20

Hugo Matousek 8 Oct 27, 2021
HW 02 for CS40 - matplotlib practice

HW 02 for CS40 - matplotlib practice project instructions https://github.com/mikeizbicki/cmc-csci040/tree/2021fall/hw_02 Drake Lyric Analysis Bar Char

13 Oct 27, 2021
哔咔漫画window客户端,界面使用PySide2,已实现分类、搜索、收藏夹、下载、在线观看、waifu2x等功能。

picacomic-windows 哔咔漫画window客户端,界面使用PySide2,已实现分类、搜索、收藏夹、下载、在线观看等功能。 功能介绍 登陆分流,还原安卓端的三个分流入口 分类,搜索,排行,收藏夹使用同一的逻辑,滚轮下滑自动加载下一页,双击打开 漫画详情,章节列表和评论列表 下载功能,目

1.8k Dec 31, 2022
Make your BSC transaction simple.

bsc_trade_history Make your BSC transaction simple. 中文ReadMe Background: inspired by debank ,Practice my hands on this small project Blog:Crypto-BscTr

foolisheddy 7 Jul 06, 2022
WebApp served by OAK PoE device to visualize various streams, metadata and AI results

DepthAI PoE WebApp | Bootstrap 4 & Vue.js SPA Dashboard Based on dashmin (https:

Luxonis 6 Apr 09, 2022
I'm doing Genuary, an aritifiacilly generated month to build code that make beautiful things

Genuary 2022 I'm doing Genuary, an aritifiacilly generated month to build code that make beautiful things. Every day there is a new prompt for making

Joaquín Feltes 1 Jan 10, 2022
在原神中使用围栏绘图

yuanshen_draw 在原神中使用围栏绘图 文件说明 toLines.py 将一张图片转换为对应的线条集合,视频可以按帧转换。 draw.py 在原神家园里绘制一张线条图。 draw_video.py 在原神家园里绘制视频(自动按帧摆放,截图(win)并回收) cat_to_video.py

14 Oct 08, 2022
Numerical methods for ordinary differential equations: Euler, Improved Euler, Runge-Kutta.

Numerical methods Numerical methods for ordinary differential equations are methods used to find numerical approximations to the solutions of ordinary

Aleksey Korshuk 5 Apr 29, 2022
A Python wrapper of Neighbor Retrieval Visualizer (NeRV)

PyNeRV A Python wrapper of the dimensionality reduction algorithm Neighbor Retrieval Visualizer (NeRV) Compile Set up the paths in Makefile then make.

2 Aug 29, 2021
Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Tomás Capretto 93 Dec 28, 2022
Alternative layout visualizer for ZSA Moonlander keyboard

General info This is a keyboard layout visualizer for ZSA Moonlander keyboard (because I didn't find their Oryx or their training tool particularly us

10 Jul 19, 2022
mysql relation charts

sqlcharts 自动生成数据库关联关系图 复制settings.py.example 重命名为settings.py 将数据库配置信息填入settings.DATABASE,目前支持mysql和postgresql 执行 python build.py -b,-b是读取数据库表结构,如果只更新匹

6 Aug 22, 2022
Flow-based visual scripting for Python

A simple visual node editor for Python Ryven combines flow-based visual scripting with Python. It gives you absolute freedom for your nodes and a simp

Leon Thomm 3.1k Jan 06, 2023
Colormaps for astronomers

cmastro: colormaps for astronomers 🔭 This package contains custom colormaps that have been used in various astronomical applications, similar to cmoc

Adrian Price-Whelan 12 Oct 11, 2022
This is my favourite function - the Rastrigin function.

This is my favourite function - the Rastrigin function. What sparked my curiosity and interest in the function was its complexity in terms of many local optimum points, which makes it particularly in

1 Dec 27, 2021
Curvipy - The Python package for visualizing curves and linear transformations in a super simple way

Curvipy - The Python package for visualizing curves and linear transformations in a super simple way

Dylan Tintenfich 55 Dec 28, 2022
Uniform Manifold Approximation and Projection

UMAP Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, bu

Leland McInnes 6k Jan 08, 2023
Python package for the analysis and visualisation of finite-difference fields.

discretisedfield Marijan Beg1,2, Martin Lang2, Samuel Holt3, Ryan A. Pepper4, Hans Fangohr2,5,6 1 Department of Earth Science and Engineering, Imperia

ubermag 12 Dec 14, 2022