plotly scatterplots which show molecule images on hover!

Overview

molplotly

Plotly scatterplots which show molecule images on hovering over the datapoints!

Beautiful :)

Required packages:

➡️ See example.ipynb for an example :)

📜 Usage

import pandas as pd
import plotly.express as px

import molplotly

# load a DataFrame with smiles
df_esol = pd.read_csv('esol.csv')
df_esol['y_pred'] = df_esol['ESOL predicted log solubility in mols per litre']
df_esol['y_true'] = df_esol['measured log solubility in mols per litre']

# generate a scatter plot
fig = px.scatter(df_esol, x="y_true", y="y_pred")

# add molecules to the plotly graph - returns a Dash app
app = molplotly.add_molecules(fig=fig, 
                            df=df_esol, 
                            smiles_col='smiles', 
                            title_col='Compound ID', 
                            )

# run Dash app inline in notebook (or in an external server)
app.run_server(mode='inline', port=8011, height=1000)

Input parameters

  • fig : plotly.graph_objects.Figure object
    a plotly figure object containing datapoints plotted from df
  • df : pandas.DataFrame object
    a pandas dataframe that contains the data plotted in fig
  • smiles_col : str, optional
    name of the column in df containing the smiles plotted in fig (default 'SMILES')
  • show_img : bool, optional
    whether or not to generate the molecule image in the dash app (default True)
  • title_col : str, optional
    name of the column in df to be used as the title entry in the hover box (default None)
  • show_coords : bool, optional
    whether or not to show the coordinates of the data point in the hover box (default True)
  • caption_cols : list, optional
    list of column names in df to be included in the hover box (default None)
  • condition_col : str, optional
    name of the column in df that is used to color the datapoints in df - necessary when there is discrete conditional coloring (default None)
  • wrap : bool, optional
    whether or not to wrap the title text to multiple lines if the length of the text is too long (default True)
  • wraplen : int, optional
    the threshold length of the title text before wrapping begins - adjust when changing the width of the hover box (default 20)
  • width : int, optional
    the width in pixels of the hover box (default 150)
  • fontfamily : str, optional
    the font family used in the hover box (default 'Arial')
  • fontsize : int, optional
    the font size used in the hover box - the font of the title line is fontsize+2 (default 12)

Output parameters

by default a JupyterDash app is returned which can be run inline in a jupyter notebook or deployed on a server via app.run_server()

Acknowledgements

Features to-add:

  1. Individual styles for each caption (fonts, colors etc)
  2. Some way to save the plot
  3. Highlight points by clicking on them
  4. SVG image generation
Comments
  • Erro with dash 2.3.0

    Erro with dash 2.3.0

    Hello,

    First of all thanks for this great package.

    Today, I got the following error while importing molplotly whereas I previously had no issue: ImportError: cannot import name 'Input' from 'dash'

    As my version of dash was a bit old 1.6 I believe, I upgraded it to version 2.3.0 via pip. The import error disappeared but I had another error when trying to run the server "app.run_server(mode='inline', port=8003, height=800)": AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')

    I have managed to get around by downgrading dash to version 2.0.0 as recommended here https://stackoverflow.com/questions/70908709/jupyterdash-app-run-server-error-using-jupyter-notebook, but there may be something to look into...

    Thanks again

    opened by remseven 4
  • Pip install error

    Pip install error

    Great idea this package! I however ran into an issue during installation: UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1822: character maps to <undefined> This is certainly due to unusal characters in the readme, since I could install the package locally by editing the readme file. Cheers!

    opened by ArnaudGaudry 4
  • Dependency versions for pip install

    Dependency versions for pip install

    Is there some reason for the very specific version requirements for the dependencies? For the pip install I would loosen this up unless there are any particular known issues.

    opened by kjelljorner 3
  • Setup testing + CI

    Setup testing + CI

    Best not to wait too long to start testing. Here's a basic scaffold to get started.

    Probably shouldn't be merged as is but y'all can push more commits here to get this in shape.

    opened by janosh 3
  • Plotting in a running Dash app

    Plotting in a running Dash app

    Hi! I have a small dash app that I use to explore the molecules present in different samples. It is possible to select the samples of interest and then display a plotly scatter plot generated using the structures. I tried to add the molplotly layer on the scatter plot but no molecules are displayed. Any experience on that? Thanks!

    opened by ArnaudGaudry 2
  • Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Hi there, thanks for providing a great and easy to use tool!

    This issue is reproducible with the first example in the documentation:

    df_esol['delY'] = df_esol["y_pred"] - df_esol["y_true"]
    fig_scatter = px.scatter(df_esol,
                             x="y_true",
                             y="y_pred",
                             color='delY',
                             marker='Minimum Degree', # <- addition
                             title='ESOL Regression (default plotly)',
                             labels={'y_pred': 'Predicted Solubility',
                                     'y_true': 'Measured Solubility',
                                     'delY': 'ΔY'},
                             width=1200,
                             height=800)
    
    # This adds a dashed line for what a perfect model _should_ predict
    y = df_esol["y_true"].values
    fig_scatter.add_shape(
        type="line", line=dict(dash='dash'),
        x0=y.min(), y0=y.min(),
        x1=y.max(), y1=y.max()
    )
    
    fig_scatter.update_layout(title='ESOL Regression (with add_molecules!)')
    
    app_scatter = molplotly.add_molecules(fig=fig_scatter,
                                          df=df_esol,
                                          smiles_col='smiles',
                                          title_col='Compound ID',
                                          color_col='delY' # <- addition
                                          )
    
    # change the arguments here to run the dash app on an external server and/or change the size of the app!
    app_scatter.run_server(mode='inline', port=8001, height=1000)
    
    

    This returns

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/molplotly/main.py in display_hover(
        hoverData={'points': [{'bbox': {'x0': 948.39, 'x1': 950.39, 'y0': 177.7, 'y1': 179.7}, 'curveNumber': 0, 'marker.color': -0.48000000000000004, 'pointIndex': 960, 'pointNumber': 960, 'x': 0.79, 'y': 0.31}]}
    )
        111             df_curve = df[df[color_col] ==
        112                           curve_dict[curve_num]].reset_index(drop=True)
    --> 113             df_row = df_curve.iloc[num]
            df_row = undefined
            df_curve.iloc = <pandas.core.indexing._iLocIndexer object at 0x7f7e3d16c950>
            num = 960
        114         else:
        115             df_row = df.iloc[num]
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960
    )
        929 
        930             maybe_callable = com.apply_if_callable(key, self.obj)
    --> 931             return self._getitem_axis(maybe_callable, axis=axis)
            self._getitem_axis = <bound method _iLocIndexer._getitem_axis of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            maybe_callable = 960
            axis = 0
        932 
        933     def _is_scalar_access(self, key: tuple):
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1564 
       1565             # validate the location
    -> 1566             self._validate_integer(key, axis)
            self._validate_integer = <bound method _iLocIndexer._validate_integer of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            key = 960
            axis = 0
       1567 
       1568             return self.obj._ixs(key, axis=axis)
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1498         len_axis = len(self.obj._get_axis(axis))
       1499         if key >= len_axis or key < -len_axis:
    -> 1500             raise IndexError("single positional indexer is out-of-bounds")
            global IndexError = undefined
       1501 
       1502     # -------------------------------------------------------------------
    
    IndexError: single positional indexer is out-of-bounds
    

    Using either only marker or color alone causes no issues with the hoverbox. Also, using Minimum Degree as color_col for add_molecules when both color and symbol are defined gives no issues.

    opened by chertianser 2
  • Removed support for python 3.7

    Removed support for python 3.7

    Thanks for the great library, really useful!

    You pinned the pandas version to ~=1.4.1 which effectively cuts of users that use python<3.7. See the pandas release notes: https://pandas.pydata.org/docs/whatsnew/v1.4.0.html

    Is this intentional? Which novel features from pandas >1.4.0 are strictly necessary to keep the package running? I'll create a PR with a relaxed pandas requirements that works fine for me in a python3.7 env.

    opened by jannisborn 1
  • Input params as table

    Input params as table

    I find list of parameters easier to parse in a table than as list. Here's what that would look like. If you think it's an improvement, could consider joining type and default columns to save horizontal space.

    After

    Screen Shot 2022-03-02 at 09 37 16

    Before

    Screen Shot 2022-03-02 at 09 39 13

    opened by janosh 1
  • Rokas slider

    Rokas slider

    Implemented support for specifying multiple smiles in the smiles_col argument for molplotly.add_molecules:

    • When a single str argument is passed to smiles_col, the function behaves as before.
    • When a list is passed, a slider is created under the plot, which allows the user to decide which column to use to render molecules.

    Also changed the ports in the example.ipynb to start from 8700 and go up. On my system 8000 and 8001 were reserverd already.

    opened by RokasEl 1
  • Error when using with Plotly Subplots

    Error when using with Plotly Subplots

    When trying to use molplotly to generate hover structures with a series of scatterplots generated using make_subplots (generated using different columns of a dataframe for the same RDKit molecule row), molplotly.add_molecules returns

    ValueError: More than one plotly curve in figure - color_col and/or marker_col needs to be specified.

    As these plots are generated using different columns, rather than faceting data in a single column based on values in another, there is no common color or marker column. Is there a way to generate molecular structures for these subplots?

    opened by matthewtoholland 3
  • Matrix distance to scatter plot

    Matrix distance to scatter plot

    Hello everyone,

    My name is Judith and for my PhD studies, I would like to use your beautiful scripts. I get a distance matrix by rmsd between each pose but I don't see how to pass it to a scatter plot of 2 clusters, I tried with pandas but I'm really blocked, I can't select the lines and the columns to generate the scatter plot

    Best Regards, Judith

    opened by JudKil 7
  • Saving interactive plots

    Saving interactive plots

    Thanks for the great package!

    It would be fantastic if the interactive plots could be exported/saved. I understand that this is non-trvial in plotly, but other libraries like mpl3d also allow to export as interactive HTML or SVG. See here for an exemplary plot. Also TMAP and Faerun support this natively. I think it will be a heavily sought-after feature for real usability of this package.

    Possible solutions:

    • separate integration building on top of mpl3d (seems overkill, might be the last resort)
    • Building upon this gist to export the Dash as HTML: https://gist.github.com/exzhawk/33e5dcfc8859e3b6ff4e5269b1ba0ba4
    • Faerun-style solution, see here: https://github.com/reymond-group/faerun-python
    opened by jannisborn 2
Releases(v1.1.5)
  • v1.1.5(Nov 24, 2022)

    Added ability to plot 3D coordinates from RDKit Mol objects as highlighted in issue #20, as well as making facet plots as raised in issue #21.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.4(Jun 14, 2022)

    Loosened package dependencies to address issue #18. Minimum version requirements for dash, jupyter-dash, and werkzeug are specified but everything else e.g. pandas is loosened.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Jun 1, 2022)

  • v1.1.2(Apr 6, 2022)

  • v1.1.1(Mar 1, 2022)

  • v1.1.0(Mar 1, 2022)

    Added features, formatting, and bug fixes :)

    • Simultaneous plotting of multiple smiles columns (pull request #1) can now be done by passing in a list of smiles columns into smiles_col (see examples/multiple_smiles_columns.ipynb for a tutorial).
    • Adjusting of hover box transparency (issue #3) can now be controlled with alpha and mol_alpha arguments (see entry in examples/simple_usage_and_formatting.ipynb for example usage).
    • Usage examples split into multiple notebooks and organised in examples folder.
    • Fixed bug (issue #6) resulting from specifying both color_col and marker_col.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Feb 11, 2022)

    It seems that Dash got updated to 2.1.0 without me realising and of course that broke jupyter-dash 😅 this is a hotfix to the requirements specifying that dash 2.0.0 is required.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 11, 2022)

Set of matplotlib operations that are not trivial

Matplotlib Snippets This repository contains a set of matplotlib operations that are not trivial. Histograms Histogram with bins adapted to log scale

Raphael Meudec 1 Nov 15, 2021
Minimal Ethereum fee data viewer for the terminal, contained in a single python script.

Minimal Ethereum fee data viewer for the terminal, contained in a single python script. Connects to your node and displays some metrics in real-time.

48 Dec 05, 2022
Import, visualize, and analyze SpiderFoot OSINT data in Neo4j, a graph database

SpiderFoot Neo4j Tools Import, visualize, and analyze SpiderFoot OSINT data in Neo4j, a graph database Step 1: Installation NOTE: This installs the sf

Black Lantern Security 42 Dec 26, 2022
Create a table with row explanations, column headers, using matplotlib

Create a table with row explanations, column headers, using matplotlib. Intended usage was a small table containing a custom heatmap.

4 Aug 14, 2022
Data visualization electromagnetic spectrum

Datenvisualisierung-Elektromagnetischen-Spektrum Anhand des Moduls matplotlib sollen die Daten des elektromagnetischen Spektrums dargestellt werden. D

Pulsar 1 Sep 01, 2022
patchwork for matplotlib

patchworklib patchwork for matplotlib test code Preparation of example plots import seaborn as sns import numpy as np import pandas as pd #Bri

Mori Hideto 185 Jan 06, 2023
Python package for the analysis and visualisation of finite-difference fields.

discretisedfield Marijan Beg1,2, Martin Lang2, Samuel Holt3, Ryan A. Pepper4, Hans Fangohr2,5,6 1 Department of Earth Science and Engineering, Imperia

ubermag 12 Dec 14, 2022
Visual Python is a GUI-based Python code generator, developed on the Jupyter Notebook environment as an extension.

Visual Python is a GUI-based Python code generator, developed on the Jupyter Notebook environment as an extension.

Visual Python 564 Jan 03, 2023
Create SVG drawings from vector geodata files (SHP, geojson, etc).

SVGIS Create SVG drawings from vector geodata files (SHP, geojson, etc). SVGIS is great for: creating small multiples, combining lots of datasets in a

Neil Freeman 78 Dec 09, 2022
Generate "Jupiter" plots for circular genomes

jupiter Generate "Jupiter" plots for circular genomes Description Python scripts to generate plots from ViennaRNA output. Written in "pidgin" python w

Robert Edgar 2 Nov 29, 2021
This is a learning tool and exploration app made using the Dash interactive Python framework developed by Plotly

Support Vector Machine (SVM) Explorer This app has been moved here. This repo is likely outdated and will not be updated. This is a learning tool and

Plotly 150 Nov 03, 2022
SummVis is an interactive visualization tool for text summarization.

SummVis is an interactive visualization tool for analyzing abstractive summarization model outputs and datasets.

Robustness Gym 246 Dec 08, 2022
Make scripted visualizations in blender

Scripted visualizations in blender The goal of this project is to script 3D scientific visualizations using blender. To achieve this, we aim to bring

Praneeth Namburi 10 Jun 01, 2022
Learn Data Science with focus on adding value with the most efficient tech stack.

DataScienceWithPython Get started with Data Science with Python An engaging journey to become a Data Scientist with Python TL;DR Download all Jupyter

Learn Python with Rune 110 Dec 22, 2022
Official Matplotlib cheat sheets

Official Matplotlib cheat sheets

Matplotlib Developers 6.7k Jan 09, 2023
DrawBot lets you draw images taken from the internet on Skribbl.io, Gartic Phone and Paint

DrawBot You don't speak french? No worries, english translation is over here. C'est quoi ? DrawBot est un logiciel codé par V2F qui va prendre possess

V2F 205 Jan 01, 2023
Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from time series data.

ts2vg: Time series to visibility graphs The Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from t

Carlos Bergillos 26 Dec 17, 2022
This component provides a wrapper to display SHAP plots in Streamlit.

streamlit-shap This component provides a wrapper to display SHAP plots in Streamlit.

Snehan Kekre 30 Dec 10, 2022
Because trello only have payed options to generate a RunUp chart, this solves that!

Trello Runup Chart Generator The basic concept of the project is that Corello is pay-to-use and want to use Trello To-Do/Doing/Done automation with gi

Rômulo Schiavon 1 Dec 21, 2021
649 Pokémon palettes as CSVs, with a Python lib to turn names/IDs into palettes, or MatPlotLib compatible ListedColormaps.

PokePalette 649 Pokémon, broken down into CSVs of their RGB colour palettes. Complete with a Python library to convert names or Pokédex IDs into eithe

11 Dec 05, 2022