plotly scatterplots which show molecule images on hover!

Overview

molplotly

Plotly scatterplots which show molecule images on hovering over the datapoints!

Beautiful :)

Required packages:

➡️ See example.ipynb for an example :)

📜 Usage

import pandas as pd
import plotly.express as px

import molplotly

# load a DataFrame with smiles
df_esol = pd.read_csv('esol.csv')
df_esol['y_pred'] = df_esol['ESOL predicted log solubility in mols per litre']
df_esol['y_true'] = df_esol['measured log solubility in mols per litre']

# generate a scatter plot
fig = px.scatter(df_esol, x="y_true", y="y_pred")

# add molecules to the plotly graph - returns a Dash app
app = molplotly.add_molecules(fig=fig, 
                            df=df_esol, 
                            smiles_col='smiles', 
                            title_col='Compound ID', 
                            )

# run Dash app inline in notebook (or in an external server)
app.run_server(mode='inline', port=8011, height=1000)

Input parameters

  • fig : plotly.graph_objects.Figure object
    a plotly figure object containing datapoints plotted from df
  • df : pandas.DataFrame object
    a pandas dataframe that contains the data plotted in fig
  • smiles_col : str, optional
    name of the column in df containing the smiles plotted in fig (default 'SMILES')
  • show_img : bool, optional
    whether or not to generate the molecule image in the dash app (default True)
  • title_col : str, optional
    name of the column in df to be used as the title entry in the hover box (default None)
  • show_coords : bool, optional
    whether or not to show the coordinates of the data point in the hover box (default True)
  • caption_cols : list, optional
    list of column names in df to be included in the hover box (default None)
  • condition_col : str, optional
    name of the column in df that is used to color the datapoints in df - necessary when there is discrete conditional coloring (default None)
  • wrap : bool, optional
    whether or not to wrap the title text to multiple lines if the length of the text is too long (default True)
  • wraplen : int, optional
    the threshold length of the title text before wrapping begins - adjust when changing the width of the hover box (default 20)
  • width : int, optional
    the width in pixels of the hover box (default 150)
  • fontfamily : str, optional
    the font family used in the hover box (default 'Arial')
  • fontsize : int, optional
    the font size used in the hover box - the font of the title line is fontsize+2 (default 12)

Output parameters

by default a JupyterDash app is returned which can be run inline in a jupyter notebook or deployed on a server via app.run_server()

Acknowledgements

Features to-add:

  1. Individual styles for each caption (fonts, colors etc)
  2. Some way to save the plot
  3. Highlight points by clicking on them
  4. SVG image generation
Comments
  • Erro with dash 2.3.0

    Erro with dash 2.3.0

    Hello,

    First of all thanks for this great package.

    Today, I got the following error while importing molplotly whereas I previously had no issue: ImportError: cannot import name 'Input' from 'dash'

    As my version of dash was a bit old 1.6 I believe, I upgraded it to version 2.3.0 via pip. The import error disappeared but I had another error when trying to run the server "app.run_server(mode='inline', port=8003, height=800)": AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')

    I have managed to get around by downgrading dash to version 2.0.0 as recommended here https://stackoverflow.com/questions/70908709/jupyterdash-app-run-server-error-using-jupyter-notebook, but there may be something to look into...

    Thanks again

    opened by remseven 4
  • Pip install error

    Pip install error

    Great idea this package! I however ran into an issue during installation: UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1822: character maps to <undefined> This is certainly due to unusal characters in the readme, since I could install the package locally by editing the readme file. Cheers!

    opened by ArnaudGaudry 4
  • Dependency versions for pip install

    Dependency versions for pip install

    Is there some reason for the very specific version requirements for the dependencies? For the pip install I would loosen this up unless there are any particular known issues.

    opened by kjelljorner 3
  • Setup testing + CI

    Setup testing + CI

    Best not to wait too long to start testing. Here's a basic scaffold to get started.

    Probably shouldn't be merged as is but y'all can push more commits here to get this in shape.

    opened by janosh 3
  • Plotting in a running Dash app

    Plotting in a running Dash app

    Hi! I have a small dash app that I use to explore the molecules present in different samples. It is possible to select the samples of interest and then display a plotly scatter plot generated using the structures. I tried to add the molplotly layer on the scatter plot but no molecules are displayed. Any experience on that? Thanks!

    opened by ArnaudGaudry 2
  • Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Hi there, thanks for providing a great and easy to use tool!

    This issue is reproducible with the first example in the documentation:

    df_esol['delY'] = df_esol["y_pred"] - df_esol["y_true"]
    fig_scatter = px.scatter(df_esol,
                             x="y_true",
                             y="y_pred",
                             color='delY',
                             marker='Minimum Degree', # <- addition
                             title='ESOL Regression (default plotly)',
                             labels={'y_pred': 'Predicted Solubility',
                                     'y_true': 'Measured Solubility',
                                     'delY': 'ΔY'},
                             width=1200,
                             height=800)
    
    # This adds a dashed line for what a perfect model _should_ predict
    y = df_esol["y_true"].values
    fig_scatter.add_shape(
        type="line", line=dict(dash='dash'),
        x0=y.min(), y0=y.min(),
        x1=y.max(), y1=y.max()
    )
    
    fig_scatter.update_layout(title='ESOL Regression (with add_molecules!)')
    
    app_scatter = molplotly.add_molecules(fig=fig_scatter,
                                          df=df_esol,
                                          smiles_col='smiles',
                                          title_col='Compound ID',
                                          color_col='delY' # <- addition
                                          )
    
    # change the arguments here to run the dash app on an external server and/or change the size of the app!
    app_scatter.run_server(mode='inline', port=8001, height=1000)
    
    

    This returns

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/molplotly/main.py in display_hover(
        hoverData={'points': [{'bbox': {'x0': 948.39, 'x1': 950.39, 'y0': 177.7, 'y1': 179.7}, 'curveNumber': 0, 'marker.color': -0.48000000000000004, 'pointIndex': 960, 'pointNumber': 960, 'x': 0.79, 'y': 0.31}]}
    )
        111             df_curve = df[df[color_col] ==
        112                           curve_dict[curve_num]].reset_index(drop=True)
    --> 113             df_row = df_curve.iloc[num]
            df_row = undefined
            df_curve.iloc = <pandas.core.indexing._iLocIndexer object at 0x7f7e3d16c950>
            num = 960
        114         else:
        115             df_row = df.iloc[num]
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960
    )
        929 
        930             maybe_callable = com.apply_if_callable(key, self.obj)
    --> 931             return self._getitem_axis(maybe_callable, axis=axis)
            self._getitem_axis = <bound method _iLocIndexer._getitem_axis of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            maybe_callable = 960
            axis = 0
        932 
        933     def _is_scalar_access(self, key: tuple):
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1564 
       1565             # validate the location
    -> 1566             self._validate_integer(key, axis)
            self._validate_integer = <bound method _iLocIndexer._validate_integer of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            key = 960
            axis = 0
       1567 
       1568             return self.obj._ixs(key, axis=axis)
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1498         len_axis = len(self.obj._get_axis(axis))
       1499         if key >= len_axis or key < -len_axis:
    -> 1500             raise IndexError("single positional indexer is out-of-bounds")
            global IndexError = undefined
       1501 
       1502     # -------------------------------------------------------------------
    
    IndexError: single positional indexer is out-of-bounds
    

    Using either only marker or color alone causes no issues with the hoverbox. Also, using Minimum Degree as color_col for add_molecules when both color and symbol are defined gives no issues.

    opened by chertianser 2
  • Removed support for python 3.7

    Removed support for python 3.7

    Thanks for the great library, really useful!

    You pinned the pandas version to ~=1.4.1 which effectively cuts of users that use python<3.7. See the pandas release notes: https://pandas.pydata.org/docs/whatsnew/v1.4.0.html

    Is this intentional? Which novel features from pandas >1.4.0 are strictly necessary to keep the package running? I'll create a PR with a relaxed pandas requirements that works fine for me in a python3.7 env.

    opened by jannisborn 1
  • Input params as table

    Input params as table

    I find list of parameters easier to parse in a table than as list. Here's what that would look like. If you think it's an improvement, could consider joining type and default columns to save horizontal space.

    After

    Screen Shot 2022-03-02 at 09 37 16

    Before

    Screen Shot 2022-03-02 at 09 39 13

    opened by janosh 1
  • Rokas slider

    Rokas slider

    Implemented support for specifying multiple smiles in the smiles_col argument for molplotly.add_molecules:

    • When a single str argument is passed to smiles_col, the function behaves as before.
    • When a list is passed, a slider is created under the plot, which allows the user to decide which column to use to render molecules.

    Also changed the ports in the example.ipynb to start from 8700 and go up. On my system 8000 and 8001 were reserverd already.

    opened by RokasEl 1
  • Error when using with Plotly Subplots

    Error when using with Plotly Subplots

    When trying to use molplotly to generate hover structures with a series of scatterplots generated using make_subplots (generated using different columns of a dataframe for the same RDKit molecule row), molplotly.add_molecules returns

    ValueError: More than one plotly curve in figure - color_col and/or marker_col needs to be specified.

    As these plots are generated using different columns, rather than faceting data in a single column based on values in another, there is no common color or marker column. Is there a way to generate molecular structures for these subplots?

    opened by matthewtoholland 3
  • Matrix distance to scatter plot

    Matrix distance to scatter plot

    Hello everyone,

    My name is Judith and for my PhD studies, I would like to use your beautiful scripts. I get a distance matrix by rmsd between each pose but I don't see how to pass it to a scatter plot of 2 clusters, I tried with pandas but I'm really blocked, I can't select the lines and the columns to generate the scatter plot

    Best Regards, Judith

    opened by JudKil 7
  • Saving interactive plots

    Saving interactive plots

    Thanks for the great package!

    It would be fantastic if the interactive plots could be exported/saved. I understand that this is non-trvial in plotly, but other libraries like mpl3d also allow to export as interactive HTML or SVG. See here for an exemplary plot. Also TMAP and Faerun support this natively. I think it will be a heavily sought-after feature for real usability of this package.

    Possible solutions:

    • separate integration building on top of mpl3d (seems overkill, might be the last resort)
    • Building upon this gist to export the Dash as HTML: https://gist.github.com/exzhawk/33e5dcfc8859e3b6ff4e5269b1ba0ba4
    • Faerun-style solution, see here: https://github.com/reymond-group/faerun-python
    opened by jannisborn 2
Releases(v1.1.5)
  • v1.1.5(Nov 24, 2022)

    Added ability to plot 3D coordinates from RDKit Mol objects as highlighted in issue #20, as well as making facet plots as raised in issue #21.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.4(Jun 14, 2022)

    Loosened package dependencies to address issue #18. Minimum version requirements for dash, jupyter-dash, and werkzeug are specified but everything else e.g. pandas is loosened.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Jun 1, 2022)

  • v1.1.2(Apr 6, 2022)

  • v1.1.1(Mar 1, 2022)

  • v1.1.0(Mar 1, 2022)

    Added features, formatting, and bug fixes :)

    • Simultaneous plotting of multiple smiles columns (pull request #1) can now be done by passing in a list of smiles columns into smiles_col (see examples/multiple_smiles_columns.ipynb for a tutorial).
    • Adjusting of hover box transparency (issue #3) can now be controlled with alpha and mol_alpha arguments (see entry in examples/simple_usage_and_formatting.ipynb for example usage).
    • Usage examples split into multiple notebooks and organised in examples folder.
    • Fixed bug (issue #6) resulting from specifying both color_col and marker_col.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Feb 11, 2022)

    It seems that Dash got updated to 2.1.0 without me realising and of course that broke jupyter-dash 😅 this is a hotfix to the requirements specifying that dash 2.0.0 is required.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 11, 2022)

NW 2022 Hackathon Project by Angelique Clara Hanzel, Aryan Sonik, Damien Fung, Ramit Brata Biswas

Spiral-Data-Visualizer NW 2022 Hackathon Project by Angelique Clara Hanzell, Aryan Sonik, Damien Fung, Ramit Brata Biswas Description This project vis

Damien Fung 2 Jan 16, 2022
BrowZen correlates your emotional states with the web sites you visit to give you actionable insights about how you spend your time browsing the web.

BrowZen BrowZen correlates your emotional states with the web sites you visit to give you actionable insights about how you spend your time browsing t

Nick Bild 36 Sep 28, 2022
A Python toolbox for gaining geometric insights into high-dimensional data

"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview

Contextual Dynamics Laboratory 1.8k Dec 29, 2022
Streamlit-template - A streamlit app template based on streamlit-option-menu

streamlit-template A streamlit app template for geospatial applications based on

Qiusheng Wu 41 Dec 10, 2022
The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".

The HIST framework for stock trend forecasting The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining C

Wentao Xu 111 Jan 03, 2023
Open-source demos hosted on Dash Gallery

Dash Sample Apps This repository hosts the code for over 100 open-source Dash apps written in Python or R. They can serve as a starting point for your

Plotly 2.7k Jan 07, 2023
Getting started with Python, Dash and Plot.ly for the Data Dashboards team

data_dashboards Getting started with Python, Dash and Plot.ly for the Data Dashboards team Getting started MacOS users: # Install the pyenv version ma

Department for Levelling Up, Housing and Communities 1 Nov 08, 2021
A minimalistic wrapper around PyOpenGL to save development time

glpy glpy is pyOpenGl wrapper which lets you work with pyOpenGl easily.It is not meant to be a replacement for pyOpenGl but runs on top of pyOpenGl to

Abhinav 9 Apr 02, 2022
Chem: collection of mostly python code for molecular visualization, QM/MM, FEP, etc

chem: collection of mostly python code for molecular visualization, QM/MM, FEP,

5 Sep 02, 2022
simple tool to paint axis x and y

simple tool to paint axis x and y

G705 1 Oct 21, 2021
Visualization ideas for data science

Nuance I use Nuance to curate varied visualization thoughts during my data scientist career. It is not yet a package but a list of small ideas. Welcom

Li Jiangchun 16 Nov 03, 2022
An XLSX spreadsheet renderer for Django REST Framework.

drf-renderer-xlsx provides an XLSX renderer for Django REST Framework. It uses OpenPyXL to create the spreadsheet and returns the data.

The Wharton School 166 Dec 01, 2022
A concise grammar of interactive graphics, built on Vega.

Vega-Lite Vega-Lite provides a higher-level grammar for visual analysis that generates complete Vega specifications. You can find more details, docume

Vega 4k Jan 08, 2023
Sky attention heatmap of submissions to astrometry.net

astroheat Installation Requires Python 3.6+, Tested with Python 3.9.5 Install library dependencies pip install -r requirements.txt The program require

4 Jun 20, 2022
China and India Population and GDP Visualization

China and India Population and GDP Visualization Historical Population Comparison between India and China This graph shows the population data of Indi

Nicolas De Mello 10 Oct 27, 2021
Kglab - an abstraction layer in Python for building knowledge graphs

Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, p

derwen.ai 466 Jan 09, 2023
A gui application to visualize various sorting algorithms using pure python.

Sorting Algorithm Visualizer A gui application to visualize various sorting algorithms using pure python. Language : Python 3 Libraries required Tkint

Rajarshi Banerjee 19 Nov 30, 2022
A set of useful perceptually uniform colormaps for plotting scientific data

Colorcet: Collection of perceptually uniform colormaps Build Status Coverage Latest dev release Latest release Docs What is it? Colorcet is a collecti

HoloViz 590 Dec 31, 2022
Create a visualization for Trump's Tweeted Words Using Python

Data Trump's Tweeted Words This plot illustrates twitter word occurences. We already did the coding I needed for this plot, so I was very inspired to

7 Mar 27, 2022
FURY - A software library for scientific visualization in Python

Free Unified Rendering in Python A software library for scientific visualization in Python. General Information • Key Features • Installation • How to

169 Dec 21, 2022