DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

Overview

ChemRxiv | [Paper] XXX

DeepStruc

Welcome to DeepStruc, a Deep Generative Model (DGM) that learns the relation between PDF and atomic structure and thereby solves a structure from a PDF!

  1. DeepStruc
  2. Getting started (with Colab)
  3. Getting started (own computer)
    1. Install requirements
    2. Simulate data
    3. Train model
    4. Predict
  4. Author
  5. Cite
  6. Acknowledgments
  7. License

We here apply DeepStruc for the structural analysis of a model system of mono-metallic nanoparticle (MMNPs) with seven different structure types and demonstrate the method for both simulated and experimental PDFs. DeepStruc can reconstruct simulated data with an average mean absolute error (MAE) of the atom xyz-coordinates on 0.093 ± 0.058 Å after fitting a contraction/extraction factor, an ADP and a scale parameter. We demonstrate the generative capability of DeepStruc on a dataset of face-centered cubic (fcc), hexagonal closed packed (hcp) and stacking faulted structures, where DeepStruc can recognize the stacking faulted structures as an interpolation between fcc and hcp and construct new structural models based on a PDF. The MAE is in this example 0.030 ± 0.019 Å.

The MMNPs are provided as a graph-based input to the encoder of DeepStruc. We compare DeepStruc with a similar DGM without the graph-based encoder. DeepStruc is able to reconstruct the structures using a smaller dimension of the latent space thus having a better generative capabillity. We also compare DeepStruc with a brute-force modelling approach and a tree-based classification algorithm. The ML models are significantly faster than the brute-force approach, but DeepStruc can furthermore create a latent space from where synthetic structures can be sampled which the tree-based method cannot! The baseline models can be found in other repositories: brute-force, MetalFinder and CVAE. alt text

Getting started (with Colab)

Using DeepStruc on your own PDFs is straightforward and does not require anything installed or downloaded to your computer. Follow the instructions in our Colab notebook and try to play around.

Getting started (own computer)

Follow these step if you want to train DeepStruc and predict with DeepStruc locally on your own computer.

Install requirements

See the install folder.

Simulate data

See the data folder.

Train model

To train your own DeepStruc model simply run:

python train.py

A list of possible arguments or run the '--help' argument for additional information.
If you are intersted in changing the architecture of the model go to train.py and change the model_arch dictionary.

Arg Description Example
-h or --help Prints help message.
-d or --data_dir Directory containing graph training, validation and test data. str -d ./data/graphs
-s or --save_dir Directory where models will be saved. This is also used for loading a learner. str -s bst_model
-r or --resume_model If 'True' the save_dir model is loaded and training is continued. bool -r True
-e or --epochs Number of maximum epochs. int -e 100
-b or --batch_size Number of graphs in each batch. int -b 20
-l or --learning_rate Learning rate. float -l 1e-4
-B or --beta Initial beta value for scaling KLD. float -B 0.1
-i or --beta_increase Increments of beta when the threshold is met. float -i 0.1
-x or --beta_max Highst value beta can increase to. float -x 5
-t or --reconstruction_th Reconstruction threshold required before beta is increased. float -t 0.001
-n or --num_files Total number of files loaded. Files will be split 60/20/20. If 'None' then all files are loaded. int -n 500
-c or --compute Train model on CPU or GPU. Choices: 'cpu', 'gpu16', 'gpu32' and 'gpu64'. str -c gpu32
-L or --latent_dim Number of latent space dimensions. int -L 3

Predict

To predict a MMNP using DeepStruc or your own model on a PDF:

python predict.py

A list of possible arguments or run the '--help' argument for additional information.

Arg Description Example
-h or --help Prints help message.
-d or --data Path to data or data directory. If pointing to data directory all datasets must have same format. str -d data/experimental_PDFs/JQ_S1.gr
-m or --model Path to model. If 'None' GUI will open. str -m ./models/DeepStruc
-n or --num_samples Number of samples/structures generated for each unique PDF. int -n 10
-s or --sigma Sample to '-s' sigma in the normal distribution. float -s 7
-p or --plot_sampling Plots sampled structures on top of DeepStruc training data. Model must be DeepStruc. bool -p True
-g or --save_path Path to directory where predictions will be saved. bool -g ./best_preds
-i or --index_plot Highlights specific reconstruction in the latent space. --data must be specific file and not directory and '--plot True'. int -i 4
-P or --plot_data If True then the first loaded PDF is plotted and shown after normalization. bool -P ./best_preds

Authors

Andy S. Anker1
Emil T. S. Kjær1
Marcus N. Weng1
Simon J. L. Billinge2, 3
Raghavendra Selvan4, 5
Kirsten M. Ø. Jensen1

1 Department of Chemistry and Nano-Science Center, University of Copenhagen, 2100 Copenhagen Ø, Denmark.
2 Department of Applied Physics and Applied Mathematics Science, Columbia University, New York, NY 10027, USA.
3 Condensed Matter Physics and Materials Science Department, Brookhaven National Laboratory, Upton, NY 11973, USA.
4 Department of Computer Science, University of Copenhagen, 2100 Copenhagen Ø, Denmark.
5 Department of Neuroscience, University of Copenhagen, 2200, Copenhagen N.

Should there be any question, desired improvement or bugs please contact us on GitHub or through email: [email protected] or [email protected].

Cite

If you use our code or our results, please consider citing our papers. Thanks in advance!

@article{kjær2022DeepStruc,
title={DeepStruc: Towards structure solution from pair distribution function data using deep generative models},
author={Emil T. S. Kjær, Andy S. Anker, Marcus N. Weng, Simon J. L. Billinge, Raghavendra Selvan, Kirsten M. Ø. Jensen},
year={2022}}
@article{anker2020characterising,
title={Characterising the atomic structure of mono-metallic nanoparticles from x-ray scattering data using conditional generative models},
author={Anker, Andy Sode and Kjær, Emil TS and Dam, Erik B and Billinge, Simon JL and Jensen, Kirsten MØ and Selvan, Raghavendra},
year={2020}}

Acknowledgments

Our code is developed based on the the following publication:

@article{anker2020characterising,
title={Characterising the atomic structure of mono-metallic nanoparticles from x-ray scattering data using conditional generative models},
author={Anker, Andy Sode and Kjær, Emil TS and Dam, Erik B and Billinge, Simon JL and Jensen, Kirsten MØ and Selvan, Raghavendra},
year={2020}}

License

This project is licensed under the Apache License Version 2.0, January 2004 - see the LICENSE file for details.

Owner
Emil Thyge Skaaning Kjær
Ph.D student in nanoscience at the University of Copenhagen.
Emil Thyge Skaaning Kjær
Python scripts for performing stereo depth estimation using the HITNET Tensorflow model.

HITNET-Stereo-Depth-estimation Python scripts for performing stereo depth estimation using the HITNET Tensorflow model from Google Research. Stereo de

Ibai Gorordo 76 Jan 02, 2023
Adaout is a practical and flexible regularization method with high generalization and interpretability

Adaout Adaout is a practical and flexible regularization method with high generalization and interpretability. Requirements python 3.6 (Anaconda versi

lambett 1 Feb 09, 2022
Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation

SUCP Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation () Direct Friends (i.e., users who follow each o

Kosar 8 Nov 26, 2022
Streaming over lightweight data transformations

Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a

Research Unit of Medical Imaging, Physics and Technology 256 Jan 08, 2023
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN Pytorch implementation Inception score evaluation StackGAN-v2-pytorch Tensorflow implementation for reproducing main results in the paper Sta

Han Zhang 1.8k Dec 21, 2022
Multi-query Video Retreival

Multi-query Video Retreival

Princeton Visual AI Lab 17 Nov 22, 2022
LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs This is the code for the LERP. Dataset The dataset used is MI

5 Jun 18, 2022
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

78 Dec 27, 2022
Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).

Revisiting spatio-temporal layouts for compositional action recognition Codebase for "Revisiting spatio-temporal layouts for compositional action reco

Gorjan 20 Dec 15, 2022
Pytorch Geometric Tutorials

Pytorch Geometric Tutorials

Antonio Longa 648 Jan 08, 2023
LBK 35 Dec 26, 2022
Pytorch domain adaptation package

DomainAdaptation This package is created to tackle the problem of domain shifts when dealing with two domains of different feature distributions. In d

Institute of Computational Perception 7 Oct 22, 2022
Automatic Data-Regularized Actor-Critic (Auto-DrAC)

Auto-DrAC: Automatic Data-Regularized Actor-Critic This is a PyTorch implementation of the methods proposed in Automatic Data Augmentation for General

89 Dec 13, 2022
LSUN Dataset Documentation and Demo Code

LSUN Please check LSUN webpage for more information about the dataset. Data Release All the images in one category are stored in one lmdb database fil

Fisher Yu 426 Jan 02, 2023
Fast convergence of detr with spatially modulated co-attention

Fast convergence of detr with spatially modulated co-attention Usage There are no extra compiled components in SMCA DETR and package dependencies are

peng gao 135 Dec 07, 2022
[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021

Convolutional MLP ConvMLP: Hierarchical Convolutional MLPs for Vision Preprint link: ConvMLP: Hierarchical Convolutional MLPs for Vision By Jiachen Li

SHI Lab 143 Jan 03, 2023
Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

Yunpeng 169 Dec 06, 2022
3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

Владислав Молодцов 0 Feb 06, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 0 Dec 15, 2022
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022