Pytorch implementation of XRD spectral identification from COD database

Last update: Jan 07, 2023

Related tags

Overview

XRDidentifier

Pytorch implementation of XRD spectral identification from COD database.
Details will be explained in the paper to be submitted to NeurIPS 2021 Workshop Machine Learning and the Physical Sciences (https://ml4physicalsciences.github.io/2021/).

Features

expert model

1D-CNN (1D-RegNet) + Hierarchical Deep metric learning (AdaCos + Angular Penalty Softmax Loss)

mixture of experts

73 expert models tailered to general chemical elements with sparsely-gated layer

data augmentation

Physics-informed data augmentation

Requirements

Python 3.6
PyTorch 1.4
pymatgen
scikit-learn

Dataset Construction

In the paper, I used ICSD dataset, but it is forbidden to redistribute the CIFs followed by their license. I will write the CIF dataset construction method using COD instead.

1. download cif files from COD

Go to the COD homepage, search and download the cif URL list.
http://www.crystallography.net/cod/search.html

python3 download_cif_from_cod.py --input ./COD-selection.txt --output ./cif

2. convert cif into XRD spectra

First, check the cif files. (some files are broken or physically meaningless)

python3 read_cif.py --input ./cif --output ./lithium_datasets.pkl

lithium_datasets.pkl will be created.

Second, convert the checked results into XRD spectra database.

python3 convertXRDspectra.py --input ./lithium_datasets.pkl --batch 8 --n_aug 5

XRD_epoch5.pkl will be created.

Train expert models

python3 train_expert.py --input ./XRD_epoch5.pkl --output learning_curve.csv --batch 16 --n_epoch 100

Output data

Trained model -> regnet1d_adacos_epoch100.pt
Learning curve -> learning_curve.csv
Correspondence between numerical int label and crystal names -> material_labels.csv

Train Mixture-of-Experts model

You need to prepare both pre-trained expert models and pickled single XRD spectra files.
You should store the pre-trained expert models in './pretrained' folder, and the pickled single XRD spectra files in './pickles' folder.
The number of experts are automatically adjusted according to the number of the pretrained expert models.

python3 train_moe.py --data_path ./pickles --save_model moe.pt --batch 64 --epoch 100

Output data

Trained model -> moe.pt
Learning curve -> moe.csv

Citation

Papers

AdaCos: https://arxiv.org/abs/1905.00292
1D-RegNet: https://arxiv.org/abs/2008.04063
Physics-informed data augmentation: https://arxiv.org/abs/1811.08425v2
Sparsely-gated layer: https://arxiv.org/abs/1701.06538

Implementation

AdaCos: https://github.com/4uiiurz1/pytorch-adacos/blob/master/metrics.py
1D-RegNet: https://github.com/hsd1503/resnet1d
Physics-informed data augmentation: https://github.com/PV-Lab/autoXRD
Top k accuracy: https://gist.github.com/weiaicunzai/2a5ae6eac6712c70bde0630f3e76b77b
Angular Penalty Softmax Loss: https://github.com/cvqluu/Angular-Penalty-Softmax-Losses-Pytorch
Sparsely-gated layer: https://github.com/davidmrau/mixture-of-experts

Pytorch implementation of XRD spectral identification from COD database

Related tags

Overview

XRDidentifier

Features

expert model

mixture of experts

data augmentation

Requirements

Dataset Construction

1. download cif files from COD

2. convert cif into XRD spectra

Train expert models

Train Mixture-of-Experts model

Citation

Papers

Implementation

Owner

Masaki Adachi

Hybrid Neural Fusion for Full-frame Video Stabilization

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.

Code for our paper "MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction" published at ICCV 2021.

Implementation of ProteinBERT in Pytorch

Normal Learning in Videos with Attention Prototype Network

一个多模态内容理解算法框架，其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

Empowering journalists and whistleblowers

implementation of the paper "MarginGAN: Adversarial Training in Semi-Supervised Learning"

Pytorch Implementation of Various Point Transformers

PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

This repository compare a selfie with images from identity documents and response if the selfie match.

Distributionally robust neural networks for group shifts

(under submission) Bayesian Integration of a Generative Prior for Image Restoration

Ultra-lightweight human body posture key point CNN model. ModelSize:2.3MB HUAWEI P40 NCNN benchmark: 6ms/img,

Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Unsupervised Foreground Extraction via Deep Region Competition

An evaluation toolkit for voice conversion models.

Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"