Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

Last update: Dec 23, 2022

Overview

naqs-for-quantum-chemistry

This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio quantum chemistry.


(a) Architecture of a neural autoregressive quantum state (NAQS)	(b) Energy surface of N₂

TL;DR

Certain parts of the notebooks relating to generating molecular data are currently not working due to updates to the underlying OpenFermion and Psi4 packages (I'll fix it!) - however the experimental results of NAQS can still be reproduced as we also provide pre-generated data in this repository.

If you don't care for now, and you just want to see it running, here are two links to notebooks that will set-up and run on Colab. Just note that Colab will not have enough memory to run experiments on the largest molecules we considered.

run_naqs.ipynb : Run individual experiments or batches of experiments, including those to recreate published results.
generate_molecular_data_and_baselines.ipynb :
1. Create the [molecule].hdf5 and [molecule]_qubit_hamiltonian.pkl files required (these are provided for molecules used in the paper in the molecules directory.)
2. Solve these molecules using various canconical QC methods using Psi4.

Overview

Quantum chemistry with neural networks

A grand challenge of ab-inito quantum chemistry (QC) is to solve the many-body Schrodinger equation describing interaction of heavy nuclei and orbiting electrons. Unfortunatley, this is an extremely (read, NP) hard problem, and so a significant amout of research effort has, and continues, to be directed towards numerical methods in QC. Typically, these methods work by optimising the wavefunction in a basis set of "Slater determinants". (In practice these are anti-symetterised tensor products of single-electron orbitals, but for our purposes let's not worry about the details.) Typically, the number of Slater determinants - and so the complexity of optimisation - grows exponentially with the system size, but recently machine learning (ML) has emerged as a possible tool with which to tackle this seemingly intractable scaling issue.

Translation/disclaimer: we can use ML and it has displayed some promising properties, but right now the SOTA results still belong to the established numerical methods (e.g. coupled-cluster) in practical settings.

Project summary

We follow the approach proposed by Choo et al. to map the exponentially complex system of interacting fermions to an equivilent (and still exponentially large) system of interacting qubits (see their or our paper for details). The advantage being that we can then apply neural network quantum states (NNQS) originally developed for condensed matter physics (CMP) (with distinguishable interacting particles) to the electron structure calculations (with indistinguishable electrons and fermionic anti-symettries).

This project proposes that simply applying techniques from CMP to QC will inevitably fail to take advantage of our significant a priori knowledge of molecular systems. Moreover, the stochastic optimisation of NNQS relies on repeatedly sampling the wavefunction, which can be prohibitively expensive. This project is a sandbox for trialling different NNQS, in particular an ansatz based on autoregressive neural networks that we present in the paper. The major benefits of our approach are that it:

allows for highly efficient sampling, especially of the highly asymmetric wavefunction typical found in QC,
allows for physical priors - such as conservation of electron number, overall spin and possible symettries - to be embedded into the network without sacrificing expressibility.

Getting started

In this repo

notebooks

run_naqs.ipynb : Run individual experiments or batches of experiments, including those to recreate published results.
generate_molecular_data_and_baselines.ipynb :
1. Create the [molecule].hdf5 and [molecule]_qubit_hamiltonian.pkl files required (these are provided for molecules used in the paper in the molecules directory.)
2. Solve these molecules using various canconical QC methods using Psi4.

experiments

Experimental scripts, including those to reproduced published results, for NAQS and Psi4.

molecules

The molecular data required to reproduce published results.

src / src_cpp

Python and cython source code for the main codebase and fast calculations, respectively.

Running experiments

Further details are provided in the run_naqs.ipynb notebook, however the published experiments can be run using the provided batch scripts.

>>> experiments/bash/naqs/batch_train.sh 0 LiH

Here, 0 is the GPU number to use (if one is available, otherwise the CPU will be used by default) and LiH can be replaced by any folder in the molecules directory. Similarly, the experimental ablations can be run using the corresponding bash scripts.

>>> experiments/bash/naqs/batch_train_no_amp_sym.sh 0 LiH
>>> experiments/bash/naqs/batch_train_no_mask.sh 0 LiH
>>> experiments/bash/naqs/batch_train_full_mask.sh 0 LiH

Requirements

The underlying neural networks require PyTorch. The molecular systems are typically handled by OpenFermion with the backend calculations and baselines requiring and Psi4. Note that this code expects OpenFermion 0.11.0 and will need refactoring to work with newer versions. Otherwise, all other required packages - numpy, matplotlib, seaborn if you want pretty plots etc - are standard. However, to be concrete, the linked Colab notebooks will provide an environment in which the code can be run.

Reference

If you find this project or the associated paper useful, it can be cited as below.

@article{barrett2021autoregressive,
  title={Autoregressive neural-network wavefunctions for ab initio quantum chemistry},
  author={Barrett, Thomas D and Malyshev, Aleksei and Lvovsky, AI},
  journal={arXiv preprint arXiv:2109.12606},
  year={2021}
}

Comments

pip installation

Great code. It runs very smoothly and clearly outperforms the results in Choo et al. Would you consider re-engineering the code slightly to allow for a pipy installation?

opened by kastoryano 0

Releases(v1.0.0)

v1.0.0(Feb 4, 2022)

Publication code for arXiv:2109.12606.
Source code(tar.gz)
Source code(zip)

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 6, 2022

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow

18 Oct 6, 2022

A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

94 Jan 6, 2023

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

72 Jan 6, 2023

Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

3 Jul 14, 2021

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

56 Nov 15, 2022

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

408 Jan 1, 2023

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

1 Oct 26, 2021

Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

7 Apr 5, 2022

Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

Related tags

Overview

naqs-for-quantum-chemistry

TL;DR

Overview

Quantum chemistry with neural networks

Project summary

Getting started

In this repo

notebooks

experiments

molecules

src / src_cpp

Running experiments

Requirements

Reference

You might also like...

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

A code generator from ONNX to PyTorch code

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Convert Python 3 code to CUDA code.

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Low-code/No-code approach for deep learning inference on devices

Comments

pip installation

Releases(v1.0.0)

v1.0.0(Feb 4, 2022)

Owner

Tom Barrett

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Semi-SDP Semi-supervised parser for semantic dependency parsing.

Disturbing Target Values for Neural Network regularization: attacking the loss layer to prevent overfitting

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Exploring Machine Learning Models for detecting anomalous behavior in credit-card transactions. It's crucial that credit-card companies are able to recognize fraudulent activity so that customers are not charged for items they didn't purchase.

A library that can print Python objects in human readable format

[ECCV 2020] Gradient-Induced Co-Saliency Detection

Face Recognition & AI Based Smart Attendance Monitoring System.

ChainerRL is a deep reinforcement learning library built on top of Chainer.

Pixray is an image generation system

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

JAX + dataclasses

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.