Using pretrained GROVER to extract the atomic fingerprints from molecule. The fingerprints can be used for further tasks.
GROVER is short for Graph Representation frOm self-superVised mEssage passing tRansformer which is a Transformer-based self-supervised message-passing neural network by Rong and colleagues as in the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.
- Create and activate a conda environment:
conda create --name grover python=3.6.8
conda activate grover
- Install requirements from
requirements.txt
file. Additionally, installtorchinfo
:
conda install -c conda-forge -c pytorch -c acellera -c RMG --file=requirements.txt
pip install torchinfo
There are two pretrained models provided by the original authors. Download, extract and save the .pt
file in models_pretrained/
.
Run the main.py
file:
python main.py
Details about the arguments can be viewed in the setup_parser()
function found in the main.py
, or by running:
python main.py -h
If no arguments are specified, then the default arguments will be used.
By default, the outputs are saved in extracted_fingerprint
. The outputs include 3 files:
atom_fp.npy
: contains the atomic fingerprints.distance.npy
: contains the pair-wise shortest relative distance matrices between nodes of the molecular graphs.smiles.txt
: contains the SMILES strings of the molecules.
In order to read the .npy
files, please refer to this part in the numpy.save
documentation