Decoding the Protein-ligand Interactions Using Parallel Graph Neural Networks

Overview

Decoding the Protein-ligand Interactions Using Parallel Graph Neural Networks

Requirements

  • python 0.10+
  • rdkit 2020.03.3.0
  • biopython 1.78
  • openbabel 2.4.1
  • numpy 1.19.2
  • scipy 1.5.2
  • torchvision 0.7.0

Conda enviroment is highly recommended for this implementation

Data Preparation for classification models

Data preperation requires the ligand and protein to be in a mol format readable by rdkit .mol, .mol2, and .pdb are readily handled by rdkit .sdf is easily handled with openbabel conversion, made convenient with the pybel wrapper

Both files can then be fed into extractM2.py where the cropping window can be adjusted on line 29 The extract method will operates best if the initial protein file is in pdbqt format. For easy model integration it is best to store the m2 protein window produced by the extract script along with the original protein ex: pickle.dump((m1,m2), file)

Once cropped complexes are stored, their numpy featurization files can be created. Files for the different models are labeled in the Data_Prep directory

The scripts are designed to use keys that reference the cropped and stored pairs from the previous step. Users will need to alter scripts to include their desired directories, as well as key traversal. Once these changes have been made, the scripts can be called with

python -W ignore gnn[f/p]_data_prep.py

Data Preparation for regression models

The data needs to be in mol format as similar to classification models. We have provided some sample mol files representing protein and ligand. Here the protein is cropped at 8Å window using the extract script as mentioned previously.

The cropped protein-ligand can be used to create features in numpy format. Sample training and test keys along with the corresponding pIC50 and experimental-binding-affinity (EBA) labels are provided in keys folder. All the files are saved in pickle format with train and test keys as list and the label files as disctionary with key corresponding to the train/test key and value corresponding to the label. The prepare_eba_data.py and prepapre_pic50_data.py uses the cropped protein-ligand mol files to create the correspnding features for the model and save them in compressed numpy file format in the corresponding numpy directory.

These scripts can be called as:

python repare_pic50_data.py <path to pkl-mol directory> <path to save numpy features>
python repare_eba_data.py <path to pkl-mol directory> <path to save numpy features>

Training

Below is an example of the training command. Additional options can be added to the argument parser here (learning rate, layer amount and dimension, etc). Defaults are in place for undeclared parameters including a save directory.

Classfication models

python -W ignore -u train.py --dropout_rate=0.3 --epoch=500 --ngpu=1 --batch_size=32 --num_workers=0  --train_keys=<your_training_keys.pkl>  --test_keys=<your_test_keys.pkl>

Regression models

python -W ignore -u train.py --dropout_rate=0.3 --epoch=500 --ngpu=1 --batch_size=1 --num_workers=0 --data_dir=<path to feature-numpy folder> --train_keys=<your_training_keys.pkl>  --test_keys=<your_test_keys.pkl>

The save directory stores each epoch as a .pt allowing the best model inatance to be loaded later on Training and test metrics such as loss and ROC are stored in the same directory for each GPU used. Ex 3 GPUS: log-rank1.csv, log-rank2.csv, and log-rank3.csv

Owner
Neeraj Kumar
Computational Biology/Chemistry and Bioinformatics.
Neeraj Kumar
Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Space-Time Correspondence as a Contrastive Random Walk This is the repository for Space-Time Correspondence as a Contrastive Random Walk, published at

A. Jabri 239 Dec 27, 2022
This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

Code Repository for the Paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (To appear in: Proceedings of NeurIPS20

1 Oct 03, 2022
A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision This is the repository for our Paper/Contribution to the WI2022 in Nürnber

Maximilian Harl 6 Jan 17, 2022
Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

Geometry-Aware Gradient Algorithms for Neural Architecture Search This repository contains the code required to run the experiments for the DARTS sear

18 May 27, 2022
Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image class

Yassine 34 Dec 25, 2022
Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

TargetCLIP- official pytorch implementation of the paper Image-Based CLIP-Guided Essence Transfer This repository finds a global direction in StyleGAN

Hila Chefer 221 Dec 13, 2022
Multi-Modal Machine Learning toolkit based on PyTorch.

简体中文 | English TorchMM 简介 多模态学习工具包 TorchMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 TorchMM 初始版本 v1.0 特性 丰富的任务场景:工具

njustkmg 1 Jan 05, 2022
TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition

Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition This is an implementation of TCPNet. Introduction For video recognition task, a g

Zilin Gao 21 Dec 08, 2022
Pytorch implementation of Zero-DCE++

Zero-DCE++ You can find more details here: https://li-chongyi.github.io/Proj_Zero-DCE++.html. You can find the details of our CVPR version: https://li

Chongyi Li 157 Dec 23, 2022
Score refinement for confidence-based 3D multi-object tracking

Score refinement for confidence-based 3D multi-object tracking Our video gives a brief explanation of our Method. This is the official code for the pa

Cognitive Systems Research Group 47 Dec 26, 2022
A model which classifies reviews as positive or negative.

SentiMent Analysis In this project I built a model to classify movie reviews fromn the IMDB dataset of 50K reviews. WordtoVec : Neural networks only w

Rishabh Bali 2 Feb 09, 2022
A Blender python script for getting asset browser custom preview images for objects and collections.

asset_snapshot A Blender python script for getting asset browser custom preview images for objects and collections. Installation: Click the code butto

Johnny Matthews 44 Nov 29, 2022
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022
A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Jun-Yan Zhu 27 Aug 08, 2022
Checking fibonacci - Generating the Fibonacci sequence is a classic recursive problem

Fibonaaci Series Generating the Fibonacci sequence is a classic recursive proble

Moureen Caroline O 1 Feb 15, 2022
Realistic lighting in ursina!

Ursina Lighting Realistic lighting in ursina! If you want to have realistic lighting in ursina, import the UrsinaLighting.py in your project and use t

17 Jul 07, 2022
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 02, 2022
AITom is an open-source platform for AI driven cellular electron cryo-tomography analysis.

AITom Introduction AITom is an open-source platform for AI driven cellular electron cryo-tomography analysis. AITom is originated from the tomominer l

93 Jan 02, 2023
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Vision Research Lab @ UCSB 26 Nov 29, 2022