RGN2-Replica (WIP)

To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding for particular use when no evolutionary homologs are available (ie. for protein design).

Install

$ pip install rgn2-replica

To load sample dataset

from datasets import load_from_disk
ds = load_from_disk("data/ur90_small")
print(ds['train'][0])

To convert to pandas for exploration

df = ds['train'].to_pandas()
df.sample(5)

To train ProteinLM

Run the following command with default parameters

python -m scripts.lmtrainer

This will start the run using sample dataset in repo directory on CPU.

TO-DO LIST: ordered by priority

Contribute:

Hey there! New ideas are welcome: open/close issues, fork the repo and share your code with a Pull Request.

Currently the main discussions / conversation about the model development is happening in this discord server under the /self-supervised-learning channel.

Clone this project to your computer:

git clone https://github.com/EricAlcaide/pysimplechain

Please, follow this guideline on open source contribtuion

Citations:

@article {Chowdhury2021.08.02.454840,
    author = {Chowdhury, Ratul and Bouatta, Nazim and Biswas, Surojit and Rochereau, Charlotte and Church, George M. and Sorger, Peter K. and AlQuraishi, Mohammed},
    title = {Single-sequence protein structure prediction using language models from deep learning},
    elocation-id = {2021.08.02.454840},
    year = {2021},
    doi = {10.1101/2021.08.02.454840},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2021/08/04/2021.08.02.454840},
    eprint = {https://www.biorxiv.org/content/early/2021/08/04/2021.08.02.454840.full.pdf},
    journal = {bioRxiv}
}

@article{alquraishi_2019,
	author={AlQuraishi, Mohammed},
	title={End-to-End Differentiable Learning of Protein Structure},
	volume={8},
	DOI={10.1016/j.cels.2019.03.006},
	URL={https://www.cell.com/cell-systems/fulltext/S2405-4712(19)30076-6}
	number={4},
	journal={Cell Systems},
	year={2019},
	pages={292-301.e3}

Replication attempt for the Protein Folding Model

Related tags

Overview

RGN2-Replica (WIP)

Install

To load sample dataset

To train ProteinLM

TO-DO LIST: ordered by priority

Contribute:

Citations:

Owner

Eric Alcaide

Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

基于pytorch构建cyclegan示例

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

PyTorch implementation of adversarial patch

Mask-invariant Face Recognition through Template-level Knowledge Distillation

Repository for the NeurIPS 2021 paper: "Exploiting Domain-Specific Features to Enhance Domain Generalization".

Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the Machine Learning 4 Health Workshop

Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

My implementation of Image Inpainting - A deep learning Inpainting model

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

Tooling for the Common Objects In 3D dataset.

piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

PyTorch Implementation for "ForkGAN with SIngle Rainy NIght Images: Leveraging the RumiGAN to See into the Rainy Night"

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

ArtEmis: Affective Language for Art