RGN2-Replica (WIP)

To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding for particular use when no evolutionary homologs are available (ie. for protein design).

Install

$ pip install rgn2-replica

To load sample dataset

from datasets import load_from_disk
ds = load_from_disk("data/ur90_small")
print(ds['train'][0])

To convert to pandas for exploration

df = ds['train'].to_pandas()
df.sample(5)

To train ProteinLM

Run the following command with default parameters

python -m scripts.lmtrainer

This will start the run using sample dataset in repo directory on CPU.

TO-DO LIST: ordered by priority

Contribute:

Hey there! New ideas are welcome: open/close issues, fork the repo and share your code with a Pull Request.

Currently the main discussions / conversation about the model development is happening in this discord server under the /self-supervised-learning channel.

Clone this project to your computer:

git clone https://github.com/EricAlcaide/pysimplechain

Please, follow this guideline on open source contribtuion

Citations:

@article {Chowdhury2021.08.02.454840,
    author = {Chowdhury, Ratul and Bouatta, Nazim and Biswas, Surojit and Rochereau, Charlotte and Church, George M. and Sorger, Peter K. and AlQuraishi, Mohammed},
    title = {Single-sequence protein structure prediction using language models from deep learning},
    elocation-id = {2021.08.02.454840},
    year = {2021},
    doi = {10.1101/2021.08.02.454840},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2021/08/04/2021.08.02.454840},
    eprint = {https://www.biorxiv.org/content/early/2021/08/04/2021.08.02.454840.full.pdf},
    journal = {bioRxiv}
}

@article{alquraishi_2019,
	author={AlQuraishi, Mohammed},
	title={End-to-End Differentiable Learning of Protein Structure},
	volume={8},
	DOI={10.1016/j.cels.2019.03.006},
	URL={https://www.cell.com/cell-systems/fulltext/S2405-4712(19)30076-6}
	number={4},
	journal={Cell Systems},
	year={2019},
	pages={292-301.e3}

Replication attempt for the Protein Folding Model

Related tags

Overview

RGN2-Replica (WIP)

Install

To load sample dataset

To train ProteinLM

TO-DO LIST: ordered by priority

Contribute:

Citations:

Owner

Eric Alcaide

Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

这个开源项目主要是对经典的时间序列预测算法论文进行复现，模型主要参考自GluonTS，框架主要参考自Informer

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

This repository gives an example on how to preprocess the data of the HECKTOR challenge

Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021.

Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Regulatory Instruments for Fair Personalized Pricing.

Synthetic Humans for Action Recognition, IJCV 2021

Structured Data Gradient Pruning (SDGP)

Part-aware Measurement for Robust Multi-View Multi-Human 3D Pose Estimation and Tracking

Locally cache assets that are normally streamed in POPULATION: ONE

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

PyTorch implementation HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections

Semantic similarity computation with different state-of-the-art metrics

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Meaningful titles for tabs and PDF downloads! Also supports tab search.

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21