Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

Last update: Dec 01, 2022

You might also like...

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

GVP Transformer (wip) Implementation of the GVP-Transformer, which was used in the paper Learning inverse folding from millions of predicted structure

19 May 6, 2022

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo

77.2k Jan 2, 2023

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training Code for our paper "Predicting lncRNA–protein interactio

1 Nov 29, 2022

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

2 Dec 14, 2022

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

69 Dec 22, 2022

Generative Models for Graph-Based Protein Design

Graph-Based Protein Design This repo contains code for Generative Models for Graph-Based Protein Design by John Ingraham, Vikas Garg, Regina Barzilay

159 Dec 15, 2022

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

kaggle-hpa-2021-7th-place-solution Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle. A description of the met

8 Jul 9, 2021

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.

24 Nov 23, 2022

Comments

protein bert uniref90 dataset
(discussed in discord)

after running the first step (create_uniref_db) of https://github.com/nadavbra/protein_bert I got a 24GB file "uniref_proteins_and_annotations.db" . It seems it could be useful for generate sequences for this project, sharing the links there

https://gitlab.com/rom1504/uniref data

colab to get the db and do a few queries https://colab.research.google.com/drive/1BGYEBDmD0yToLNou2T-t-QbJV5wCtIBz#scrollTo=21U3PpCp-pxr There are 135301051 records in the db, in a table looking like:

CREATE TABLE "protein_annotations" ( "index" INTEGER, "tax_id" REAL, "uniprot_name" TEXT, "go_annotations" TEXT, "flat_go_annotations" TEXT, "n_go_annotations" INTEGER, "complete_go_annotation_indices" TEXT, "n_complete_go_annotations" INTEGER );

Sample look like this:

| | index | tax_id | uniprot_name | go_annotations | flat_go_annotations | n_go_annotations | complete_go_annotation_indices | n_complete_go_annotations | |---:|--------:|-----------------:|:-----------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------|-------------------:|:---------------------------------|----------------------------:| | 0 | 0 | 1.57204e+06 | A0A5A9P0L4_9TELE | {"GO Molecular Function": ["GO:0003755", "GO:0005524", "GO:0004672", "GO:0005509"], "GO Biological Process": [], "GO Cellular Component": []} | ["GO:0003755", "GO:0004672", "GO:0005509", "GO:0005524"] | 4 | [2761, 3561, 4193, 4205] | 4 | | 1 | 1 | 648755 | UPI0016133188 | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 2 | 2 | 1.93059e+06 | A0A410P257_9BACT | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 3 | 3 | 519421 | UPI0019403D63 | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 4 | 4 | 72004 | A0A6B0RPA5_9CETA | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": []} | ["GO:0004672", "GO:0005524"] | 2 | [3561, 4205] | 2 | | 5 | 5 | 375764 | A0A672ZWI7_9TELE | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 6 | 6 | 1.41558e+06 | A0A6P7YNV3_9AMPH | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886"] | 3 | [3561, 4205, 4526] | 3 | | 7 | 7 | 240159 | A0A4U5TZD8_COLLU | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0016021", "GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886", "GO:0016021"] | 4 | [3561, 4205, 4526, 10019] | 4 | | 8 | 8 | 146911 | UPI00074FFD9C | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 9 | 9 | 260995 | A0A6P8RG40_GEOSA | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886"] | 3 | [3561, 4205, 4526] | 3 |
opened by rom1504 4

Releases(0.0.36)

0.0.36(Aug 16, 2021)

Source code(tar.gz)
Source code(zip)
0.0.35(Aug 9, 2021)

Source code(tar.gz)
Source code(zip)
0.0.34(Jul 7, 2021)

Source code(tar.gz)
Source code(zip)
0.0.33(Jul 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.32(Jul 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.29(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.28a(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.27(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.26(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.25(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.24(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.23(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.21(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.20(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.19(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.18(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.17(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.16(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.14(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.12(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.11(Jul 1, 2021)

Source code(tar.gz)
Source code(zip)
0.0.10(Jul 1, 2021)

Source code(tar.gz)
Source code(zip)
0.0.9a(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.8(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.7(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.6(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5a(Jun 28, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3a(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2a(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention

GitHub Repository

Code for ViTAS_Vision Transformer Architecture Search

Vision Transformer Architecture Search This repository open source the code for ViTAS: Vision Transformer Architecture Search. ViTAS aims to search fo

46 Dec 17, 2022

Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.

Distant Supervision for Scene Graph Generation Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation. Introduction The pape

23 Dec 31, 2022

PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Maximum Entropy Generators for Energy-Based Models All experiments have tensorboard visualizations for samples / density / train curves etc. To run th

135 Oct 27, 2022

Adaptation through prediction: multisensory active inference torque control

Adaptation through prediction: multisensory active inference torque control Submitted to IEEE Transactions on Cognitive and Developmental Systems Abst

1 Nov 07, 2022

Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust.

Subspace Adversarial Training Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust. However,

15 Sep 02, 2022

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch; pre-processing and post-processing using numpy instead of pytroch.

21 Dec 12, 2022

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

DALLE2 Video (wip) ** only to be built after DALLE2 image is done and replicated, and the importance of the prior network is validated ** Direct appli

105 May 15, 2022

Protect against subdomain takeover

domain-protect scans Amazon Route53 across an AWS Organization for domain records vulnerable to takeover deploy to security audit account scan your en

0 Nov 17, 2022

[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

[ICLR 2021] RAPID: A Simple Approach for Exploration in Reinforcement Learning This is the Tensorflow implementation of ICLR 2021 paper Rank the Episo

48 Nov 21, 2022

A simple version for graphfpn

GraphFPN: Graph Feature Pyramid Network for Object Detection Download graph-FPN-main.zip For training , run: python train.py For test with Graph_fpn

67 Dec 25, 2022

Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

11.3k Dec 30, 2022

PyTorch implementation of "Continual Learning with Deep Generative Replay", NIPS 2017

pytorch-deep-generative-replay PyTorch implementation of Continual Learning with Deep Generative Replay, NIPS 2017 Results Continual Learning on Permu

127 Dec 14, 2022

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

112 Dec 16, 2022

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

PTvsBT On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021) Citation Please cite a

10 Nov 25, 2022

Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

Related tags

Overview

ProGen - (wip)

Install

Usage

Training from Uniref

Todo

Citations

You might also like...

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Generative Models for Graph-Based Protein Design

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Comments

protein bert uniref90 dataset

Releases(0.0.36)

0.0.36(Aug 16, 2021)

0.0.35(Aug 9, 2021)

0.0.34(Jul 7, 2021)

0.0.33(Jul 6, 2021)

0.0.32(Jul 6, 2021)

0.0.29(Jul 4, 2021)

0.0.28a(Jul 4, 2021)

0.0.27(Jul 4, 2021)

0.0.26(Jul 4, 2021)

0.0.25(Jul 4, 2021)

0.0.24(Jul 4, 2021)

0.0.23(Jul 4, 2021)

0.0.21(Jul 3, 2021)

0.0.20(Jul 3, 2021)

0.0.19(Jul 3, 2021)

0.0.18(Jul 2, 2021)

0.0.17(Jul 2, 2021)

0.0.16(Jul 2, 2021)

0.0.14(Jul 2, 2021)

0.0.12(Jul 2, 2021)

0.0.11(Jul 1, 2021)

0.0.10(Jul 1, 2021)

0.0.9a(Jun 30, 2021)

0.0.8(Jun 30, 2021)

0.0.7(Jun 29, 2021)

0.0.6(Jun 29, 2021)

0.0.5a(Jun 28, 2021)

0.0.5(Jun 25, 2021)

0.0.3a(Jun 25, 2021)

0.0.2a(Jun 25, 2021)