RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Overview

RITA: a Study on Scaling Up Generative Protein Sequence Models

GitHub license Twitter

RITA is a family of autoregressive protein models, developed by a collaboration of Lighton, the OATML group at Oxford, and the Debbie Marks Lab at Harvard.

Model #Params d_model layers lm loss uniref-100
Small 85M 768 12 2.31
Medium 300M 1024 24 2.01
Large 680M 1536 24 1.82
XLarge 1.2B 2048 24 1.70

Results

For full results see our preprint: https://arxiv.org/abs/2205.05789

Usage

Instantiate a model like so:

from transformers import AutoModel, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("lightonai/RITA_s, trust_remote_code=True")
tokenizer = AutoTokenizer.from_pretrained("lightonai/RITA_s")

for generation we support pipelines:

from transformers import pipeline
rita_gen = pipeline('text-generation', model=model, tokenizer=tokenizer)
sequences = rita_gen("MAB", max_length=20, do_sample=True, top_k=950, repetition_penalty=1.2, 
                     num_return_sequences=2, eos_token_id=2)
for seq in sequences:
    print(f"seq: {seq['generated_text'].replace(' ', '')}")

Or see example.py

How to cite

@article{hesslow2022rita,
  title={RITA: a Study on Scaling Up Generative Protein Sequence Models},
  author={Hesslow, Daniel and Zanichelli, Niccol{\'o} and Notin, Pascal and Poli, Iacopo and Marks, Debora},
  journal={arXiv preprint arXiv:2205.05789},
  year={2022}
}
Owner
LightOn
At LightOn, we unlock Extreme-Scale Machine Intelligence. Most repos are focused on the use of photonic hardware. LightOnMuse connects to foundation models
LightOn
Code for paper "Learning to Reweight Examples for Robust Deep Learning"

learning-to-reweight-examples Code for paper Learning to Reweight Examples for Robust Deep Learning. [arxiv] Environment We tested the code on tensorf

Uber Research 261 Jan 01, 2023
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022
A basic implementation of Layer-wise Relevance Propagation (LRP) in PyTorch.

Layer-wise Relevance Propagation (LRP) in PyTorch Basic unsupervised implementation of Layer-wise Relevance Propagation (Bach et al., Montavon et al.)

Kai Fabi 28 Dec 26, 2022
A code implementation of AC-GC: Activation Compression with Guaranteed Convergence, in NeurIPS 2021.

Code For AC-GC: Lossy Activation Compression with Guaranteed Convergence This code is intended to be used as a supplemental material for submission to

Dave Evans 2 Nov 01, 2022
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
Action Segmentation Evaluation

Reference Action Segmentation Evaluation Code This repository contains the reference code for action segmentation evaluation. If you have a bug-fix/im

5 May 22, 2022
Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We h

97 Dec 01, 2022
Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

SCAI-QReCC-21 [leaderboards] [registration] [forum] [contact] [SCAI] Answer a series of contextually-dependent questions like they may occur in natura

19 Sep 28, 2022
natural image generation using ConvNets

The Eyescream Project Generating Natural Images using Neural Networks. For our research summary on this work, please read the Arxiv paper: http://arxi

Meta Archive 601 Nov 23, 2022
Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Yandex 662 Nov 20, 2022
PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

SimTrans-Weak-Shot-Classification This repository contains the official PyTorch implementation of the following paper: Weak-shot Fine-grained Classifi

BCMI 60 Dec 02, 2022
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 4.6k Jan 05, 2023
My coursework for Machine Learning (2021 Spring) at National Taiwan University (NTU)

Machine Learning 2021 Machine Learning (NTU EE 5184, Spring 2021) Instructor: Hung-yi Lee Course Website : (https://speech.ee.ntu.edu.tw/~hylee/ml/202

100 Dec 26, 2022
TransGAN: Two Transformers Can Make One Strong GAN

[Preprint] "TransGAN: Two Transformers Can Make One Strong GAN", Yifan Jiang, Shiyu Chang, Zhangyang Wang

VITA 1.5k Jan 07, 2023
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

GATER This repository contains the code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”. Our implementation is

Jiacheng Ye 12 Nov 24, 2022
This repository is for Competition for ML_data class

This repository is for Competition for ML_data class. Based on mmsegmentatoin,mainly using swin transformer to completed the competition.

jianlong 2 Oct 23, 2022
Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks]

Neural Architecture Search for Spiking Neural Networks Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks] (https

Intelligent Computing Lab at Yale University 28 Nov 18, 2022
Improving Compound Activity Classification via Deep Transfer and Representation Learning

Improving Compound Activity Classification via Deep Transfer and Representation Learning This repository is the official implementation of Improving C

NingLab 2 Nov 24, 2021