Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Related tags

Deep Learningt-few
Overview

T-Few

This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning".

This method outperforms in-context learning with GPT-3 and achieves state-of-the-art on "RAFT".

Setup

First, create a virtual environment for the project and install all the requirments. (We use conda to manage environments. Be sure to install and initialize conda first.)

  1. Create a virtual environment with python 3.7 conda create -n tfew python==3.7, then activate the environment conda activate tfew.
  2. Install other dependencies. pip install -r requirements.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html
  3. If you plan to run SAID, then install dependencies with python src/intrinsic_said_setup.py develop. Otherwise, skip this step.

The steps above only needs to be done once. In addition, every time you start a new session, you will need to run . bin/start.sh

Run your first experiment

Once you finished setting up the environment, you can try running CUDA_VISIBLE_DEVICES=3 python -m src.pl_train -c t0.json+rte.json -k save_model=False exp_name=first_exp The outputs of this run will be saved to ${OUTPUT_PATH}/first_exp/, which is usually /t-few/exp_out/first_exp/. Here, first_exp is the experiment name, you can run more experiments with different expeirment names. The code will automatically skip finished experiments. (However, if you wish to rerun a finished experiment under the same experiment name, you will need to manually remove the corresponding files in the output directory.)

There are two ways to control an experiment.

  1. You can specify config files with -c. Multiple config files can be combined with +. (When there are conflits, config terms from the config file on the right will have greater power.) This will be convinient when you have multiple terms that forms a fixed group.
  2. You can override values with -k. This will be convinient when you need to change a small number of terms.

It is recommended to use GPUs with 40GB to train T0(3B) and 80GB to train T0

Run an array of experiments

In this project, we often need to run a large number of experiments. Here is an example bash script bin/few-shot-pretrained-3b-100k.sh to fine-tune 3B pre-trained (IA)3 on all datasets.

This should take a few hours. After that, you can use scripts/get_results_table.py to generate a csv summary.

Citation

If you find this repo helpful, welcome to cite our work:

@article{liu2020tfew,
  title={Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning},
  author={Liu, Haokun and Tam, Derek and Muqeeth, Mohammed and Mohta, Jay and Huang, Tenghao and Bansal, Mohit and Raffel, Colin},
  journal={arXiv preprint arXiv:2205.05638},
  year={2022}
}

We use the following code in our works:

@article{mahabadi2021compacter,
  title={Compacter: Efficient low-rank hypercomplex adapter layers},
  author={Mahabadi, Rabeeh Karimi and Henderson, James and Ruder, Sebastian},
  journal={arXiv preprint arXiv:2106.04647},
  year={2021}
}

@article{sung2021training,
  title={Training Neural Networks with Fixed Sparse Masks},
  author={Sung, Yi-Lin and Nair, Varun and Raffel, Colin},
  journal={arXiv preprint arXiv:2111.09839},
  year={2021}
}

@article{aghajanyan2020intrinsic,
  title={Intrinsic dimensionality explains the effectiveness of language model fine-tuning},
  author={Aghajanyan, Armen and Zettlemoyer, Luke and Gupta, Sonal},
  journal={arXiv preprint arXiv:2012.13255},
  year={2020}
}
SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

SAAVN SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,IC

YinfengYu 10 Aug 30, 2022
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery (ICCV 2021)

Change is Everywhere Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery by Zhuo Zheng, Ailong Ma, Liangpei Zhang and Yanfei

Zhuo Zheng 125 Dec 13, 2022
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

Mathis Petrovich 248 Dec 23, 2022
Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Face-Recognition-based-Attendance-System A real time implementation of Attendance System in python. Pre-requisites To understand the implentation of F

Muhammad Zain Ul Haque 1 Dec 31, 2021
Denoising images with Fourier Ring Correlation loss

Denoising images with Fourier Ring Correlation loss The python code accompanies the working manuscript Image quality measurements and denoising using

2 Mar 12, 2022
This is the repository for Learning to Generate Piano Music With Sustain Pedals

SusPedal-Gen This is the official repository of Learning to Generate Piano Music With Sustain Pedals Demo Page Dataset The dataset used in this projec

Joann Ching 12 Sep 02, 2022
Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Class Activation Map methods implemented in Pytorch pip install grad-cam ⭐ Tested on many Common CNN Networks and Vision Transformers. ⭐ Includes smoo

Jacob Gildenblat 6.6k Jan 06, 2023
This repo is developed for Strong Baseline For Vehicle Re-Identification in Track 2 Ai-City-2021 Challenges

A STRONG BASELINE FOR VEHICLE RE-IDENTIFICATION This paper is accepted to the IEEE Conference on Computer Vision and Pattern Recognition Workshop(CVPR

Cybercore Co. Ltd 78 Dec 29, 2022
A voice recognition assistant similar to amazon alexa, siri and google assistant.

kenyan-Siri Build an Artificial Assistant Full tutorial (video) To watch the tutorial, click on the image below Installation For windows users (run th

Alison Parker 3 Aug 19, 2022
Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models.

Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models

AdvBox 1.3k Dec 25, 2022
An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

iftopt An Implicit Function Theorem (IFT) optimizer for bi-level optimizations. Requirements Python 3.7+ PyTorch 1.x Installation $ pip install git+ht

The Money Shredder Lab 2 Dec 02, 2021
Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

advantage-weighted-regression Implementation of Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning, by Peng et al. (

Omar D. Domingues 1 Dec 02, 2021
A reimplementation of DCGAN in PyTorch

DCGAN in PyTorch A reimplementation of DCGAN in PyTorch. Although there is an abundant source of code and examples found online (as well as an officia

Diego Porres 6 Jan 08, 2022
CodeContests is a competitive programming dataset for machine-learning

CodeContests CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode. It consists of pro

DeepMind 1.6k Jan 08, 2023
PyGCL: A PyTorch Library for Graph Contrastive Learning

PyGCL is a PyTorch-based open-source Graph Contrastive Learning (GCL) library, which features modularized GCL components from published papers, standa

PyGCL 588 Dec 31, 2022
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation

tf-imle Tensorflow 2 and PyTorch implementation and Jupyter notebooks for Implicit Maximum Likelihood Estimation (I-MLE) proposed in the NeurIPS 2021

NEC Laboratories Europe 69 Dec 13, 2022
Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

Neural Fractal Create Fractals Using Complex-Valued Neural Networks! Home Page Features Define Dynamical Systems Using Complex-Valued Neural Networks

Amirabbas Asadi 10 Dec 17, 2022
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

Jurijs Nazarovs 7 Nov 26, 2022