MIST

Training MIST

TRAIN_FILE=/your/path/to/train.json
VALID_FILE=/your/path/to/valid.json
OUTPUT_DIR=/your/path/to/save_checkpoints
CACHE_DIR=/your/path/to/transformer_package_cache

MODEL_PATH=bert-base-uncased or models/unilm1.2-base-uncased

# squadqg 30005 steps
# squadqg 50005 steps
# xsum 600005 steps
STEPS=30005

python -m torch.distributed.launch --nproc_per_node=4 train.py\
  --train_file $TRAIN_FILE\
  --valid_file $VALID_FILE\
  --output_dir $OUTPUT_PATH\
  --model_type nat --model_name_or_path $MODEL_PATH\
  --do_lower_case --max_source_seq_length 464 --max_target_seq_length 48\
  --per_gpu_train_batch_size 16 --gradient_accumulation_steps 1\
  --learning_rate 3e-5 --num_warmup_steps 500 --num_training_steps $STEPS\
  --cache_dir $CACHE_DIR\
  --log_dir ${OUTPUT_PATH}/log\
  --keep_prob 0.0\
  --random_prob 0.0\
  --use_glat\
  --tqdm_miniters 100\
  --cotrain_put_target_in_source\ 
  --cotrain_put_target_in_source_same_bert\ 
  --wandb\ # logging with wandb
  --fp16\
  --fp16_opt_level O2

Removing the cotrain_put_target_in_source and cotrain_put_target_in_source_same_bert flags to reproduce the results without MIST.

Download Unilm

mkdir -p models/unilm1.2-base-uncased
cd models/unilm1.2-base-uncased
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased.bin -O pytorch_model.bin
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-vocab.txt -O vocab.txt
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-config.json -O config.json

Download datasets

Json dataset links: squadqg, xsum and quora

Training NAT MASS

To reproduce the results of NAT MASS, please refer to the ./MASS-NAT/mass-nat.sh

Improving Non-autoregressive Generation with Mixup Training

Related tags

Overview

MIST

Training MIST

Download Unilm

Download datasets

Training NAT MASS

Owner

GLODISMO: Gradient-Based Learning of Discrete Structured Measurement Operators for Signal Recovery

TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

PyTorch implementation of Deformable Convolution

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

NaturalProofs: Mathematical Theorem Proving in Natural Language

Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

Code for the paper Learning the Predictability of the Future

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

GrabGpu_py: a scripts for grab gpu when gpu is free

Experiments for Fake News explainability project

ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

AI Summer's complete catalog of articles

Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

AI4Good project for detecting waste in the environment

tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

NR-GAN: Noise Robust Generative Adversarial Networks

Repo público onde postarei meus estudos de Python, buscando aprender por meio do compartilhamento do aprendizado!

CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices.

Python Implementation of Chess Playing AI with variable difficulty

Simulator for FRC 2022 challenge: Rapid React