MIST

Training MIST

TRAIN_FILE=/your/path/to/train.json
VALID_FILE=/your/path/to/valid.json
OUTPUT_DIR=/your/path/to/save_checkpoints
CACHE_DIR=/your/path/to/transformer_package_cache

MODEL_PATH=bert-base-uncased or models/unilm1.2-base-uncased

# squadqg 30005 steps
# squadqg 50005 steps
# xsum 600005 steps
STEPS=30005

python -m torch.distributed.launch --nproc_per_node=4 train.py\
  --train_file $TRAIN_FILE\
  --valid_file $VALID_FILE\
  --output_dir $OUTPUT_PATH\
  --model_type nat --model_name_or_path $MODEL_PATH\
  --do_lower_case --max_source_seq_length 464 --max_target_seq_length 48\
  --per_gpu_train_batch_size 16 --gradient_accumulation_steps 1\
  --learning_rate 3e-5 --num_warmup_steps 500 --num_training_steps $STEPS\
  --cache_dir $CACHE_DIR\
  --log_dir ${OUTPUT_PATH}/log\
  --keep_prob 0.0\
  --random_prob 0.0\
  --use_glat\
  --tqdm_miniters 100\
  --cotrain_put_target_in_source\ 
  --cotrain_put_target_in_source_same_bert\ 
  --wandb\ # logging with wandb
  --fp16\
  --fp16_opt_level O2

Removing the cotrain_put_target_in_source and cotrain_put_target_in_source_same_bert flags to reproduce the results without MIST.

Download Unilm

mkdir -p models/unilm1.2-base-uncased
cd models/unilm1.2-base-uncased
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased.bin -O pytorch_model.bin
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-vocab.txt -O vocab.txt
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-config.json -O config.json

Download datasets

Json dataset links: squadqg, xsum and quora

Training NAT MASS

To reproduce the results of NAT MASS, please refer to the ./MASS-NAT/mass-nat.sh

Improving Non-autoregressive Generation with Mixup Training

Related tags

Overview

MIST

Training MIST

Download Unilm

Download datasets

Training NAT MASS

Owner

Speech Recognition using DeepSpeech2.

Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

A full-fledged version of Pix2Seq

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Keras Image Embeddings using Contrastive Loss

Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems

Entity-Based Knowledge Conflicts in Question Answering.

DANet for Tabular data classification/ regression.

Official source code of Fast Point Transformer, CVPR 2022

AdelaiDepth is an open source toolbox for monocular depth prediction.

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

[ICME 2021 Oral] CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

Fantasy Points Prediction and Dream Team Formation

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

An experiment to bait a generalized frontrunning MEV bot

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。