DOP-Tuning(Domain-Oriented Prefix-tuning model)

Overview

DOP-Tuning

DOP-Tuning(Domain-Oriented Prefix-tuning model)代码基于Prefix-Tuning改进.

Files

├── seq2seq                       # Code for encoder-decoder architecture
│   ├── train_bart.py             # high-level scripts to train.
│   ├── prefixTuning.py           # code that implements prefix-tuning.
│   ├── finetune.py               # training code (contains data loading, model loading, and calls trainer)   
│   ├── lightning_base.py         # helper code
│   ├── utils.py                  # helper code
│   └── callbacks.py              # helper code
└── ...

To run the code for encoder-decoder architecture like BART, the code is in seq2seq. This corresponds to the summarization experiments in the paper.

Setup:

cd transformer; pip install -e .

Train via DOP-tuning

cd seq2seq; 

python train_bart.py --mode xsum --preseqlen 200 --do_train yes --fp16 yes --bsz 16  --epoch 30  --gradient_accumulation_step 3 --learning_rate 0.00005  --mid_dim 800 --use_lowdata_token 'yes' --lowdata_token 'summarize'

其中use_lowdata_token表示是否采用domain word初始化的方式;lowdata_token表示传入的domain word.

Decode

cd seq2seq; 

python train_bart.py --mode xsum --do_train no --prefix_model_path {checkpoint_path} --preseqlen {same as training} --mid_dim {same as training} --use_lowdata_token 'yes' --lowdata_token 'summarize'
Owner
Andrew Zeng
Andrew Zeng
Andrew Zeng
A free website that keeps the people informed about housing and evictions.

Eviction Tracker Currently helping verify detainer warrant data for middle Tennessee - via Middle TN DSA - Red Door Collective Features Presents data

Red Door Collective 7 Dec 14, 2022
Validate UC alumni identifier numbers with Python 3.

UC number validator Validate UC alumni identifier numbers with Python 3. Getting started Install the library with: pip install -U ucnumber Usage from

Open Source eUC 1 Jul 07, 2021
All kinds of programs are accepted here, raise a genuine PR, and claim a PR, Make 4 successful PR's and get the Stickers and T-Shirt from hacktoberfest 2021

this repository is excluded from hacktoberfest Hacktoberfest-2021 This repository aims to help code beginners with their first successful pull request

34 Sep 11, 2022
A python script that fetches the grades of a student from a WAEC result in pdf format.

About waec-result-analyzer A python script that fetches the grades of a student from a WAEC result in pdf format. Built for federal government college

Oshodi Kolapo 2 Dec 04, 2021
Script to check if your Bistromatic handle everything as it should.

Bistromatic Checker Script to check if your Bistromatic handle everything as it should. The bistromatic is the project marking the end of the CPool at

Mathias 1 Dec 27, 2021
Hook and simulate global keyboard events on Windows and Linux.

keyboard Take full control of your keyboard with this small Python library. Hook global events, register hotkeys, simulate key presses and much more.

BoppreH 3.2k Jan 01, 2023
Binjago - Set of tools aiding in analysis of stripped Golang binaries with Binary Ninja

Binjago 🥷 Set of tools aiding in analysis of stripped Golang binaries with Bina

W3ndige 2 Jul 23, 2022
SysCFG R/W Utility written in Swift

MagicCFG SysCFG R/W Utility written in Swift MagicCFG is one of our first, successful applications that we launched last year. The app makes it possib

Jan Fabel 82 Aug 08, 2022
PDX Code Guild Full Stack Python Bootcamp starting 2022/02/28

Class Liger Rough Timeline Weeks 1, 2, 3, 4: Python Weeks 5, 6, 7, 8: HTML/CSS/Flask Weeks 9, 10, 11: Javascript Weeks 12, 13, 14, 15: Django Weeks 16

PDX Code Guild 5 Jul 05, 2022
Fix Eitaa Messenger's Font Problem on Linux

Fix Eitaa Messenger's Font Problem on Linux

6 Oct 15, 2022
A basic DIY-project made using Python and MySQL

Banking-Using-Python-MySQL This is a basic DIY-project made using Python and MySQL. Pre-Requisite needed:-- MySQL command Line:- creating a database

ABHISHEK 0 Jul 03, 2022
LTGen provides classic algorithms used in Language Theory.

LTGen LTGen stands for Language Theory GENerator and provides tools to implement language theory. Command Line LTGen is a collection of tools to imple

Hugues Cassé 1 Jan 07, 2022
Rock-paper-scissors basic game in terminal with Python

piedra-papel-tijera Juego básico de piedra, papel o tijera en terminal con Python. El juego incluye: Nombre de jugador Número de veces a jugar Resulta

Isaías Flores 1 Dec 14, 2021
solsim is the Solana complex systems simulator. It simulates behavior of dynamical systems—DeFi protocols, DAO governance, cryptocurrencies, and more—built on the Solana blockchain

solsim is the Solana complex systems simulator. It simulates behavior of dynamical systems—DeFi protocols, DAO governance, cryptocurrencies, and more—built on the Solana blockchain

William Wolf 12 Jul 13, 2022
An unofficial opensource Pokemon cursor theme for Windows and Linux.

pokemon-cursor An unofficial opensource Pokemon cursor theme for Windows and Linux. Cursor Sizes 22 24 28 32 40 48 56 64 72 80 88 96 Colors Quick inst

Kaiz Khatri 72 Dec 26, 2022
Tindicators is a Python library to calculate the values of various technical indicators

Tindicators is a Python library to calculate the values of various technical indicators

omar 3 Mar 03, 2022
Automated moth pictures for biodiversity research

Automated moth pictures for biodiversity research

Ludwig Kürzinger 1 Dec 16, 2021
Sardana integration into the Jupyter ecosystem.

sardana-jupyter Sardana integration into the Jupyter ecosystem.

Marc Espín 1 Dec 23, 2021
A Bot Which Can generate Random Account Based On Your Hits.

AccountGenBot This Bot Can Generate Account With Hits You Save (Randomly) Keyfeatures Join To Use Support Limit Account Generation Using Sql Customiza

DevsExpo 30 Oct 21, 2022