Which Apple Keeps Which Doctor Away? Colorful Word Representations with Visual Oracles

Overview

AppleLM

Which Apple Keeps Which Doctor Away? Colorful Word Representations with Visual Oracles (TASLP 2022)

Setup

This implementation is based on Transformers.

Preparation

  1. Download GLUE datasets

    The datasets can be downloaded automatically. Please refer to https://github.com/nyu-mll/GLUE-baselines

    git clone https://github.com/nyu-mll/GLUE-baselines.git
    python download_glue_data.py --data_dir glue_data --tasks all
    

    It is recommended to put the folder glue_data to data/. The architecture looks like:

    AppleLM
    └───data
    │   └───glue_data
    │       │   CoLA/
    │       │   MRPC/
    │       │   ...
    
  2. Visual Features

    Pre-extracted visual features can be downloaded from Google Drive borrowed from the repo Multi30K.

    The features are used in image embedding layer for indexing. Extract train-resnet50-avgpool.npy and put it in the data/ folder.

Training & Evaluate

export GLUE_DIR=data/glue_data/
export CUDA_VISIBLE_DEVICES="0"
export TASK_NAME=CoLA
python ./examples/run_glue_visual-tfidf_att.py \
    --model_type bert \
    --model_name_or_path bert-large-uncased-whole-word-masking \
    --task_name $TASK_NAME \
    --do_eval \
    --do_lower_case \
    --data_dir $GLUE_DIR/$TASK_NAME \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size=32   \
    --per_gpu_train_batch_size=16   \
    --learning_rate 1e-5 \
    --eval_all_checkpoints \
    --save_steps 500 \
    --max_steps 5336 \
    --warmup_steps 320 \
    --image_dir data/train.lc.norm.tok.en \
    --image_embedding_file data/train-resnet50-avgpool.npy \
    --num_img 3 \
    --tfidf 5 \
    --image_merge att-gate \
    --stopwords_dir data/stopwords-en.txt \
    --output_dir experiments/CoLA_bert_wwm

Reference

Please kindly cite this paper in your publications if it helps your research:

@ARTICLE{zhang2022which,
  author={Zhang, Zhuosheng and Yu, Haojie and Zhao, Hai and Utiyama, Masao},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, 
  title={Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles}, 
  year={2022},
  volume={30},
  number={},
  pages={49-59},
  doi={10.1109/TASLP.2021.3130972}
}
Owner
Zhuosheng Zhang
Ph.D. student @ Shanghai Jiao Tong University. NLP/AI/ML.
Zhuosheng Zhang
An open collection of annotated voices in Japanese language

声庭 (Koniwa): オープンな日本語音声とアノテーションのコレクション Koniwa (声庭): An open collection of annotated voices in Japanese language 概要 Koniwa(声庭)は利用・修正・再配布が自由でオープンな音声とアノテ

Koniwa project 32 Dec 14, 2022
An end to end ASR Transformer model training repo

END TO END ASR TRANSFORMER 本项目基于transformer 6*encoder+6*decoder的基本结构构造的端到端的语音识别系统 Model Instructions 1.数据准备: 自行下载数据,遵循文件结构如下: ├── data │ ├── train │

旷视天元 MegEngine 10 Jul 19, 2022
profile tools for pytorch nn models

nnprof Introduction nnprof is a profile tool for pytorch neural networks. Features multi profile mode: nnprof support 4 profile mode: Layer level, Ope

Feng Wang 42 Jul 09, 2022
Fast, DB Backed pretrained word embeddings for natural language processing.

Embeddings Embeddings is a python package that provides pretrained word embeddings for natural language processing and machine learning. Instead of lo

Victor Zhong 212 Nov 21, 2022
TextFlint is a multilingual robustness evaluation platform for natural language processing tasks,

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks, which unifies general text transformation, task-specific transformation, adversarial attack, sub-popu

TextFlint 587 Dec 20, 2022
Pytorch implementation of Tacotron

Tacotron-pytorch A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model. Requirements Install python 3 Install pytorc

soobin seo 203 Dec 02, 2022
A method for cleaning and classifying text using transformers.

NLP Translation and Classification The repository contains a method for classifying and cleaning text using NLP transformers. Overview The input data

Ray Chamidullin 0 Nov 15, 2022
Voilà turns Jupyter notebooks into standalone web applications

Rendering of live Jupyter notebooks with interactive widgets. Introduction Voilà turns Jupyter notebooks into standalone web applications. Unlike the

Voilà Dashboards 4.5k Jan 03, 2023
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python) 日本語は以下に続きます (Japanese follows) English: This book is written in Japanese and primaril

Ryuichi Yamamoto 189 Dec 29, 2022
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Pretrained Language Model This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei N

HUAWEI Noah's Ark Lab 2.6k Jan 08, 2023
DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex

BuyWithCrypto 3 Apr 19, 2022
Official PyTorch Implementation of paper "NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting", EGSR 2021.

NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting Official PyTorch Implementation of paper "NeLF: Neural Light-tran

Ken Lin 38 Dec 26, 2022
:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

huybery 60 Dec 31, 2022
✔👉A Centralized WebApp to Ensure Road Safety by checking on with the activities of the driver and activating label generator using NLP.

AI-For-Road-Safety Challenge hosted by Omdena Hyderabad Chapter Original Repo Link : https://github.com/OmdenaAI/omdena-india-roadsafety Final Present

Prathima Kadari 7 Nov 29, 2022
It analyze the sentiment of the user, whether it is postive or negative.

Sentiment-Analyzer-Tool It analyze the sentiment of the user, whether it is postive or negative. It uses streamlit library for creating this sentiment

Paras Patidar 18 Dec 17, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

🤗 Contributing to OpenSpeech 🤗 OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 03, 2023
Club chatbot

Chatbot Club chatbot Instructions to get the Chatterbot working Step 1. First make sure you are using a version of Python 3 or newer. To check your ve

5 Mar 07, 2022
TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

Yixuan Su 26 Oct 17, 2022
:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

deepset 6.4k Jan 09, 2023