PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

Related tags

Deep LearningD-VQA
Overview

D-VQA

We provide the PyTorch implementation for Debiased Visual Question Answering from Feature and Sample Perspectives (NeurIPS 2021).

D-VQA

Dependencies

  • Python 3.6
  • PyTorch 1.1.0
  • dependencies in requirements.txt
  • We train and evaluate all of the models based on one TITAN Xp GPU

Getting Started

Installation

  1. Clone this repository:

     git clone https://github.com/Zhiquan-Wen/D-VQA.git
     cd D-VQA
    
  2. Install PyTorch and other dependencies:

     pip install -r requirements.txt
    

Download and preprocess the data

cd data 
bash download.sh
python preprocess_features.py --input_tsv_folder xxx.tsv --output_h5 xxx.h5
python feature_preprocess.py --input_h5 xxx.h5 --output_path trainval 
python create_dictionary.py --dataroot vqacp2/
python preprocess_text.py --dataroot vqacp2/ --version v2
cd ..

Training

  • Train our model
CUDA_VISIBLE_DEVICES=0 python main.py --dataroot data/vqacp2/ --img_root data/coco/trainval_features --output saved_models_cp2/ --self_loss_weight 3 --self_loss_q 0.7
  • Train the model with 80% of the original training set
CUDA_VISIBLE_DEVICES=0 python main.py --dataroot data/vqacp2/ --img_root data/coco/trainval_features --output saved_models_cp2/ --self_loss_weight 3 --self_loss_q 0.7 --ratio 0.8 

Evaluation

  • A json file of results from the test set can be produced with:
CUDA_VISIBLE_DEVICES=0 python test.py --dataroot data/vqacp2/ --img_root data/coco/trainval_features --checkpoint_path saved_models_cp2/best_model.pth --output saved_models_cp2/result/
  • Compute detailed accuracy for each answer type:
python comput_score.py --input saved_models_cp2/result/XX.json --dataroot data/vqacp2/

Pretrained model

A well-trained model can be found here. The test results file produced by it can be found here and its performance is as follows:

Overall score: 61.91
Yes/No: 88.93 Num: 52.32 other: 50.39

Reference

If you found this code is useful, please cite the following paper:

@inproceedings{D-VQA,
  title     = {Debiased Visual Question Answering from Feature and Sample Perspectives},
  author    = {Zhiquan Wen, 
               Guanghui Xu, 
               Mingkui Tan, 
               Qingyao Wu, 
               Qi Wu},
  booktitle = {NeurIPS},
  year = {2021}
}

Acknowledgements

This repository contains code modified from SSL-VQA, thank you very much!

Besides, we thank Yaofo Chen for providing MIO library to accelerate the data loading.

Comments
  • Questions about the code

    Questions about the code

    Thank you very much for providing the code, but I still have two questions that I did not understand well.

    1. A module, BDM, is used to capture negative bias, but this module only includes a multi-layer perceptron. Then how to ensure the features captured by this multi-layer perceptron are negative bias?
    2. On the left of Figure 2 of the paper, there are no backward gradient of the question-to-answer and the vision-to-answer branches. Where did it reflect in the code?
    opened by darwann 4
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • LXMERT numbers

    LXMERT numbers

    Hi, I wish to reproduce the LXMERT(LXMERT without D-VQA) numbers reported in the paper. It would be helpful if you could provide me with a way to do this using your code. I tried using the original LXMERT code, but I am not able to get the numbers reported in your paper on the VQA-CP2 dataset.

    opened by Vaidehi99 0
  • Download trainval_36.zip error

    Download trainval_36.zip error

    Hi, thank you for your work on this.

    I keep getting a download error when downloading the trainval_36.zip file. Is there another link I can use to download this?

    Thanks in advance!

    opened by chojw 0
  • 关于box和image的对齐问题

    关于box和image的对齐问题

    您好,我将box的注释解开后,重新生成特征,然后将其绘制出来,但是明显感觉有偏差,不知道您是否可以提供一份绘图的代码。 image 下面是我的代码 def plot_rect(image, boxes): img = Image.fromarray(np.uint8(image)) draw = ImageDraw.Draw(img) for k in range(2): box = boxes[k,:] print(box) drawrect(draw, box, outline='green', width=3) img = np.asarray(img) return img def drawrect(drawcontext, xy, outline=None, width=0): x1, y1, x2, y2 = xy points = (x1, y1), (x2, y1), (x2, y2), (x1, y2), (x1, y1) drawcontext.line(points, fill=outline, width=width)

    opened by LemonQC 0
Owner
Zhiquan Wen
Zhiquan Wen
Official code repository for Continual Learning In Environments With Polynomial Mixing Times

Official code for Continual Learning In Environments With Polynomial Mixing Times Continual Learning in Environments with Polynomial Mixing Times This

Sharath Raparthy 1 Dec 19, 2021
Autonomous Robots Kalman Filters

Autonomous Robots Kalman Filters The Kalman Filter is an easy topic. However, ma

20 Jul 18, 2022
It is a system used to detect bone fractures. using techniques deep learning and image processing

MohammedHussiengadalla-Intelligent-Classification-System-for-Bone-Fractures It is a system used to detect bone fractures. using techniques deep learni

Mohammed Hussien 7 Nov 11, 2022
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

aft-pytorch Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc. Installation You can i

Rishabh Anand 184 Dec 12, 2022
Contains code for Deep Kernelized Dense Geometric Matching

DKM - Deep Kernelized Dense Geometric Matching Contains code for Deep Kernelized Dense Geometric Matching We provide pretrained models and code for ev

Johan Edstedt 83 Dec 23, 2022
The implementation of "Bootstrapping Semantic Segmentation with Regional Contrast".

ReCo - Regional Contrast This repository contains the source code of ReCo and baselines from the paper, Bootstrapping Semantic Segmentation with Regio

Shikun Liu 128 Dec 30, 2022
A Python Package for Portfolio Optimization using the Critical Line Algorithm

PyCLA A Python Package for Portfolio Optimization using the Critical Line Algorithm Getting started To use PyCLA, clone the repo and install the requi

19 Oct 11, 2022
Self-supervised learning on Graph Representation Learning (node-level task)

graph_SSL Self-supervised learning on Graph Representation Learning (node-level task) How to run the code To run GRACE, sh run_GRACE.sh To run GCA, sh

Namkyeong Lee 3 Dec 31, 2021
ML powered analytics engine for outlier detection and root cause analysis.

Website • Docs • Blog • LinkedIn • Community Slack ML powered analytics engine for outlier detection and root cause analysis ✨ What is Chaos Genius? C

Chaos Genius 523 Jan 04, 2023
Unsupervised Video Interpolation using Cycle Consistency

Unsupervised Video Interpolation using Cycle Consistency Project | Paper | YouTube Unsupervised Video Interpolation using Cycle Consistency Fitsum A.

NVIDIA Corporation 100 Nov 30, 2022
This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation

This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation (Guillaume Couairon, Holger

Meta Research 31 Oct 17, 2022
Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Transfer-Learning-in-Reinforcement-Learning Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations Final Report Tra

Trung Hieu Tran 4 Oct 17, 2022
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

Pan Lu 81 Dec 27, 2022
Pytorch implementation of Learning with Opponent-Learning Awareness

Pytorch implementation of Learning with Opponent-Learning Awareness using DiCE

Alexis David Jacq 82 Sep 15, 2022
AMTML-KD: Adaptive Multi-teacher Multi-level Knowledge Distillation

AMTML-KD: Adaptive Multi-teacher Multi-level Knowledge Distillation

Frank Liu 26 Oct 13, 2022
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!

Guillermo Cámbara 26 Dec 13, 2022
CLIP + VQGAN / PixelDraw

clipit Yet Another VQGAN-CLIP Codebase This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings a

dribnet 276 Dec 12, 2022
Neural Message Passing for Computer Vision

Neural Message Passing for Quantum Chemistry Implementation of different models of Neural Networks on graphs as explained in the article proposed by G

Pau Riba 310 Nov 07, 2022
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 03, 2023
PyTorch Lightning implementation of Automatic Speech Recognition

lasr Lightening Automatic Speech Recognition An MIT License ASR research library, built on PyTorch-Lightning, for developing end-to-end ASR models. In

Soohwan Kim 40 Sep 19, 2022