PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

Last update: Dec 22, 2022

Related tags

Overview

D-VQA

We provide the PyTorch implementation for Debiased Visual Question Answering from Feature and Sample Perspectives (NeurIPS 2021).

Dependencies

Python 3.6
PyTorch 1.1.0
dependencies in requirements.txt
We train and evaluate all of the models based on one TITAN Xp GPU

Getting Started

Installation

Clone this repository:

 git clone https://github.com/Zhiquan-Wen/D-VQA.git
 cd D-VQA

Install PyTorch and other dependencies:
```
 pip install -r requirements.txt
```

Download and preprocess the data

cd data 
bash download.sh
python preprocess_features.py --input_tsv_folder xxx.tsv --output_h5 xxx.h5
python feature_preprocess.py --input_h5 xxx.h5 --output_path trainval 
python create_dictionary.py --dataroot vqacp2/
python preprocess_text.py --dataroot vqacp2/ --version v2
cd ..

Training

Train our model

CUDA_VISIBLE_DEVICES=0 python main.py --dataroot data/vqacp2/ --img_root data/coco/trainval_features --output saved_models_cp2/ --self_loss_weight 3 --self_loss_q 0.7

Train the model with 80% of the original training set

CUDA_VISIBLE_DEVICES=0 python main.py --dataroot data/vqacp2/ --img_root data/coco/trainval_features --output saved_models_cp2/ --self_loss_weight 3 --self_loss_q 0.7 --ratio 0.8

Evaluation

A json file of results from the test set can be produced with:

CUDA_VISIBLE_DEVICES=0 python test.py --dataroot data/vqacp2/ --img_root data/coco/trainval_features --checkpoint_path saved_models_cp2/best_model.pth --output saved_models_cp2/result/

Compute detailed accuracy for each answer type:

python comput_score.py --input saved_models_cp2/result/XX.json --dataroot data/vqacp2/

Pretrained model

A well-trained model can be found here. The test results file produced by it can be found here and its performance is as follows:

Overall score: 61.91
Yes/No: 88.93 Num: 52.32 other: 50.39

Reference

If you found this code is useful, please cite the following paper:

@inproceedings{D-VQA,
  title     = {Debiased Visual Question Answering from Feature and Sample Perspectives},
  author    = {Zhiquan Wen, 
               Guanghui Xu, 
               Mingkui Tan, 
               Qingyao Wu, 
               Qi Wu},
  booktitle = {NeurIPS},
  year = {2021}
}

Acknowledgements

This repository contains code modified from SSL-VQA, thank you very much!

Besides, we thank Yaofo Chen for providing MIO library to accelerate the data loading.

Comments

Questions about the code
Thank you very much for providing the code, but I still have two questions that I did not understand well.

A module, BDM, is used to capture negative bias, but this module only includes a multi-layer perceptron. Then how to ensure the features captured by this multi-layer perceptron are negative bias?

On the left of Figure 2 of the paper, there are no backward gradient of the question-to-answer and the vision-to-answer branches. Where did it reflect in the code?
opened by darwann 4
CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0
LXMERT numbers

Hi, I wish to reproduce the LXMERT(LXMERT without D-VQA) numbers reported in the paper. It would be helpful if you could provide me with a way to do this using your code. I tried using the original LXMERT code, but I am not able to get the numbers reported in your paper on the VQA-CP2 dataset.

opened by Vaidehi99 0
Download trainval_36.zip error

Hi, thank you for your work on this.

I keep getting a download error when downloading the trainval_36.zip file. Is there another link I can use to download this?

Thanks in advance!

opened by chojw 0
关于box和image的对齐问题

您好，我将box的注释解开后，重新生成特征，然后将其绘制出来，但是明显感觉有偏差，不知道您是否可以提供一份绘图的代码。下面是我的代码 def plot_rect(image, boxes): img = Image.fromarray(np.uint8(image)) draw = ImageDraw.Draw(img) for k in range(2): box = boxes[k,:] print(box) drawrect(draw, box, outline='green', width=3) img = np.asarray(img) return img def drawrect(drawcontext, xy, outline=None, width=0): x1, y1, x2, y2 = xy points = (x1, y1), (x2, y1), (x2, y2), (x1, y2), (x1, y1) drawcontext.line(points, fill=outline, width=width)

opened by LemonQC 0

Releases(Results_LXMERT)

Results_LXMERT(Oct 26, 2021)

Source code(tar.gz)
Source code(zip)
test_best_epoch0.json(9.66 MB)
Models_LXMERT(Oct 26, 2021)

Source code(tar.gz)
Source code(zip)
BEST.pth(842.49 MB)
LXMERT_pretrained_model(Oct 25, 2021)

Source code(tar.gz)
Source code(zip)
Epoch20_LXRT.pth(870.06 MB)
Results(Oct 10, 2021)

Source code(tar.gz)
Source code(zip)
61.91_results.json(9.65 MB)
Models(Oct 10, 2021)

Source code(tar.gz)
Source code(zip)
61.91.pth(620.44 MB)

Owner

Zhiquan Wen

GitHub Repository

OpenMMLab Detection Toolbox and Benchmark

MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.

22.5k Jan 05, 2023

API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

RL - EmsPy (work In Progress...) The EmsPy Python package was made to facilitate Reinforcement Learning (RL) algorithm research for developing and tes

20 Jan 05, 2023

Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Get Fooled for the Right Reason Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness throu

1 Apr 25, 2022

Genetic feature selection module for scikit-learn

sklearn-genetic Genetic feature selection module for scikit-learn Genetic algorithms mimic the process of natural selection to search for optimal valu

260 Dec 14, 2022

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

1.3k Jan 04, 2023

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or

71 Jan 07, 2023

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

75 Dec 29, 2022

Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints Installation Run pipenv install (at your own risk with --skip-lo

65 Dec 27, 2022

Binary Stochastic Neurons in PyTorch

Binary Stochastic Neurons in PyTorch http://r2rt.com/binary-stochastic-neurons-in-tensorflow.html https://github.com/pytorch/examples/tree/master/mnis

54 Nov 21, 2022

Fully Convlutional Neural Networks for state-of-the-art time series classification

Deep Learning for Time Series Classification As the simplest type of time series data, univariate time series provides a reasonably good starting poin

572 Dec 23, 2022

Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

7 Apr 05, 2022

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

82 Dec 27, 2022

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo

6 Dec 23, 2022

PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

Related tags

Overview

D-VQA

Dependencies

Getting Started

Installation

Download and preprocess the data

Training

Evaluation

Pretrained model

Reference

Acknowledgements

Comments

Questions about the code

CVE-2007-4559 Patch

Patching CVE-2007-4559

LXMERT numbers

Download trainval_36.zip error

关于box和image的对齐问题

Releases(Results_LXMERT)

Results_LXMERT(Oct 26, 2021)

Models_LXMERT(Oct 26, 2021)

LXMERT_pretrained_model(Oct 25, 2021)

Results(Oct 10, 2021)

Models(Oct 10, 2021)