Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Last update: Dec 11, 2022

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge by Teney et al.

Prerequisites

python 3.6+
numpy
pytorch 0.4
tqdm
nltk
pandas

Data

Preparation

To download and extract vqav2, glove, and pretrained visual features:
```
bash scripts/download_extract.sh
```
To prepare data for training:
```
python scripts/preproc.py
```

The structure of data/ directory should look like this:

- data/
  - zips/
    - v2_XXX...zip
    - ...
    - glove...zip
    - trainval_36.zip
  - glove/
    - glove...txt
    - ...
  - v2_XXX.json
  - ...
  - trainval_resnet...tsv
  (The above are files created after executing scripts/download_extract.sh)
  - tokenizers/
    - ...
  - dict_ans.pkl
  - dict_q.pkl
  - glove_pretrained_300.npy
  - train_qa.pkl
  - val_qa.pkl
  - train_vfeats.pkl
  - val_vfeats.pkl
  (The above are files created after executing scripts/preproc.py)

Train

Use default parameters:

bash scripts/train.sh

Notes

Huge re-factor (especially data preprocessing), tested based on pytorch 0.4.1 and python 3.6
Training for 20 epochs reach around 50% training accuracy. (model seems buggy in my implementation)
After all the preprocessing, data/ directory may be up to 38G+
Some of preproc.py and utils.py are based on this repo

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Related tags

Overview

2017 VQA Challenge Winner (CVPR'17 Workshop)

Prerequisites

Data

Preparation

Train

Notes

Resources

Owner

Mark Dong

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

A CV toolkit for my papers.

Vehicle Detection Using Deep Learning and YOLO Algorithm

Training vision models with full-batch gradient descent and regularization

Unofficial PyTorch Implementation of AHDRNet (CVPR 2019)

Code for "Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance" at NeurIPS 2021

Implementation of Convolutional LSTM in PyTorch.

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Face Alignment using python

Narya API allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent

Generating Fractals on Starknet with Cairo

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

CAST: Character labeling in Animation using Self-supervision by Tracking

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

This program will stylize your photos with fast neural style transfer.

An auto discord account and token generator. Automatically verifies the phone number. Works without proxy. Bypasses captcha.

E2VID_ROS - E2VID_ROS: E2VID to a real-time system

PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation