An easy-to-use app to visualise attentions of various VQA models.

Last update: Nov 13, 2022

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

An easy-to-use app to visualise attentions of various VQA models. Please click here to see a live demo of the app!

• Models
• Requirements
• Installation
• How to run
• How to use
• Contributing
• Acknowledgements

Models

• MFB - Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu, Jun Yu, Jianping Fan, Dacheng Tao
Arxiv

• (Coming soon) MCAN - Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian
Arvix

Requirements

Please check the requirements.txt file for the version numbers.

opencv_python==4.4.0.46
numpy==1.19.4
pandas==1.1.4
torch==1.4.0
matplotlib==3.3.2
gdown==3.12.2
seaborn==0.11.0
dotmap==1.3.23
streamlit==0.70.0
Pillow==8.0.1
PyYAML==5.3.1

Installation

Install Anaconda
Clone this repository and cd into it.
git clone https://github.com/apugoneappu/ask_me_anything.git && cd ask_me_anything
In a new environment (new_env)
pip install -r requirements.txt

How to run

From the directory of this repository, do the following -

conda activate new_env
streamlit run main.py
In a browser tab, open the Network URL displayed in your terminal.

Done! 🎉

How to use

Contributing

First of all, thank you for wanting to contribute to this work! I will try and make your job as easy as possible. Detailed instructions coming soon ...

Acknowledgements

This repository has been built by modifying the OpenVQA repository.

I would also like to thank Yash Khandelwal, Nikhil Shah and Chinmay Singh for their support and amazing suggestions!

Huge thanks to Streamlit for making all of this possible and for Streamlit Sharing that enables free hosting of this app! ❤️

An easy-to-use app to visualise attentions of various VQA models.

Related tags

Overview

Ask Me Anything: A tool for visualising Visual Question Answering (AMA)

Models

Requirements

Installation

How to run

How to use

Contributing

Acknowledgements

Owner

Apoorve

Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

Scikit-learn compatible estimation of general graphical models

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

🏅 The Most Comprehensive List of Kaggle Solutions and Ideas 🏅

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

Subgraph Based Learning of Contextual Embedding

Table-Extractor 表格抽取

NeRF Meta-Learning with PyTorch

CoINN: Correlated-informed neural networks: a new machine learning framework to predict pressure drop in micro-channels

Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)

(CVPR2021) ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

BanditPAM: Almost Linear-Time k-Medoids Clustering

Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN"

This is a code repository for the paper "Graph Auto-Encoders for Financial Clustering".

내가 보려고 정리한 <프로그래밍 기초 Ⅰ> / organized for me

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.