This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Last update: Dec 26, 2022

Overview

MultiModal-InfoMax

🔥 If you would be interested in other multimodal works in our DeCLaRe Lab, welcome to visit the clustered repository

Introduction

Multimodal-informax (MMIM) synthesizes fusion results from multi-modality input through a two-level mutual information (MI) maximization. We use BA (Barber-Agakov) lower bound and contrastive predictive coding as the target function to be maximized. To facilitate the computation, we design an entropy estimation module with associated history data memory to facilitate the computation of BA lower bound and the training process.

Usage

Download the CMU-MOSI and CMU-MOSEI dataset from Google Drive or Baidu Disk (extraction code: g3m2). Place them under the folder Multimodal-Infomax/datasets
Set up the environment (need conda prerequisite)

conda env create -f environment.yml
conda activate MMIM

Start training

python main.py --dataset mosi --contrast

Citation

Please cite our paper if you find our work useful for your research:

@article{han2021improving,
  title={Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis},
  author={Han, Wei and Chen, Hui and Poria, Soujanya},
  journal={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2021}
}

Contact

Should you have any question, feel free to contact me through [email protected]

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Related tags

Overview

MultiModal-InfoMax

Introduction

Usage

Citation

Contact

Owner

Deep Cognition and Language Research (DeCLaRe) Lab

Godot RL Agents is a fully Open Source packages that allows video game creators

Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

PyTorch implementation of the paper Deep Networks from the Principle of Rate Reduction

A lightweight deep network for fast and accurate optical flow estimation.

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

tsflex - feature-extraction benchmarking

🛰️ List of earth observation companies and job sites

The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Code release for "Making a Bird AI Expert Work for You and Me".

YoloAll is a collection of yolo all versions. you you use YoloAll to test yolov3/yolov5/yolox/yolo_fastest

A cool little repl-based simulation written in Python

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Agile SVG maker for python

OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech

Python Fanduel API (2021) - Lineup Automation

This repo is about implementing different approaches of pose estimation and also is a sub-task of the smart hospital bed project :smile:

Paper Code：A Self-adaptive Weighted Differential Evolution Approach for Large-scale Feature Selection

A really easy-to-use and powerful sudoku solver.