Unimodal Face Classification with Multimodal Training

This is a PyTorch implementation of the following paper:

Unimodal Face Classification with Multimodal Training

Wenbin Teng (Boston University), Chongyang Bai (Dartmouth College)

Abstract: We propose a Multimodal Training Unimodal Test (MTUT) framework for robust face classification, which exploits the cross-modality relationship during training and applies it as a complementary of the imperfect single modality input during testing. Technically, during training, the framework (1) builds both intra-modality and cross-modality autoencoders with the aid of facial attributes to learn latent embeddings as multimodal descriptors, (2) proposes a novel multimodal embedding divergence loss to align the heterogeneous features from different modalities, which also adaptively avoids the useless modality (if any) from confusing the model. This way, the learned autoencoders can generate robust embeddings in single-modality face classification on test stage. We evaluate our framework in two face classification datasets and two kinds of testing input: (1) poor-condition image and (2) point cloud or 3D face mesh, when both 2D and 3D modalities are available for training.

The proposed method applies both 2D and 3D encoder to extract the embeddings of each individual modalities. Divergence between both embeddings is minimized adaptively through measuring the classification loss. Based on the type of testing modality, we use certain decoder to reconstruct 2D and 3D inputs from feature embeddings. An overview of the proposed network is shown in the following picture:

Unimodal Face Classification with Multimodal Training

Related tags

Overview

Unimodal Face Classification with Multimodal Training

Owner

Wenbin Teng

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

Machine learning library for fast and efficient Gaussian mixture models

FastCover: A Self-Supervised Learning Framework for Multi-Hop Influence Maximization in Social Networks by Anonymous.

Python code to generate art with Generative Adversarial Network

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations)

zeus is a Python implementation of the Ensemble Slice Sampling method.

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

Use unsupervised and supervised learning to predict stocks

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Examples of using f2py to get high-speed Fortran integrated with Python easily

A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021)

Scenarios, tutorials and demos for Autonomous Driving

This package contains deep learning models and related scripts for RoseTTAFold

ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

Multi-task Multi-agent Soft Actor Critic for SMAC

Clinica is a software platform for clinical research studies involving patients with neurological and psychiatric diseases and the acquisition of multimodal data

This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization