Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Last update: Dec 10, 2022

Related tags

Overview

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Our paper is accepted by ICCV2021.

Picture: Overview of the proposed Plug-and-Play (PnP) adaption framework for generalizing gaze estimation to a new domain.

Picture: The proposed architecture.

Results

Input	Method	D_E→D_M	D_E→D_D	D_G→D_M	D_G→D_D
Face	Baseline	8.767	8.578	7.662	8.977
Face	Baseline + PnP-GA	5.529 ↓36.9%	5.867 ↓31.6%	6.176 ↓19.4%	7.922 ↓11.8%
Face	ResNet50	8.017	8.310	8.328	7.549
Face	ResNet50 + PnP-GA	6.000 ↓25.2%	6.172 ↓25.7%	5.739 ↓31.1%	7.042 ↓6.7%
Face	SWCNN	10.939	24.941	10.021	13.473
Face	SWCNN + PnP-GA	8.139 ↓25.6%	15.794 ↓36.7%	8.740 ↓12.8%	11.376 ↓15.6%
Face + Eye	CA-Net	--	--	21.276	30.890
Face + Eye	CA-Net + PnP-GA	--	--	17.597 ↓17.3%	16.999 ↓44.9%
Face + Eye	Dilated-Net	--	--	16.683	18.996
Face + Eye	Dilated-Net + PnP-GA	--	--	15.461 ↓7.3%	16.835 ↓11.4%

This repository contains the official PyTorch implementation of the following paper:

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation
Yunfei Liu, Ruicong Liu, Haofei Wang, Feng Lu

Abstract: Deep neural networks have significantly improved appearance-based gaze estimation accuracy. However, it still suffers from unsatisfactory performance when generalizing the trained model to new domains, e.g., unseen environments or persons. In this paper, we propose a plugand-play gaze adaptation framework (PnP-GA), which is an ensemble of networks that learn collaboratively with the guidance of outliers. Since our proposed framework does not require ground-truth labels in the target domain, the existing gaze estimation networks can be directly plugged into PnP-GA and generalize the algorithms to new domains. We test PnP-GA on four gaze domain adaptation tasks, ETH-to-MPII, ETH-to-EyeDiap, Gaze360-to-MPII, and Gaze360-to-EyeDiap. The experimental results demonstrate that the PnP-GA framework achieves considerable performance improvements of 36.9%, 31.6%, 19.4%, and 11.8% over the baseline system. The proposed framework also outperforms the state-of-the-art domain adaptation approaches on gaze domain adaptation tasks.

Resources

Material related to our paper is available via the following links:

Paper: https://arxiv.org/abs/2107.13780
Project: https://liuyunfei.net/publication/iccv2021_pnp-ga/
Code: https://github.com/DreamtaleCore/PnP-GA

System requirements

Only Linux is tested, Windows is under test.
64-bit Python 3.6 installation.

Playing with pre-trained networks and training

Config

You need to modify the config.yaml first, especially xxx/image, xxx/label, and xxx_pretrains params.

xxx/image represents the path of label file.

xxx/root represents the path of image file.

xxx_pretrains represents the path of pretrained models.

A example of label file is data folder. Each line in label file is conducted as:

p00/face/1.jpg 0.2558059438789034,-0.05467275933864655 -0.05843388117618364,0.46745964684693614 ... ...

Where our code reads image data form os.path.join(xxx/root, "p00/face/1.jpg") and reads ground-truth labels of gaze direction from the rest in label file.

Train

We provide three optional arguments, which are --oma2, --js and --sg. They repersent three different network components, which could be found in our paper.

--source and --target represent the datasets used as the source domain and the target domain. You can choose among eth, gaze360, mpii, edp.

--i represents the index of person which is used as the training set. You can set it as -1 for using all the person as the training set.

--pics represents the number of target domain samples for adaptation.

We also provide other arguments for adjusting the hyperparameters in our PnP-GA architecture, which could be found in our paper.

For example, you can run the code like:

python3 adapt.py --i 0 --pics 10 --savepath path/to/save --source eth --target mpii --gpu 0 --js --oma2 --sg

Test

--i, --savepath, --target are the same as training.

--p represents the index of person which is used as the training set in the adaptation process.

For example, you can run the code like:

python3 test.py --i -1 --p 0 --savepath path/to/save --target mpii

Citation

If you find this work or code is helpful in your research, please cite:

@inproceedings{liu2021PnP_GA,
  title={Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation},
  author={Liu, Yunfei and Liu, Ruicong and Wang, Haofei and Lu, Feng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Contact

If you have any questions, feel free to E-mail me via: lyunfei(at)buaa.edu.cn

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Related tags

Overview

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Resources

System requirements

Playing with pre-trained networks and training

Config

Train

Test

Citation

Contact

Owner

Yunfei Liu

Voice Conversion Using Speech-to-Speech Neuro-Style Transfer

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

Codes for the AAAI'22 paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning"

Example repository for custom C++/CUDA operators for TorchScript

Stochastic Normalizing Flows

Pytorch-Swin-Unet-V2 - a modified version of Swin Unet based on Swin Transfomer V2

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

A BaSiC Tool for Background and Shading Correction of Optical Microscopy Images

Evaluation Pipeline for our ECCV2020: Journey Towards Tiny Perceptual Super-Resolution.

AI Based Smart Exam Proctoring Package

Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Code for the USENIX 2017 paper: kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels

Efficient Training of Visual Transformers with Small Datasets

Personals scripts using ageitgey/face_recognition

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Detector for Log4Shell exploitation attempts

PyTorch implementation of Munchausen Reinforcement Learning based on DQN and SAC. Handles discrete and continuous action spaces

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.