[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion

Related tags

Deep LearningIICNet
Overview

IICNet - Invertible Image Conversion Net

Official PyTorch Implementation for IICNet: A Generic Framework for Reversible Image Conversion (ICCV2021). Demo Video | Supplements

Introduction

Reversible image conversion (RIC) aims to build a reversible transformation between specific visual content (e.g., short videos) and an embedding image, where the original content can be restored from the embedding when necessary. This work develops Invertible Image Conversion Net (IICNet) as a generic solution to various RIC tasks due to its strong capacity and task-independent design. Unlike previous encoder-decoder based methods, IICNet maintains a highly invertible structure based on invertible neural networks (INNs) to better preserve the information during conversion. We use a relation module and a channel squeeze layer to improve the INN nonlinearity to extract cross-image relations and the network flexibility, respectively. Experimental results demonstrate that IICNet outperforms the specifically-designed methods on existing RIC tasks and can generalize well to various newly-explored tasks. With our generic IICNet, we no longer need to hand-engineer task-specific embedding networks for rapidly occurring visual content.

Installation

Clone this repository and set up the environment.

git clone https://github.com/felixcheng97/IICNet.git
cd IICNet/
conda env create -f iic.yml

Dataset Preparation

We conduct experments on 5 multiple-and-single RIC tasks in the main paper and 2 single-and-single RIC tasks in the supplements. Note that all the datasets are placed under the ./datasets directory.

Task 1: Spatial-Temporal Video Embedding

We use the high-quality DAVIS 2017 video dataset in this task. You could download the Semi-supervised 480p dataset through this link. Unzip, rename, and place them under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |   |-- DAVIS-2017-test-challenge (rename the DAVIS folder from DAVIS-2017-test-challenge-480p.zip)
    |   |-- DAVIS-2017-test-dev       (rename the DAVIS folder from DAVIS-2017-test-dev-480p.zip)
    |   `-- DAVIS-2017-trainval       (rename the DAVIS folder from DAVIS-2017-trainval-480p.zip)
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python davis_annotation.py

Task 2: Mononizing Binocular Images

We use the Flickr1024 dataset with the official train and test splits in this task. You could download the dataset through this link. Place the dataset under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |   |-- Test
    |   |-- Train_1
    |   |-- Train_2
    |   |-- Train_3
    |   |-- Train_4
    |   `-- Validation
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python flicker1024_annotation.py

Task 3: Embedding Dual-View Images

We use the DIV2K dataset in this task. You could download the dataset through this link. Download the corresponding datasets and place them under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |   |-- DIV2K_train_HR
    |   |-- DIV2K_train_LR_bicubic
    |   |   |-- X2
    |   |   |-- X4
    |   |   |-- X8
    |   |-- DIV2K_valid_HR
    |   `-- DIV2K_valid_LR_bicubic
    |       |-- X2
    |       |-- X4
    |       `-- X8
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python div2kddual_annotation.py

Task 4: Embedding Multi-Layer Images / Composition and Decomposition

We use the Adobe Deep Matting dataset and the Real Matting dataset in this task. You could download the Adobe Deep Matting dataset according to their instructions through this link. You could download the Real Matting dataset on its official GitHub page or through this direct link. Place the downloaded datasets under the dataset directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |   |-- Addobe_Deep_Matting_Dataset.zip
    |   |-- train2014.zip
    |   |-- VOC2008test.tar
    |   `-- VOCtrainval_14-Jul-2008.tar
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    |   |-- fixed-camera
    |   `-- hand-held
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts

# process the Adobe Matting dataset
python adobe_process.py
python adobe_annotation.py

# process the Real Matting dataset
python real_process.py
python real_annotation.py

Task 5: Hiding Images in an Image

We use the Flicker 2W dataset in this task. You could download the dataset on its official GitHub page through this link. Place the unzipped dataset under the datasets directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |   `-- flicker_2W_images
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit

Then run the following scripts for annotation.

cd codes/scripts
python flicker_annotation.py

Task 6 (supp): Invertible Grayscale

We use the VOC2012 dataset in this task. You could download the training/validation dataset through this link. Place the unzipped dataset under the datasets directory with the following structure.

.
`-- datasets
    |-- Adobe-Matting
    |-- DAVIS-2017
    |-- DIV2K
    |-- flicker
    |-- flicker1024
    |-- Real-Matting
    `-- VOCdevkit
        `-- VOC2012

Then run the following scripts for annotation

cd codes/scripts
python voc2012_annotation.py

Task 7 (supp): Invertible Image Rescaling

We use the DIV2K dataset in this task. Please check Task 3: Embedding Dual-View Images to download the corresponding dataset. Then run the following scripts for annotation.

cd codes/scripts
python div2ksr_annotation.py

Training

To train a model for a specific task, run the following script:

cd codes
OMP_NUM_THREADS=4 python train.py -opt ./conf/train/<xxx>.yml

To enable distributed training with multiple GPUs for a specific task, simply assign a list of gpu_ids in the yml file and run the following script. Note that since training with multiple GPU is not tested yet, we suggest to train a model with a single GPU.

cd codes
OMP_NUM_THREADS=4 python -m torch.distributed.launch --nproc_per_node=4 --master_port 29501 train.py -opt ./conf/train/<xxx>.yml

Testing

We provide our trained models in our paper for your reference. Download all the pretrained weights of our models from Google Drive or Baidu Drive (extraction code: e377). Unzip the zip file and place pretrained models under the ./experiments directory.

To test a model for a specific task, run the following script:

cd codes
OMP_NUM_THREADS=4 python test.py -opt ./conf/test/<xxx>.yml

Acknowledgement

Some codes of this repository benefits from Invertible Image Rescaling (IRN).

Citation

If you find this work useful, please cite our paper:

@inproceedings{cheng2021iicnet,
    title = {IICNet: A Generic Framework for Reversible Image Conversion}, 
    author = {Ka Leong Cheng and Yueqi Xie and Qifeng Chen},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    year = {2021}
}

Contact

Feel free to open an issue if you have any question. You could also directly contact us through email at [email protected] (Ka Leong Cheng) and [email protected] (Yueqi Xie).

Owner
felixcheng97
felixcheng97
Official Pytorch Code for the paper TransWeather

TransWeather Official Code for the paper TransWeather, Arxiv Tech Report 2021 Paper | Website About this repo: This repo hosts the implentation code,

Jeya Maria Jose 81 Dec 30, 2022
Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences

Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences 1. Introduction This project is for paper Model-free Vehicle Tracking and St

TuSimple 92 Jan 03, 2023
This package is for running the semantic SLAM algorithm using extracted planar surfaces from the received detection

Semantic SLAM This package can perform optimization of pose estimated from VO/VIO methods which tend to drift over time. It uses planar surfaces extra

Hriday Bavle 125 Dec 02, 2022
Unofficial pytorch-lightning implement of Mip-NeRF

mipnerf_pl Unofficial pytorch-lightning implement of Mip-NeRF, Here are some results generated by this repository (pre-trained models are provided bel

Jianxin Huang 159 Dec 23, 2022
My coursework for Machine Learning (2021 Spring) at National Taiwan University (NTU)

Machine Learning 2021 Machine Learning (NTU EE 5184, Spring 2021) Instructor: Hung-yi Lee Course Website : (https://speech.ee.ntu.edu.tw/~hylee/ml/202

100 Dec 26, 2022
Source code for deep symbolic optimization.

Update July 10, 2021: This repository now supports an additional symbolic optimization task: learning symbolic policies for reinforcement learning. Th

Brenden Petersen 290 Dec 25, 2022
Fiddle is a Python-first configuration library particularly well suited to ML applications.

Fiddle Fiddle is a Python-first configuration library particularly well suited to ML applications. Fiddle enables deep configurability of parameters i

Google 227 Dec 26, 2022
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
Exploiting Robust Unsupervised Video Person Re-identification

Exploiting Robust Unsupervised Video Person Re-identification Implementation of the proposed uPMnet. For the preprint, please refer to [Arxiv]. Gettin

1 Apr 09, 2022
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library ERISHA is a multilingual multispeaker expressive speech synthesis framework. It ca

Ajinkya Kulkarni 43 Nov 27, 2022
Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Regularizing Generative Adversarial Networks under Limited Data [Project Page][Paper] Implementation for our GAN regularization method. The proposed r

Google 148 Nov 18, 2022
A PyTorch version of You Only Look at One-level Feature object detector

PyTorch_YOLOF A PyTorch version of You Only Look at One-level Feature object detector. The input image must be resized to have their shorter side bein

Jianhua Yang 25 Dec 30, 2022
This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

Divam Gupta 19 Dec 17, 2022
Submission to Twitter's algorithmic bias bounty challenge

Twitter Ethics Challenge: Pixel Perfect Submission to Twitter's algorithmic bias bounty challenge, by Travis Hoppe (@metasemantic). Abstract We build

Travis Hoppe 4 Aug 19, 2022
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
A simple editor for captions in .SRT file extension

WaySRT A simple editor for captions in .SRT file extension The program doesn't use any external dependecies, just run: python way_srt.py {file_name.sr

Gustavo Lopes 3 Nov 16, 2022
Code repository for "Stable View Synthesis".

Stable View Synthesis Code repository for "Stable View Synthesis". Setup Install the following Python packages in your Python environment - numpy (1.1

Intelligent Systems Lab Org 195 Dec 24, 2022
A modular application for performing anomaly detection in networks

Deep-Learning-Models-for-Network-Annomaly-Detection The modular app consists for mainly three annomaly detection algorithms. The system supports model

Shivam Patel 1 Dec 09, 2021
A python library for face detection and features extraction based on mediapipe library

FaceAnalyzer A python library for face detection and features extraction based on mediapipe library Introduction FaceAnalyzer is a library based on me

Saifeddine ALOUI 14 Dec 30, 2022
PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric This repository contains the implementation of MSBG hearing loss m

BUT <a href=[email protected]"> 9 Nov 08, 2022