[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

Last update: Nov 30, 2022

Related tags

Deep Learning pytorch

Overview

InvCompress

Official Pytorch Implementation for "Enhanced Invertible Encoding for Learned Image Compression", ACMMM 2021 (Oral)

Figure: Our framework

Acknowledgement

The framework is based on CompressAI, we add our model in compressai.models.ours, compressai.models.our_utils. We modify compressai.utils, compressai.zoo, compressai.layers and examples/train.py for usage. Part of the codes benefit from Invertible-Image-Rescaling.

Introduction

In this paper, we target at structuring a better transformation between the image space and the latent feature space. Instead of employing previous autoencoder style networks to build this transformation, we propose an enhanced Invertible Encoding Network with invertible neural networks (INNs) to largely mitigate the information loss problem for better compression. To solve the challenge of unstable training with INN, we propose an attentive channel squeeze layer to flexibly adjust the feature dimension for a lower bit rate. We also present a feature enhancement module with same-resolution transforms and residual connections to improve the network nonlinear representation capacity.

[Paper]

Figure: Our results

Installation

As mentioned in CompressAI, "A C++17 compiler, a recent version of pip (19.0+), and common python packages are also required (see setup.py for the full list)."

git clone https://github.com/xyq7/InvCompress.git
cd InvCompress/codes/
conda create -n invcomp python=3.7 
conda activate invcomp
pip install -U pip && pip install -e .
conda install -c conda-forge tensorboard

Usage

Evaluation

If you want evaluate with pretrained model, please download from Google drive or Baidu cloud (code: a7jd) and put in ./experiments/

Some evaluation dataset can be downloaded from kodak dataset, CLIC

Note that as mentioned in original CompressAI, "Inference on GPU is not recommended for the autoregressive models (the entropy coder is run sequentially on CPU)." So for inference of our model, please run on CPU.

python -m compressai.utils.eval_model checkpoint $eval_data_dir -a invcompress -exp $exp_name -s $save_dir

An example: to evaluate model of quality 1 optimized with mse on kodak dataset.

python -m compressai.utils.eval_model checkpoint ../data/kodak -a invcompress -exp exp_01_mse_q1 -s ../results/exp_01

If you want to evaluate your trained model on own data, please run update before evaluation. An example:

python -m compressai.utils.update_model -exp $exp_name -a invcompress
python -m compressai.utils.eval_model checkpoint $eval_data_dir -a invcompress -exp $exp_name -s $save_dir

Train

We use the training dataset processed in the repo. We further preprocess with /codes/scripts/flicker_process.py Training setting is detailed in the paper. You can also use your own data for training.

python examples/train.py -exp $exp_name -m invcompress -d $train_data_dir --epochs $epoch_num -lr $lr --batch-size $batch_size --cuda --gpu_id $gpu_id --lambda $lamvda --metrics $metric --save

An example: to train model of quality 1 optimized with mse metric.

python examples/train.py -exp exp_01_mse_q1 -m invcompress -d ../data/flicker --epochs 600 -lr 1e-4 --batch-size 8 --cuda --gpu_id 0 --lambda 0.0016 --metrics mse --save

Other usage please refer to the original library CompressAI

Citation

If you find this work useful for your research, please cite:

@inproceedings{xie2021enhanced,
    title = {Enhanced Invertible Encoding for Learned Image Compression}, 
    author = {Yueqi Xie and Ka Leong Cheng and Qifeng Chen},
    booktitle = {Proceedings of the ACM International Conference on Multimedia},
    year = {2021}
}

Contact

Feel free to contact us if there is any question. (YueqiXIE, [email protected]; Ka Leong Cheng, [email protected])

[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

Related tags

Overview

InvCompress

Acknowledgement

Introduction

Installation

Usage

Evaluation

Train

Citation

Contact

Owner

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)

Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

Spatial Contrastive Learning for Few-Shot Classification (SCL)

ObsPy: A Python Toolbox for seismology/seismological observatories.

An open source library for face detection in images. The face detection speed can reach 1000FPS.

Real-time Neural Representation Fusion for Robust Volumetric Mapping

Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Official page of Patchwork (RA-L'21 w/ IROS'21)

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

Finite-temperature variational Monte Carlo calculation of uniform electron gas using neural canonical transformation.

[Arxiv preprint] Causality-inspired Single-source Domain Generalization for Medical Image Segmentation (code&data-processing pipeline)

Neural network for recognizing the gender of people in photos

Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.

Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss