Reference PyTorch implementation of "End-to-end optimized image compression with competition of prior distributions"

Overview

PyTorch reference implementation of "End-to-end optimized image compression with competition of prior distributions" by Benoit Brummer and Christophe De Vleeschouwer ( https://github.com/trougnouf/Manypriors )

Forked from PyTorch implementation of "Variational image compression with a scale hyperprior" by Jiaheng Liu ( https://github.com/liujiaheng/compression )

This code is experimental.

Requirements

TODO torchac should be switched to the standalone release on https://github.com/fab-jul/torchac (which was not yet released at the time of writing this code)

Arch

pacaur -S python-tqdm python-pytorch-torchac python-configargparse python-yaml python-ptflops python-colorspacious python-pypng python-pytorch-piqa-git

Ubuntu / Slurm cluster / misc:

TMPDIR=tmp pip3 install --user torch==1.7.0+cu92 torchvision==0.8.1+cu92 -f https://download.pytorch.org/whl/torch_stable.html
TMPDIR=tmp pip3 install --user tqdm matplotlib tensorboardX scipy scikit-image scikit-video ConfigArgParse pyyaml h5py ptflops colorspacious pypng piqa

torchac must be compiled and installed per https://github.com/trougnouf/L3C-PyTorch/tree/master/src/torchac

torchac $ COMPILE_CUDA=auto python3 setup.py build
torchac $ python3 setup.py install --optimize=1 --skip-build

or (untested)

torchac $ pip install .

Once Ubuntu updates PyTorch then tensorboardX won't be required

Dataset gathering

Copy the kodak dataset into datasets/test/kodak

cd ../common
python tools/wikidownloader.py --category "Category:Featured pictures on Wikimedia Commons"
python tools/wikidownloader.py --category "Category:Formerly featured pictures on Wikimedia Commons"
python tools/wikidownloader.py --category "Category:Photographs taken on Ektachrome and Elite Chrome film"
mv "../../datasets/Category:Featured pictures on Wikimedia Commons" ../../datasets/FeaturedPictures
mv "../../datasets/Category:Formerly featured pictures on Wikimedia Commons" ../../datasets/Formerly_featured_pictures_on_Wikimedia_Commons
mv "../../datasets/Category:Photographs taken on Ektachrome and Elite Chrome film" ../../datasets/Photographs_taken_on_Ektachrome_and_Elite_Chrome_film
python tools/verify_images.py ../../datasets/FeaturedPictures/
python tools/verify_images.py ../../datasets/Formerly_featured_pictures_on_Wikimedia_Commons/
python tools/verify_images.py ../../datasets/Photographs_taken_on_Ektachrome_and_Elite_Chrome_film/

# TODO make a list of train/test img automatically s.t. images don't have to be copied over the network

Crop images to 1024*1024. from src/common: (in python)

import os
from libs import libdsops
for ads in ['Formerly_featured_pictures_on_Wikimedia_Commons', 'Photographs_taken_on_Ektachrome_and_Elite_Chrome_film', 'FeaturedPictures']:
    libdsops.split_traintest(ads)
    libdsops.crop_ds_dpath(ads, 1024, root_ds_dpath=os.path.join(libdsops.ROOT_DS_DPATH, 'train'), num_threads=os.cpu_count()//2)

#verify crops
python3 tools/verify_images.py ../../datasets/train/resized/1024/FeaturedPictures/
python3 tools/verify_images.py ../../datasets/train/resized/1024/Formerly_featured_pictures_on_Wikimedia_Commons/
python3 tools/verify_images.py ../../datasets/train/resized/1024/Photographs_taken_on_Ektachrome_and_Elite_Chrome_film/
# use the --save_img flag at the end of verify_images.py commands if training fails after the simple verification

Move a small subset of the training cropped images to a matching test directory and use it as args.val_dpath

JPEG/BPG compression of the Commons Test Images is done with common/tools/bpg_jpeg_compress_commons.py and comp/tools/bpg_jpeg_test_commons.py

Loading

Loading a model: provide all necessary (non-default) parameters s.a. arch, num_distributions, etc. Saved yaml can be used iff the ConfigArgParse patch from https://github.com/trougnouf/ConfigArgParse is applied, otherwise unset values are overwritten with the "None" string.

Training

Train a base model (given arch and num_distributions) for 6M steps at train_lambda=4096, fine-tune for 4M steps with lower train_lambda and/or msssim lossf Set arch to Manypriors for this work, use num_distributions 1 for Balle2017, or set arch to Balle2018PTTFExp for Balle2018 (hyperprior) egrun:

python train.py --num_distributions 64 --arch ManyPriors --train_lambda 4096 --expname mse_4096_manypriors_64_CLI
# and/or
python train.py --config configs/mse_4096_manypriors_64pr.yaml
# and/or
python train.py --config configs/mse_2048_manypriors_64pr.yaml --pretrain mse_4096_manypriors_64pr --reset_lr --reset_global_step # --reset_optimizer
# and/or
python train.py --config configs/mse_4096_hyperprior.yaml

--passthrough_ae is now activated by default. It was not used in the paper, but should result in better rate-distortion. To turn it off, change config/defaults.yaml or use --no_passthrough_ae

Tests

egruns: Test complexity:

python tests.py --complexity --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64

Test timing:

python tests.py --timing "../../datasets/test/Commons_Test_Photographs" --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64

Segment the images in commons_test_dpath by distribution index:

python tests.py --segmentation --commons_test_dpath "../../datasets/test/Commons_Test_Photographs" --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64

Visualize cumulative distribution functions:

python tests.py --plot --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64

Test on kodak images:

python tests.py --encdec_kodak --test_dpath "../../datasets/test/kodak/" --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64

Test on commons images (larger, uses CPU):

python tests.py --encdec_commons --test_commons_dpath "../../datasets/test/Commons_Test_Photographs/" --pretrain checkpoints/mse_4096_manypriors_64pr/saved_models/checkpoint.pth --arch ManyPriors --num_distributions 64

Encode an image:

python tests.py --encode "../../datasets/test/Commons_Test_Photographs/Garden_snail_moving_down_the_Vennbahn_in_disputed_territory_(DSCF5879).png" --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64 --device -1

Decode that image:

python tests.py --decode "checkpoints/mse_4096_manypriors_64pr/encoded/Garden_snail_moving_down_the_Vennbahn_in_disputed_territory_(DSCF5879).png" --pretrain mse_4096_manypriors_64pr --arch ManyPriors --num_distributions 64 --device -1
Owner
Benoit Brummer
BS CpE at @UCF (2016), MS CS (AI) @uclouvain (2019), PhD student @uclouvain w/ intoPIX
Benoit Brummer
Asterisk is a framework to generate high-quality training datasets at scale

Asterisk is a framework to generate high-quality training datasets at scale

Mona Nashaat 44 Apr 25, 2022
Implementation of "JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting"

JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting Pytorch implementation for the paper "JOKR: Joint Keypoint Repres

45 Dec 25, 2022
The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

This repository is the official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness. Requirements pip install -r requi

Jie Ren 17 Dec 12, 2022
CLOOB training (JAX) and inference (JAX and PyTorch)

cloob-training Pretrained models There are two pretrained CLOOB models in this repo at the moment, a 16 epoch and a 32 epoch ViT-B/16 checkpoint train

Katherine Crowson 64 Nov 27, 2022
Pytorch implementation of our paper accepted by NeurIPS 2021 -- Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) (Link) Overview Prerequisites Linu

Shaojie Li 34 Mar 31, 2022
Code for paper: "Spinning Language Models for Propaganda-As-A-Service"

Spinning Language Models for Propaganda-As-A-Service This is the source code for the Arxiv version of the paper. You can use this Google Colab to expl

Eugene Bagdasaryan 16 Jan 03, 2023
PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Introduction This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection. Up

133 Dec 29, 2022
Distributed Evolutionary Algorithms in Python

DEAP DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data stru

Distributed Evolutionary Algorithms in Python 4.9k Jan 05, 2023
Self Driving RC Car Code

Derp Learning Derp Learning is a Python package that collects data, trains models, and then controls an RC car for track racing. Hardware You will nee

Not Karol 39 Dec 07, 2022
PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

11 May 19, 2022
Code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in Video".

Consistent Depth of Moving Objects in Video This repository contains training code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in

Google 203 Jan 05, 2023
Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Transferable Semantic Augmentation for Domain Adaptation Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021) Paper

66 Dec 16, 2022
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung

Seung Min Lee 54 Dec 08, 2022
这是一个yolox-keras的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Keras当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤 Ho

Bubbliiiing 64 Nov 10, 2022
Empower Sequence Labeling with Task-Aware Language Model

LM-LSTM-CRF Check Our New NER Toolkit 🚀 🚀 🚀 Inference: LightNER: inference w. models pre-trained / trained w. any following tools, efficiently. Tra

Liyuan Liu 838 Jan 05, 2023
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022
Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

A Latent Transformer for Disentangled Face Editing in Images and Videos Official implementation for paper: A Latent Transformer for Disentangled Face

InterDigital 108 Dec 09, 2022
One line to host them all. Bootstrap your image search case in minutes.

One line to host them all. Bootstrap your image search case in minutes. Survey NOW gives the world access to customized neural image search in just on

Jina AI 403 Dec 30, 2022
Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

R-Pad - Robots Perceiving and Doing 18 Aug 22, 2022
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos

Manim is an engine for precise programmatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This rep

Grant Sanderson 49k Jan 09, 2023