A lossless neural compression framework built on top of JAX.

Overview

Kompressor

GitHub

Branch CI Coverage
main (active) Build codecov
main Build codecov
development Build codecov

A neural compression framework built on top of JAX.

Install

setup.py assumes a compatible version of JAX and JAXLib are already installed. Automated build is tested for a cuda:11.1-cudnn8-runtime-ubuntu20.04 environment with jaxlib==0.1.76+cuda11.cudnn82.

git clone https://github.com/rosalindfranklininstitute/kompressor.git
cd kompressor
pip install -e .

# Run tests
python -m pytest --cov=src/kompressor tests/

Install & Run through Docker environment

Docker image for the Kompressor dependencies are provided in the quay.io/rosalindfranklininstitute/kompressor:main Quay.io image.

# Run the container for the Kompressor environment
docker run --rm quay.io/rosalindfranklininstitute/kompressor:main \
    python -m pytest --cov=/usr/local/kompressor/src/kompressor /usr/local/kompressor/tests

Install & Run through Singularity environment

Singularity image for the Kompressor dependencies are provided in the rosalindfranklininstitute/kompressor/kompressor:main cloud.sylabs.io image.

singularity pull library://rosalindfranklininstitute/kompressor/kompressor:main
singularity run kompressor_main.sif \
    python -m pytest --cov=/usr/local/kompressor/src/kompressor /usr/local/kompressor/tests
Comments
  • Refactor map tuples to dicts

    Refactor map tuples to dicts

    Closes #14. Functions which currently return an ordered tuple of maps (lrmap, udmap, cmap, ...) now return keyed dictionaries { 'lrmap': lrmap, 'udmap': udmap, 'cmap': cmap, ... } so that order/usage is explicitly enforced.

    List comprehensions over the tuples now use jax.tree_map and jax.tree_multimap to ensure key safety.

    @GMW99, this will break the current implementation of the Metrics Callback class which iterates over a zip of the hardcoded map names and the maps tuple. This iteration can be replaced by iterating over maps.items() since it is now a dict already.

    enhancement 
    opened by JossWhittle 1
  • Ensure jax.jit static_argnums is refactored to static_argnames

    Ensure jax.jit static_argnums is refactored to static_argnames

    Functions that currently mark static_argnums=(0, 1, 2) should be updated to use the safer static_argnames=('tom', 'dick', 'harry') that is now available.

    enhancement high priority 
    opened by JossWhittle 1
  • Update development examples

    Update development examples

    • Splits docker image into JAX base image and Kompressor dependency and install image
    • JAX image installs JAX from source to ensure correct CUDA / CUDNN versions
    • Adjust setup.py to install dependencies from requirement.txt
    • Refactors a how submodules are imported (within the kom.image submodule. Need to check volumes matches)
    • Add kom.image.data submodule for dealing with tensorflow data pipelines
    • Fixed pooling in the total variation losses (used as metrics in the example notebooks)
    • Move all the encoding/decoding functions for the maps into a kom.mapping submodule
    • Add within-k and run-length metrics to kom.image.metrics for example notebooks
    • Added example notebooks for interacting with the maps and training a basic Haiku compression model
    feature 
    opened by JossWhittle 0
  • Add mapping encode/decode functions for float32 data

    Add mapping encode/decode functions for float32 data

    Will need a bit of thinking to get right. We probably need to consider similar tricks that we used for applying Radix Sort on float32 data to make the compression numerically stable and portable between machines.

    enhancement low priority 
    opened by JossWhittle 0
  • Add mapping encode/decode functions for uint32 data

    Add mapping encode/decode functions for uint32 data

    Some of our data is uint32 volumes.

    Will need to trace through the full compression implementation and make sure intermediate value dtypes are large enough to avoid uint32 overflow when needed.

    enhancement low priority 
    opened by JossWhittle 0
  • Modify core encode decode functions to pass a dict to the prediction function

    Modify core encode decode functions to pass a dict to the prediction function

    Currently the lowres inputs are passed directly to the prediction_fn as the only input.

    • Modify to accept a dict that has at least one key for the lowres input.

    • Provide boolean flag to also pass a positional encoding tensor along with the lowres which the model can use if needed.

    • Chunked encode decode will need to generate the correct chunks of the positional encoding for the current chunk.

    • Model can choose how to use positional encodings.

      • Image case would receive (B, H, W, 2) tensor containing the Y and X coordinates of each pixel in the trailing axis.
      • Volume case would receive (B, D, H, W, 3) tensor containing the Z, Y, and X coordinates of each voxel in the trailing axis.
    enhancement high priority 
    opened by JossWhittle 0
  • Look at decompressing sliced chunks

    Look at decompressing sliced chunks

    Decompress sliced chunk of image or volume without needing to decompress the entire data element.

    • May require applying secondary compression in blocks to avoid needing to decompress the full level maps, only to apply the predictor to the target slice.

    • Instead unpack just the blocks needed for the slice then trim.

    • A kompressor (or stack of) trained to secondary compress the maps from the primary kompressor (or stack of) would be able to naturally handle slice chunked decoding.

      • Could such a secondary compressor be shared between levels? Between multiple kompressors in the primary stack?
    experiment low priority 
    opened by JossWhittle 0
  • Look at compressing timeseries data

    Look at compressing timeseries data

    • Experiment with implementing the 1D case for compressing signals.
    • Video as sequence of 2D frames using the 3D volume code directly.
    • Look at compressing within timestep using information from neighbouring timesteps without actually compressing (dropping frames) the temporal axis.
    experiment low priority 
    opened by JossWhittle 0
Releases(v0.0.0)
Owner
Rosalind Franklin Institute
The Rosalind Franklin Institute is dedicated to transforming life science through interdisciplinary research and technology development
Rosalind Franklin Institute
IndoNLI: A Natural Language Inference Dataset for Indonesian

IndoNLI: A Natural Language Inference Dataset for Indonesian This is a repository for data and code accompanying our EMNLP 2021 paper "IndoNLI: A Natu

15 Feb 10, 2022
Experiments with differentiable stacks and queues in PyTorch

Please use stacknn-core instead! StackNN This project implements differentiable stacks and queues in PyTorch. The data structures are implemented in s

Will Merrill 141 Oct 06, 2022
2021搜狐校园文本匹配算法大赛 分比我们低的都是帅哥队

sohu_text_matching 2021搜狐校园文本匹配算法大赛Top2:分比我们低的都是帅哥队 本repo包含了本次大赛决赛环节提交的代码文件及答辩PPT,提交的模型文件可在百度网盘获取(链接:https://pan.baidu.com/s/1T9FtwiGFZhuC8qqwXKZSNA ,

hflserdaniel 43 Oct 01, 2022
[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

SSH: A Self-Supervised Framework for Image Harmonization (ICCV 2021) code for SSH Representative Examples Main Pipeline RealHM DataSet Google Drive Pr

VITA 86 Dec 02, 2022
A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.

CAP-Classification-CRNN A deep learning model based on Inception modules paired with gated recurrent units (GRU) for the classification of CAP phases

Apurva R. Umredkar 2 Nov 25, 2022
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

93 Nov 08, 2022
Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Jiaxi Jiang 282 Jan 02, 2023
PyTorch implementation of "VRT: A Video Restoration Transformer"

VRT: A Video Restoration Transformer Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool Computer

Jingyun Liang 837 Jan 09, 2023
Piotr - IoT firmware emulation instrumentation for training and research

Piotr: Pythonic IoT exploitation and Research Introduction to Piotr Piotr is an emulation helper for Qemu that provides a convenient way to create, sh

Damien Cauquil 51 Nov 09, 2022
Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

nnvolterra Run Code Compile first: make compile Run all codes: make all Test xconv: make npxconv_test MNIST dataset needs to be downloaded, converted

1 May 24, 2022
Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Serpent.AI - Game Agent Framework (Python) Update: Revival (May 2020) Development work has resumed on the framework with the aim of bringing it into 2

Serpent.AI 6.4k Jan 05, 2023
Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

Bran Zhu 28 Dec 11, 2022
3D dataset of humans Manipulating Objects in-the-Wild (MOW)

MOW dataset [Website] This repository maintains our 3D dataset of humans Manipulating Objects in-the-Wild (MOW). The dataset contains 512 images in th

Zhe Cao 28 Nov 06, 2022
Database Reasoning Over Text project for ACL paper

Database Reasoning over Text This repository contains the code for the Database Reasoning Over Text paper, to appear at ACL2021. Work is performed in

Facebook Research 320 Dec 12, 2022
Detectron2 for Document Layout Analysis

Detectron2 trained on PubLayNet dataset This repo contains the training configurations, code and trained models trained on PubLayNet dataset using Det

Himanshu 163 Nov 21, 2022
Code for NeurIPS2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints"

This repository is the code for NeurIPS 2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints". Edit 2021/

10 Dec 20, 2022
Google Recaptcha solver.

byerecaptcha - Google Recaptcha solver. Model and some codes takes from embium's repository -Installation- pip install byerecaptcha -How to use- from

Vladislav Zenkevich 21 Dec 19, 2022
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

CoCa - Pytorch Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch. They were able to elegantly fit in contras

Phil Wang 565 Dec 30, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022