Spherical CNNs

Related tags

Deep Learnings2cnn
Overview

Spherical CNNs

Equivariant CNNs for the sphere and SO(3) implemented in PyTorch

Equivariance

Overview

This library contains a PyTorch implementation of the rotation equivariant CNNs for spherical signals (e.g. omnidirectional images, signals on the globe) as presented in [1]. Equivariant networks for the plane are available here.

Dependencies

(commands to install all the dependencies on a new conda environment)

conda create --name cuda9 python=3.6 
conda activate cuda9

# s2cnn deps
#conda install pytorch torchvision cuda90 -c pytorch # get correct command line at http://pytorch.org/
conda install -c anaconda cupy  
pip install pynvrtc joblib

# lie_learn deps
conda install -c anaconda cython  
conda install -c anaconda requests  

# shrec17 example dep
conda install -c anaconda scipy  
conda install -c conda-forge rtree shapely  
conda install -c conda-forge pyembree  
pip install "trimesh[easy]"  

Installation

To install, run

$ python setup.py install

Usage

Please have a look at the examples.

Please cite [1] in your work when using this library in your experiments.

Design choices for Spherical CNN Architectures

Spherical CNNs come with different choices of grids and grid hyperparameters which are on the first look not obviously related to those of conventional CNNs. The s2_near_identity_grid and so3_near_identity_grid are the preferred choices since they correspond to spatially localized kernels, defined at the north pole and rotated over the sphere via the action of SO(3). In contrast, s2_equatorial_grid and so3_equatorial_grid define line-like (or ring-like) kernels around the equator.

To clarify the possible parameter choices for s2_near_identity_grid:

max_beta:

Adapts the size of the kernel as angle measured from the north pole. Conventional CNNs on flat space usually use a fixed kernel size but pool the signal spatially. This spatial pooling gives the kernels in later layers an effectively increased field of view. One can emulate a pooling by a factor of 2 in spherical CNNs by decreasing the signal bandwidth by 2 and increasing max_beta by 2.

n_beta:

Number of rings of the kernel around the equator, equally spaced in [β=0, β=max_beta]. The choice n_beta=1 corresponds to a small 3x3 kernel in conv2d since in both cases the resulting kernel consists of one central pixel and one ring around the center.

n_alpha:

Gives the number of learned parameters of the rings around the pole. These values are per default equally spaced on the azimuth. A sensible number of values depends on the bandwidth and max_beta since a higher resolution or spatial extent allow to sample more fine kernels without producing aliased results. In practice this value is typically set to a constant, low value like 6 or 8. A reduced bandwidth of the signal is thereby counteracted by an increased max_beta to emulate spatial pooling.

The so3_near_identity_grid has two additional parameters max_gamma and n_gamma. SO(3) can be seen as a (principal) fiber bundle SO(3)→S² with the sphere S² as base space and fiber SO(2) attached to each point. The additional parameters control the grid on the fiber in the following way:

max_gamma:

The kernel spans over the fiber SO(2) between γ∈[0, max_gamma]. The fiber SO(2) encodes the kernel responses for every sampled orientation at a given position on the sphere. Setting max_gamma≨2π results in the kernel not seeing the responses of all kernel orientations simultaneously and is in general unfavored. Steerable CNNs [3] usually always use max_gamma=2π.

n_gamma:

Number of learned parameters on the fiber. Typically set equal to n_alpha, i.e. to a low value like 6 or 8.

See the deep model of the MNIST example for an example of how to adapt these parameters over layers.

Feedback

For questions and comments, feel free to contact us: geiger.mario (gmail), taco.cohen (gmail), jonas (argmin.xyz).

License

MIT

References

[1] Taco S. Cohen, Mario Geiger, Jonas Köhler, Max Welling, Spherical CNNs. International Conference on Learning Representations (ICLR), 2018.

[2] Taco S. Cohen, Mario Geiger, Jonas Köhler, Max Welling, Convolutional Networks for Spherical Signals. ICML Workshop on Principled Approaches to Deep Learning, 2017.

[3] Taco S. Cohen, Mario Geiger, Maurice Weiler, Intertwiners between Induced Representations (with applications to the theory of equivariant neural networks), ArXiv preprint 1803.10743, 2018.

Owner
Jonas Köhler
Jonas Köhler
Evaluating Cross-lingual Sentence Representations

XNLI: The Cross-Lingual NLI Corpus XNLI is an evaluation corpus for language transfer and cross-lingual sentence classification in 15 languages. New:

Meta Research 395 Dec 19, 2022
Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Yaoming Cai 5 Jul 18, 2022
Practical tutorials and labs for TensorFlow used by Nvidia, FFN, CNN, RNN, Kaggle, AE

TensorFlow Tutorial - used by Nvidia Learn TensorFlow from scratch by examples and visualizations with interactive jupyter notebooks. Learn to compete

Alexander R Johansen 1.9k Dec 19, 2022
Google AI Open Images - Object Detection Track: Open Solution

Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c

minerva.ml 46 Jun 22, 2022
StarGAN2 for practice

StarGAN2 for practice This version of StarGAN2 (coined as 'Post-modern Style Transfer') is intended mostly for fellow artists, who rarely look at scie

vadim epstein 87 Sep 24, 2022
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

Federico Galatolo 172 Dec 22, 2022
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

Doron Adler 43 Oct 03, 2022
Paper list of log-based anomaly detection

Paper list of log-based anomaly detection

Weibin Meng 411 Dec 05, 2022
Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021) Paper Video Instance Segmentation using Inter-Frame Communicat

Sukjun Hwang 81 Dec 29, 2022
A Transformer-Based Siamese Network for Change Detection

ChangeFormer: A Transformer-Based Siamese Network for Change Detection (Under review at IGARSS-2022) Wele Gedara Chaminda Bandara, Vishal M. Patel Her

Wele Gedara Chaminda Bandara 214 Dec 29, 2022
Code for testing convergence rates of Lipschitz learning on graphs

📈 LipschitzLearningRates The code in this repository reproduces the experimental results on convergence rates for k-nearest neighbor graph infinity L

2 Dec 20, 2021
这是一个yolox-pytorch的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤

Bubbliiiing 613 Jan 05, 2023
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Introduction This repository contains my unofficial reimplementation of the standard ECAPA-TDNN, which is the speaker recognition in VoxCeleb2 dataset

Tao Ruijie 277 Dec 31, 2022
Trax — Deep Learning with Clear Code and Speed

Trax — Deep Learning with Clear Code and Speed Trax is an end-to-end library for deep learning that focuses on clear code and speed. It is actively us

Google 7.3k Dec 26, 2022
Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

USSS_ICCV19 Code for Universal Semi Supervised Semantic Segmentation accepted to ICCV 2019. Full Paper available at https://arxiv.org/abs/1811.10323.

Tarun K 68 Nov 24, 2022
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

Wang Han 王晗 1.3k Jan 08, 2023
SMIS - Semantically Multi-modal Image Synthesis(CVPR 2020)

Semantically Multi-modal Image Synthesis Project page / Paper / Demo Semantically Multi-modal Image Synthesis(CVPR2020). Zhen Zhu, Zhiliang Xu, Anshen

316 Dec 01, 2022
MISSFormer: An Effective Medical Image Segmentation Transformer

MISSFormer Code for paper "MISSFormer: An Effective Medical Image Segmentation Transformer". Please read our preprint at the following link: paper_add

Fong 22 Dec 24, 2022
Evolving neural network parameters in JAX.

Evolving Neural Networks in JAX This repository holds code displaying techniques for applying evolutionary network training strategies in JAX. Each sc

Trevor Thackston 6 Feb 12, 2022
Principled Detection of Out-of-Distribution Examples in Neural Networks

ODIN: Out-of-Distribution Detector for Neural Networks This is a PyTorch implementation for detecting out-of-distribution examples in neural networks.

189 Nov 29, 2022