PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules

Last update: Dec 24, 2022

Overview

Dynamic Routing Between Capsules - PyTorch implementation

PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules from Sara Sabour, Nicholas Frosst and Geoffrey E. Hinton.

The hyperparameters and data augmentation strategy strictly follow the paper.

Requirements

Only PyTorch with torchvision is required (tested on pytorch 0.2.0 and 0.3.0). Jupyter and matplotlib is required to run the notebook with visualizations.

Usage

Train the model by running

python net.py

Optional arguments and default values:

  --batch-size N          input batch size for training (default: 128)
  --test-batch-size N     input batch size for testing (default: 1000)
  --epochs N              number of epochs to train (default: 250)
  --lr LR                 learning rate (default: 0.001)
  --no-cuda               disables CUDA training
  --seed S                random seed (default: 1)
  --log-interval N        how many batches to wait before logging training
                          status (default: 10)
  --routing_iterations    number of iterations for routing algorithm (default: 3)
  --with_reconstruction   should reconstruction layers be used

MNIST dataset will be downloaded automatically.

Results

The network trained with reconstruction and 3 routing iterations on MNIST dataset achieves 99.65% accuracy on test set. The test loss is still slightly decreasing, so the accuracy could probably be improved with more training and more careful learning rate schedule.

Visualizations

We can create visualizations of digit reconstructions from DigitCaps (e.g. Figure 3 in the paper)

We can also visualize what each dimension of digit capsule represents (Section 5.1, Figure 4 in the paper).

Below, each row shows the reconstruction when one of the 16 dimensions in the DigitCaps representation is tweaked by intervals of 0.05 in the range [−0.25, 0.25].

We can see what individual dimensions represent for digit 7, e.g. dim6 - stroke thickness, dim11 - digit width, dim 15 - vertical shift.

Visualization examples are provided in a jupyter notebook

PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules

Related tags

Overview

Dynamic Routing Between Capsules - PyTorch implementation

Requirements

Usage

Results

Visualizations

Owner

Adam Bielski

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

[peer review] An Arbitrary Scale Super-Resolution Approach for 3D MR Images using Implicit Neural Representation

LegoDNN: a block-grained scaling tool for mobile vision systems

FluidNet re-written with ATen tensor lib

TreeSubstitutionCipher - Encryption system based on trees and substitution

Benchmarks for Model-Based Optimization

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Code for Reciprocal Adversarial Learning for Brain Tumor Segmentation: A Solution to BraTS Challenge 2021 Segmentation Task

Pretraining on Dynamic Graph Neural Networks

Research into Forex price prediction from price history using Deep Sequence Modeling with Stacked LSTMs.

Alfred-Restore-Iterm-Arrangement - An Alfred workflow to restore iTerm2 window Arrangements

Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

Generative Adversarial Networks(GANs)

Pure python implementation reverse-mode automatic differentiation

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

A programming language written with python

Omnidirectional camera calibration in python

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"