Template repository for managing machine learning research projects built with PyTorch-Lightning

Overview

Mjolnir

Mjolnir: Thor's hammer, a divine instrument making its holder worthy of wielding lightning.

Template repository for managing machine learning research projects built with PyTorch-Lightning, using Anaconda for Python Dependencies and Sane Quality Defaults (Black, Flake, isort).

Template created by Sidd Karamcheti.


Contributing

Key section if this is a shared research project (e.g., other collaborators). Usually you should have a detailed set of instructions in CONTRIBUTING.md - Notably, before committing to the repository, make sure to set up your dev environment and pre-commit install (pre-commit install)!

Here are sample contribution guidelines (high-level):

  • Install and activate the Conda Environment using the QUICKSTART instructions below.

  • On installing new dependencies (via pip or conda), please make sure to update the environment- .yaml files via the following command (note that you need to separately create the environment-cpu.yaml file by exporting from your local development environment!):

    make serialize-env --arch=


Quickstart

Note: Replace instances of mjolnir and other instructions with instructions specific to your repository!

Clones mjolnir to the working directory, then walks through dependency setup, mostly leveraging the environment- .yaml files.

Shared Environment (for Clusters w/ Centralized Conda)

Note: The presence of this subsection depends on your setup. With the way the Stanford NLP Cluster has been set up, and the way I've set up the ILIAD Cluster, this section makes it really easy to maintain dependencies across multiple users via centralized conda environments, but YMMV.

@Sidd (or central repository maintainer) has already set up the conda environments in Stanford-NLP/ILIAD. The only necessary steps for you to take are cloning the repo, activating the appropriate environment, and running pre-commit install to start developing.

Local Development - Linux w/ GPU & CUDA 11.0

Note: Assumes that conda (Miniconda or Anaconda are both fine) is installed and on your path.

Ensure that you're using the appropriate environment- .yaml file --> if PyTorch doesn't build properly for your setup, checking the CUDA Toolkit is usually a good place to start. We have environment- .yaml files for CUDA 11.0 (and any additional CUDA Toolkit support can be added -- file an issue if necessary).

git clone https://github.com/pantheon-616/mjolnir.git
cd mjolnir
conda env create -f environments/environment-gpu.yaml  # Choose CUDA Kernel based on Hardware - by default used 11.0!
conda activate mjolnir
pre-commit install  # Important!

Local Development - CPU (Mac OS & Linux)

Note: Assumes that conda (Miniconda or Anaconda are both fine) is installed and on your path. Use the -cpu environment file.

git clone https://github.com/pantheon-616/mjolnir.git
cd mjolnir
conda env create -f environments/environment-cpu.yaml
conda activate mjolnir
pre-commit install  # Important!

Usage

This repository comes with sane defaults for black, isort, and flake8 for formatting and linting. It additionally defines a bare-bones Makefile (to be extended for your specific build/run needs) for formatting/checking, and dumping updated versions of the dependencies (after installing new modules).

Other repository-specific usage notes should go here (e.g., training models, running a saved model, running a visualization, etc.).

Repository Structure

High-level overview of repository file-tree (expand on this as you build out your project). This is meant to be brief, more detailed implementation/architectural notes should go in ARCHITECTURE.md.

  • conf - Quinine Configurations (.yaml) for various runs (used in lieu of argparse or typed-argument-parser)
  • environments - Serialized Conda Environments for both CPU and GPU (CUDA 11.0). Other architectures/CUDA toolkit environments can be added here as necessary.
  • src/ - Source Code - has all utilities for preprocessing, Lightning Model definitions, utilities.
    • preprocessing/ - Preprocessing Code (fill in details for specific project).
    • models/ - Lightning Modules (fill in details for specific project).
  • tests/ - Tests - Please test your code... just, please (more details to come).
  • train.py - Top-Level (main) entry point to repository, for training and evaluating models. Can define additional top-level scripts as necessary.
  • Makefile - Top-level Makefile (by default, supports conda serialization, and linting). Expand to your needs.
  • .flake8 - Flake8 Configuration File (Sane Defaults).
  • .pre-commit-config.yaml - Pre-Commit Configuration File (Sane Defaults).
  • pyproject.toml - Black and isort Configuration File (Sane Defaults).
  • ARCHITECTURE.md - Write up of repository architecture/design choices, how to extend and re-work for different applications.
  • CONTRIBUTING.md - Detailed instructions for contributing to the repository, in furtherance of the default instructions above.
  • README.md - You are here!
  • LICENSE - By default, research code is made available under the MIT License. Change as you see fit, but think deeply about why!

Start-Up (from Scratch)

Use these commands if you're starting a repository from scratch (this shouldn't be necessary for your collaborators , since you'll be setting things up, but I like to keep this in the README in case things break in the future). Generally, if you're just trying to run/use this code, look at the Quickstart section above.

GPU & Cluster Environments (CUDA 11.0)

conda create --name mjolnir python=3.8
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch   # CUDA=11.0 on most of Cluster!
conda install ipython
conda install pytorch-lightning -c conda-forge

pip install black flake8 isort matplotlib pre-commit quinine wandb

# Install other dependencies via pip below -- conda dependencies should be added above (always conda before pip!)
...

CPU Environments (Usually for Local Development -- Geared for Mac OS & Linux)

Similar to the above, but installs the CPU-only versions of Torch and similar dependencies.

conda create --name mjolnir python=3.8
conda install pytorch torchvision torchaudio -c pytorch
conda install ipython
conda install pytorch-lightning -c conda-forge

pip install black flake8 isort matplotlib pre-commit quinine wandb

# Install other dependencies via pip below -- conda dependencies should be added above (always conda before pip!)
...

Containerized Setup

Support for running mjolnir inside of a Docker or Singularity container is TBD. If this support is urgently required, please file an issue.

Owner
Sidd Karamcheti
PhD Student at Stanford & Research Intern at Hugging Face 🤗
Sidd Karamcheti
Keras Image Embeddings using Contrastive Loss

Image to Embedding projection in vector space. Implementation in keras and tensorflow of batch all triplet loss for one-shot/few-shot learning.

Shravan Anand K 5 Mar 21, 2022
Collection of sports betting AI tools.

sports-betting sports-betting is a collection of tools that makes it easy to create machine learning models for sports betting and evaluate their perf

George Douzas 109 Dec 31, 2022
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

amos_xwang 57 Dec 04, 2022
Code for Multinomial Diffusion

Code for Multinomial Diffusion Abstract Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural ima

104 Jan 04, 2023
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

flownet2-pytorch Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, a

NVIDIA Corporation 2.8k Dec 27, 2022
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022
Official Implementation of DE-CondDETR and DELA-CondDETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-CondDETR and DELA-Cond

Wen Wang 41 Dec 12, 2022
Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

SRDenseNet-pytorch Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch (http://openaccess.thecvf.com/content_ICC

wxy 114 Nov 26, 2022
This repository contains small projects related to Neural Networks and Deep Learning in general.

ILearnDeepLearning.py Description People say that nothing develops and teaches you like getting your hands dirty. This repository contains small proje

Piotr Skalski 1.2k Dec 22, 2022
Neural Network Libraries

Neural Network Libraries Neural Network Libraries is a deep learning framework that is intended to be used for research, development and production. W

Sony 2.6k Dec 30, 2022
Gym environment for FLIPIT: The Game of "Stealthy Takeover"

gym-flipit Gym environment for FLIPIT: The Game of "Stealthy Takeover" invented by Marten van Dijk, Ari Juels, Alina Oprea, and Ronald L. Rivest. Desi

Lisa Oakley 2 Dec 15, 2021
State-Relabeling Adversarial Active Learning

State-Relabeling Adversarial Active Learning Code for SRAAL [2020 CVPR Oral] Requirements torch = 1.6.0 numpy = 1.19.1 tqdm = 4.31.1 AL Results The

10 Jul 14, 2022
DvD-TD3: Diversity via Determinants for TD3 version

DvD-TD3: Diversity via Determinants for TD3 version The implementation of paper Effective Diversity in Population Based Reinforcement Learning. Instal

3 Feb 11, 2022
Resources for the Ki testnet challenge

Ki Testnet Challenge This repository hosts ki-testnet-challenge. A set of scripts and resources to be used for the Ki Testnet Challenge What is the te

Ki Foundation 23 Aug 08, 2022
Pytorch code for our paper Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains)

Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains (ICLR'2022) This is the Pytorch code for our paper Beyond ImageNet

Alibaba-AAIG 37 Nov 23, 2022
Code for PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

PackNet: https://arxiv.org/abs/1711.05769 Pretrained models are available here: https://uofi.box.com/s/zap2p03tnst9dfisad4u0sfupc0y1fxt Datasets in Py

Arun Mallya 216 Jan 05, 2023
Structured Data Gradient Pruning (SDGP)

Structured Data Gradient Pruning (SDGP) Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by re

Bradley McDanel 10 Nov 11, 2022
Cross View SLAM

Cross View SLAM This is the associated code and dataset repository for our paper I. D. Miller et al., "Any Way You Look at It: Semantic Crossview Loca

Ian D. Miller 99 Dec 09, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
FcaNet: Frequency Channel Attention Networks

FcaNet: Frequency Channel Attention Networks PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks". Simplest usage Models

327 Dec 27, 2022