Efficient Deep Learning Systems

This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Science of HSE University and Yandex School of Data Analysis.

Syllabus

Week 1: Introduction
- Lecture: Course overview and organizational details. Core concepts of the GPU architecture and CUDA API.
- Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
Week 2: Basics of distributed ML
- Lecture: Introduction to distributed training. Process-based communication. Parameter Server architecture.
- Seminar: Multiprocessing basics. Parallel GloVe training.
Week 3: Data-parallel training and All-Reduce
- Lecture: Data-parallel training of neural networks. All-Reduce and its efficient implementations.
- Seminar: Introduction to PyTorch Distributed. Data-parallel training primitives.
Week 4: Memory-efficient and model-parallel training
Week 5: Profiling DL code, training-time optimizations
Week 6: Basics of Python application deployment
Week 7: Software for serving neural networks
Week 8: Optimizing models for faster inference
Week 9: Experiment tracking, model and data versioning
Week 10: Testing, debugging and monitoring of models

Grading

There will be a total of 4 home assignments (some of them spread over several weeks). The final grade is a weighted sum of per-assignment grades. Please refer to the course page of your institution for details.

Efficient Deep Learning Systems course

Related tags

Overview

Efficient Deep Learning Systems

Syllabus

Grading

Staff

Owner

Max Ryabinin

TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021

SciFive: a text-text transformer model for biomedical literature

PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

中文语音识别系列，读者可以借助它快速训练属于自己的中文语音识别模型，或直接使用预训练模型测试效果。

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Python library for science observations from the James Webb Space Telescope

Categorical Depth Distribution Network for Monocular 3D Object Detection

Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Gesture-controlled Video Game. Just swing your finger and play the game without touching your PC

PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

This repository for project that can Automate Number Plate Recognition (ANPR) in Morocco Licensed Vehicles. 💻 + 🚙 + 🇲🇦 = 🤖 🕵🏻‍♂️

Development of IP code based on VIPs and AADM

This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

A self-supervised 3D representation learning framework named viewpoint bottleneck.

Weakly- and Semi-Supervised Panoptic Segmentation (ECCV18)

Contains source code for the winning solution of the xView3 challenge