GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

Overview

Course title page

Course Description

The programming language Julia is being more and more adopted in High Performance Computing (HPC) due to its unique way to combine performance with simplicity and interactivity, enabling unprecedented productivity in HPC development. This course will discuss both basic and advanced topics relevant for single and Multi-GPU computing with Julia. It will focus on the CUDA.jl package, which enables writing native Julia code for GPUs. Topics covered include the following:

  • GPU array programming;
  • GPU kernel programming;
  • kernel launch parameters;
  • usage of on-chip memory;
  • Multi-GPU computing;
  • code reflection and introspection; and
  • diverse advanced optimization techniques.

This course combines lectures and hands-on sessions.

Target audience

This course addresses scientists interested in doing HPC using Julia. Previous Julia or GPU computing knowledge is not needed, but a good general understanding of programming is advantageous.

Instructors

  • Dr. Tim Besard (Lead developer of CUDA.jl, Julia Computing Inc.)
  • Dr. Samuel Omlin (Computational Scientist | Responsible for Julia computing, CSCS)

Course material

This git repository contains the material of day 1 and 2 (speaker: Dr. Samuel Omlin, CSCS). The material of day 3 and 4 is found in this git repository (speaker: Dr. Tim Besard, Julia Computing Inc.).

Course recording

The edited course recording is found here. The following list provides key entry points into the video.

Day 1:

00:00: Introduction to the course

05:02: General introduction to supercomputing

14:06: High-speed introduction to GPU computing

32:57: Walk through introduction notebook on memory copy and performance evaluation

Day 2:

1:24:53: Introduction to day 2

1:39:12: Walk through solutions of exercise 1 and 2 (data "transfer" optimisations)

2:34:12: Walk through solutions of exercise 3 and 4 (data "transfer" optimisations and distributed parallelization)

Day 3:

03:31:57: Introduction to day 3

03:32:59: Presentation of notebook 1: cuda libraries

04:24:31: Presentation of notebook 2: programming models

05:30:46: Presentation of notebook 3: memory management

06:03:48: Presentation of notebook 4: concurrent computing

Day 4:

06:27:15: Introduction to day 4

06:28:13: Presentation of notebook 5: application analysis and optimisation

07:35:08: Presentation of notebook 6: kernel analysis and optimisation

Owner
Samuel Omlin
Computational Scientist | Responsible for Julia computing, CSCS - Swiss National Supercomputing Centre
Samuel Omlin
Extreme Rotation Estimation using Dense Correlation Volumes

Extreme Rotation Estimation using Dense Correlation Volumes This repository contains a PyTorch implementation of the paper: Extreme Rotation Estimatio

Ruojin Cai 29 Nov 18, 2022
Out of Distribution Detection on Natural Adversarial Examples

OOD-on-NAE Research project on out of distribution detection for the Computer Vision course by Prof. Rob Fergus (CSCI-GA 2271) Paper out on arXiv - ht

Anugya 1 Jun 08, 2022
Code for Understanding Pooling in Graph Neural Networks

Select, Reduce, Connect This repository contains the code used for the experiments of: "Understanding Pooling in Graph Neural Networks" Setup Install

Daniele Grattarola 37 Dec 13, 2022
OMNIVORE is a single vision model for many different visual modalities

Omnivore: A Single Model for Many Visual Modalities [paper][website] OMNIVORE is a single vision model for many different visual modalities. It learns

Meta Research 451 Dec 27, 2022
Crossover Learning for Fast Online Video Instance Segmentation (ICCV 2021)

TL;DR: CrossVIS (Crossover Learning for Fast Online Video Instance Segmentation) proposes a novel crossover learning paradigm to fully leverage rich c

Hust Visual Learning Team 79 Nov 25, 2022
This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation This is the code relat

39 Sep 23, 2022
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point

Fangzhou Hong 112 Dec 23, 2022
TensorFlow implementation of ENet, trained on the Cityscapes dataset.

segmentation TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e

Fredrik Gustafsson 248 Dec 16, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
URIE: Universal Image Enhancementfor Visual Recognition in the Wild

URIE: Universal Image Enhancementfor Visual Recognition in the Wild This is the implementation of the paper "URIE: Universal Image Enhancement for Vis

Taeyoung Son 43 Sep 12, 2022
Dashboard for the COVID19 spread

COVID-19 Data Explorer App A streamlit Dashboard for the COVID-19 spread. The app is live at: [https://covid19.cwerner.ai]. New data is queried from G

Christian Werner 22 Sep 29, 2022
How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Training a NN to 99% accuracy on MNIST in 0.76 seconds A quick study on how fast you can reach 99% accuracy on MNIST with a single laptop. Our answer

Tuomas Oikarinen 42 Dec 10, 2022
This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

CSP_Deep_EEG This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning" {https://www

Seyed Mahdi Roostaiyan 2 Nov 08, 2022
PyTorch implementation of our method for adversarial attacks and defenses in hyperspectral image classification.

Self-Attention Context Network for Hyperspectral Image Classification PyTorch implementation of our method for adversarial attacks and defenses in hyp

22 Dec 02, 2022
Use of Attention Gates in a Convolutional Neural Network / Medical Image Classification and Segmentation

Attention Gated Networks (Image Classification & Segmentation) Pytorch implementation of attention gates used in U-Net and VGG-16 models. The framewor

Ozan Oktay 1.6k Dec 30, 2022
Software & Hardware to do multi color printing with Sharpies

3D Print Colorizer is a combination of 3D printed parts and a Cura plugin which allows anyone with an Ender 3 like 3D printer to produce multi colored

343 Jan 06, 2023
MISSFormer: An Effective Medical Image Segmentation Transformer

MISSFormer Code for paper "MISSFormer: An Effective Medical Image Segmentation Transformer". Please read our preprint at the following link: paper_add

Fong 22 Dec 24, 2022
This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans

This repository contains the code for TABS, a 3D CNN-Transformer hybrid automated brain tissue segmentation algorithm using T1w structural MRI scans. TABS relies on a Res-Unet backbone, with a Vision

6 Nov 07, 2022
Coded illumination for improved lensless imaging

CodedCam Coded Illumination for Improved Lensless Imaging Paper | Supplementary results | Data and Code are available. Coded illumination for improved

Computational Sensing and Information Processing Lab 1 Nov 29, 2021
Semi-Autoregressive Transformer for Image Captioning

Semi-Autoregressive Transformer for Image Captioning Requirements Python 3.6 Pytorch 1.6 Prepare data Please use git clone --recurse-submodules to clo

YE Zhou 23 Dec 09, 2022