STAR

Official implementation of Sparse Transformer-based Action Recognition

Dataset

download NTU RGB+D 60 action recognition of 2D/3D skeleton from http://rose1.ntu.edu.sg/datasets/actionRecognition.asp

or use google drive

NTU60 NTU120

uzip data as the following file structure: $(project_folder)/raw/.\*skeleton or $(project_folder)/dataset/raw/.\*skeleton (create "raw" folder under $(project_folder) or $(project_folder)/dataset then put raw skeleton files under "raw" folder)

run the code below to generate dataset:

python datagen.py

Training

git fetch and checkout to "distributed" branch

python train_dist.py -#distributed training

Configuration

parser.set_defaults(gpu=True,
                    batch_size=128,
                    dataset_name='NTU',
                    dataset_root=osp.join(os.getcwd()),  # or dataset_root=osp.join(os.getcwd(), 'dataset')
                    load_model=False,
                    in_channels=9,
                    num_enc_layers=5,
                    num_conv_layers=2,
                    weight_decay=4e-5,
                    drop_rate=[0.4, 0.4, 0.4, 0.4],  # linear_attention, sparse_attention, add_norm, ffn
                    hid_channels=64,
                    out_channels=64,
                    heads=8,
                    data_parallel=False,
                    cross_k=5,
                    mlp_head_hidden=128)

parser.set_defaults(gpu=True,
                    batch_size=128,
                    dataset_name='NTU',
                    dataset_root=osp.join(os.getcwd()),
                    load_model=False,
                    in_channels=9,
                    num_enc_layers=5,
                    num_conv_layers=2,
                    weight_decay=4e-5,
                    drop_rate=[0.4, 0.4, 0.4, 0.4],  # linear_attention, sparse_attention, add_norm, ffn
                    hid_channels=128,
                    out_channels=128,
                    heads=8,
                    data_parallel=False,
                    cross_k=5,
                    mlp_head_hidden=128)

Official implementation of Sparse Transformer-based Action Recognition

Related tags

Overview

STAR

Dataset

Training

Configuration

Owner

Chonghan_Lee

This is the official implementation of our proposed SwinMR

Examples of how to create colorful, annotated equations in Latex using Tikz.

wmctrl ported to Python Ctypes

A basic neural network for image segmentation.

DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

tensorflow implementation of 'YOLO : Real-Time Object Detection'

Object Depth via Motion and Detection Dataset

[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

Another pytorch implementation of FCN (Fully Convolutional Networks)

A PyTorch Toolbox for Face Recognition

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

HGCN: Harmonic Gated Compensation Network For Speech Enhancement

Sentinel-1 vessel detection model used in the xView3 challenge

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Python implementation of Project Fluent

🎁 3,000,000+ Unsplash images made available for research and machine learning

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.