Alleviating Over-segmentation Errors by Detecting Action Boundaries

Last update: Dec 12, 2022

Related tags

Deep Learning asrf_with_asformer

Overview

Alleviating Over-segmentation Errors by Detecting Action Boundaries

Forked from ASRF offical code. This repo is the a implementation of replacing original MSTCN backbone with ASFormer.

Dataset

GTEA, 50Salads, Breakfast

You can download features and G.T. of these datasets from this repository.
Or you can extract their features by yourself using this repository

Requirements

Python >= 3.7
pytorch => 1.0
torchvision
pandas
numpy
Pillow
PyYAML

You can download packages using requirements.txt.

pip install -r requirements.txt

Directory Structure

root ── csv/
      ├─ libs/
      ├─ imgs/
      ├─ result/
      ├─ utils/
      ├─ dataset ─── 50salads/...
      │           ├─ breakfast/...
      │           └─ gtea ─── features/
      │                    ├─ groundTruth/
      │                    ├─ splits/
      │                    └─ mapping.txt
      ├.gitignore
      ├ README.md
      ├ requirements.txt
      ├ save_pred.py
      ├ train.py
      └ evaluate.py

csv directory contains csv files which are necessary for training and testing.
An image in imgs is one from PascalVOC. This is used for an color palette to visualize outputs.
Experimental results are stored in results directory.
Scripts in utils are directly irrelevant with train.py and evaluate.py but necessary for converting labels, generating configurations, visualization and so on.
Scripts in libs are necessary for training and evaluation. e.g.) models, loss functions, dataset class and so on.
The datasets downloaded from this repository are stored in dataset. You can put them in another directory, but need to specify the path in configuration files.
train.py is a script for training networks.
eval.py is a script for evaluation.
save_pred.py is for saving predictions from models.

How to use

Please also check scripts/experiment.sh, which runs all the following experimental codes.

First of all, please download features and G.T. of these datasets from this repository.
Features and groundTruth labels need to be converted to numpy array. This repository does not provide boundary groundtruth labels, so you have to generate them, too. Please run the following command. [DATASET_DIR] is the path to your dataset directory.
```
python utils/generate_gt_array.py --dataset_dir [DATASET_DIR]
python utils/generate_boundary_array.py --dataset_dir [DATASET_DIR]
```
In this implementation, csv files are used for keeping information of training or test data. You can run the below command to generate csv files, but we suggest to use the csv files provided in the repo.
```
python utils/make_csv_files.py --dataset_dir [DATASET_DIR]
```

You can automatically generate experiment configuration files by running the following command. This command generates directories and configuration files in root_dir. However, we suggest to use the config files provided in the repo.

python utils/make_config.py --root_dir ./result/50salads --dataset 50salads --split 1 2 3 4 5
python utils/make_config.py --root_dir ./result/gtea --dataset gtea --split 1 2 3 4
python utils/make_config.py --root_dir ./result/breakfast --dataset breakfast --split 1 2 3 4

If you want to add other configurations, please add command-line options like:

python utils/make_config.py --root_dir ./result/50salads --dataset 50salads --split 1 2 3 4 5 --learning_rate 0.1 0.01 0.001 0.0001

Please see libs/config.py about configurations.

You can train and evaluate models specifying a configuration file generated in the above process like, we train 80 epochs for 50salads dataset in the config.yaml.
```
python train.py ./result/50salads/dataset-50salads_split-1/config.yaml
python evaluate.py ./result/50salads/dataset-50salads_split-1/config.yaml test
```

You can also save model predictions as numpy array by running:

python save_pred.py ./result/50salads/dataset-50salads_split-1/config.yaml test

If you want to visualize the saved model predictions, please run:

python utils/convert_arr2img.py ./result/50salads/dataset-50salads_split1/predictions

License

This repository is released under the MIT License.

Citation

@inproceedings{chinayi_ASformer,
author={Fangqiu Yi and Hongyu Wen and Tingting Jiang}, booktitle={The British Machine Vision Conference (BMVC)},
title={ASFormer: Transformer for Action Segmentation}, year={2021},
}

Reference

Yuchi Ishikawa, Seito Kasai, Yoshimitsu Aoki, Hirokatsu Kataoka, "Alleviating Over-segmentation Errors by Detecting Action Boundaries" in WACV 2021.
Colin Lea et al., "Temporal Convolutional Networks for Action Segmentation and Detection", in CVPR2017 (paper)
Yazan Abu Farha et al., "MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation", in CVPR2019 (paper, code)

Alleviating Over-segmentation Errors by Detecting Action Boundaries

Related tags

Overview

Alleviating Over-segmentation Errors by Detecting Action Boundaries

Dataset

Requirements

Directory Structure

How to use

License

Citation

Reference

Owner

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Pyramid addon for OpenAPI3 validation of requests and responses.

Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

DimReductionClustering - Dimensionality Reduction + Clustering + Unsupervised Score Metrics

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models

PyTorch implementation of SmoothGrad: removing noise by adding noise.

A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight).

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

The versatile ocean simulator, in pure Python, powered by JAX.

Contextual Attention Localization for Offline Handwritten Text Recognition

This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

WiFi-based Multi-task Sensing

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Generative Autoregressive, Normalized Flows, VAEs, Score-based models (GANVAS)

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Adversarial Attacks are Reversible via Natural Supervision