3D ResNets for Action Recognition (CVPR 2018)

Last update: Jan 06, 2023

Overview

3D ResNets for Action Recognition

Update (2020/4/13)

We published a paper on arXiv.

Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,
"Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs",
arXiv preprint, arXiv:2004.04968, 2020.

We uploaded the pretrained models described in this paper including ResNet-50 pretrained on the combined dataset with Kinetics-700 and Moments in Time.

Update (2020/4/10)

We significantly updated our scripts. If you want to use older versions to reproduce our CVPR2018 paper, you should use the scripts in the CVPR2018 branch.

This update includes as follows:

Refactoring whole project
Supporting the newer PyTorch versions
Supporting distributed training
Supporting training and testing on the Moments in Time dataset.
Adding R(2+1)D models
Uploading 3D ResNet models trained on the Kinetics-700, Moments in Time, and STAIR-Actions datasets

Summary

This is the PyTorch code for the following papers:

Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,
"Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs",
arXiv preprint, arXiv:2004.04968, 2020.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh,
"Towards Good Practice for Action Recognition with Spatiotemporal 3D Convolutions",
Proceedings of the International Conference on Pattern Recognition, pp. 2516-2521, 2018.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh,
"Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?",
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, 2018.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh,
"Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition",
Proceedings of the ICCV Workshop on Action, Gesture, and Emotion Recognition, 2017.

This code includes training, fine-tuning and testing on Kinetics, Moments in Time, ActivityNet, UCF-101, and HMDB-51.

Citation

If you use this code or pre-trained models, please cite the following:

@inproceedings{hara3dcnns,
  author={Kensho Hara and Hirokatsu Kataoka and Yutaka Satoh},
  title={Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={6546--6555},
  year={2018},
}

Pre-trained models

Pre-trained models are available here.
All models are trained on Kinetics-700 (K), Moments in Time (M), STAIR-Actions (S), or merged datasets of them (KM, KS, MS, KMS).
If you want to finetune the models on your dataset, you should specify the following options.

r3d18_K_200ep.pth: --model resnet --model_depth 18 --n_pretrain_classes 700
r3d18_KM_200ep.pth: --model resnet --model_depth 18 --n_pretrain_classes 1039
r3d34_K_200ep.pth: --model resnet --model_depth 34 --n_pretrain_classes 700
r3d34_KM_200ep.pth: --model resnet --model_depth 34 --n_pretrain_classes 1039
r3d50_K_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 700
r3d50_KM_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 1039
r3d50_KMS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 1139
r3d50_KS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 800
r3d50_M_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 339
r3d50_MS_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 439
r3d50_S_200ep.pth: --model resnet --model_depth 50 --n_pretrain_classes 100
r3d101_K_200ep.pth: --model resnet --model_depth 101 --n_pretrain_classes 700
r3d101_KM_200ep.pth: --model resnet --model_depth 101 --n_pretrain_classes 1039
r3d152_K_200ep.pth: --model resnet --model_depth 152 --n_pretrain_classes 700
r3d152_KM_200ep.pth: --model resnet --model_depth 152 --n_pretrain_classes 1039
r3d200_K_200ep.pth: --model resnet --model_depth 200 --n_pretrain_classes 700
r3d200_KM_200ep.pth: --model resnet --model_depth 200 --n_pretrain_classes 1039

Old pretrained models are still available here.
However, some modifications are required to use the old pretrained models in the current scripts.

Requirements

PyTorch (ver. 0.4+ required)

conda install pytorch torchvision cudatoolkit=10.1 -c soumith

FFmpeg, FFprobe
Python 3

Preparation

ActivityNet

Download videos using the official crawler.
Convert from avi to jpg files using util_scripts/generate_video_jpgs.py

python -m util_scripts.generate_video_jpgs mp4_video_dir_path jpg_video_dir_path activitynet

Add fps infomartion into the json file util_scripts/add_fps_into_activitynet_json.py

python -m util_scripts.add_fps_into_activitynet_json mp4_video_dir_path json_file_path

Kinetics

Download videos using the official crawler.
- Locate test set in video_directory/test.
Convert from avi to jpg files using util_scripts/generate_video_jpgs.py

python -m util_scripts.generate_video_jpgs mp4_video_dir_path jpg_video_dir_path kinetics

Generate annotation file in json format similar to ActivityNet using util_scripts/kinetics_json.py
- The CSV files (kinetics_{train, val, test}.csv) are included in the crawler.

python -m util_scripts.kinetics_json csv_dir_path 700 jpg_video_dir_path jpg dst_json_path

UCF-101

Download videos and train/test splits here.
Convert from avi to jpg files using util_scripts/generate_video_jpgs.py

python -m util_scripts.generate_video_jpgs avi_video_dir_path jpg_video_dir_path ucf101

Generate annotation file in json format similar to ActivityNet using util_scripts/ucf101_json.py
- annotation_dir_path includes classInd.txt, trainlist0{1, 2, 3}.txt, testlist0{1, 2, 3}.txt

python -m util_scripts.ucf101_json annotation_dir_path jpg_video_dir_path dst_json_path

HMDB-51

Download videos and train/test splits here.
Convert from avi to jpg files using util_scripts/generate_video_jpgs.py

python -m util_scripts.generate_video_jpgs avi_video_dir_path jpg_video_dir_path hmdb51

Generate annotation file in json format similar to ActivityNet using util_scripts/hmdb51_json.py
- annotation_dir_path includes brush_hair_test_split1.txt, ...

python -m util_scripts.hmdb51_json annotation_dir_path jpg_video_dir_path dst_json_path

Running the code

Assume the structure of data directories is the following:

~/
  data/
    kinetics_videos/
      jpg/
        .../ (directories of class names)
          .../ (directories of video names)
            ... (jpg files)
    results/
      save_100.pth
    kinetics.json

Confirm all options.

python main.py -h

Train ResNets-50 on the Kinetics-700 dataset (700 classes) with 4 CPU threads (for data loading).
Batch size is 128.
Save models at every 5 epochs. All GPUs is used for the training. If you want a part of GPUs, use CUDA_VISIBLE_DEVICES=....

python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --model resnet \
--model_depth 50 --n_classes 700 --batch_size 128 --n_threads 4 --checkpoint 5

Continue Training from epoch 101. (~/data/results/save_100.pth is loaded.)

python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --resume_path results/save_100.pth \
--model_depth 50 --n_classes 700 --batch_size 128 --n_threads 4 --checkpoint 5

Calculate top-5 class probabilities of each video using a trained model (~/data/results/save_200.pth.)
Note that inference_batch_size should be small because actual batch size is calculated by inference_batch_size * (n_video_frames / inference_stride).

python main.py --root_path ~/data --video_path kinetics_videos/jpg --annotation_path kinetics.json \
--result_path results --dataset kinetics --resume_path results/save_200.pth \
--model_depth 50 --n_classes 700 --n_threads 4 --no_train --no_val --inference --output_topk 5 --inference_batch_size 1

Evaluate top-1 video accuracy of a recognition result (~/data/results/val.json).

python -m util_scripts.eval_accuracy ~/data/kinetics.json ~/data/results/val.json --subset val -k 1 --ignore

Fine-tune fc layers of a pretrained model (~/data/models/resnet-50-kinetics.pth) on UCF-101.

python main.py --root_path ~/data --video_path ucf101_videos/jpg --annotation_path ucf101_01.json \
--result_path results --dataset ucf101 --n_classes 101 --n_pretrain_classes 700 \
--pretrain_path models/resnet-50-kinetics.pth --ft_begin_module fc \
--model resnet --model_depth 50 --batch_size 128 --n_threads 4 --checkpoint 5

Comments

question about the 'Temporal duration of inputs'

[email protected] , in the opts.py ,whether I can change temporal duration of inputs in parser.add_argument('--sample_duration', default=16, type=int, help='Temporal duration of inputs'),like 32 frames,64 frames,etc? have you take the similar experiments? I really appreciate for your reply, Thanks.

opened by sophiazy 25
Performance of pretrained weights on UCF101

Hi, Nice work! I have a question about your results on UCF101 split 1. I've evaluated your pretrained weight of "resnext-101-kinetics-ucf101_split1.pth" on UCF101 split 1 and got the accuracy of ~85.99. I'm wondering if it is the right accuracy or not. Would you please provide the accuracies of the pretrained models?

opened by MohsenFayyaz89 21
Train from scratch on UCF101 using ResNet18 and get 10% gain without doing anything
I train and evaluate the code and get 10% gain without changing anything. Here is my process:

Parse video data using codes from READ.ME.

Train the model using python3 main.py --root_path ./datasets/ --video_path UCF101/jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --model resnet --model_depth 18 --n_classes 101 --batch_size 16 and get the model result datasets/desults/save_200.pth.

Test the dataset using python3 main.py --root_path ./datasets/ --video_path UCF101/jpg --annotation_path ucf101_01.json --result_path results --dataset ucf101 --resume_path results/save_200.pth --model resnet --model_depth 18 --n_classes 101 --batch_size 16 --no_train --test and get the result in val.json and the command window says the clip accuracy is 0.346.

Get the accuracy using eval_ucf101.py to compare ucf101_01.json with val.json and the top1 accuracy for video is 52.66%, which is about 10% over 42.4% (reported in the paper).

I only use UCF101 split_01 so there are no overlap videos in train and test data. It is a little bit strange, and is it because in the paper, the training procedure did not last for 200 epochs? My platform is Pytorch 0.4 and I only modified one place to avoid the error, which is reported in another issue.
opened by BestJuly 16
RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

Hi dear Need help....on running main.py ,everything is going well till dataset loading as shown below: model generated dataset loading [0/9537] dataset loading [1000/9537] dataset loading [2000/9537] dataset loading [3000/9537] dataset loading [4000/9537] dataset loading [5000/9537] dataset loading [6000/9537] dataset loading [7000/9537] dataset loading [8000/9537] dataset loading [9000/9537] dataset loading [0/3783] dataset loading [1000/3783] dataset loading [2000/3783] dataset loading [3000/3783] run

error occured here:

train at epoch 1 Traceback (most recent call last): File "/media/psrana/New Volume/chandni/HAR_3D_TU/main.py", line 139, in train_logger, train_batch_logger) File "/media/psrana/New Volume/chandni/HAR_3D_TU/train.py", line 22, in train_epoch for i, (inputs, targets) in enumerate(data_loader): File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 417, in iter return DataLoaderIter(self) File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 242, in init self._put_indices() File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 290, in _put_indices indices = next(self.sample_iter, None) File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 119, in iter for idx in self.sampler: File "/home/psrana/anaconda3/envs/har_chandni/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 50, in iter return iter(torch.randperm(len(self.data_source)).long()) RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

what could be the

opened by chandnikathuria1992 12
Very Slow Training

I am training Resnet with depth 34 on the kinetics dataset, however the training procedure is not improving anything. How long does it take till the model starts improving ? I have attached a screenshot; currently I am at epoch 34 but the loss is still 5.99 and is not decreasing, and accuracy is very volatile

opened by cryptedp 11
RuntimeError: expected a non-empty list of Tensors

Traceback (most recent call last): File "main.py", line 129, in train_logger, train_batch_logger) File "/home/hareesh/Downloads/3D-ResNets-PyTorch-master/train.py", line 22, in train_epoch for i, (inputs, targets) in enumerate(data_loader): File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 286, in next return self._process_next_batch(batch) File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch raise batch.exc_type(batch.exc_msg) RuntimeError: Traceback (most recent call last): File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/hareesh/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 57, in samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/hareesh/Downloads/3D-ResNets-PyTorch-master/datasets/ucf101.py", line 193, in getitem clip = torch.stack(clip, 0).permute(1, 0, 2, 3) RuntimeError: expected a non-empty list of Tensors

Please let me know cause of this error.

opened by hareeshdevarakonda 10
Input of Densenet

Thank you for your wonderful work. Just read the paper, it is noted that each clip contains 16 frames. I read two other papers in which the author claims that 32 frame input would be better, have you tried 32 frames input? If you trained such models, can you please release the pretrained models?

opened by Tord-Zhang 10
Asking about the using of 3D ResNet on video sequence

Hello,

I'm new in this kind of 3D Convolution, so, I'm trying to understand how does this works. My dataset (UNBC McMaster) is including some videos that contains sequence of frames. For each frame, we have one pain intensity level. Now, I want to use 3D ResNet to predict pain level as regression problem. So, let says we have a sequence including 32 frames, which mean I have 32 labels for this sequence. Normally, with CNN + LSTM, I will use CNN to extract features and then put it through LSTM, take the output and label of the last frame to compute loss. So, for 3D ResNet, should I take the output of the model and the label of last frame to calculate loss ?

opened by glmanhtu 9
i have some problem do a test

I would like to test network on UCF101 after finetune in UCF101 using pretrained kinetics you provided.

i use resnet18. I want to get the accuracy of a video unit, not clip.

i use instruction below

python main.py --root_path UCF101 --video_path jpg --annotation_path ucf101_01.json --result_path test --dataset ucf101 --model resnet --model_depth 18 --n_classes 101 --batch_size 64 --n_threads 4 --pretrain_path 18result1s/save_200.pth --no_train --no_val --test --test_subset val --n_finetune_classes 101

jpg is the same as your data directory. 18results1s/save_200.pth is pretrained on Kinetics, finetune in UCF101's network.

it make a error.

run dataset loading [0/3783] dataset loading [1000/3783] dataset loading [2000/3783] dataset loading [3000/3783] test test.py:42: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead. inputs = Variable(inputs, volatile=True) test.py:45: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument. outputs = F.softmax(outputs) Traceback (most recent call last): File "main.py", line 162, in test.test(test_loader, model, opt, test_data.class_names) File "test.py", line 50, in test test_results, class_names) File "test.py", line 20, in calculate_video_results 'label': class_names[locs[i]], KeyError: tensor(12)

KeyError : tensor(12) , i change my instruction, the number in parentheses is change, maybe.

can you help me?

opened by lee2h 9
Performance of fine-tuning on UCF101

I downloaded the network ResNet-101 pretrained on Kinetics, and fine-tuned on UCF101 following the example script. However, I can only get 82.5 by averaging the three splits. In the paper, the authors reported 88.9. Any suggestion?

opened by zhihuilics 9
Pretrain models cannot download

Hello, I am making a demo about 3D convolution network. After reading your CVPR paper, I am happy to use your pretrain models. However, when I try to open your link, I get "The folder has been put in the recycle bin.". These pretrain models are extramely important for my program, because I don't have many GPU to train the network from scratch. Please give me a chance to use res3D convolution network...... @kenshohara

opened by KeCh96 8
main.py: error: unrecognized arguments: hmdb51_3.json

I'm trying to train resnet on hmdb51. So I've 3 json files. But whichever I'm passing with the --annotation_path argument, it's keep giving this error. Please help!

opened by soumyadbanik 2
ABOUT ucf101_json

I used python script to generate three json files of UCF101, they are ucf101_01.json,ucf101_01.json,ucf101_03.json I run the main function to train resnet python main.py --root_path ./data --video_path ucf101_jpg/ --annotation_path ucf101_json/ucf101_01.json
--result_path results --dataset ucf101 --n_classes 101 --model resnet --model_depth 50 --batch_size 64
--n_threads 4 --checkpoint 5

I want to know when ucf101_02.json and ucf101_03.json should be used,thank you very much!!!

opened by theones-g 0
Image Resolution 112*112

I wanted to be able to input larger image resolutions. However when I do input image size of 480*480 it takes almost 10 minutes to process a tiny 10 second clip.

It seems when I increase image size, the model inference run-time become exponentially greater.

There is crucial motion information being lost when I downscale my images to 112*112 and it is effecting the precision of the model on my test sets.

Is there any alternative model or method that will allow me to proceed with larger image resolutions using the 3D-ResNet model?

Is it practical to use 3D-CNN with input sizes of 480*480 images for video classification tasks?

opened by darshvirbelandis 1
Why is opt.n_val_samples 3

I found that when fine-tuning UCF101, split1 partition was used to verify that the number in the dataset was 11349 instead of 3783, and why was batchsize opt.batch_size // opt.n_val_samples?

opened by YTHmamba 0
AssertionError when I inference

I used the r2p1d18_K_200ep.pth and finetune it on hmdb51 dataset,and when I want to use it to inference there is an AssertionError `CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --root_path /home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data --video_path hmdb51-videos/jpg --annotation_path hmdb51_1.json \

--result_path results --dataset hmdb51 --resume_path results/save_200.pth
--model_depth 18 --n_classes 51 --n_threads 4 --no_train --no_val --inference --output_topk 5 --inference_batch_size 1 Namespace(accimage=False, annotation_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/hmdb51_1.json'), arch='resnet-18', batch_size=128, batchnorm_sync=False, begin_epoch=1, checkpoint=10, colorjitter=False, conv1_t_size=7, conv1_t_stride=1, dampening=0.0, dataset='hmdb51', dist_url='tcp://127.0.0.1:23456', distributed=False, file_type='jpg', ft_begin_module='', inference=True, inference_batch_size=1, inference_crop='center', inference_no_average=False, inference_stride=16, inference_subset='val', input_type='rgb', learning_rate=0.1, lr_scheduler='multistep', manual_seed=1, mean=[0.4345, 0.4051, 0.3775], mean_dataset='kinetics', model='resnet', model_depth=18, momentum=0.9, multistep_milestones=[50, 100, 150], n_classes=51, n_epochs=200, n_input_channels=3, n_pretrain_classes=0, n_threads=4, n_val_samples=3, nesterov=False, no_cuda=False, no_hflip=False, no_max_pool=False, no_mean_norm=False, no_std_norm=False, no_train=True, no_val=True, optimizer='sgd', output_topk=5, overwrite_milestones=False, plateau_patience=10, pretrain_path=None, resnet_shortcut='B', resnet_widen_factor=1.0, resnext_cardinality=32, result_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/results'), resume_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/results/save_200.pth'), root_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data'), sample_duration=16, sample_size=112, sample_t_stride=1, std=[0.2768, 0.2713, 0.2737], tensorboard=False, train_crop='random', train_crop_min_ratio=0.75, train_crop_min_scale=0.25, train_t_crop='random', value_scale=1, video_path=PosixPath('/home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/hmdb51-videos/jpg'), weight_decay=0.001, wide_resnet_k=2, world_size=-1) loading checkpoint /home/pubNAS/jianfei/3D-ResNets-PyTorch-master/data/results/save_200.pth model Traceback (most recent call last): File "main.py", line 428, in main_worker(-1, opt) File "main.py", line 345, in main_worker model = resume_model(opt.resume_path, opt.arch, model) File "main.py", line 89, in resume_model assert arch == checkpoint['arch'] AssertionError `

opened by z369437558 0

Releases(1.0)

1.0(Oct 30, 2018)

This version works in PyTorch v0.3.1 or earlier.

In the following papers, we used this version.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh, "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, 2018.

Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh, "Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition", Proceedings of the ICCV Workshop on Action, Gesture, and Emotion Recognition, 2017.
Source code(tar.gz)
Source code(zip)

Owner

Kensho Hara

GitHub Repository

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri Bu Github Reposundaki tüm projeler; kaleme almış olduğum "Projelerle Yapay Zekâ ve Bi

4 Aug 03, 2022

Boston House Prediction Valuation Tool

Boston-House-Prediction-Valuation-Tool From Below Anlaysis The Valuation Tool is Designed Correlation Matrix Regrssion Analysis Between Target Vs Pred

0 Sep 09, 2022

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future

243 Dec 26, 2022

This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

CPM 项目描述 CPM（Chinese Pretrained Models）模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型，参数量分别为109M、334M、2.6B，用户需申请与通过审核，方可下载。由于原项目需要考虑大模型的训练和使用，需要安装较为复杂

190 Jan 08, 2023

Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

99 Dec 14, 2022

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

828 Jan 05, 2023

Source code related to the article submitted to the International Conference on Computational Science ICCS 2022 in London

POTHER: Patch-Voted Deep Learning-based Chest X-ray Bias Analysis for COVID-19 Detection Source code related to the article submitted to the Internati

1 Apr 29, 2022

3D ResNets for Action Recognition (CVPR 2018)

Related tags

Overview

3D ResNets for Action Recognition

Update (2020/4/13)

Update (2020/4/10)

Summary

Citation

Pre-trained models

Requirements

Preparation

ActivityNet

Kinetics

UCF-101

HMDB-51

Running the code

Comments

Releases(1.0)

1.0(Oct 30, 2018)

Owner

Kensho Hara

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

Boston House Prediction Valuation Tool

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

Bayesian Inference Tools in Python

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Source code related to the article submitted to the International Conference on Computational Science ICCS 2022 in London

SEJE Pytorch implementation

PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.

This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Power Core Simulator!

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

Code for "Unsupervised Source Separation via Bayesian inference in the latent domain"

Submanifold sparse convolutional networks

ROS Basics and TurtleSim

Malware Analysis Neural Network project.

TensorFlow-LiveLessons - "Deep Learning with TensorFlow" LiveLessons

공공장소에서 눈만 돌리면 CCTV가 보인다는 말이 과언이 아닐 정도로 CCTV가 우리 생활에 깊숙이 자리 잡았습니다.