Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

Overview

An Image Captioning codebase

This is a codebase for image captioning research.

It supports:

A simple demo colab notebook is available here

Requirements

  • Python 3
  • PyTorch 1.3+ (along with torchvision)
  • cider (already been added as a submodule)
  • coco-caption (already been added as a submodule) (Remember to follow initialization steps in coco-caption/README.md)
  • yacs
  • lmdbdict

Install

If you have difficulty running the training scripts in tools. You can try installing this repo as a python package:

python -m pip install -e .

Pretrained models

Checkout MODEL_ZOO.md.

If you want to do evaluation only, you can then follow this section after downloading the pretrained models (and also the pretrained resnet101 or precomputed bottomup features, see data/README.md).

Train your own network on COCO/Flickr30k

Prepare data.

We now support both flickr30k and COCO. See details in data/README.md. (Note: the later sections assume COCO dataset; it should be trivial to use flickr30k.)

Start training

$ python tools/train.py --id fc --caption_model newfc --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_fc --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30

or

$ python tools/train.py --cfg configs/fc.yml --id fc

The train script will dump checkpoints into the folder specified by --checkpoint_path (default = log_$id/). By default only save the best-performing checkpoint on validation and the latest checkpoint to save disk space. You can also set --save_history_ckpt to 1 to save every checkpoint.

To resume training, you can specify --start_from option to be the path saving infos.pkl and model.pth (usually you could just set --start_from and --checkpoint_path to be the same).

To checkout the training curve or validation curve, you can use tensorboard. The loss histories are automatically dumped into --checkpoint_path.

The current command use scheduled sampling, you can also set --scheduled_sampling_start to -1 to turn off scheduled sampling.

If you'd like to evaluate BLEU/METEOR/CIDEr scores during training in addition to validation cross entropy loss, use --language_eval 1 option, but don't forget to pull the submodule coco-caption.

For all the arguments, you can specify them in a yaml file and use --cfg to use the configurations in that yaml file. The configurations in command line will overwrite cfg file if there are conflicts.

For more options, see opts.py.

Train using self critical

First you should preprocess the dataset and get the cache for calculating cider score:

$ python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Then, copy the model from the pretrained model using cross entropy. (It's not mandatory to copy the model, just for back-up)

$ bash scripts/copy_model.sh fc fc_rl

Then

$ python tools/train.py --id fc_rl --caption_model newfc --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-5 --start_from log_fc_rl --checkpoint_path log_fc_rl --save_checkpoint_every 6000 --language_eval 1 --val_images_use 5000 --self_critical_after 30 --cached_tokens coco-train-idxs --max_epoch 50 --train_sample_n 5

or

$ python tools/train.py --cfg configs/fc_rl.yml --id fc_rl

You will see a huge boost on Cider score, : ).

A few notes on training. Starting self-critical training after 30 epochs, the CIDEr score goes up to 1.05 after 600k iterations (including the 30 epochs pertraining).

Generate image captions

Evaluate on raw images

Note: this doesn't work for models trained with bottomup feature. Now place all your images of interest into a folder, e.g. blah, and run the eval script:

$ python tools/eval.py --model model.pth --infos_path infos.pkl --image_folder blah --num_images 10

This tells the eval script to run up to 10 images from the given folder. If you have a big GPU you can speed up the evaluation by increasing batch_size. Use --num_images -1 to process all images. The eval script will create an vis.json file inside the vis folder, which can then be visualized with the provided HTML interface:

$ cd vis
$ python -m SimpleHTTPServer

Now visit localhost:8000 in your browser and you should see your predicted captions.

Evaluate on Karpathy's test split

$ python tools/eval.py --dump_images 0 --num_images 5000 --model model.pth --infos_path infos.pkl --language_eval 1 

The defualt split to evaluate is test. The default inference method is greedy decoding (--sample_method greedy), to sample from the posterior, set --sample_method sample.

Beam Search. Beam search can increase the performance of the search for greedy decoding sequence by ~5%. However, this is a little more expensive. To turn on the beam search, use --beam_size N, N should be greater than 1.

Evaluate on COCO test set

$ python tools/eval.py --input_json cocotest.json --input_fc_dir data/cocotest_bu_fc --input_att_dir data/cocotest_bu_att --input_label_h5 none --num_images -1 --model model.pth --infos_path infos.pkl --language_eval 0

You can download the preprocessed file cocotest.json, cocotest_bu_att and cocotest_bu_fc from link.

Miscellanea

Using cpu. The code is currently defaultly using gpu; there is even no option for switching. If someone highly needs a cpu model, please open an issue; I can potentially create a cpu checkpoint and modify the eval.py to run the model on cpu. However, there's no point using cpus to train the model.

Train on other dataset. It should be trivial to port if you can create a file like dataset_coco.json for your own dataset.

Live demo. Not supported now. Welcome pull request.

For more advanced features:

Checkout ADVANCED.md.

Reference

If you find this repo useful, please consider citing (no obligation at all):

@article{luo2018discriminability,
  title={Discriminability objective for training descriptive captions},
  author={Luo, Ruotian and Price, Brian and Cohen, Scott and Shakhnarovich, Gregory},
  journal={arXiv preprint arXiv:1803.04376},
  year={2018}
}

Of course, please cite the original paper of models you are using (You can find references in the model files).

Acknowledgements

Thanks the original neuraltalk2 and awesome PyTorch team.

Comments
  • 关于bottom up and top down模型的问题

    关于bottom up and top down模型的问题

    你好, bottom up and top down的论文里说 60k iterations 训9个小时就能到达cider 120.1 的效果。 但我将你代码模型的参数修改成论文中的参数,并将attention lstm那块的输入修改成每个bounding box 的image feature的均值。但是也达不到论文里的效果。 想问问你,以你的经验来看,觉得会是什么原因呢?

    opened by JimLee4530 26
  • The cider score gets smaller than pretrained model

    The cider score gets smaller than pretrained model

    Hi, i am trying to reproduce your code in tensorflow. I almost write my code as you do, but find the cider score is getting smaller than before. At the same time, the sampling sentence is also getting shorter. Can you give me some advice?

    opened by wjb123 23
  • Structure loss

    Structure loss

    Hi,

    I have a question regarding the structure loss in su+bu+structure. The structure loss weight is similar as the lambda in your discriminality paper? That is scales the rewards with cross-entropy ? Furthermore, if that is the case each of the rewards can as well have different weights?

    Thank you.

    opened by mememimis 20
  • About multi-GPU training

    About multi-GPU training

    hi~, when I tried to train the fc_model on 2 GPUs by setting the environment variable CUDA_VISIBLE_DEVICES=0,1, I ran this code with some errors intrain.py#125, but there were no errors when I ran your codes with only one GPU.

    Does this repository support multi-GPUs training?

    opened by xuyan1115 16
  • evaluation error using pre-trained model

    evaluation error using pre-trained model

    Hi Ruotian,

    I tried to run the following script: python eval.py --model resnet50.pth --infos_path infos.pkl --image_folder ./image/val2014_coco/ --num_images 1

    The model was resnet50 (and resnet101) downloaded from your google drive. But I got the error: Traceback (most recent call last): File "eval.py", line 102, in <module> model.load_state_dict(torch.load(opt.model)) File "/home/wentong/miniconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 522, in load_state_dict .format(name)) KeyError: 'unexpected key "conv1.weight" in state_dict'

    I have searched online but there was little information about that. I guess you have used multiple gpus.

    Any advice? Thank you for your implementation.

    opened by Wentong-DST 15
  • CUDA out of memory in self-critical training but not in xe

    CUDA out of memory in self-critical training but not in xe

    Hello. I am using GTX2080 Ti with 11GB memory. I trained the model using xe and it works fine. But then when I run self-critical, I get CUDA out of memory. How come the model can fit in xe training but not in rl training? It is the same model with the same number of parameters. Any advice would help

    opened by homelifes 12
  • Issue when training

    Issue when training

    I am trying to train the model as the per the instructions given on the repo. I am getting below error:

    [[email protected] self-critical.pytorch]$ python tools/train.py --id fc --caption_model newfc --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_fc --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30 Hugginface transformers not installed; please visit https://github.com/huggingface/transformers meshed-memory-transformer not installed; please runpip install git+https://github.com/ruotianluo/meshed-memory-transformer.gitDataLoader loading json file: data/cocotalk.json vocab size is 9487 DataLoader loading h5 file: data/cocotalk_fc data/cocotalk_att data/cocotalk_box data/cocotalk_label.h5 max sequence length in data is 16 read 123287 image features assigned 113287 images to split train assigned 5000 images to split val assigned 5000 images to split test /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) /home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/data/dataloader.py:291: RuntimeWarning: Mean of empty slice. fc_feat = att_feat.mean(0) 2020-07-27 22:49:15.536672: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 Read data: 0.0002117156982421875 Traceback (most recent call last): File "tools/train.py", line 289, in <module> train(opt) File "tools/train.py", line 182, in train model_out = dp_lw_model(fc_feats, att_feats, labels, masks, att_masks, data['gts'], torch.arange(0, len(data['gts'])), sc_flag, struc_flag) File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/usr/local/lib64/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) StopIteration: Caught StopIteration in replica 0 on device 0. Original Traceback (most recent call last): File "/usr/local/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/modules/loss_wrapper.py", line 45, in forward loss = self.crit(self.model(fc_feats, att_feats, labels[..., :-1], att_masks), labels[..., 1:], masks[..., 1:]) File "/usr/local/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/models/CaptionModel.py", line 33, in forward return getattr(self, '_'+mode)(*args, **kwargs) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/models/AttModel.py", line 128, in _forward state = self.init_hidden(batch_size*seq_per_img) File "/home/default/ephemeral_drive/work/image_captioning/self-critical.pytorch/captioning/models/AttModel.py", line 99, in init_hidden weight = next(self.parameters()) StopIteration

    PyTorch version: '1.5.0+cu101'

    I saw a pytorch - bug https://github.com/huggingface/transformers/issues/3936, which describes a similar issue. Not sure if it related.

    opened by gsrivas4 11
  • transformer性能

    transformer性能

    您好,我使用 transformer.yml训练只能达到 Bleu_1: 0.749 Bleu_2: 0.584 Bleu_3: 0.443 Bleu_4: 0.336 METEOR: 0.271 ROUGE_L: 0.553 CIDEr: 1.092 远远不及您提及的分数,请问您训练参数是如何设置的?

    opened by hasky123 10
  • Type error while training

    Type error while training

    @ruotianluo Sorry again, after fixing the file problem, I got an error: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)

    run tools/train.py --cfg configs/fc_rl.yml --id fc_rl DataLoader loading json file: data/cocotalk.json vocab size is 9487 DataLoader loading h5 file: data/cocotalk_fc data/cocotalk_att data/cocotalk_box data/cocotalk_label.h5 max sequence length in data is 16 read 123287 image features assigned 113287 images to split train assigned 5000 images to split val assigned 5000 images to split test Read data: 0.003994464874267578 Save ckpt on exception ... model saved to ./log_fc_rl\model.pth Save ckpt done. Traceback (most recent call last): File "D:\Stephan\Final project\self-critical.pytorch-master\tools\train.py", line 183, in train model_out = dp_lw_model(fc_feats, att_feats, labels, masks, att_masks, data['gts'], torch.arange(0, len(data['gts'])), sc_flag, struc_flag) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\parallel\data_parallel.py", line 150, in forward return self.module(*inputs[0], **kwargs[0]) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\modules\loss_wrapper.py", line 45, in forward loss = self.crit(self.model(fc_feats, att_feats, labels[..., :-1], att_masks), labels[..., 1:], masks[..., 1:]) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\models\CaptionModel.py", line 33, in forward return getattr(self, '_'+mode)(*args, **kwargs) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\models\AttModel.py", line 160, in _forward output, state = self.get_logprobs_state(it, p_fc_feats, p_att_feats, pp_att_feats, p_att_masks, state) File "D:\Stephan\Final project\self-critical.pytorch-master\captioning\models\AttModel.py", line 167, in get_logprobs_state xt = self.embed(it) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 541, in call result = self.forward(*input, **kwargs) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\modules\sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "C:\Users\ncku_ailab\Anaconda3\lib\site-packages\torch\nn\functional.py", line 1484, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)

    opened by stephancheng 10
  • Question regarding self-critical reward

    Question regarding self-critical reward

    Hi Ruotian, I'd like to ask about the nature of the reward when training with self-critical. Is it normal to start at negative or zero for the first few epochs? I am getting the following for the first epoch with self-critical (after training with XE for 13 epochs):

    Capture

    Is this normal? And what is the maximum reward you achieved after training with self-critical?

    Also, I'm using the py3 branch. I saw there are a lot of differences between the py3 and py2 branches. So is the py3 branch reliable to use?

    Looking forward to your answer!

    opened by fawazsammani 9
  • ZeroDivisionError: division by zero

    ZeroDivisionError: division by zero

    ZeroDivisionError: division by zero Hi Dr. Luo. I try to evaluate model, but I meet an error.

    error info:

    python eval.py --model data/pretrain/fc/model-best.pth  --infos_path data/pretrain/fc/infos_fc-best.pkl --image_folder blah --num_images -1
    loading annotations into memory...
    0:00:00.498650
    creating index...
    index created!
    Traceback (most recent call last):
      File "eval.py", line 71, in <module>
        lang_stats = eval_utils.language_eval(opt.input_json, predictions, n_predictions, vars(opt), opt.split)
      File "/home/andrewcao95/workspace/pycharm_ws/self-critical.pytorch/eval_utils.py", line 83, in language_eval
        mean_perplexity = sum([_['perplexity'] for _ in preds_filt]) / len(preds_filt)
    ZeroDivisionError: division by zero
    
    

    source code:

    # filter results to only those in MSCOCO validation set
        preds_filt = [p for p in preds if p['image_id'] in valids]
        mean_perplexity = sum([_['perplexity'] for _ in preds_filt]) / len(preds_filt)
        mean_entropy = sum([_['entropy'] for _ in preds_filt]) / len(preds_filt)
        print('using %d/%d predictions' % (len(preds_filt), len(preds)))
        json.dump(preds_filt, open(cache_path, 'w')) # serialize to temporary json file. Sigh, COCO API...
    
        cocoRes = coco.loadRes(cache_path)
        cocoEval = COCOEvalCap(coco, cocoRes)
        cocoEval.params['image_id'] = cocoRes.getImgIds()
        cocoEval.evaluate()
    
        for metric, score in cocoEval.eval.items():
            out[metric] = score
        # Add mean perplexity
        out['perplexity'] = mean_perplexity
        out['entropy'] = mean_entropy
    
        imgToEval = cocoEval.imgToEval
        for k in list(imgToEval.values())[0]['SPICE'].keys():
            if k != 'All':
                out['SPICE_'+k] = np.array([v['SPICE'][k]['f'] for v in imgToEval.values()])
                out['SPICE_'+k] = (out['SPICE_'+k][out['SPICE_'+k]==out['SPICE_'+k]]).mean()
        for p in preds_filt:
            image_id, caption = p['image_id'], p['caption']
            imgToEval[image_id]['caption'] = caption
    
    opened by andrewcao95 8
  • How can I get my dataset image feature to scripts/prepro_feats.py

    How can I get my dataset image feature to scripts/prepro_feats.py

    Thanks for your job! When I prepare data I find that I have not pre_extract the image features in my dataset. So, how can I get these image features? Could you give me some codebase?

    opened by ShanZard 0
  • how to set the model ensemble?

    how to set the model ensemble?

    hello, I see the model ensembling method in the paper, which is helpful to improve the model performance. So I want to know how to implement the method of model enhancement?Thanks very much.

    opened by fjqfjqfjqfjqfjqfjqfjq 0
  • transformer_nsc

    transformer_nsc

    你好,我在运行transformer_nsc.yml时,碰到以下问题,请问怎么解决? Hugginface transformers not installed; please visit https://github.com/huggingface/transformers meshed-memory-transformer not installed; please run pip install git+https://github.com/ruotianluo/meshed-memory-transformer.git Warning: coco-caption not available

    opened by bai-24 11
  • Running Inference using CPU only.

    Running Inference using CPU only.

    I am not able to run inference on the model provided in the demo using CPU only. I am getting cuda device not available error. I tried removing all gpu and cuda references in the code and replacing it with CPU but it still does not work. It would be really helpful if you could help understand how to run the demo code without GPU.

    Thanks.

    opened by harindercnvrg 0
Releases(3.2)
  • 3.2(May 29, 2020)

    1. Faster beam search
    2. support h5 feature file
    3. allow beam search + scst (doesn't work as well though)
    4. Add a few models, BertCapModel and m2transformer (usefulness still question marked)
    5. Add projects.
    Source code(tar.gz)
    Source code(zip)
  • v3.1(Jan 10, 2020)

    1. Since it's 2020, py3 is officially supported. Open an issue if there is still something wrong.
    2. Finally, there is a model zoo which is relatively complete. Feel free to try the provided models.
    Source code(tar.gz)
    Source code(zip)
  • 3(Dec 31, 2019)

    1. Add structure loss inspired by Classical Structured Prediction Losses for Sequence to Sequence Learning
    2. Add a function of sample n captions. Support methods described in https://www.dropbox.com/s/tdqr9efrjdkeicz/iccv.pdf?dl=0.
    3. More pytorchy design of dataloader. Also, the dataloader now don't repeat image features according to seq_per_img. The repeating is now moved to the model forward function.
    4. Add multi-sentence sampling evaluation metrics like mBleu, Self-CIDEr etc. (those described in https://www.dropbox.com/s/tdqr9efrjdkeicz/iccv.pdf?dl=0)
    5. Use detectron type of config to setup experiments.
    6. A better self critical objective. (Named as new_self_critical now.) Use config ymls that end with nsc to test the performance. A technical report will be out soon. Basically, it performs better than original SCST on all metrics (by a small margin), but also faster (by a little bit).
    Source code(tar.gz)
    Source code(zip)
  • 2.2(Jun 25, 2019)

    1 Refactor the code a little bit. 2 Add BPE (didn’t seem to work much different) 3 Add nucleus sampling, topk and gumbel softmax sampling. 4 Make AttEnsemble compatible with transformer 5 Add remove bad ending from Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

    Source code(tar.gz)
    Source code(zip)
  • 2.1(Jun 25, 2019)

  • 2.0.0(Apr 29, 2018)

    1. Add support for bleu4 optimization or combination of bleu4 and cider
    2. Add bottom-up feature support
    3. Add ensemble during evaluation.
    4. Add multi-gpu support.
    5. Add miscellaneous things. (box features; experimental models etc.)
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Apr 28, 2018)

Owner
Ruotian(RT) Luo
Phd student at TTIC
Ruotian(RT) Luo
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 06, 2022
Adversarial Learning for Modeling Human Motion

Adversarial Learning for Modeling Human Motion This repository contains the open source code which reproduces the results for the paper: Adversarial l

wangqi 6 Jun 15, 2021
Python script to download the celebA-HQ dataset from google drive

download-celebA-HQ Python script to download and create the celebA-HQ dataset. WARNING from the author. I believe this script is broken since a few mo

133 Dec 21, 2022
NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models

NaturalCC NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models for many software engineering tasks,

159 Dec 28, 2022
A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

R-YOLOv4 This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detect

94 Dec 03, 2022
Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

GAN stability This repository contains the experiments in the supplementary material for the paper Which Training Methods for GANs do actually Converg

Lars Mescheder 885 Jan 01, 2023
DeepMind Alchemy task environment: a meta-reinforcement learning benchmark

The DeepMind Alchemy environment is a meta-reinforcement learning benchmark that presents tasks sampled from a task distribution with deep underlying structure.

DeepMind 188 Dec 25, 2022
Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

AutoViz and Auto_ViML 397 Dec 30, 2022
Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks Official implementation of paper Towards Practic

Xiangyu Qi 8 Dec 30, 2022
[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

OpenCOOD OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV

Runsheng Xu 322 Dec 23, 2022
Repo for code associated with Modeling the Mitral Valve.

Project Title Mitral Valve Getting Started Repo for code associated with Modeling the Mitral Valve. See https://arxiv.org/abs/1902.00018 for preprint,

Alex Kaiser 1 May 17, 2022
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

CIFS This repository provides codes for CIFS (ICML 2021). CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Sel

Hanshu YAN 19 Nov 12, 2022
Viperdb - A tiny log-structured key-value database written in pure Python

ViperDB 🐍 ViperDB is a lightweight embedded key-value store written in pure Pyt

17 Oct 17, 2022
This package contains deep learning models and related scripts for RoseTTAFold

RoseTTAFold This package contains deep learning models and related scripts to run RoseTTAFold This repository is the official implementation of RoseTT

1.6k Jan 03, 2023
Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models This repo contains code for DDPM training. Based on Denoising Diffusion Probabilistic Models, Improved Denois

Alexander Markov 7 Dec 15, 2022
A scikit-learn compatible neural network library that wraps PyTorch

A scikit-learn compatible neural network library that wraps PyTorch. Resources Documentation Source Code Examples To see more elaborate examples, look

4.9k Dec 31, 2022
Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

Flask Google Maps Easy to use Google Maps in your Flask application requires Jinja Flask A google api key get here Contribute To contribute with the p

Flask Extensions 611 Dec 05, 2022
An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

GLOM An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" for MNIST Dataset. To understand this

50 Oct 19, 2022
Syntax-Aware Action Targeting for Video Captioning

Syntax-Aware Action Targeting for Video Captioning Code for SAAT from "Syntax-Aware Action Targeting for Video Captioning" (Accepted to CVPR 2020). Th

59 Oct 13, 2022