This is an official implementation of the High-Resolution Transformer for Dense Prediction.

Last update: Dec 13, 2022

Overview

High-Resolution Transformer for Dense Prediction

Introduction

This is the official implementation of High-Resolution Transformer (HRT). We present a High-Resolution Transformer (HRT) that learns high-resolution representations for dense prediction tasks, in contrast to the original Vision Transformer that produces low-resolution representations and has high memory and computational cost. We take advantage of the multi-resolution parallel design introduced in high-resolution convolutional networks (HRNet), along with local-window self-attention that performs self-attention over small non-overlapping image windows, for improving the memory and computation efficiency. In addition, we introduce a convolution into the FFN to exchange information across the disconnected image windows. We demonstrate the effectiveness of the High-Resolution Transformeron human pose estimation and semantic segmentation tasks.

The High-Resolution Transformer architecture:

Pose estimation

2d Human Pose Estimation

Results on COCO `val2017` with detector having human AP of 56.4 on COCO `val2017` dataset

Backbone	Input Size	AP	AP⁵⁰	AP⁷⁵	AR^M	AR^L	AR	ckpt	log	script
HRT-S	256x192	74.0%	90.2%	81.2%	70.4%	80.7%	79.4%	ckpt	log	script
HRT-S	384x288	75.6%	90.3%	82.2%	71.6%	82.5%	80.7%	ckpt	log	script
HRT-B	256x192	75.6%	90.8%	82.8%	71.7%	82.6%	80.8%	ckpt	log	script
HRT-B	384x288	77.2%	91.0%	83.6%	73.2%	84.2%	82.0%	ckpt	log	script

Results on COCO `test-dev` with detector having human AP of 56.4 on COCO `val2017` dataset

Backbone	Input Size	AP	AP⁵⁰	AP⁷⁵	AR^M	AR^L	AR	ckpt	log	script
HRT-S	384x288	74.5%	92.3%	82.1%	70.7%	80.6%	79.8%	ckpt	log	script
HRT-B	384x288	76.2%	92.7%	83.8%	72.5%	82.3%	81.2%	ckpt	log	script

The models are first pre-trained on ImageNet-1K dataset, and then fine-tuned on COCO val2017 dataset.

Semantic segmentation

Cityscapes

Performance on the Cityscapes dataset. The models are trained and tested with input size of 512x1024 and 1024x2048 respectively.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRT-S	7x7	Train	Val	80000	8	Yes	80.0	81.0	log	ckpt	script
OCRNet	HRT-B	7x7	Train	Val	80000	8	Yes	81.4	82.0	log	ckpt	script
OCRNet	HRT-B	15x15	Train	Val	80000	8	Yes	81.9	82.6	log	ckpt	script

PASCAL-Context

The models are trained with the input size of 520x520, and tested with original size.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRT-S	7x7	Train	Val	60000	16	Yes	53.8	54.6	log	ckpt	script
OCRNet	HRT-B	7x7	Train	Val	60000	16	Yes	56.3	57.1	log	ckpt	script
OCRNet	HRT-B	15x15	Train	Val	60000	16	Yes	57.6	58.5	log	ckpt	script

COCO-Stuff

The models are trained with input size of 520x520, and tested with original size.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRT-S	7x7	Train	Val	60000	16	Yes	37.9	38.9	log	ckpt	script
OCRNet	HRT-B	7x7	Train	Val	60000	16	Yes	41.6	42.5	log	ckpt	script
OCRNet	HRT-B	15x15	Train	Val	60000	16	Yes	42.4	43.3	log	ckpt	script

ADE20K

The models are trained with input size of 520x520, and tested with original size. The results with window size 15x15 will be updated latter.

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRT-S	7x7	Train	Val	150000	8	Yes	44.0	45.1	log	ckpt	script
OCRNet	HRT-B	7x7	Train	Val	150000	8	Yes	46.3	47.6	log	ckpt	script
OCRNet	HRT-B	13x13	Train	Val	150000	8	Yes	48.7	50.0	log	ckpt	script
OCRNet	HRT-B	15x15	Train	Val	150000	8	Yes	-	-	-	-	-

Classification

Results on ImageNet-1K

Backbone	[email protected]	[email protected]	#params	FLOPs	ckpt	log	script
HRT-T	78.6%	94.2%	8.0M	1.83G	ckpt	log	script
HRT-S	81.2%	95.6%	13.5M	3.56G	ckpt	log	script
HRT-B	82.8%	96.3%	50.3M	13.71G	ckpt	log	script

Citation

If you find this project useful in your research, please consider cite:

@article{YuanFHZCW21,
  title={HRT: High-Resolution Transformer for Dense Prediction},
  author={Yuhui Yuan and Rao Fu and Lang Huang and Chao Zhang and Xilin Chen and Jingdong Wang},
  booktitle={arXiv},
  year={2021}
}

Acknowledgment

This project is developed based on the Swin-Transformer, openseg.pytorch, and mmpose.

git diff-index HEAD
git subtree add -P pose <url to sub-repo> <sub-repo branch>

Comments

Question about Local Self-Attention of your code

Hi，I‘m very interested in your work about the Local Self-Attention and feature fusion in Transformer。But I have a doubt that Because the input image size for the image classification task in the source code is fixed, 224 or 384, in other words, the size is an integer multiple of 32. If the input size is not fixed, for example the detection task, the input is 800x1333, although the feature map can be divided into window size windows by using padding, but for the key_ padding_ mask, how should the mask be handled?

The shape of attention weights map is [bs x H/7 x W/7, 49, 49], default there window size is 7, but the key padding mask shape is [1, HW], so how can I convert this mask to match the attention weights map。

I sincerely hope you can give me some advice about this question. Thanks !

opened by Huzhen757 4
about pose training speed

the computation cost of HRF-s 256 isd about 2.8G flops. but when i training it, i found that it is significantly slower than the hrnet which have about 7.9 Gflops do you know how to solve it? thanks

opened by maowayne123 4
Is the padding module wrong?

Hello, I observes in the class PadBlock, the operation you have done is "n (qh ph) (qw pw) c -> (ph pw) (n qh qw) c" which you makes the padding group as batch dim. Therefore, it may cause a problem that you consider the pad-group wise attention across all batches. Do you think the permutation should be "n (qh ph) (qw pw) c -> (n ph pw) (qh qw) c"

opened by UBCIntelliview 3
Need pre-trained model on ImageNet-1K

Hi, thanks for your work! I'm trying to train your model in custom config from scratch, but have not found any pre-trained model on ImageNet-1K. Do you plan to share these models?

opened by WinstonDeng 2
undefined symbol: _Z13__THCudaCheck9cudaErrorPKci

` FutureWarning, WARNING:torch.distributed.run:

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

Traceback (most recent call last): File "tools/train.py", line 168, in main() File "tools/train.py", line 122, in main env_info_dict = collect_env() File "/dataset/wh/wh_code/HRFormer-main/pose/mmpose/utils/collect_env.py", line 8, in collect_env env_info = collect_basic_env() File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/env.py", line 85, in collect_env from mmcv.ops import get_compiler_version, get_compiling_cuda_version File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/init.py", line 1, in from .bbox import bbox_overlaps File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/bbox.py", line 3, in ext_module = ext_loader.load_ext('_ext', ['bbox_overlaps']) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/ext_loader.py", line 12, in load_ext ext = importlib.import_module('mmcv.' + name) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) ImportError: /home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _Z13__THCudaCheck9cudaErrorPKci ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 42674) of binary: /home/celia/anaconda3/envs/open-mmlab/bin/python Traceback (most recent call last): File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in main() File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/run.py", line 718, in run )(*cmd_args) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/celia/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 247, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

tools/train.py FAILED

Failures: [1]: time : 2022-10-24_10:03:43 host : omnisky rank : 1 (local_rank: 1) exitcode : 1 (pid: 42675) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [2]: time : 2022-10-24_10:03:43 host : omnisky rank : 2 (local_rank: 2) exitcode : 1 (pid: 42676) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html [3]: time : 2022-10-24_10:03:43 host : omnisky rank : 3 (local_rank: 3) exitcode : 1 (pid: 42677) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2022-10-24_10:03:43 host : omnisky rank : 0 (local_rank: 0) exitcode : 1 (pid: 42674) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================`

opened by yzew 1
Pretrained model for cityscapes

Thanks for your great job. I have some trouble for reproducing the segmentation results of cityscapes. Then I check the log and find out it might be the problem of pretrained models. For now I use the ImageNet model released as pretrained. Can you release the pretrained model for cityscapes? Thanks a lot!

opened by devillala 1
Cuda out of memory on resume (incl. fix)

If ran out of memory with exact same params as in training (which worked). Loading the model first to cpu fixes the problem:

resume_dict = torch.load(self.configer.get('network', 'resume'),map_location='cpu')

maybe it helps somebody

021-08-25 14:51:29,793 INFO [data_helper.py, 126] Input keys: ['img'] 2021-08-25 14:51:29,793 INFO [data_helper.py, 127] Target keys: ['labelmap'] Traceback (most recent call last): File "/home/rsa-key-20190908/HRFormer/seg/main.py", line 541, in model.train() File "/home/rsa-key-20190908/HRFormer/seg/segmentor/trainer.py", line 438, in train self.__train() File "/home/rsa-key-20190908/HRFormer/seg/segmentor/trainer.py", line 187, in __train outputs = self.seg_net(*inputs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 705, in forward output = self.module(*inputs[0], **kwargs[0]) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in call_impl result = self.forward(*input, **kwargs) File "/home/rsa-key-20190908/HRFormer/seg/lib/models/nets/hrt.py", line 117, in forward x = self.backbone(x) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/rsa-key-20190908/HRFormer/seg/lib/models/backbones/hrt/hrt_backbone.py", line 579, in forward y_list = self.stage3(x_list) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/rsa-key-20190908/HRFormer/seg/lib/models/backbones/hrt/hrt_backbone.py", line 282, in forward x[i] = self.branchesi File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/rsa-key-20190908/HRFormer/seg/lib/models/backbones/hrt/modules/transformer_block.py", line 103, in forward x = x + self.drop_path(self.attn(self.norm1(x), H, W)) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/rsa-key-20190908/HRFormer/seg/lib/models/backbones/hrt/modules/multihead_isa_pool_attention.py", line 41, in forward out, _, _ = self.attn(x_permute, x_permute, x_permute, rpe=self.with_rpe, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/rsa-key-20190908/HRFormer/seg/lib/models/backbones/hrt/modules/multihead_isa_attention.py", line 116, in forward rpe=rpe, File "/home/rsa-key-20190908/HRFormer/seg/lib/models/backbones/hrt/modules/multihead_isa_attention.py", line 311, in multi_head_attention_forward ) + relative_position_bias.unsqueeze(0) RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.78 GiB total capacity; 6.64 GiB already allocated; 27.25 MiB free; 6.66 GiB reserved in total by PyTorch) Killing subprocess 6170

opened by marcok 1
CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0

Cannot reproduce the test accuracy.

I tried to run the test of HRFormer on ImageNet-1k, but the test result was strange. The top-1 accuracy is about 2.0%

Test command

bash run_eval.sh hrt/hrt_tiny ~/Downloads/hrt_tiny_imagenet_pretrained_top1_786.pth  ~/data/imagenet

Test output

[2022-09-06 15:00:15 hrt_tiny](main.py 157): INFO number of params: 8035820
All checkpoints founded in output/hrt_tiny/default: []
[2022-09-06 15:00:15 hrt_tiny](main.py 184): INFO no checkpoint found in output/hrt_tiny/default, ignoring auto resume
[2022-09-06 15:00:15 hrt_tiny](utils.py 21): INFO ==============> Resuming form /home/mzr/Downloads/hrt_tiny_imagenet_pretrained_top1_786.pth....................
[2022-09-06 15:00:15 hrt_tiny](utils.py 31): INFO <All keys matched successfully>
[2022-09-06 15:00:19 hrt_tiny](main.py 389): INFO Test: [0/391]	Time 4.122 (4.122)	Loss 8.9438 (8.9438)	[email protected] 2.344 (2.344)	[email protected] 4.688 (4.688)	Mem 2309MB
[2022-09-06 15:00:29 hrt_tiny](main.py 389): INFO Test: [10/391]	Time 1.028 (1.279)	Loss 9.0749 (9.3455)	[email protected] 5.469 (2.486)	[email protected] 12.500 (7.031)	Mem 2309MB
[2022-09-06 15:00:39 hrt_tiny](main.py 389): INFO Test: [20/391]	Time 1.027 (1.159)	Loss 9.9610 (9.3413)	[email protected] 0.781 (2.269)	[email protected] 4.688 (7.403)	Mem 2309MB
[2022-09-06 15:00:49 hrt_tiny](main.py 389): INFO Test: [30/391]	Time 0.952 (1.103)	Loss 9.1598 (9.3309)	[email protected] 1.562 (2.293)	[email protected] 7.812 (7.359)	Mem 2309MB
[2022-09-06 15:00:59 hrt_tiny](main.py 389): INFO Test: [40/391]	Time 0.951 (1.071)	Loss 9.3239 (9.3605)	[email protected] 0.781 (2.210)	[email protected] 4.688 (7.241)	Mem 2309MB
[2022-09-06 15:01:09 hrt_tiny](main.py 389): INFO Test: [50/391]	Time 0.952 (1.049)	Loss 9.7051 (9.3650)	[email protected] 0.781 (2.191)	[email protected] 3.125 (7.200)	Mem 2309MB
[2022-09-06 15:01:18 hrt_tiny](main.py 389): INFO Test: [60/391]	Time 0.951 (1.035)	Loss 9.5935 (9.3584)	[email protected] 1.562 (2.075)	[email protected] 7.812 (7.095)	Mem 2309MB
...

The environment is brand new according to the install instruction, and the checkpoint is from https://github.com/HRNet/HRFormer/releases/tag/v1.0.0 . The only change is I disabled the amp.

opened by mzr1996 0

cocostuff dataset validation bug

in the segmentation folder -> segmentation_val/segmentor/tester.py line183

def __relabel(self, label_map):
    height, width = label_map.shape
    label_dst = np.zeros((height, width), dtype=np.uint8)
    for i in range(self.configer.get('data', 'num_classes')):
        label_dst[label_map == i] = self.configer.get('data', 'label_list')[i]
  
    label_dst = np.array(label_dst, dtype=np.uint8)
  
    return label_dst

if self.configer.exists('data', 'reduce_zero_label') and self.configer.get('data', 'reduce_zero_label'):
    label_img = label_img + 1
    label_img = label_img.astype(np.uint8)
if self.configer.exists('data', 'label_list'):
    label_img_ = self.__relabel(label_img)
else:
    label_img_ = label_img

for cocostuff dataset (171 num classes), the origin predicted classes range from 0-170, after +1, it range from 1-171, then feed the label_img into __relabel() func. However, the loop in __relabel() range from 0-170, and the class 171 is not be operated.

opened by chencheng1203 0

missing `mmpose/version.py`

Hi,

When I installed mmpose in this repo, I found there is no mmpose/version.py file.

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/chenshoufa/workspace/HRFormer/pose/setup.py", line 105, in <module>
        version=get_version(),
      File "/home/chenshoufa/workspace/HRFormer/pose/setup.py", line 14, in get_version
        with open(version_file, 'r') as f:
    FileNotFoundError: [Errno 2] No such file or directory: 'mmpose/version.py'

opened by ShoufaChen 2

Inference speed

What is the inference speed for e.g. semantic segmentation using 1024x1024 (referring to table 5)? Measured on GPU of your choice, just to get a feeling?

opened by UrskaJ 0

Releases(v1.0.0)

v1.0.0(Nov 29, 2021)

Source code(tar.gz)
Source code(zip)
hrt_base_imagenet_pretrained_top1_828.pth(579.27 MB)
hrt_small_imagenet_pretrained_top1_812.pth(158.00 MB)
hrt_tiny_imagenet_pretrained_top1_786.pth(94.42 MB)

Owner

HRNet

Code for pose estimation is available at https://github.com/leoxiaobin/deep-high-resolution-net.pytorch

GitHub Repository

Repo for parser tensorflow(.pb) and tflite(.tflite)

tfmodel_parser .pb file is the format of tensorflow model .tflite file is the format of tflite model, which usually used in mobile devices before star

1 Dec 23, 2021

A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

1 Jun 05, 2022

Age Progression/Regression by Conditional Adversarial Autoencoder

Age Progression/Regression by Conditional Adversarial Autoencoder (CAAE) TensorFlow implementation of the algorithm in the paper Age Progression/Regre

603 Dec 22, 2022

This repository collects 100 papers related to negative sampling methods.

Negative-Sampling-Paper This repository collects 100 papers related to negative sampling methods, covering multiple research fields such as Recommenda

119 Dec 29, 2022

🛠️ Tools for Transformers compression using Lightning ⚡

Bert-squeeze is a repository aiming to provide code to reduce the size of Transformer-based models or decrease their latency at inference time.

66 Dec 11, 2022

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

269 Dec 21, 2022

Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

197 Jan 06, 2023

Joint deep network for feature line detection and description

SOLD² - Self-supervised Occlusion-aware Line Description and Detection This repository contains the implementation of the paper: SOLD² : Self-supervis

427 Dec 27, 2022

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

89 Dec 15, 2022

Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

Deep learning based model for Cyro ET Sub-tomogram-Detection High degree of stru

2 Feb 04, 2022

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Mugs: A Multi-Granular Self-Supervised Learning Framework This is a PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-

62 Nov 08, 2022

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

126 Dec 03, 2022

Generate vibrant and detailed images using only text.

CLIP Guided Diffusion From RiversHaveWings. Generate vibrant and detailed images using only text. See captions and more generations in the Gallery See

401 Dec 28, 2022

audioLIME: Listenable Explanations Using Source Separation

audioLIME This repository contains the Python package audioLIME, a tool for creating listenable explanations for machine learning models in music info

27 Dec 01, 2022

CVPR 2020 oral paper: Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax.

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax ⚠️ Latest: Current repo is a complete version. But we delet

341 Dec 23, 2022

PyTorch implementation for "Sharpness-aware Quantization for Deep Neural Networks".

Sharpness-aware Quantization for Deep Neural Networks Recent Update 2021.11.23: We release the source code of SAQ. Setup the environments Clone the re

30 Dec 19, 2022

As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

HAKE-Action HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It inclu

94 Nov 18, 2022