ICON: Implicit Clothed humans Obtained from Normals

Last update: Dec 30, 2022

Related tags

Deep Learning ICON

Overview

ICON: Implicit Clothed humans Obtained from Normals

arXiv, December 2021.
Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black

Table of Contents

Who needs ICON
More Qualitative Results
Introduction Video
Citation
Acknowledgments
Disclosure
Contact

Who needs ICON?

If you want to reconstruct 3D clothed humans in unconstrained poses from in-the-wild images
- together with the body under clothing (e.g. SMPL, SMPL-X)
- clothed-body normal maps (front/back) predicted from images


ICON's outputs from single RGB image

If you want to obtain a realistic and animatable 3D clothed avatar direclty from video / a sequence of monocular images
- fully-textured with per-vertex color
- could be animated by SMPL pose parameters
- with pose-dependent clothing deformation


3D Clothed Avatar, created from 400+ images using ICON+SCANimate, animated by AIST++

More Qualitative Results


Comparison with other state-of-the-art methods

Reconstruction on in-the-wild photos with extreme poses (GIF)

Reconstruction on in-the-wild photos with extreme poses (PNG)

Predicted normals on in-the-wild images with extreme poses

Introduction Video

ICON.mp4

Citation

@misc{xiu2021icon,
      title={ICON: Implicit Clothed humans Obtained from Normals}, 
      author={Yuliang Xiu and Jinlong Yang and Dimitrios Tzionas and Michael J. Black},
      year={2021},
      eprint={2112.09127},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgments

We thank Yao Feng, Soubhik Sanyal, Qianli Ma, Hongwei Yi, Chun-Hao Paul Huang, Weiyang Liu, and Xu Chen for their feedback and discussions, Tsvetelina Alexiadis for her help with the AMT perceptual study, Taylor McConnell for her voice over, and Yuanlu Xu's help in comparing with ARCH and ARCH++.

Here are some great resources we benefit from:

MonoPortDataset for Data Processing
PaMIR, PIFu, PIFuHD, and MonoPort for Benchmark
SCANimate and AIST++ for Animation
rembg for Human Segmentation
torch-mesh-isect for BVH Computation
smplx, PARE, PyMAF, and PIXIE for Human Pose & Shape Estimation
CAPE and THuman for Dataset
PyTorch3D for Differential Rendering

Some images used in the qualitative examples come from pinterest.com.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No.860768 (CLIPE Project).

Disclosure

MJB has received research gift funds from Adobe, Intel, Nvidia, Facebook, and Amazon. While MJB is a part-time employee of Amazon, his research was performed solely at, and funded solely by, Max Planck. MJB has financial interests in Amazon, Datagen Technologies, and Meshcapade GmbH.

Contact

For more questions, please contact [email protected]

Comments

OpenGL.raw.EGL._errors.EGLError: EGLError( )

When I run the command of "bash render_batch.sh debug", it gives an error as following

OpenGL.raw.EGL._errors.EGLError: EGLError( err = EGL_NOT_INITIALIZED, baseOperation = eglInitialize, cArguments = ( <OpenGL._opaque.EGLDisplay_pointer object at 0x7f7b3d0ee2c0>, <importlib._bootstrap.LP_c_int object at 0x7f7b3d0ee440>, <importlib._bootstrap.LP_c_int object at 0x7f7b3d106bc0>, ), result = 0 )

How can I fix this?
documentation CUDA or OpenGL Dataset

opened by Yuhuoo 16
ConnectionError: HTTPSConnectionPool

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='drive.google.com', port=443): Max retries exceeded with url: /uc?id=1tCU5MM1LhRgGou5OpmpjBQbSrYIUoYab (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fbc4ad7deb0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
documentation rembg

opened by shuoshuoxu 12

Trouble getting ICON results

After installing all packages, I got the results successfully for PIFu and PaMIR. I faced the runtime error when trying to get the ICON demo result. Could you guide what setting was wrong?

$ python infer.py -cfg ../configs/icon-filter.yaml -gpu 0 -in_dir ../examples -out_dir ../results

Traceback (most recent call last):
  File "infer.py", line 304, in <module>
    verts_pr, faces_pr, _ = model.test_single(in_tensor)
  File "./ICON/apps/ICON.py", line 738, in test_single
    sdf = self.reconEngine(opt=self.cfg,
  File "./.virtualenvs/icon/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "../lib/common/seg3d_lossless.py", line 148, in forward
    return self._forward_faster(**kwargs)
  File "../lib/common/seg3d_lossless.py", line 170, in _forward_faster
    occupancys = self.batch_eval(coords, **kwargs)
  File "../lib/common/seg3d_lossless.py", line 139, in batch_eval
    occupancys = self.query_func(**kwargs, points=coords2D)
  File "../lib/common/train_util.py", line 338, in query_func
    preds = netG.query(features=features,
  File "../lib/net/HGPIFuNet.py", line 285, in query
    smpl_sdf, smpl_norm, smpl_cmap, smpl_ind = cal_sdf_batch(
  File "../lib/dataset/mesh_util.py", line 231, in cal_sdf_batch
    residues, normals, pts_cmap, pts_ind = func(
  File "./.virtualenvs/icon/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "./.virtualenvs/icon/lib/python3.8/site-packages/bvh_distance_queries/mesh_distance.py", line 79, in forward
    output = self.search_tree(triangles, points)
  File "./.virtualenvs/icon/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "./.virtualenvs/icon/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "./.virtualenvs/icon/lib/python3.8/site-packages/bvh_distance_queries/bvh_search_tree.py", line 109, in forward
    output = BVHFunction.apply(
  File "./.virtualenvs/icon/lib/python3.8/site-packages/bvh_distance_queries/bvh_search_tree.py", line 42, in forward
    outputs = bvh_distance_queries_cuda.distance_queries(
RuntimeError: after reduction step 1: cudaErrorInvalidDevice: invalid device ordinal

CUDA or OpenGL

opened by Samiepapa 12

THuman Dataset preprocess

Hi, I found the program was running so slowly when I ran bash render_batch.sh debug all, I figured out it stopped at hits = mesh.ray.intersects_any (origins + delta * normals, vectors), and the number of rays is the millions, is that the reason why it was too slow?
documentation Dataset

opened by mmmcn 11
Colab main cell doing nothing

Hi, first thanks for your code In colab version, when running the cell that starts with

run the test on examples

the execution is very fast, no errors, but do nothing. Next cell show errors: FileNotFoundError: [Errno 2] No such file or directory: '/content/ICON/results/icon-filter/vid/22097467bffc92d4a5c4246f7d4edb75_display.mp4'

many thanks

opened by smithee77 9
Error: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv

Getting this error when installing locally on my workstation via colab bash script.

.../ICON/pytorch3d/pytorch3d/_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv

This after installing pytorch3d locally as recommended. Conda has too many conflicts and never resolves.

Installing torch through pip works (1.8.2+cu111) up until the next steps of infer.py because bvh_distance_queries only supports cuda 11.0. This would most likely require compiling against 11.0, but it will probably lead to more errors as I don't know what this repository's dependencies require as far as torch goes.
CUDA or OpenGL

opened by ExponentialML 9
The side face seems not good in training phase.

I followed the instruction to process the THuman2.0 dataset and train the ICON. After 10 epochs, the result on tensorboard seems not good especially the reconstruction of the side face. I can't find where the problem is.

The change of loss during training is

opened by River-Zhang 7
Question for Reproducing Result

Hi, I reproduced ICON and trained with THuman2.0.

As a result of the my training, only this model produces a lot of noise and isolated mesh.

Would you give me an advice?

I used [-1, 1] SMPL vertices, query points, and 15.0 SDF clipping.

I believe that ICON has great potential and generalization performance.

But I don't know what is the problem.

The results below are about unseen data, and I used pre-trained GCMR just like PaMIR for SMPL prediction.

Thank you.

opened by EadCat 7
Problem about SMPL refining loss.
Many thanks to the author for his work. I found a question in the process of reading papers and code. The paper introduces in the Refining SMPL section that the results of SMPL modeling can be iteratively optimized during the inference process. The loss function includes two parts, the L1 difference between the unclothed normal map and the normal map of the model prediction results, and The L1 difference between the mask of the smpl normal map and the mask of the original image, but you did not implement the corresponding implementation in the code. What is the reason for this? Is the existing code implementation more efficient than the original implementation?

# silhouette loss smpl_arr = torch.cat([T_mask_F, T_mask_B], dim=-1)[0] # smpl mask maps gt_arr = torch.cat( # clothed normal maps [in_tensor['normal_F'][0], in_tensor['normal_B'][0]], dim=2).permute(1, 2, 0) gt_arr = ((gt_arr + 1.0) * 0.5).to(device) bg_color = torch.Tensor([0.5, 0.5, 0.5]).unsqueeze(0).unsqueeze(0).to(device) gt_arr = ((gt_arr - bg_color).sum(dim=-1) != 0.0).float() diff_S = torch.abs(smpl_arr - gt_arr) losses['silhouette']['value'] = diff_S.mean()
HPS Discussion
opened by SongYupei 7
Some questions about training

I would like to know some details about training. Is the Ground-Truth SMPL or the predicted SMPL used in training ICON? Also, what about normal images? According to my understanding of the paper and practice, ICON should train the normal network first and then train the implicit reconstruction network. When I reproduce ICON, I don't know whether to choose the Ground-Truth or the predicted data for SMPL model and normal images, respectively.
Dataset Training

opened by sunjc0306 7
ModuleNotFoundError: No module named 'bvh_distance_queries_cuda'

Hi, thank you so much for the wonderful work and corresponding codes. I am facing the following issue: https://github.com/YuliangXiu/ICON/blob/0045bd10f076bf367d25b7dac41d0d5887b8694f/lib/bvh-distance-queries/bvh_distance_queries/bvh_search_tree.py#L27

Is there any .py file called bvh_distance_queries_cuda ? Please let me know a possible solution. Thank you for your effort and help :) :) :)
CUDA or OpenGL

opened by Pallab38 7
Question about training

After training of the implicit MLP, I got quite wired results. The reconstructed meshes are poor. The evaluation results shows that NC is very low, but chamfer and p2s are very high. Do you know where the problem is? I would appreciate it a lot if you could give me some suggestions!

opened by Zhangjzh 4
Question of cloth refinement

Hi, I'm confused about local_affine_model in the cloth refinement. I find that you use the LocalAffine() converting the verts of the refinemesh to the affined verts. The step seems to train a Affinemodel. I know that we need get the loss between P_normal from the mesh and normal of the image to optimizer the verts, but I don't understand why we need introduct the Affinemodel. The procedure of localAffine is meaning getting the Affine Matrix under the Camera coordinate system? I confused about these.

opened by Yuhuoo 1
Can't setup ICON on Ubuntu / Colab

Unfortunately, I can't manage to install ICON locally. The Colab Notebook and the model on Huggingface also seem to be broken. Would it be possible to update the documentation / environment or share a Docker file?

opened by c6s0 4
how to get real depth value from depth map

Hi, recently I've been trying to use depth map as prior information, but how can I generate the point cloud at camera coordinate space or model space from the given depth map(generated by render_single.py) ? For example, how can i generate point cloud in camera/view space by depth_F/xxx.png below, i.e. how can i get the real depth value in camera space. I tried to see the coordinate transformation in the code, it seems that I can get perspective matrix and model_view matrix in lib/render/camer.py , which may transform the model from local space into clip space. But I am still confused about how to convert the depth map to another space.

It would be very appricated if anyone could give me some advice!

opened by mmmcn 2

scripts/render_single.sh: line 33: 137550 Killed

ubuntu 22.04 + NVIDIA 2080Ti.

I encountered this error when execute this script "bash scripts/render_batch.sh debug all".

thuman2 START----------
Debug renderer
Rendering thuman2 0001
/home/hcp/anaconda3/envs/metaverse/lib/python3.8/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.3
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"

**scripts/render_single.sh: line 33: 137550 Killed                  python $PYTHON_SCRIPT -s $SUBJECT -o $SAVE_DIR -r $NUM_VIEWS -w $SIZE**
thuman2 END----------

opened by hcp6897 1

A weird bug: The data is not on the same device.

https://github.com/YuliangXiu/ICON/blob/ece5a09aa2d56aec28017430e65a0352622a0f30/lib/dataset/mesh_util.py#L283

` print(triangles.device) # cuda:1

print(points.device) # cuda:1

residues, pts_ind, _ = point_to_mesh_distance(points, triangles)

print(triangles.device) # cuda:1

print(pts_ind.device) # cuda:0

print(residues.device) # cuda:0`

command: python -m apps.infer -cfg ./configs/icon-filter.yaml -gpu 1 -in_dir {*} -out_dir {*} 'CUDA_VISIBLE_DEVICES=1' doesn't work either.
CUDA or OpenGL

opened by caiyongqi 11

Releases(v.1.1.0)

v.1.1.0(Aug 5, 2022)
Important updates:

Support normal network training

New interactive demo deployed on HuggingFace space

Faster Google Colab environment setup

Improved clothing refinement module

Fix several bugs and refactor some messy functions

Source code(tar.gz)
Source code(zip)
v.1.0.0(Jun 15, 2022)
The first stable version (ICON v.1.0.0) comes out!

Dataset: support THuman2.0

Inference: support PyMAF, PIXIE, PARE, HybrIK, BEV

Training & Evaluation: support PIFu, PaMIR, ICON

Add-on: garment extraction from fashion images

Source code(tar.gz)
Source code(zip)
v.1.0.0-rc2(Mar 7, 2022)
Some updates:

HPS support: PyMAF (SMPL), PARE (SMPL), PIXIE (SMPL-X)

Google Colab support

Replace bvh-distance-queries with PyTorch3D and Kaolin to improve CUDA compatibility

Fix some issues

Source code(tar.gz)
Source code(zip)
v.1.0.0-rc1(Jan 30, 2022)
First commit of ICON:

image-based inference code

pretrained model of ICON, PIFu*, PaMIR* (*: self-implementation)

homepage: https://icon.is.tue.mpg.de

Source code(tar.gz)
Source code(zip)

Owner

Yuliang Xiu

Ph.D. Student in Graphics & Vision

GitHub Repository

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

💎A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily instal

142 Dec 25, 2022

Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION This is the official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSU

14 Dec 15, 2022

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations This repository contains the data, scripts and baseline co

51 Dec 17, 2022

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

129 Dec 24, 2022

A collection of inference modules for fastai2

fastinference A collection of inference modules for fastai including inference speedup and interpretability Install pip install fastinference There ar

83 Oct 10, 2022

Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Do you want a RL agent nicely moving on Atari? Rainbow is all you need! This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains bo

1.4k Dec 29, 2022

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

53 Jul 04, 2022

Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

SNN_Calibration Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021 Feature Comparison of SNN calibration: Features SNN Direct Tr

60 Dec 27, 2022

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

11 Nov 30, 2022

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

SYMPAIS: Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis Overview | Installation | Documentation | Examples | Notebo

4 Sep 13, 2022

Inference pipeline for our participation in the FeTA challenge 2021.

feta-inference Inference pipeline for our participation in the FeTA challenge 2021. Team name: TRABIT Installation Download the two folders in https:/

2 Apr 13, 2022

NCNN implementation of Real-ESRGAN. Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

593 Jan 03, 2023

This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

Q-Programming Summer of Qode This repository contains all the code and materials distributed in the Q-Programming Summer of Qode. If you want to creat

11 Jun 11, 2021

Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

pmapper pmapper is a super-resolution and deconvolution toolkit for python 3.6+. PMAP stands for Poisson Maximum A-Posteriori, a highly flexible and a

8 Nov 06, 2022

Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

Transfer learning approach to bicycle sharing systems station location planning using OpenStreetMap Companion repository to the paper accepted at the

4 Oct 24, 2022

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI 声明: 本项目仅限于学习交流，不可用于非法用途，包括但不限于：用于游戏外挂等，使用本项目产生的任何后果与本人无关！简介本项目基于yolov5,实现了一款FPS类游戏（CF、CSGO等）的自瞄AI，本项目旨在使用现

246 Dec 28, 2022

Evaluating different engineering tricks that make RL work

Reinforcement Learning Tricks, Index This repository contains the code for the paper "Distilling Reinforcement Learning Tricks for Video Games". Short

15 Dec 26, 2022

Galileo library for large scale graph training by JD

近年来，图计算在搜索、推荐和风控等场景中获得显著的效果，但也面临超大规模异构图训练，与现有的深度学习框架Tensorflow和PyTorch结合等难题。 Galileo（伽利略）是一个图深度学习框架，具备超大规模、易使用、易扩展、高性能、双后端等优点，旨在解决超大规模图算法在工业级场景的落地难题，提

128 Nov 29, 2022

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

KAIROS MineRL BASALT Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL B

37 Oct 30, 2022

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

0 Feb 06, 2022

ICON: Implicit Clothed humans Obtained from Normals

Related tags

Overview

ICON: Implicit Clothed humans Obtained from Normals

Who needs ICON?

More Qualitative Results

Introduction Video

Citation

Acknowledgments

Disclosure

Contact

Comments

run the test on examples

Releases(v.1.1.0)

v.1.1.0(Aug 5, 2022)

v.1.0.0(Jun 15, 2022)

v.1.0.0-rc2(Mar 7, 2022)

v.1.0.0-rc1(Jan 30, 2022)

Owner

Yuliang Xiu

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

A collection of inference modules for fastai2

Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

Inference pipeline for our participation in the FeTA challenge 2021.

NCNN implementation of Real-ESRGAN. Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

Evaluating different engineering tricks that make RL work

Galileo library for large scale graph training by JD

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.