Fashion Landmark Estimation with HRNet

Overview

HRNet for Fashion Landmark Estimation

(Modified from deep-high-resolution-net.pytorch)

Introduction

This code applies the HRNet (Deep High-Resolution Representation Learning for Human Pose Estimation) onto fashion landmark estimation task using the DeepFashion2 dataset. HRNet maintains high-resolution representations throughout the forward path. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise.

Illustrating the architecture of the proposed HRNet

Please note that every image in DeepFashion2 contains multiple fashion items, while our model assumes that there exists only one item in each image. Therefore, what we feed into the HRNet is not the original image but the cropped ones provided by a detector. In experiments, one can either use the ground truth bounding box annotation to generate the input data or use the output of a detecter.

Main Results

Landmark Estimation Performance on DeepFashion2 Test set

We won the third place in the "DeepFashion2 Challenge 2020 - Track 1 Clothes Landmark Estimation" competition. DeepFashion2 Challenge 2020 - Track 1 Clothes Landmark Estimation

Landmark Estimation Performance on DeepFashion2 Validation Set

Arch BBox Source AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet Detector 0.579 0.793 0.658 0.460 0.581 0.706 0.939 0.784 0.548 0.708
pose_hrnet GT 0.702 0.956 0.801 0.579 0.703 0.740 0.965 0.827 0.592 0.741

Quick start

Installation

  1. Install pytorch >= v1.2 following official instruction. Note that if you use pytorch's version < v1.0.0, you should follow the instruction at https://github.com/Microsoft/human-pose-estimation.pytorch to disable cudnn's implementations of BatchNorm layer. We encourage you to use higher pytorch's version(>=v1.0.0)

  2. Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.

  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Make libs:

    cd ${POSE_ROOT}/lib
    make
    
  5. Init output(training model output directory) and log(tensorboard log directory) directory:

    mkdir output 
    mkdir log
    

    Your directory tree should look like this:

    ${POSE_ROOT}
    |-- lib
    |-- tools 
    |-- experiments
    |-- models
    |-- data
    |-- log
    |-- output
    |-- README.md
    `-- requirements.txt
    
  6. Download pretrained models from our Onedrive Cloud Storage

Data preparation

Our experiments were conducted on DeepFashion2, clone this repo, and we'll call the directory that you cloned as ${DF2_ROOT}.

1) Download the dataset

Extract the dataset under ${POSE_ROOT}/data.

2) Convert annotations into coco-type

The above code repo provides a script to convert annotations into coco-type.

We uploaded our converted annotation file onto OneDrive named as train/val-coco_style.json. We also made truncated json files such as train-coco_style-32.json meaning the first 32 samples in the dataset to save the loading time during development period.

3) Install the deepfashion_api

Enter ${DF2_ROOT}/deepfashion2_api/PythonAPI and run

python setup.py install

Note that the deepfashion2_api is modified from the cocoapi without changing the package name. Therefore, conflicts occur if you try to install this package when you have installed the original cocoapi in your computer. We provide two feasible solutions: 1) run our code in a virtualenv 2) use the deepfashion2_api as a local pacakge. Also note that deepfashion2_api is different with cocoapi mainly in the number of classes and the values of standard variations for keypoints.

At last the directory should look like this:

${POSE_ROOT}
|-- data
`-- |-- deepfashion2
    `-- |-- train
        |   |-- image
        |   |-- annos                           (raw annotation)
        |   |-- train-coco_style.json           (converted annotation file)
        |   `-- train-coco_style-32.json      (truncated for fast debugging)
        |-- validation
        |   |-- image
        |   |-- annos                           (raw annotation)
        |   |-- val-coco_style.json             (converted annotation file)
        |   `-- val-coco_style-64.json        (truncated for fast debugging)
        `-- json_for_test
            `-- keypoints_test_information.json

Training and Testing

Note that the GPUS parameter in the yaml config file is deprecated. To select GPUs, use the environment varaible:

 export CUDA_VISIBLE_DEVICES=1

Testing on DeepFashion2 dataset with BBox from ground truth using trained models:

python tools/test.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth \
    TEST.USE_GT_BBOX True

Testing on DeepFashion2 dataset with BBox from a detector using trained models:

python tools/test.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth \
    TEST.DEEPFASHION2_BBOX_FILE data/bbox_result_val.pkl \

Training on DeepFashion2 dataset using pretrained models:

python tools/train.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
     MODEL.PRETRAINED models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth

Other options

python tools/test.py \
    ... \
    DATASET.MINI_DATASET True \ # use a subset of the annotation to save loading time
    TAG 'experiment description' \ # this info will appear in the output directory name
    WORKERS 4 \ # num_of_worker for the dataloader
    TEST.BATCH_SIZE_PER_GPU 8 \
    TRAIN.BATCH_SIZE_PER_GPU 8 \

OneDrive Cloud Storage

OneDrive

We provide the following files:

  • Model checkpoint files
  • Converted annotation files in coco-type
  • Bounding box results from our self-implemented detector in a pickle file.
hrnet-for-fashion-landmark-estimation.pytorch
|-- models
|   `-- pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth
|
|-- data
|   |-- bbox_result_val.pkl
|   |
`-- |-- deepfashion2
    `---|-- train
        |   |-- train-coco_style.json           (converted annotation file)
        |   `-- train-coco_style-32.json      (truncated for fast debugging)
        `-- validation
            |-- val-coco_style.json             (converted annotation file)
            `-- val-coco_style-64.json        (truncated for fast debugging)
        

Discussion

Experiment Configuration

  • For the regression target of keypoint heatmaps, we tuned the standard deviation value sigma and finally set it to 2.
  • During training, we found that the data augmentation from the original code was too intensive which makes the training process unstable. We weakened the augmentation parameters and observed performance gain.
  • Due to the imbalance of classes in DeepFashion2 dataset, the model's performance on different classes varies a lot. Therefore, we adopted a weighted sampling strategy rather than the naive random shuffling strategy, and observed performance gain.
  • We expermented with the value of weight decay, and found that either 1e-4 or 1e-5 harms the performance. Therefore, we simply set weight decay to 0.
Owner
SVIP Lab
ShanghaiTech Vision and Intelligent Perception Lab
SVIP Lab
Explicable Reward Design for Reinforcement Learning Agents [NeurIPS'21]

Explicable Reward Design for Reinforcement Learning Agents [NeurIPS'21]

3 May 12, 2022
Deep Halftoning with Reversible Binary Pattern

Deep Halftoning with Reversible Binary Pattern ICCV Paper | Project Website | BibTex Overview Existing halftoning algorithms usually drop colors and f

Menghan Xia 17 Nov 22, 2022
Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers

Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers This is the repo used for human motion prediction with non-autoregress

Idiap Research Institute 26 Dec 14, 2022
Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021)

T2Net Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021) [Paper][Code] Dependencies numpy==1.18.5 scikit_image==

64 Nov 23, 2022
Library for 8-bit optimizers and quantization routines.

bitsandbytes Bitsandbytes is a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers and quantization functions. Paper -- V

Facebook Research 687 Jan 04, 2023
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec

Irtiza Hasan 594 Jan 05, 2023
šŸ•µ Artificial Intelligence for social control of public administration

Non-tech crash course into Operação Serenata de Amor Tech crash course into Operação Serenata de Amor Contributing with code and tech skills Supportin

Open Knowledge Brasil - Rede pelo Conhecimento Livre 4.4k Dec 31, 2022
Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud This repository contains a reference implementation of our Part-Aware Data Augment

Jaeseok Choi 62 Jan 03, 2023
Artifacts for paper "MMO: Meta Multi-Objectivization for Software Configuration Tuning"

MMO: Meta Multi-Objectivization for Software Configuration Tuning This repository contains the data and code for the following paper that is currently

0 Nov 17, 2021
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
Code release for SLIP Self-supervision meets Language-Image Pre-training

SLIP: Self-supervision meets Language-Image Pre-training What you can find in this repo: Pre-trained models (with ViT-Small, Base, Large) and code to

Meta Research 621 Dec 31, 2022
MPRNet-Cloud-removal: Progressive cloud removal

MPRNet-Cloud-removal Progressive cloud removal Requirements 1.Pytorch = 1.0 2.Python 3 3.NVIDIA GPU + CUDA 9.0 4.Tensorboard Installation 1.Clone the

Semi 95 Dec 18, 2022
An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

CV Lab @ Yonsei University 35 Oct 26, 2022
The AugNet Python module contains functions for the fast computation of image similarity.

AugNet AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation arxiv link In our work, we propose AugNet, a new deep le

Ming 74 Dec 28, 2022
CoaT: Co-Scale Conv-Attentional Image Transformers

CoaT: Co-Scale Conv-Attentional Image Transformers Introduction This repository contains the official code and pretrained models for CoaT: Co-Scale Co

mlpc-ucsd 191 Dec 03, 2022
Remote sensing change detection using PaddlePaddle

Change Detection Laboratory Developing and benchmarking deep learning-based remo

Lin Manhui 15 Sep 23, 2022
Backdoor Attack through Frequency Domain

Backdoor Attack through Frequency Domain DEPENDENCIES python==3.8.3 numpy==1.19.4 tensorflow==2.4.0 opencv==4.5.1 idx2numpy==1.2.3 pytorch==1.7.0 Data

5 Jun 18, 2022
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 06, 2022
A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

RE2 This is a pytorch implementation of the ACL 2019 paper "Simple and Effective Text Matching with Richer Alignment Features". The original Tensorflo

287 Dec 21, 2022