Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

Last update: Nov 22, 2021

Related tags

Deep Learning CPN_KR

Overview

ML2 Takehome Project

Reimplementing the paper: Cascaded Pyramid Network for Multi-Person Pose Estimation

Dataset

The model uses the COCO dataset which can be downloaded by typing:

chmod +x coco.sh
./coco.sh

The data is going to be saved inside the coco/ folder.

I actually got the wrong idea of the assigment from the beginning and didn't relize until I searched for a pytorch code on Github for reference.

That is the data doesn't need to be cropped from the original. I mean not physically cropped to images but just need to write the program to cut it during the training process. Anyway I did the cutting and save the neccesary information such as keypoints and visual score (0,1,2) to a dataframe for the training and validation data.

python dataprocessing/process_data.py

Training

python train.py

Test

Download the checkpoint here and unzip.

python test.py

The results are shown below, I know that this one is not a perfect one, but if I have more time I think the model will get better.

Input	Prediction

Failed cases

Input	Prediction

Notes

the model was not finished training yet, then I was not able to test it.
There was a typo in the code when I created the dataset and I just figured it out on Friday then everything is just like a fresh start. I will keep training and update the weight file and test code as well as the result.

Reference

The repo is heavily based on the pytorch version and tensorflow version and the official keras tutorial about keypoint estimation.

Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

Related tags

Overview

ML2 Takehome Project

Dataset

Training

Test

Notes

Reference

Owner

Vo Van Tu

一个免费开源一键搭建的通用验证码识别平台，大部分常见的中英数验证码识别都没啥问题。

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

Learning to See by Looking at Noise

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

A Kernel fuzzer focusing on race bugs

ROS-UGV-Control-Interface - Control interface which can be used in any UGV

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

A modular PyTorch library for optical flow estimation using neural networks

Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

MediaPipe is a an open-source framework from Google for building multimodal

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

PyTorch implementation of the paper Ultra Fast Structure-aware Deep Lane Detection

OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

Specificity-preserving RGB-D Saliency Detection

Official implementation of TMANet.

ElasticFace: Elastic Margin Loss for Deep Face Recognition

Code for ICCV 2021 paper "Distilling Holistic Knowledge with Graph Neural Networks"