[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Last update: Dec 15, 2022

Related tags

Overview

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

This is an official PyTorch code repository of the paper "Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks " (ICCV, 2021).

Here, we present a versatile point cloud processing block that yields state-of-the-art results on many tasks.
The key idea is to process point clouds with many cheap low-dimensional different projections followed by standard convolutions. And we do so both in parallel and sequentially.

Datasets

We provide links to the datasets we used to train/evaluate. After unpacking and preparation, please edit the dataset path (data:path field) in configs/*.yaml

Pre-trained models

We provide our pre-trained models' weights in a single archive.

Building Dependencies

To install and build all the modules required, please run:

bash ./install_deps.sh

Code Structure

In layers/cloud_transform.py the core operations are implemented (rasterization Splat and de-rasterization Slice). While in layers\mutihead_ct_*.py we provide slightly different versions of Multi-Headed Cloud Transform (MHCT).

The model zoo is situated in model_zoo, where the models for corresponding tasks are constructed of Multi-Headed Cloud Transforms.

Run

We train our models in multi-GPU setting using DistributedDataParallel. To train on n GPUs, please run the following commands:

python train_${SCRIPT_NAME}.py ${EXP_NAME} -c configs/${CONFIG_NAME}.yaml --master localhost:3315 --rank 0 --num_nodes n
...
python train_${SCRIPT_NAME}.py ${EXP_NAME} -c configs/${CONFIG_NAME}.yaml --master localhost:3315 --rank  --num_nodes n

The semantics for evaluation scripts is almost the same:

python eval_${SCRIPT_NAME}.py ${EXP_NAME} -c configs/eval/${CONFIG_NAME}.yaml

Cite

If you find our work helpful, please do not hesitate to cite us.

@inproceedings{mazur2021cloudtransformers,
  title={Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks},
  author={Mazur, Kirill and Lempitsky, Victor},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2021}
}

[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Related tags

Overview

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Datasets

Pre-trained models

Building Dependencies

Code Structure

Run

Cite

Owner

Visual Understanding Lab @ Samsung AI Center Moscow

Create single line SVG illustrations from your pictures

Here use convulation with sobel filter from scratch in opencv python .

This repository summarized computer vision theories.

Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Detect and fix skew in images containing text

Image processing in Python

Python library to extract tabular data from images and scanned PDFs

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

Python-based tools for document analysis and OCR

Creating of virtual elements of the graphical interface using opencv and mediapipe.

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Automatically fishes for you while you are afk :)

天池2021"全球人工智能技术创新大赛"【赛道一】：医学影像报告异常检测 - 第三名解决方案

A small C++ implementation of LSTM networks, focused on OCR.

A Python wrapper for Google Tesseract

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Tesseract Open Source OCR Engine (main repository)