LETR: Line Segment Detection Using Transformers without Edges

Last update: Jan 06, 2023

Related tags

Overview

LETR: Line Segment Detection Using Transformers without Edges

Introduction

This repository contains the official code and pretrained models for Line Segment Detection Using Transformers without Edges. Yifan Xu*, Weijian Xu*, David Cheung, and Zhuowen Tu. CVPR2021 (Oral)

In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. Our method, named LinE segment TRansformers (LETR), takes advantages of having integrated tokenized queries, a self-attention mechanism, and encoding-decoding strategy within Transformers by skipping standard heuristic designs for the edge element detection and perceptual grouping processes. We equip Transformers with a multi-scale encoder/decoder strategy to perform fine-grained line segment detection under a direct endpoint distance loss. This loss term is particularly suitable for detecting geometric structures such as line segments that are not conveniently represented by the standard bounding box representations. The Transformers learn to gradually refine line segments through layers of self-attention.

Changelog

05/07/2021: Code for LETR Basic Usage Demo are released.

04/30/2021: Code and pre-trained checkpoint for LETR are released.

Results and Checkpoints

Name	sAP10	sAP15	sF10	sF15	URL
Wireframe	65.6	68.0	66.1	67.4	LETR-R101
YorkUrban	29.6	32.0	40.5	42.1	LETR-R50

Reproducing Results

Step1: Code Preparation

git clone https://github.com/mlpc-ucsd/LETR.git

Step2: Environment Installation

mkdir -p data
mkdir -p evaluation/data
mkdir -p exp


conda create -n letr python anaconda
conda activate letr
conda install -c pytorch pytorch torchvision
conda install cython scipy
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
pip install docopt

Step3: Data Preparation

To reproduce our results, you need to process two datasets, ShanghaiTech and YorkUrban. Files located at ./helper/wireframe.py and ./helper/york.py are both modified based on the code from L-CNN, which process the raw data from download.

ShanghaiTech Train Data

To Download (modified based on from L-CNN)

cd data
bash ../helper/gdrive-download.sh 1BRkqyi5CKPQF6IYzj_dQxZFQl0OwbzOf wireframe_raw.tar.xz
tar xf wireframe_raw.tar.xz
rm wireframe_raw.tar.xz
python ../helper/wireframe.py ./wireframe_raw ./wireframe_processed

YorkUrban Train Data

To Download

cd data
wget https://www.dropbox.com/sh/qgsh2audfi8aajd/AAAQrKM0wLe_LepwlC1rzFMxa/YorkUrbanDB.zip
unzip YorkUrbanDB.zip 
python ../helper/york.py ./YorkUrbanDB ./york_processed

Processed Evaluation Data

bash ./helper/gdrive-download.sh 1T4_6Nb5r4yAXre3lf-zpmp3RbmyP1t9q ./evaluation/data/wireframe.tar.xz
bash ./helper/gdrive-download.sh 1ijOXv0Xw1IaNDtp1uBJt5Xb3mMj99Iw2 ./evaluation/data/york.tar.xz
tar -vxf ./evaluation/data/wireframe.tar.xz -C ./evaluation/data/.
tar -vxf ./evaluation/data/york.tar.xz -C ./evaluation/data/.
rm ./evaluation/data/wireframe.tar.xz
rm ./evaluation/data/york.tar.xz

Step4: Train Script Examples

Train a coarse-model (a.k.a. stage1 model).

# Usage: bash script/*/*.sh [exp name]
bash script/train/a0_train_stage1_res50.sh  res50_stage1 # LETR-R50  
bash script/train/a1_train_stage1_res101.sh res101_stage1 # LETR-R101

Train a fine-model (a.k.a. stage2 model).

# Usage: bash script/*/*.sh [exp name]
bash script/train/a2_train_stage2_res50.sh  res50_stage2  # LETR-R50
bash script/train/a3_train_stage2_res101.sh res101_stage2 # LETR-R101

Fine-tune the fine-model with focal loss (a.k.a. stage2_focal model).

# Usage: bash script/*/*.sh [exp name]
bash script/train/a4_train_stage2_focal_res50.sh   res50_stage2_focal # LETR-R50
bash script/train/a5_train_stage2_focal_res101.sh  res101_stage2_focal # LETR-R101

Step5: Evaluation

Evaluate models.

# Evaluate sAP^10, sAP^15, sF^10, sF^15 (both Wireframe and YorkUrban datasets).
bash script/evaluation/eval_stage1.sh [exp name]
bash script/evaluation/eval_stage2.sh [exp name]
bash script/evaluation/eval_stage2_focal.sh [exp name]

Citation

If you use this code for your research, please cite our paper:

@InProceedings{Xu_2021_CVPR,
    author    = {Xu, Yifan and Xu, Weijian and Cheung, David and Tu, Zhuowen},
    title     = {Line Segment Detection Using Transformers Without Edges},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {4257-4266}
}

Acknowledgments

This code is based on the implementations of DETR: End-to-End Object Detection with Transformers.

LETR: Line Segment Detection Using Transformers without Edges

Related tags

Overview

LETR: Line Segment Detection Using Transformers without Edges

Introduction

Changelog

Results and Checkpoints

Reproducing Results

Step1: Code Preparation

Step2: Environment Installation

Step3: Data Preparation

Step4: Train Script Examples

Step5: Evaluation

Citation

Acknowledgments

Owner

mlpc-ucsd

Contextual Attention Localization for Offline Handwritten Text Recognition

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

Implementation of Convolutional LSTM in PyTorch.

Official Implementation of Neural Splines

This repo includes the supplementary of our paper "CEMENT: Incomplete Multi-View Weak-Label Learning with Long-Tailed Labels"

Python Environment for Bayesian Learning

[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

I have created this Virtual Paint Program, in this you can paint(draw) on your screen using hand gestures, created in Python-3 using OpenCV and Mediapipe library. Gestures :- Index Finger for drawing and Index+Middle Finger for changing position and objects.

McGill Physics Hackathon 2021: Reaction-Diffusion Models for the Generation of Biological Patterns

This repo is about to create the Streamlit application for given ML model.

GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and Heterophily

Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

Optical machine for senses sensing using speckle and deep learning

Score refinement for confidence-based 3D multi-object tracking