CrossMLP

Last update: Jul 27, 2022

Related tags

Overview

Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation
Bin Ren¹, Hao Tang², Nicu Sebe¹.
¹University of Trento, Italy, ²ETH, Switzerland.
In BMVC 2021 Oral.
The repository offers the official implementation of our paper in PyTorch.

🦖 News! We have updated the proposed CrossMLP(December 9th, 2021)!

Installation

Step1: Create a new virtual environment with anaconda

conda create -n crossmlp python=3.6

Step2: Install the required libraries

pip install -r requirement.txt

Dataset Preparation

For Dayton and CVUSA, the datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, we put a few sample images in this code repo data samples. Please cite their papers if you use the data.

Preparing Ablation Dataset. We conduct ablation study in a2g (aerialto-ground) direction on Dayton dataset. To reduce the training time, we randomly select 1/3 samples from the whole 55,000/21,048 samples i.e. around 18,334 samples for training and 7,017 samples for testing. The trianing and testing splits can be downloaded here.

Preparing Dayton Dataset. The dataset can be downloaded here. In particular, you will need to download dayton.zip. Ground Truth semantic maps are not available for this datasets. We adopt RefineNet trained on CityScapes dataset for generating semantic maps and use them as training data in our experiments. Please cite their papers if you use this dataset. Train/Test splits for Dayton dataset can be downloaded from here.

Preparing CVUSA Dataset. The dataset can be downloaded here. After unzipping the dataset, prepare the training and testing data as discussed in our CrossMLP. We also convert semantic maps to the color ones by using this script. Since there is no semantic maps for the aerial images on this dataset, we use black images as aerial semantic maps for placehold purposes.

🌲 Note that for your convenience we also provide download scripts:

bash ./datasets/download_selectiongan_dataset.sh [dataset_name]

[dataset_name] can be:

dayton_ablation : 5.7 GB
dayton: 17.0 GB
cvusa: 1.3 GB

Training

Run the train_crossMlp.sh, whose content is shown as follows

python train.py --dataroot [path_to_dataset] \
	--name [experiment_name] \
	--model crossmlpgan \
	--which_model_netG unet_256 \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm batch \
	--gpu_ids 0 \
	--batchSize [BS] \
	--loadSize [LS] \
	--fineSize [FS] \
	--no_flip \
	--display_id 0 \
	--lambda_L1 100 \
	--lambda_L1_seg 1

For dayton or dayton_ablation dataset, [BS,LS,FS]=[4,286,256], set --niter 20 --niter_decay 15
For cvusa dataset, [BS,LS,FS]=[4,286,256], set --niter 15 --niter_decay 15

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use export CUDA_VISIBLE_DEVICES=[GPU_ID]. Training will cost about 3 days for dayton , less than 2 days for dayton_ablation, and less than 3 days for cvusa with the default --batchSize on one TITAN Xp GPU (12G). So we suggest you use a larger --batchSize, while performance is not tested using a larger --batchSize

To view training results and loss plots on local computers, set --display_id to a non-zero value and run python -m visdom.server on a new terminal and click the URL http://localhost:8097. On a remote server, replace localhost with your server's name, such as http://server.trento.cs.edu:8097.

Testing

Run the test_crossMlp.sh, whose content is shown as follows:

python test.py --dataroot [path_to_dataset] \
--name crossMlp_dayton_ablation \
--model crossmlpgan \
--which_model_netG unet_256 \
--which_direction AtoB \
--dataset_mode aligned \
--norm batch \
--gpu_ids 0 \
--batchSize 8 \
--loadSize 286 \
--fineSize 256 \
--saveDisk  \ 
--no_flip --eval

By default, it loads the latest checkpoint. It can be changed using --which_epoch.

We also provide image IDs used in our paper here for further qualitative comparsion.

Evaluation

Coming soon

Generating Images Using Pretrained Model

Coming soon

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Bin Ren ([email protected]).

Acknowledgments

This source code borrows heavily from Pix2pix and SelectionGAN. We also thank the authors X-Fork & X-Seq for providing the evaluation codes. This work was supported by the EU H2020 AI4Media No.951911project and by the PRIN project PREVUE.

CrossMLP - The repository offers the official implementation of our BMVC 2021 paper (oral) in PyTorch.

Related tags

Overview

CrossMLP

Installation

Dataset Preparation

Training

Testing

Evaluation

Generating Images Using Pretrained Model

Contributions

Acknowledgments

Owner

Bingoren

BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

A Japanese Medical Information Extraction Toolkit

This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?".

MNIST, but with Bezier curves instead of pixels

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

This program presents convolutional kernel density estimation, a method used to detect intercritical epilpetic spikes (IEDs)

The full training script for Enformer (Tensorflow Sonnet) on TPU clusters

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

Shared Attention for Multi-label Zero-shot Learning

ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models

Saliency - Framework-agnostic implementation for state-of-the-art saliency methods (XRAI, BlurIG, SmoothGrad, and more).

Groceries ARL: Association Rules (Birliktelik Kuralı)

Open source Python implementation of the HDR+ photography pipeline

Pytorch Implementation for (STANet+ and STANet)

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX