Winners of DrivenData's Overhead Geopose Challenge

Overview



Banner Image

Images shown are from the public Urban Semantic 3D Dataset, provided courtesy of DigitalGlobe

Goal of the Competition

Overhead satellite imagery provides critical, time-sensitive information for use in arenas such as disaster response, navigation, and security. Most current methods for using aerial imagery assume images are taken from directly overhead, known as near-nadir. However, the first images available are often taken from an angle — they are oblique. Effects from these camera orientations complicate useful tasks such as change detection, vision-aided navigation, and map alignment.

In this challenge, participants made satellite imagery taken from a significant angle more useful for time-sensitive applications such as disaster and emergency response

What's in This Repository

This repository contains code from winning competitors in the Overhead Geopose Challenge.

Winning code for other DrivenData competitions is available in the competition-winners repository.

Winning Submissions

Prediction Contest

All of the models below build on the solution provided in the benchmark blog post: Overhead Geopose Challenge - Benchmark. Additional solution details can be found in the reports folder inside the directory for each submission.

The weights for each winning model can be downloaded from the National Geospatial-Intelligence Agency's (NGA's) DataPort page.

Place Team or User Public Score Private Score Summary of Model
1 selim_sef 0.902184 0.902459 An EfficientNet V2 L encoder is used instead of the Resnet34 encoder because it has a huge capacity and is less prone to overfitting. The decoder is a UNet with more filters and additional convolution blocks for better handling of fine-grained details. MSE loss would produce imbalance for different cities, depending on building heights. The model is trained with an R2 loss for AGL/MAG outputs, which reflects the final competition metric and is more robust to noisy training data.
2 bloodaxe 0.889955 0.891393 I’ve trained a bunch of UNet-like models and averaged their predictions. Sounds simple, yet I used quite heavy encoders (B6 & B7) and custom-made decoders to produce very accurate height map predictions at original resolution. Another crucial part of the solution was extensive custom data augmentation for height, orientation, scale, GSD, and image RGB values.
3 o__@ 0.882882 0.882801 I ensembled the VFlow-UNet model using a large input resolution and a large backbone without downsampling. Better results were obtained when the model was trained on all images from the training set. The test set contains images of the same location as the images in the training set. This overlap was identified by image matching to improve the prediction results.
4 kbrodt 0.872775 0.873057 The model uses a UNet architecture with various encoders (efficientnet-b{6,7} and senet154) and has only one above-ground level (AGL) head and two heads in the bottleneck for scale and angle. The features are a random 512x512 crop of an aerial image, the city's one hot encoding, and ground sample distance (GSD). The model is trained with mean squared error (MSE) loss function for all targets (AGL, scale, angle) using AdamW optimizer with 1e-4 learning rate.

Model Write-up Bonus

Prediction rank Team or User Public Score Private Score Summary of Model
2 bloodaxe 0.889955 0.891393 See the "Prediction Contest" section above
5 chuchu 0.856847 0.855636 We conducted an empirical upper bound analysis, which suggested that the main errors are from height prediction and the rest are from angle prediction. To overcome the bottlenecks we proposed HR-VFLOW, which takes HRNet as backbone and adopts simple multi-scale fusion as multi-task decoders to predict height, magnitude, angle, and scale simultaneously. To handle the height variance, we first pretrained the model on all four cities and then transferred the pretrained model to each specific city for better city-wise performance.
7 vecxoz 0.852948 0.851828 First, I implemented training with automatic mixed precision in order to speed up training and facilitate experiments with the large architectures. Second, I implemented 7 popular decoder architectures and conducted extensive preliminary research of different combinations of encoders and decoders. For the most promising combinations I ran long training for at least 200 epochs to study best possible scores and training dynamics. Third, I implemented an ensemble using weighted average for height and scale target and circular average for angle target.

Approved for public release, 21-943

Owner
DrivenData
Data science competitions for social good.
DrivenData
Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21

Self-Supervised Reward Regression (SSRR) Codebase for CoRL 2021 paper "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression "

19 Dec 12, 2022
Gauge equivariant mesh cnn

Geometric Mesh CNN The code in this repository is an implementation of the Gauge Equivariant Mesh CNN introduced in the paper Gauge Equivariant Mesh C

50 Dec 18, 2022
Rethinking Portrait Matting with Privacy Preserving

Rethinking Portrait Matting with Privacy Preserving This is the official repository of the paper Rethinking Portrait Matting with Privacy Preserving.

184 Jan 03, 2023
A Python library for Deep Probabilistic Modeling

Abstract DeeProb-kit is a Python library that implements deep probabilistic models such as various kinds of Sum-Product Networks, Normalizing Flows an

DeeProb-org 46 Dec 26, 2022
QI-Q RoboMaster2022 CV Algorithm

QI-Q RoboMaster2022 CV Algorithm

2 Jan 10, 2022
Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

Pose Detection Project Description: Human pose estimation from video plays a critical role in various applications such as quantifying physical exerci

Hassan Shahzad 2 Jan 17, 2022
Blender Python - Node-based multi-line text and image flowchart

MindMapper v0.8 Node-based text and image flowchart for Blender Mindmap with shortcuts visible: Mindmap with shortcuts hidden: Notes This was requeste

SpectralVectors 58 Oct 08, 2022
adversarial_multi_armed_bandit_variable_plays

Adversarial Multi-Armed Bandit with Variable Plays This code is for paper: Adversarial Online Learning with Variable Plays in the Evasion-and-Pursuit

Yiyang Wang 1 Oct 28, 2021
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
Scheduling BilinearRewards

Scheduling_BilinearRewards Requirement Python 3 =3.5 Structure main.py This file includes the main function. For getting the results in Figure 1, ple

junghun.kim 0 Nov 25, 2021
Code for the published paper : Learning to recognize rare traffic sign

Improving traffic sign recognition by active search This repo contains code for the paper : "Learning to recognise rare traffic signs" How to use this

samsja 4 Jan 05, 2023
v objective diffusion inference code for PyTorch.

v-diffusion-pytorch v objective diffusion inference code for PyTorch, by Katherine Crowson (@RiversHaveWings) and Chainbreakers AI (@jd_pressman). The

Katherine Crowson 635 Dec 30, 2022
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

697 Jan 06, 2023
Streamlit component for TensorBoard, TensorFlow's visualization toolkit

streamlit-tensorboard This is a work-in-progress, providing a function to embed TensorBoard, TensorFlow's visualization toolkit, in Streamlit apps. In

Snehan Kekre 27 Nov 13, 2022
Finding all things on-prem Microsoft for password spraying and enumeration.

msprobe About Installing Usage Examples Coming Soon Acknowledgements About Finding all things on-prem Microsoft for password spraying and enumeration.

205 Jan 09, 2023
N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting

N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting Recent progress in neural forecasting instigated significant improvements in the

Cristian Challu 82 Jan 04, 2023
Vignette is a face tracking software for characters using osu!framework.

Vignette is a face tracking software for characters using osu!framework. Unlike most solutions, Vignette is: Made with osu!framework, the game framewo

Vignette 412 Dec 28, 2022
DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021] Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng

Automation and Intelligence for Civil Engineering (AI4CE) Lab @ NYU 98 Dec 21, 2022
J.A.R.V.I.S is an AI virtual assistant made in python.

J.A.R.V.I.S is an AI virtual assistant made in python. Running JARVIS Without Python To run JARVIS without python: 1. Head over to our installation pa

somePythonProgrammer 16 Dec 29, 2022
FedTorch is an open-source Python package for distributed and federated training of machine learning models using PyTorch distributed API

FedTorch is a generic repository for benchmarking different federated and distributed learning algorithms using PyTorch Distributed API.

Machine Learning and Optimization Lab @PennState 136 Dec 23, 2022