Towards Representation Learning for Atmospheric Dynamics (AtmoDist)

Related tags

Deep LearningAtmoDist
Overview

Towards Representation Learning for Atmospheric Dynamics (AtmoDist)

The prediction of future climate scenarios under anthropogenic forcing is critical to understand climate change and to assess the impact of potentially counter-acting technologies. Machine learning and hybrid techniques for this prediction rely on informative metrics that are sensitive to pertinent but often subtle influences. For atmospheric dynamics, a critical part of the climate system, no well established metric exists and visual inspection is currently still often used in practice. However, this "eyeball metric" cannot be used for machine learning where an algorithmic description is required. Motivated by the success of intermediate neural network activations as basis for learned metrics, e.g. in computer vision, we present a novel, self-supervised representation learning approach specifically designed for atmospheric dynamics. Our approach, called AtmoDist, trains a neural network on a simple, auxiliary task: predicting the temporal distance between elements of a randomly shuffled sequence of atmospheric fields (e.g. the components of the wind field from reanalysis or simulation). The task forces the network to learn important intrinsic aspects of the data as activations in its layers and from these hence a discriminative metric can be obtained. We demonstrate this by using AtmoDist to define a metric for GAN-based super resolution of vorticity and divergence. Our upscaled data matches both visually and in terms of its statistics a high resolution reference closely and it significantly outperform the state-of-the-art based on mean squared error. Since AtmoDist is unsupervised, only requires a temporal sequence of fields, and uses a simple auxiliary task, it has the potential to be of utility in a wide range of applications.

Original implementation of

Hoffmann, Sebastian, and Christian Lessig. "Towards Representation Learning for Atmospheric Dynamics." arXiv preprint arXiv:2109.09076 (2021). https://arxiv.org/abs/2109.09076

presented as part of the NeurIPS 2021 Workshop on Tackling Climate Change with Machine Learning

We would like to thank Stengel et al. for openly making available their implementation (https://github.com/NREL/PhIRE) of Adversarial super-resolution of climatological wind and solar data on which we directly based the super-resolution part of this work.


Requirements

  • tensorflow 1.15.5
  • pyshtools (for SR evaluation)
  • pyspharm (for SR evaluation)
  • h5py
  • hdf5plugin
  • dask.array

Installation

pip install -e ./

This also makes available multiple command line tools that provide easy access to preprocessing, training, and evaluation routines. It's recommended to install the project in a virtual environment as to not polutte the global PATH.


CLI Tools

The provided CLI tools don't accept parameters but rather act as a shortcut to execute the corresponding script files. All parameters controlling the behaviour of the training etc. should thus be adjusted in the script files directly. We list both the command-line command, as well as the script file the command executes.

  • rplearn-data (python/phire/data_tool.py)
    • Samples patches and generates .tfrecords files from HDF5 data for the self-supervised representation-learning task.
  • rplearn-train (python/phire/rplearn/train.py)
    • Trains the representation network. By toggling comments, the same script is also used for evaluation of the trained network.
  • phire-data (python/phire/data_tool.py)
    • Samples patches and generates .tfrecords files from HDF5 data for the super-resolution task.
  • phire-train (python/phire/main.py)
    • Trains the SRGAN model using either MSE or a content-loss based on AtmoDist.
  • phire-eval (python/phire/evaluation/cli.py)
    • Evaluates trained SRGAN models using various metrics (e.g. energy spectrum, semivariogram, etc.). Generation of images is also part of this.

Project Structure

  • python/phire
    • Mostly preserved from the Stengel et al. implementation, this directory contains the code for the SR training. sr_network.py contains the actual GAN model, whereas PhIREGANs.py contains the main training loop, data pipeline, as well as interference procedure.
  • python/phire/rplearn
    • Contains everything related to representation learning task, i.e. AtmoDist. The actual ResNet models are defined in resnet.py, while the training procedure can be found in train.py.
  • python/phire/evaluation
    • Dedicated to the evaluation of the super-resolved fields. The main configuration of the evaluation is done in cli.py, while the other files mostly correspond to specific evaluation metrics.
  • python/phire/data
    • Static data shipped with the python package.
  • python/phire/jetstream
    • WiP: Prediction of jetstream latitude as downstream task.
  • scripts/
    • Various utility scripts, e.g. used to generate some of the figures seen in the paper.

Preparing the data

AtmoDist is trained on vorticity and divergence fields from ERA5 reanalysis data. The data was directly obtained as spherical harmonic coefficients from model level 120, before being converted to regular lat-lon grids (1280 x 2560) using pyshtools (right now not included in this repository).

We assume this gridded data to be stored in a hdf5 file for training and evaluation respectively containing a single dataset /data with dimensions C x T x H x W. These dimensions correspond to channel (/variable), time, height, and width respectively. Patches are then sampled from this hdf5 data and stored in .tfrecords files for training.

In practice, these "master" files actually contained virtual datasets, while the actual data was stored as one hdf5 file per year. This is however not a hard requirement. The script to create these virtual datasets is currently not included in the repository but might be at a later point of time.

To sample patches for training or evaluation run rplearn-data and phire-data.

Normalization

Normalization is done by the phire/data_tool.py script. This procedure is opaque to the models and data is only de-normalized during evaluation. The mean and standard deviations used for normalization can be specified using DataSampler.mean, DataSampler.std, DataSampler.mean_log1p, DataSampler.std_log1p. If specified as None, then these statistics will be calculated from the dataset using dask (this will take some time).


Training the AtmoDist model

  1. Specify dataset location, model name, output location, and number of classes (i.e. max delta T) in phire/rplearn/train.py
  2. Run training using rplearn-train
  3. Switch to evaluation by calling evaluate_all() and compute metrics on eval set
  4. Find optimal epoch and calculate normalization factors (for specific layer) using calculate_loss()

Training the SRGAN model

  1. Specify dataset location, model name, AtmoDist model to use, and training regimen in phire/main.py
  2. Run training using phire-train

Evaluating the SRGAN models

  1. Specify dataset location, models to evaluate, output location, and metrics to calculate in phire/evaluation/cli.py
  2. Evaluate using phire-eval
  3. Toggle the if-statement to generate comparing plots and data between different models and rerun phire-eval
Owner
Sebastian Hoffmann
Sebastian Hoffmann
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Demo video: CVPR 2021 Oral: Single Channel Manipulation: Localized or attribu

Zongze Wu 267 Dec 30, 2022
[IEEE Transactions on Computational Imaging] Self-Gated Memory Recurrent Network for Efficient Scalable HDR Deghosting

Few-shot Deep HDR Deghosting This repository contains code and pretrained models for our paper: Self-Gated Memory Recurrent Network for Efficient Scal

Susmit Agrawal 4 Dec 29, 2021
Anonymous implementation of KSL

k-Step Latent (KSL) Implementation of k-Step Latent (KSL) in PyTorch. Representation Learning for Data-Efficient Reinforcement Learning [Paper] Code i

1 Nov 10, 2021
Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors

Gas detection Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors. Description The MQ-2 sensor can detect multiple gases (CO, H2, CH4, LPG,

Filip Š 15 Sep 30, 2022
Preparation material for Dropbox interviews

Dropbox-Onsite-Interviews A guide for the Dropbox onsite interview! The Dropbox interview question bank is very small. The bank has been in a Chinese

386 Dec 31, 2022
Multi-tool reverse engineering collaboration solution.

CollaRE v0.3 Intorduction CollareRE is a tool for collaborative reverse engineering that aims to allow teams that do need to use more then one tool du

105 Nov 27, 2022
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

61.4k Jan 04, 2023
Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models This repo contains code for DDPM training. Based on Denoising Diffusion Probabilistic Models, Improved Denois

Alexander Markov 7 Dec 15, 2022
Multistream CNN for Robust Acoustic Modeling

Multistream Convolutional Neural Network (CNN) A multistream CNN is a novel neural network architecture for robust acoustic modeling in speech recogni

ASAPP Research 37 Sep 21, 2022
Only valid pull requests will be allowed. Use python only and readme changes will not be accepted.

❌ This repo is excluded from hacktoberfest This repo is for python beginners and contains lot of beginner python projects for practice. You can also s

Prajjwal Pathak 50 Dec 28, 2022
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

DingDing 143 Jan 01, 2023
Research shows Google collects 20x more data from Android than Apple collects from iOS. Block this non-consensual telemetry using pihole blocklists.

pihole-antitelemetry Research shows Google collects 20x more data from Android than Apple collects from iOS. Block both using these pihole lists. Proj

Adrian Edwards 290 Jan 09, 2023
A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

OpenCDA OpenCDA is a SIMULATION tool integrated with a prototype cooperative driving automation (CDA; see SAE J3216) pipeline as well as regular autom

UCLA Mobility Lab 726 Dec 29, 2022
Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback This is our Pytorch implementation for the paper: Yinwei Wei,

17 Jun 10, 2022
Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection abstract:Unlike 2D object detection where all RoI featur

DK. Zhang 2 Oct 07, 2022
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows This is the official implementation of the ICCV 2021 Paper "Probabilistic Mono

62 Nov 23, 2022
利用Tensorflow实现基于CNN的中文短文本分类

Text Classification with CNN 使用卷积神经网络进行中文文本分类 CNN做句子分类的论文可以参看: Convolutional Neural Networks for Sentence Classification 还可以去读dennybritz大牛的博客:Implemen

Jeremiah 4 Nov 08, 2022