This module is used to create Convolutional AutoEncoders for Variational Data Assimilation

Last update: Dec 16, 2022

Related tags

Overview

VarDACAE

This module is used to create Convolutional AutoEncoders for Variational Data Assimilation. A user can define, create and train an AE for Data Assimilation with just a few lines of code. It is the accompanying code to the paper here, published in Computer Methods in Applied Mechanics and Engineering.

Introduction

Data Assimilation (DA) is an uncertainty quantification technique used to reduce the error in predictions by combining forecasting data with observation of the state. The most common techniques for DA are Variational approaches and Kalman Filters.

In this work, we propose a method of using Autoencoders to model the Background error covariance matrix, to greatly reduce the computational cost of solving 3D Variational DA while increasing the quality of the Data Assimilation.

Data

The data used in this paper is owned by the Data Science Institute, Imperial College, London. If you do not have access to this data, please see the section below on training a model with your own data.

Installation

Install vtk by navigating to this link and installing the version applicable to your system.
Navigate to the base directory and run:
```
pip install -e .
```
Run pytest from the home directory to ensure correct installation.

Tests

From the project home directory run pytest.

Getting Started

To train and evaluate a Tucodec model on Fluidity data:

from VarDACAE import TrainAE, BatchDA
from VarDACAE.settings.models.CLIC import CLIC

model_kwargs = {"model_name": "Tucodec", "block_type": "NeXt", "Cstd": 64}

settings = CLIC(**model_kwargs)    # settings describing experimental setup
expdir = "experiments/expt1/"      # dir to save results data and models

trainer = TrainAE(settings, expdir, batch_sz=16)
model = trainer.train(num_epochs=150)   # this will take approximately 8 hrs on a K80

# evaluate DA on the test set:
results_df = BatchDA(settings, AEModel=model).run()

Settings Instance

The API is based around a monolithic settings object that is used to define all configuration parameters, from the model definition to the seed. This single point of truth is used so that, an experiment can be repeated exactly by simply loading a pickled settings object. All key classes like TrainAE and BatchDA require a settings object at initialisation.

Train a model on your own data

To train a model on your own 3D data you must do the following:

Override the default get_X(...) method in the GetData loader class:

from VarDACAE import GetData

class NewLoaderClass(GetData):
    def get_X(self, settings):
        "Arguments:
               settings: (A settings.Config class)
        returns:
            np.array of dimensions B x nx x ny x nz "

        # ... calculate / load or download X
        # For an example see VarDACAE.data.load.GetData.get_X"""
        return X

Create a new settings class that inherits from your desired model's settings class (e.g. VarDACAE.settings.models.CLIC.CLIC) and update the data dimensions:

from VarDACAE.settings.models.CLIC import CLIC

class NewConfig(CLIC):
    def __init__(self, CLIC_kwargs, opt_kwargs):
        super(CLIC, self).__init__(**CLIC_kwargs)
        self.n3d = (100, 200, 300)  # Define input domain size
                                    # This is used by ConvScheduler
        self.X_FP = "SET_IF_REQ_BY_get_X"
        # ... use opt_kwargs as desired

CLIC_kwargs =  {"model_name": "Tucodec", "block_type": "NeXt",
                "Cstd": 64, "loader": NewLoaderClass}
                # NOTE: do not initialize NewLoaderClass

settings = NewConfig(CLIC_kwargs, opt_kwargs)

This settings object can now be used to train a model with the TrainAE method as shown above.

This module is used to create Convolutional AutoEncoders for Variational Data Assimilation

Related tags

Overview

VarDACAE

Introduction

Data

Installation

Tests

Getting Started

Settings Instance

Train a model on your own data

Owner

Julian Mack

This module is used to create Convolutional AutoEncoders for Variational Data Assimilation

Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks

A set of functions and analysis classes for solvation structure analysis

General Assembly's 2015 Data Science course in Washington, DC

ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

Projects that implement various aspects of Data Engineering.

BioMASS - A Python Framework for Modeling and Analysis of Signaling Systems

Codes for the collection and predictive processing of bitcoin from the API of coinmarketcap

PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra

A lightweight, hub-and-spoke dashboard for multi-account Data Science projects

BIGDATA SIMULATION ONE PIECE WORLD CENSUS

InDels analysis of CRISPR lines by NGS amplicon sequencing technology for a multicopy gene family.

A DSL for data-driven computational pipelines

Tools for the analysis, simulation, and presentation of Lorentz TEM data.

PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).

A fast, flexible, and performant feature selection package for python.

DefAP is a program developed to facilitate the exploration of a material's defect chemistry

OpenDrift is a software for modeling the trajectories and fate of objects or substances drifting in the ocean, or even in the atmosphere.

Provide a market analysis (R)