CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

Last update: Mar 30, 2022

Overview

SmartSim Example Zoo

This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

The CrayLabs team will attempt to keep examples updated with current releases but all user contibuted examples should specify the release they were created with.

Contibuting Examples

We welcome any and all contibutions to this repository. The CrayLabs team will do their best to review in a timely manner. We ask that, if you contribute examples, please include a description and all references to code and relavent previous implemenations or open source code that the work is based off of for the benefit of anyone who would like to try out your example.

Examples by Paper

The following examples are implemented based on existing research papers. Each example lists the paper, previous works, and links to the implementation (possibly stored within this repository or a seperate repository)

1. DeepDriveMD

Contibuting User: CrayLabs
Tags: OpenMM, CVAE, online inference, unsupervised online learning, PyTorch, ensemble

This use case highlights many features of SmartSim and SmartRedis and together they can be used to orchestrate complex workflows with coupled applications without using the filesystem for exchanging information.

More specifically, this use case is based on the original DeepDriveMD work. DeepDriveMD was furthered with an asynchronous streaming version. SmartSim extends the streaming implementation through the use of the SmartSim architecture. The main difference between the SmartSim implementation and the previous implementations, is that neither ML models, nor Molecular Dynamics (MD) intermediate results are stored on the file system. Additionally, the inference portion of the workflow takes place inside the database instead of a seperate task launched on the system.

2. TensorFlowFoam

Contributing User: CrayLabs
Tags: Online Inference, TensorFlow, OpenFOAM, supervised learning

This example shows how to use TensorFlow inside of OpenFOAM simulations using SmartSim.

More specifically, this SmartSim use case adapts the TensorFlowFoam work which utilized a deep neural network to predict steady-state turbulent viscosities of the Spalart-Allmaras (SA) model. This use case highlights that a machine learning model can be evaluated using SmartSim from within a simulation with minimal external library code. For the OpenFOAM use case herein, only four SmartRedis client API calls are needed to initialize a client connection, send tensor data for evaluation, execute the TensorFlow model, and retrieve the model inference result.

In general, this example provides a useful driver script for those looking to run OpenFOAM with SmartSim.

3. ML-EKE

Contributing User: CrayLabs
Tags: Online inference, MOM6, climate modeling, ensemble, parameterization replacement

This example was a collaboration between CrayLabs (HPE), NCAR, and the university of Victoria. Using SmartSim, this example shows how to run an ensemble of simulations all using the SmartSim architecture to replace a parameterization (MEKE) within each global ocean simulation (MOM6).

Paper Abstract:

We demonstrate the first climate-scale, numerical ocean simulations improved through distributed, online inference of Deep Neural Networks (DNN) using SmartSim. SmartSim is a library dedicated to enabling online analysis and Machine Learning (ML) for traditional HPC simulations. In this paper, we detail the SmartSim architecture and provide benchmarks including online inference with a shared ML model on heterogeneous HPC systems. We demonstrate the capability of SmartSim by using it to run a 12-member ensemble of global-scale, high-resolution ocean simulations, each spanning 19 compute nodes, all communicating with the same ML architecture at each simulation timestep. In total, 970 billion inferences are collectively served by running the ensemble for a total of 120 simulated years. Finally, we show our solution is stable over the full duration of the model integrations, and that the inclusion of machine learning has minimal impact on the simulation runtimes.

Since this is original research done by CrayLabs, there is no previous implementation.

Examples by Simulation Model

LAMMPS

SmartSim examples with LAMMPS which is a Molecular Dynamics simulation model.

1. Online Analysis of Atom Position

Contibuting User: CrayLabs
Tags: Molecular Dynamics, online analysis, visualizations.

LAMMPS has dump styles which are custom I/O methods that can be implmentated by users. CrayLabs implemented a SMARTSIM dump style which uses the SmartRedis clients to stream data to an Orchestrator database created by SmartSim.

Once the data is in the database, any application with a SmartRedis client can consume that data. For this example, we have a simple Python script that uses iPyVolume to plot the data every 100 iterations.

Examples by System

High Performance Computing Systems are a bit like snowflakes, they are all different. Since each one has their own quirks, some examples for specific and popular systems can be of benefit to new users.

National Center for Atmospheric Research (NCAR)

1. Cheyenne

Contibuting User: CrayLabs
implementation (this repo)
WLM: PBSPro
System: SGI 8600
CPU: intel
GPU: None

2. Casper

Contibuting user: @jedwards4b
Implementation (this repo)
WLM: PBSPro
GPU: Nvidia
CPU: Intel
SmartSim Version: 0.3.2
SmartRedis Version: 0.2.0

Oak Ridge National Lab

1. Summit

Contributing user: CrayLabs
implementation (this repo)
System:
OS: Red Hat Enterprise Linux (RHEL)
CPU: Power9
GPU: Nvidia V100

CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

Related tags

Overview

SmartSim Example Zoo

Contibuting Examples

Examples by Paper

1. DeepDriveMD

2. TensorFlowFoam

3. ML-EKE

Examples by Simulation Model

LAMMPS

1. Online Analysis of Atom Position

Examples by System

National Center for Atmospheric Research (NCAR)

1. Cheyenne

2. Casper

Oak Ridge National Lab

1. Summit

Owner

Cray Labs

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning

scikit-learn models hyperparameters tuning and feature selection, using evolutionary algorithms.

The Fuzzy Labs guide to the universe of open source MLOps

LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.

Data Version Control or DVC is an open-source tool for data science and machine learning projects

Machine Learning Algorithms

Provide an input CSV and a target field to predict, generate a model + code to run it.

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

Ml based project which uses regression technique to predict the price.

A Python implementation of GRAIL, a generic framework to learn compact time series representations.

CobraML: Completely Customizable A python ML library designed to give the end user full control

#30DaysOfStreamlit is a 30-day social challenge for you to build and deploy Streamlit apps.

Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑‍🔧

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Python package for causal inference using Bayesian structural time-series models.

Sleep stages are classified with the help of ML. We have used 4 different ML algorithms (SVM, KNN, RF, NN) to demonstrate them

ML Optimizers from scratch using JAX

Bayesian Additive Regression Trees For Python

Data science, Data manipulation and Machine learning package.