Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!

Last update: Sep 11, 2022

Overview

Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your machine!

Motivation

Would you like fully reproducible research or reusable workflows that seamlessly run on HPC clusters? Tired of writing and managing large Slurm submission scripts? Do you have comment out large parts of your pipeline whenever its results have been generated? Don't waste your precious time! awflow allows you to directly describe complex pipelines in Python, that run on your personal computer and large HPC clusters.

import awflow as aw
import glob
import numpy as np

n = 100000
tasks = 10

@aw.cpus(4)  # Request 4 CPU cores
@aw.memory("4GB")  # Request 4 GB of RAM
@aw.postcondition(aw.num_files('pi-*.npy', 10))
@aw.tasks(tasks)  # Requests '10' parallel tasks
def estimate(task_index):
    print("Executing task {} / {}.".format(task_index + 1, tasks))
    x = np.random.random(n)
    y = np.random.random(n)
    pi_estimate = (x**2 + y**2 <= 1)
    np.save('pi-' + str(task_index) + '.npy', pi_estimate)

@aw.dependency(estimate)
def merge():
    files = glob.glob('pi-*.npy')
    stack = np.vstack([np.load(f) for f in files])
    np.save('pi.npy', stack.sum() / (n * tasks) * 4)

@aw.dependency(merge)
@aw.postcondition(aw.exists('pi.npy'))  # Prevent execution if postcondition is satisfied.
def show_result():
    print("Pi:", np.load('pi.npy'))

aw.execute()

Executing this Python program (python examples/pi.py) on a Slurm HPC cluster will launch the following jobs.

           1803299       all    merge username PD       0:00      1 (Dependency)
           1803300       all show_res username PD       0:00      1 (Dependency)
     1803298_[6-9]       all estimate username PD       0:00      1 (Resources)
         1803298_3       all estimate username  R       0:01      1 compute-xx
         1803298_4       all estimate username  R       0:01      1 compute-xx
         1803298_5       all estimate username  R       0:01      1 compute-xx

Check the examples directory and guide to explore the functionality.

Installation

The awflow package is available on PyPi, which means it is installable via pip.

[email protected]:~ $ pip install awflow

If you would like the latest features, you can install it using this Git repository.

[email protected]:~ $ pip install git+https://github.com/JoeriHermans/awflow

If you would like to run the examples as well, be sure to install the optional example dependencies.

[email protected]:~ $ pip install 'awflow[examples]'

Usage

The core concept in awflow is the notion of a task. Essentially, this is a method that will be executed in your workflow. Tasks are represented as a node in a directed graph. In doing so, we can easily specify (task) dependencies. In addition, we can attribute properties to tasks using decorators defined by awflow. This allows you to specify things like CPU cores, GPU's and even postconditions. Follow the guide for additional examples and descriptions.

Decorators

TODO

Workflow storage

By default, workflows will be stored in the current working direction within the ./workflows folder. If desired, a central storage directory can be used by specifying the AWFLOW_STORAGE environment variable.

The `awflow` utility

This package comes with a utility program to manage submitted, failed, and pending workflows. Its functionality can be inspected by executing awflow -h. In addition, to streamline the management of workflows, we recommend to give every workflow as specific name to easily identify a workflow. This name does not have to be unique for every distinct workflow execution.

aw.execute(name=r'Some name')

Executing awflow list after submitting the pipeline with python pipeline.py [args] will yield.

[email protected]:~ $ awflow list
  Postconditions      Status      Backend     Name          Location
 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  50%                 Running     Slurm       Some name     /home/jhermans/awflow/examples/.workflows/tmpntmc712a

Modules

[email protected]:~ $ awflow cancel [workflow] TODO

[email protected]:~ $ awflow clear TODO

[email protected]:~ $ awflow list TODO

[email protected]:~ $ awflow inspect [workflow] TODO

Contributing

See CONTRIBUTING.md.

Roadmap

Documentation
README

License

As described in the LICENSE file.

You might also like...

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

25.5k Jan 7, 2023

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

5.1k Jan 4, 2023

Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

134 Jul 10, 2022

Open-sourcing the Slates Dataset for recommender systems research

FINN.no Recommender Systems Slate Dataset This repository accompany the paper "Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sa

48 Nov 28, 2022

Research on controller area network Intrusion Detection Systems

Group members information Member 1: Lixue Liang Member 2: Yuet Lee Chan Member 3: Xinruo Zhang Member 4: Yifei Han User Manual Generate Attack Packets

4 Aug 30, 2022

GluonMM is a library of transformer models for computer vision and multi-modality research

GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon Research.

42 Dec 2, 2022

BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

26 Nov 29, 2022

Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Open-L2O This repository establishes the first comprehensive benchmark efforts of existing learning to optimize (L2O) approaches on a number of proble

161 Jan 2, 2023

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark We propose a benchmark to evaluate different quantization algorithms on vari

494 Dec 29, 2022

Comments

[BUG] conda activation crashes standalone execution
Issue description

In the standalone backend on Unix systems, the os.system(command) used here

https://github.com/JoeriHermans/awflow/blob/1fcf255debfbc18d39a6b2baa387bbc85050209d/awflow/backends/standalone/executor.py#L53-L60

actually calls /bin/sh. For some OS, like Ubuntu, sh links to dash which does not support the scripting features required by conda activations. This results in runtime errors like

sh: 5: /home/username/miniconda3/envs/envname/etc/conda/activate.d/activate-binutils_linux-64.sh: Syntax error: "(" unexpected

Proposed solution

A solution would be to change the shell with which the commands are called. This is possible thanks to the subprocess package. A good default would be bash as almost all Unix systems use it.

if node.tasks > 1: for task_index in range(node.tasks): task_command = command + ' ' + str(task_index) return_code = subprocess.call(task_command, shell=True, executable='/bin/bash') else: return_code = subprocess.call(command, shell=True, executable='/bin/bash')

One could also add a way to change this default. Additionally, wouldn't it be better to launch the tasks as background jobs for the standalone backend (simply add & at the end of the command) ?
bug
opened by francois-rozet 1

[BUG] pip install fails for version 0.0.4

$ pip install awflow==0.0.4
Collecting awflow==0.0.4
  Using cached awflow-0.0.4.tar.gz (19 kB)
    ERROR: Command errored out with exit status 1:
     command: /home/francois/awf/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-ou4rxs3q/awflow/setup.py'"'"'; __file__='"'"'/tmp/pip-install-ou4rxs3q/awflow/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-ou4rxs3q/awflow/pip-egg-info
         cwd: /tmp/pip-install-ou4rxs3q/awflow/
    Complete output (7 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-ou4rxs3q/awflow/setup.py", line 54, in <module>
        'examples': _load_requirements('requirements_examples.txt')
      File "/tmp/pip-install-ou4rxs3q/awflow/setup.py", line 17, in _load_requirements
        with open(file_name, 'r') as file:
    FileNotFoundError: [Errno 2] No such file or directory: 'requirements_examples.txt'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

bug high priority

opened by francois-rozet 1

Jobs submitted with awflow doesn't work with Multiprocessing.pool

Hi,

I tried submitting a few jobs with awflow but somehow each time I run it with slurm backend it never produces a pool.starmap and the process simply times out on cluster. `0 0 8196756 5.1g 85664 S 0.0 1.0 2:12.27 python 790517 rnath 20 0 7953388 5.0g 12020 S 0.0 1.0 0:01.66 python

790518 rnath 20 0 7953388 5.0g 12020 S 0.0 1.0 0:01.45 python

790519 rnath 20 0 7953388 5.0g 12020 S 0.0 1.0 0:01.76 python

790520 rnath 20 0 7953388 5.0g 12020 S 0.0 1.0 0:02.02 python

790521 rnath 20 0 7953388 5.0g 12020 S 0.0 1.0 0:01.99 python `

An example of what happens in the cluster where the processes are spawned but each process uses 0 % of the cpu slurmstepd: error: *** JOB 1933332 ON compute-04 CANCELLED AT 2022-04-08T19:33:26 DUE TO TIME LIMIT ***

opened by digirak 0

Releases(0.1.0)

0.1.0(Jan 10, 2022)
New interface proposed by @francois-rozet.

Allows user to define arbitrary workflow graphs.

Post-conditions on an array-level.

Intuitive and extensible compute backend implementations.

Arbitrary starting points in compute graph.

Updated job pruning code.

Available on https://pypi.org/project/awflow/
Source code(tar.gz)
Source code(zip)
0.0.4(Dec 19, 2021)

First fully functional release with a CLI tool to manage workflows.

Available on PyPI.
Source code(tar.gz)
Source code(zip)

Owner

Joeri Hermans

Combining Machine Learning and Physics to automate science.

GitHub Repository

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA

LightSeq: A High Performance Library for Sequence Processing and Generation

2.5k Jan 06, 2023

NOMAD - A blackbox optimization software

################################################################################### #

78 Dec 29, 2022

Easily Process a Batch of Cox Models

ezcox: Easily Process a Batch of Cox Models The goal of ezcox is to operate a batch of univariate or multivariate Cox models and return tidy result. ⏬

15 May 23, 2022

Sequential model-based optimization with a `scipy.optimize` interface

Scikit-Optimize Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements

2.5k Jan 04, 2023

Wordplay, an artificial Intelligence based crossword puzzle solver.

Wordplay, AI based crossword puzzle solver A crossword is a word puzzle that usually takes the form of a square or a rectangular grid of white- and bl

4 Nov 16, 2022

For medical image segmentation

LeViT_UNet For medical image segmentation Our model is based on LeViT (https://github.com/facebookresearch/LeViT). You'd better gitclone its codes. Th

13 Dec 24, 2022

SIR model parameter estimation using a novel algorithm for differentiated uniformization.

TenSIR Parameter estimation on epidemic data under the SIR model using a novel algorithm for differentiated uniformization of Markov transition rate m

4 Nov 30, 2022

Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.

Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi

18 Oct 20, 2022

Efficient Householder transformation in PyTorch

Efficient Householder Transformation in PyTorch This repository implements the Householder transformation algorithm for calculating orthogonal matrice

49 Nov 20, 2022

Adversarial Learning for Semi-supervised Semantic Segmentation, BMVC 2018

Adversarial Learning for Semi-supervised Semantic Segmentation This repo is the pytorch implementation of the following paper: Adversarial Learning fo

464 Dec 19, 2022

DTCN IJCAI - Sequential prediction learning framework and algorithm

DTCN This is the implementation of our paper "Sequential Prediction of Social Me

2 Jan 24, 2022

Benchmarks for Model-Based Optimization

Design-Bench Design-Bench is a benchmarking framework for solving automatic design problems that involve choosing an input that maximizes a black-box

43 Dec 20, 2022

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

57 Nov 28, 2022

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri Bu Github Reposundaki tüm projeler; kaleme almış olduğum "Projelerle Yapay Zekâ ve Bi

4 Aug 03, 2022

Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow

AutoAugment - Learning Augmentation Policies from Data Unofficial implementation of the ImageNet, CIFAR10 and SVHN Augmentation Policies learned by Au

1.3k Jan 02, 2023

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Overview This is a hobby project which includes a hand-gesture controlled virtual piano using an android phone camera and some OpenCV library. My moti

1 Nov 19, 2021