Python Research Framework

Last update: Dec 13, 2022

Related tags

Overview

pyfra

The Python Research Framework.

Design Philosophy

Research code has some of the fastest shifting requirements of any type of code. It's nearly impossible to plan ahead of time the proper abstractions, because it is exceedingly likely that in the course of the project what you originally thought was your main focus suddenly no longer is. Further, research code (especially in ML) often involves big and complicated pipelines, typically involving many different machines, which are either run by hand or using shell scripts that are far more complicated than any shell script ever should be.

Therefore, the objective of pyfra is to make it as fast and low-friction as possible to write research code involving complex pipelines over many machines. This entails making it as easy as possible to implement a research idea in reality, at the cost of fine-grained control and the long-term maintainability of the system. In other words, pyfra expects that code will either be rapidly obsoleted by newer code, or rewritten using some other framework once it is no longer a research project and requirements have settled down.

Pyfra is in its very early stages of development. The interface may change rapidly and without warning.

Features:

Spin up an internal webserver complete with a permissions system using only a few lines of code
Extremely elegant shell integration—run commands on any server seamlessly. All the best parts of bash and python combined
Automated remote environment setup, so you never have to worry about provisioning machines by hand again
(WIP) Tools for painless functional programming in python
(Coming soon) High level API for experiment management/scheduling and resource provisioning
(Coming soon) Idempotent resumable data pipelines with no cognitive overhead

Example code

from pyfra import *

loc = Remote()
rem = Remote("[email protected]")
nas = Remote("[email protected]")

@page("Run experiment", dropdowns={'server': ['local', 'remote']})
def run_experiment(server: str, config_file: str, some_numerical_value: int, some_checkbox: bool):
    r = loc if server == 'local' else rem

    r.sh("git clone https://github.com/EleutherAI/gpt-neox")
    
    # rsync as a function can do local-local, local-remote, and remote-remote
    rsync(config_file, r.file("gpt-neox/configs/my-config.yml"))
    rsync(nas.file('some_data_file'), r.file('gpt-neox/data/whatever'))
    
    return r.sh('cd gpt-neox; python3 main.py')

@page("Write example file and copy")
def example():
    rem.fwrite("testing.txt", "hello world")
    
    # tlocal files can be specified as just a string
    rsync(rem.file('testing123.txt'), 'test1.txt')
    rsync(rem.file('testing123.txt'), loc.file('test2.txt'))

    loc.sh('cat test1.txt')
    
    assert fread('test1.txt') == fread('test2.txt')
    
    # fread, fwrite, etc can take a `rem.file` instead of a string filename.
    # you can also use all *read and *write functions directly on the remote too.
    assert fread('test1.txt') == fread(rem.file('testing123.txt'))
    assert fread('test1.txt') == rem.fread('testing123.txt')

    # ls as a function returns a list of files (with absolute paths) on the selected remote.
    # the returned value is displayed on the webpage.
    return '\n'.join(rem.ls('/'))

@page("List files in some directory")
def list_files(directory):
    return sh(f"ls -la {directory | quote}")


# start internal webserver
webserver()

Installation

pip3 install git+https://github.com/EleutherAI/pyfra/

The version of PyPI is not up to date, do not use it.

Webserver screenshots

Comments

Try to install sudo in _install

Sudo is installed in setup.apt(), which is not run when python_version=None is set for an env. This PR tries to install the sudo package on _install which solves this issue.

opened by kurumuz 1
Styling updates 2
This should fix some issues that were noticed recently.

increases the width of the content in the middle

all button icons are now the same (until we figure out better solution)

content that is overflowing should now be scrollable
opened by jprester 0
Update styling
I made some updates to styling for the admin dashboard pages.

Stuff I did:

changed the styling to look like design mockup

moved ids to classes in css. Ids should be used for javascript selector

added some svg icons

made the UI somewhat responsive
opened by jprester 0
docs: docs are empty

Screenshot from the RTD page:

I recommend checking the raw output of the build on the RTD dashboard.

Probably some library installation issue when running setup.

opened by TomFrederik 0
Type annotations

Type annotations are a must-have for public facing library exports, as they allow users to infer a lot of information about calls/return values independent of documentation, as well as help with code completions.

opened by hugbubby 0

Releases(v0.3.0)

v0.3.0(Dec 9, 2021)
What's new

Envs now resume where they left off (and Remotes have an option for turning this behaviour on)

@stage caching added

Breaking Changes

delegation promoted to full submodule and experiment removed

pyfra.functional removed

pyfra.web deprecated and moved to contrib

contrib revamp

Full Changelog: https://github.com/EleutherAI/pyfra/compare/8e775df36ca8f2ae39b0b7add9c30eab446207b1...9616e835578f8ad04a6d9c3b405777fc4b7e0853
Source code(tar.gz)
Source code(zip)
v0.3.0rc6(Sep 1, 2021)

Source code(tar.gz)
Source code(zip)

Python Research Framework

Related tags

Overview

pyfra

Design Philosophy

Example code

Installation

Webserver screenshots

Comments

Try to install sudo in _install

Styling updates 2

Update styling

docs: docs are empty

Type annotations

Releases(v0.3.0)

v0.3.0(Dec 9, 2021)

What's new

Breaking Changes

v0.3.0rc6(Sep 1, 2021)

Owner

EleutherAI

Adaptive: parallel active learning of mathematical functions

A Python step-by-step primer for Machine Learning and Optimization

Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice

LiuAlgoTrader is a scalable, multi-process ML-ready framework for effective algorithmic trading

A collection of Machine Learning Models To Web Api which are built on open source technologies/frameworks like Django, Flask.

Mesh TensorFlow: Model Parallelism Made Easier

This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you ask it.

icepickle is to allow a safe way to serialize and deserialize linear scikit-learn models

jaxfg - Factor graph-based nonlinear optimization library for JAX.

AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker

The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

Dive into Machine Learning

To-Be is a machine learning challenge on CodaLab Platform about Mortality Prediction

This is a public repo where code samples are stored for the book Practical MLOps.

💀mummify: a version control tool for machine learning

Mixing up the Invariant Information clustering architecture, with self supervised concepts from SimCLR and MoCo approaches

BASTA: The BAyesian STellar Algorithm

Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis.

Probabilistic time series modeling in Python

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.