Pipetools enables function composition similar to using Unix pipes.

Overview

Pipetools

tests-badge coverage-badge pypi-badge

Complete documentation

pipetools enables function composition similar to using Unix pipes.

It allows forward-composition and piping of arbitrary functions - no need to decorate them or do anything extra.

It also packs a bunch of utils that make common operations more convenient and readable.

Source is on github.

Why?

Piping and function composition are some of the most natural operations there are for plenty of programming tasks. Yet Python doesn't have a built-in way of performing them. That forces you to either deep nesting of function calls or adding extra glue code.

Example

Say you want to create a list of python files in a given directory, ordered by filename length, as a string, each file on one line and also with line numbers:

>>> print(pyfiles_by_length('../pipetools'))
1. ds_builder.py
2. __init__.py
3. compat.py
4. utils.py
5. main.py

All the ingredients are already there, you just have to glue them together. You might write it like this:

def pyfiles_by_length(directory):
    all_files = os.listdir(directory)
    py_files = [f for f in all_files if f.endswith('.py')]
    sorted_files = sorted(py_files, key=len, reverse=True)
    numbered = enumerate(py_files, 1)
    rows = ("{0}. {1}".format(i, f) for i, f in numbered)
    return '\n'.join(rows)

Or perhaps like this:

def pyfiles_by_length(directory):
    return '\n'.join('{0}. {1}'.format(*x) for x in enumerate(reversed(sorted(
        [f for f in os.listdir(directory) if f.endswith('.py')], key=len)), 1))

Or, if you're a mad scientist, you would probably do it like this:

pyfiles_by_length = lambda d: (reduce('{0}\n{1}'.format,
    map(lambda x: '%d. %s' % x, enumerate(reversed(sorted(
        filter(lambda f: f.endswith('.py'), os.listdir(d)), key=len))))))

But there should be one -- and preferably only one -- obvious way to do it.

So which one is it? Well, to redeem the situation, pipetools give you yet another possibility!

pyfiles_by_length = (pipe
    | os.listdir
    | where(X.endswith('.py'))
    | sort_by(len).descending
    | (enumerate, X, 1)
    | foreach("{0}. {1}")
    | '\n'.join)

Why would I do that, you ask? Comparing to the native Python code, it's

  • Easier to read -- minimal extra clutter
  • Easier to understand -- one-way data flow from one step to the next, nothing else to keep track of
  • Easier to change -- want more processing? just add a step to the pipeline
  • Removes some bug opportunities -- did you spot the bug in the first example?

Of course it won't solve all your problems, but a great deal of code can be expressed as a pipeline, giving you the above benefits. Read on to see how it works!

Installation

$ pip install pipetools

Uh, what's that?

Usage

The pipe

The pipe object can be used to pipe functions together to form new functions, and it works like this:

from pipetools import pipe

f = pipe | a | b | c

# is the same as:
def f(x):
    return c(b(a(x)))

A real example, sum of odd numbers from 0 to x:

from functools import partial
from pipetools import pipe

odd_sum = pipe | range | partial(filter, lambda x: x % 2) | sum

odd_sum(10)  # -> 25

Note that the chain up to the sum is lazy.

Automatic partial application in the pipe

As partial application is often useful when piping things together, it is done automatically when the pipe encounters a tuple, so this produces the same result as the previous example:

odd_sum = pipe | range | (filter, lambda x: x % 2) | sum

As of 0.1.9, this is even more powerful, see X-partial.

Built-in tools

Pipetools contain a set of pipe-utils that solve some common tasks. For example there is a shortcut for the filter class from our example, called where():

from pipetools import pipe, where

odd_sum = pipe | range | where(lambda x: x % 2) | sum

Well that might be a bit more readable, but not really a huge improvement, but wait!

If a pipe-util is used as first or second item in the pipe (which happens quite often) the pipe at the beginning can be omitted:

odd_sum = range | where(lambda x: x % 2) | sum

See pipe-utils' documentation.

OK, but what about the ugly lambda?

where(), but also foreach(), sort_by() and other pipe-utils can be quite useful, but require a function as an argument, which can either be a named function -- which is OK if it does something complicated -- but often it's something simple, so it's appropriate to use a lambda. Except Python's lambdas are quite verbose for simple tasks and the code gets cluttered...

X object to the rescue!

from pipetools import where, X

odd_sum = range | where(X % 2) | sum

How 'bout that.

Read more about the X object and it's limitations.

Automatic string formatting

Since it doesn't make sense to compose functions with strings, when a pipe (or a pipe-util) encounters a string, it attempts to use it for (advanced) formatting:

>>> countdown = pipe | (range, 1) | reversed | foreach('{}...') | ' '.join | '{} boom'
>>> countdown(5)
'4... 3... 2... 1... boom'

Feeding the pipe

Sometimes it's useful to create a one-off pipe and immediately run some input through it. And since this is somewhat awkward (and not very readable, especially when the pipe spans multiple lines):

result = (pipe | foo | bar | boo)(some_input)

It can also be done using the > operator:

result = some_input > pipe | foo | bar | boo

Note

Note that the above method of input won't work if the input object defines __gt__ for any object - including the pipe. This can be the case for example with some objects from math libraries such as NumPy. If you experience strange results try falling back to the standard way of passing input into a pipe.

But wait, there is more

Checkout the Maybe pipe, partial application on steroids or automatic data structure creation in the full documentation.

pipeline for migrating lichess data into postgresql

How Long Does It Take Ordinary People To "Get Good" At Chess? TL;DR: According to 5.5 years of data from 2.3 million players and 450 million games, mo

Joseph Wong 182 Nov 11, 2022
Python package to transfer data in a fast, reliable, and packetized form.

pySerialTransfer Python package to transfer data in a fast, reliable, and packetized form.

PB2 101 Dec 07, 2022
An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

qgrid Qgrid is a Jupyter notebook widget which uses SlickGrid to render pandas DataFrames within a Jupyter notebook. This allows you to explore your D

Quantopian, Inc. 2.9k Jan 08, 2023
pyETT: Python library for Eleven VR Table Tennis data

pyETT: Python library for Eleven VR Table Tennis data Documentation Documentation for pyETT is located at https://pyett.readthedocs.io/. Installation

Tharsis Souza 5 Nov 19, 2022
PyPSA: Python for Power System Analysis

1 Python for Power System Analysis Contents 1 Python for Power System Analysis 1.1 About 1.2 Documentation 1.3 Functionality 1.4 Example scripts as Ju

758 Dec 30, 2022
Hidden Markov Models in Python, with scikit-learn like API

hmmlearn hmmlearn is a set of algorithms for unsupervised learning and inference of Hidden Markov Models. For supervised learning learning of HMMs and

2.7k Jan 03, 2023
Data collection, enhancement, and metrics calculation.

l3_data_collection Data collection, enhancement, and metrics calculation. Summary Repository containing code for QuantDAO's JDT data collection task.

Ruiwyn 3 Dec 23, 2022
Includes all files needed to satisfy hw02 requirements

HW 02 Data Sets Mean Scale Score for Asian and Hispanic Students, Grades 3 - 8 This dataset provides insights into the New York City education system

7 Oct 28, 2021
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
Implementation in Python of the reliability measures such as Omega.

OmegaPy Summary Simple implementation in Python of the reliability measures: Omega Total, Omega Hierarchical and Omega Hierarchical Total. Name Link O

Rafael Valero Fernández 2 Apr 27, 2022
The repo for mlbtradetrees.com. Analyze any trade in baseball history!

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

7 Nov 20, 2022
Learn machine learning the fun way, with Oracle and RedBull Racing

Red Bull Racing Analytics Hands-On Labs Introduction Are you interested in learning machine learning (ML)? How about doing this in the context of the

Oracle DevRel 55 Oct 24, 2022
COVID-19 deaths statistics around the world

COVID-19-Deaths-Dataset COVID-19 deaths statistics around the world This is a daily updated dataset of COVID-19 deaths around the world. The dataset c

Nisa Efendioğlu 4 Jul 10, 2022
This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

This repo contains a powerful tool made using python which is used to visualize, analyse and finally assess the quality of the product depending upon the given observations

SasiVatsal 8 Oct 18, 2022
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.

PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand

Melwin Varghese P 4 Jun 05, 2022
Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

FangWei 1 Jan 16, 2022
Data analysis and visualisation projects from a range of individual projects and applications

Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python

Tom Ritman-Meer 1 Jan 25, 2022
Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

Brady Law 2 Dec 01, 2021
Weather analysis with Python, SQLite, SQLAlchemy, and Flask

Surf's Up Weather analysis with Python, SQLite, SQLAlchemy, and Flask Overview The purpose of this analysis was to examine weather trends (precipitati

Art Tucker 1 Sep 05, 2021
Data processing with Pandas.

Processing-data-with-python This is a simple example showing how to use Pandas to create a dataframe and the processing data with python. The jupyter

1 Jan 23, 2022