Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed

Overview

iterable-subprocess CircleCI Test Coverage

Python utility function to communicate with a subprocess using iterables: for when data is too big to fit in memory and has to be streamed.

Data is sent to a subprocess's standard input via an iterable, and extracted from its standard output via another iterable. This allows an external subprocess to be naturally placed in a chain of iterables for streaming processing.

Installation

pip install iterable-subprocess

Usage

A single function iterable_subprocess is exposed. The first parameter is the args argument passed to the Popen Constructor, and the second is an iterable whose items must be bytes instances and are sent to the subprocess's standard input.

Returned from the function is an iterable whose items are bytes instances of the process's standard output.

from iterable_subprocess import iterable_subprocess

def yield_input():
    # In a real case could read from the filesystem or the network
    yield b'first\n'
    yield b'second\n'
    yield b'third\n'

output = iterable_subprocess(['cat'], yield_input())

for chunk in output:
    print(chunk)

Usage: unzip the first file of a ZIP archive while downloading

While its not typically possible to completely unzip an arbitrary ZIP file on-the-fly, it is possible to unzip the first file in a ZIP archive using funzip, as in the following example.

from iterable_subprocess import iterable_subprocess
import httpx

def zipped_chunks():
    with httpx.stream('GET', 'https://www.example.com/my.zip') as r:
        yield from r.iter_bytes()

unzipped_chunks = iterable_subprocess(['funzip'], zipped_chunks())

for chunk in unzipped_chunks:
    print(chunk)

Ideally Python's zipfile module or Python's zlib module would be able to do this without calling into funzip. However, at the time of writing this does not appear easily possible.

Owner
Department for International Trade
Department for International Trade
Jenkins-AWS-CICD - Implement Jenkins CI/CD with AWS CodeBuild and AWS CodeDeploy, build a python flask web application.

Jenkins-AWS-CICD - Implement Jenkins CI/CD with AWS CodeBuild and AWS CodeDeploy, build a python flask web application.

Ning 1 Jan 01, 2022
ZeroMQ bindings for Twisted

Twisted bindings for 0MQ Introduction txZMQ allows to integrate easily ØMQ sockets into Twisted event loop (reactor). txZMQ supports both CPython and

Andrey Smirnov 149 Dec 08, 2022
Python IMDB Docker - A docker tutorial to containerize a python script.

Python_IMDB_Docker A docker tutorial to containerize a python script. Build the docker in the current directory: docker build -t python-imdb . Run the

Sarthak Babbar 1 Dec 30, 2021
Travis CI testing a Dockerfile based on Palantir's remix of Apache Cassandra, testing IaC, and testing integration health of Debian

Testing Palantir's remix of Apache Cassandra with Snyk & Travis CI This repository is to show Travis CI testing a Dockerfile based on Palantir's remix

Montana Mendy 1 Dec 20, 2021
More than 130 check plugins for Icinga and other Nagios-compatible monitoring applications. Each plugin is a standalone command line tool (written in Python) that provides a specific type of check.

Python-based Monitoring Check Plugins Collection This Enterprise Class Check Plugin Collection offers a package of more than 130 Python-based, Nagios-

Linuxfabrik 119 Dec 27, 2022
Quick & dirty controller to schedule Kubernetes Jobs later (once)

K8s Jobber Operator Quickly implemented Kubernetes controller to enable scheduling of Jobs at a later time. Usage: To schedule a Job later, Set .spec.

Jukka Väisänen 2 Feb 11, 2022
Hackergame nc 类题目的 Docker 容器资源限制、动态 flag、网页终端

Hackergame nc 类题目的 Docker 容器资源限制、动态 flag、网页终端 快速入门 配置证书 证书用于验证用户 Token。请确保这里的证书文件(cert.pem)与 Hackergame 平台 配置的证书相同,这样 Hackergame 平台为每个用户生成的 Token 才可以通

USTC Hackergame 68 Nov 09, 2022
Official Python client library for kubernetes

Kubernetes Python Client Python client for the kubernetes API. Installation From source: git clone --recursive https://github.com/kubernetes-client/py

Kubernetes Clients 5.4k Jan 02, 2023
Micro Data Lake based on Docker Compose

Micro Data Lake based on Docker Compose This is the implementation of a Minimum Data Lake

Abel Coronado 15 Jan 07, 2023
Helperpod - A CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster

Helperpod is a CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster.

Atakan Tatlı 2 Feb 05, 2022
Big data on k8s

# microsoft azure # https://docs.microsoft.com/en-us/cli/azure/install-azure-cli az account set --subscription [] az aks get-credentials --resource-g

Luan Moreno 22 Dec 24, 2022
CDK Template of Table Definition AWS Lambda for RDB

CDK Template of Table Definition AWS Lambda for RDB Overview This sample deploys Amazon Aurora of PostgreSQL or MySQL with AWS Lambda that can define

AWS Samples 5 May 16, 2022
Remote Desktop Protocol in Twisted Python

RDPY Remote Desktop Protocol in twisted python. RDPY is a pure Python implementation of the Microsoft RDP (Remote Desktop Protocol) protocol (client a

Sylvain Peyrefitte 1.6k Dec 30, 2022
DataOps framework for Machine Learning projects.

Noronha DataOps Noronha is a Python framework designed to help you orchestrate and manage ML projects life-cycle. It hosts Machine Learning models ins

52 Oct 30, 2022
Visual disk-usage analyser for docker images

whaler What? A command-line tool for visually investigating the disk usage of docker images Why? Large images are slow to move and expensive to store.

Treebeard Technologies 194 Sep 01, 2022
docker-compose工程部署时的辅助脚本

okta-cmd Introduction docker-compose 辅助脚本

完美风暴666 4 Dec 09, 2021
SSH tunnels to remote server.

Author: Pahaz Repo: https://github.com/pahaz/sshtunnel/ Inspired by https://github.com/jmagnusson/bgtunnel, which doesn't work on Windows. See also: h

Pavel White 1k Dec 28, 2022
Define and run multi-container applications with Docker

Docker Compose Docker Compose is a tool for running multi-container applications on Docker defined using the Compose file format. A Compose file is us

Docker 28.2k Jan 08, 2023
Caboto, the Kubernetes semantic analysis tool

Caboto Caboto, the Kubernetes semantic analysis toolkit. It contains a lightweight Python library for semantic analysis of plain Kubernetes manifests

Michael Schilonka 8 Nov 26, 2022