A synchronous, single-threaded interface for starting processes on Linux

Related tags

Miscellaneoussfork
Overview

Summary

sfork is a prototype for a new system call on Linux which provides a synchronous, single-threaded interface for starting processes.

sfork can be viewed as a variation on vfork which does the minimal amount of work required to make vfork actually useful and usable. In particular, sfork removes all the traditional restrictions vfork has on what you can do in the child process.

Interface

The raw interface is identical to the usual prototypes on Linux for vfork, exit, and execveat:

int sfork();
int sfork_exit(int status);
int sfork_execveat(int dirfd, const char* pathname, char *const argv[],
                   char *const envp[], int flags);

However, unlike traditional fork and vfork, sfork only ever returns once. sfork always returns 0 on success, or a negative value if forking failed for any of the usual reasons, like a cap on the number of processes.

The pid, then, is obtained from the return value of exit or execveat. Of course, those system calls don’t usually return, hence the need to wrap them with sfork-supporting equivalents.

In other words, the control flow for sfork is different from the control flow for fork and vfork.

Control flow for fork and vfork proceeds as below. Each line is numbered according to the order in which it is reached. (Error checking is omitted for simplicity)

int ret; // 1
printf("I'm in the parent"); // 2
ret = vfork();  // 3 and 7
if (ret == 0) { // 4 and 8
  printf("I'm in the child"); // 5
  exec(); // 6
} else {
  printf("I'm in the parent once again"); // 9
  printf("Pid of child is %d", ret); // 10
}

Control flow for sfork proceeds like this (again, with error checking omitted):

int ret; // 1
printf("I'm in the parent"); // 2
sfork();  // 3
printf("I'm in the child"); // 4
ret = exec(); // 5
printf("I'm in the parent once again"); // 6
printf("Pid of child is %d", ret); // 7

Control flow works like that naturally in any language that calls sfork, like any other normal function call.

For example, with the Python wrapper, exceptions thrown in the child automatically propagate up. The subprocess() contextmanager in the Python wrapper catches exceptions, automatically calls exit(1) to exit the child process context and re-enter the parent process context, and rethrows the exception. So if a user application encounters an error while setting up the child, the error is naturally and easily propagated up.

A clean way to understand sfork, is to view it as moving a single existing thread of control from an existing process context, the parent, to a new, fresh process context, the child, which starts off sharing its address space with the parent.

In this view, after a call to sfork, exec is an overloaded operation which does three things: Creates a new address space inside the current process context and loads the executable into it, creates a new thread starting at the executable entry point in the current process context and the new address space, and returns the current thread to the parent process context.

And exit, after a call to sfork, just destroys the current process context (setting the exit code), and returns the current thread to the parent process context.

In this view, sfork actually is much more like unshare than fork or vfork. Like unshare, sfork creates a new execution context and moves the current thread into that execution context. Unfortunately, sfork cannot currently be implemented with unshare; see the discussion in appropriate section below.

Userspace implementation

Recall that vfork shares the memory space between the parent process and child process, and blocks the thread in the parent process that executes vfork. The thread in the parent process is unblocked when the child process calls either exec or exit.

The kernel, when implementing vfork, saves the parent process’s registers and restores them after the parent is resumed. To achieve the behavior of sfork, we would rather the kernel just not save and restore the registers at all, but rather, just continue control flow from the point of the child process’s exec.

If you view vfork as just moving a single thread of control between processes, then the fact that the kernel saves the registers of this thread at the point of calling vfork, and then restores them when calling exec or exit, becomes obviously unnecessary: Merely not doing that save and restore gives us sfork. Without that save and restore, we get a single continuous control flow without any jumps.

So all that the sfork wrapper does is perform the exact opposite jump of the kernel: It saves the child process’s registers at the point of exec or exit, and restore those child registers immediately after the parent process is resumed with the parent’s saved registers. This register save/restore exactly counteracts the kernel’s register save/restore.

Possible implementation using unshare

Instead of calling vfork to create a new process context, sfork could call unshare(CLONE_SIGHAND|CLONE_FILES|CLONE_FS) to create a new process context and move the current thread into it.

Then, instead of calling exec, we would call clone(new_stack, CLONE_VM) while inside the new process context, with an appropriately set up new_stack to immediately call exec.

Then to return to the parent process context, we would call setns(procfd, CLONE_SIGHAND|CLONE_FILES|CLONE_FS), where procfd is a file descriptor pointing to the parent process context.

The main missing piece here is that there’s no way to get a file descriptor representing the parent process context, and setns does not support passing any of CLONE_SIGHAND|CLONE_FILES|CLONE_FS, so there’s no way for the thread to return to the parent process.

Also, unshare doesn’t allow calling CLONE_SIGHAND in multi-threaded applications, for good reason. Properly dealing with signals will be tricky.

Also, unshare doesn’t allow calling CLONE_VM in multi-threaded applications, for reasons which are unclear to me. I think that could be changed to be allowed.

Also, calling clone(new_stack, CLONE_VM) will copy the address space, negating one of the main advantages of a vfork style approach. We may need some other specialized system call that runs an executable in a new address space on a new thread, inheriting all the parts of the execution context.

Owner
Spencer Baugh
Spencer Baugh
📽 Streamlit application powered by a PyScaffold project setup

streamlit-demo Streamlit application powered by a PyScaffold project setup. Work in progress: The idea of this repo is to demonstrate how to package a

PyScaffold 2 Oct 10, 2022
Decentralized intelligent voting application.

DiVA Decentralized intelligent voting application. Hack the North 2021. Inspiration Following the previous US election, many voters were fearful that

Ali Shariatmadari 4 Jun 05, 2022
Saturne best tools pour baiser tout le système de discord

Installation | Important | Discord 🌟 Comme Saturne est gratuit, les dons sont vraiment appréciables et maintiennent le développement! Caractéristique

GalackQSM 8 Oct 02, 2022
Monitor the New World login queue and notify when it is about to finish

nwwatch - Monitor the New World queue and notify when it is about to finish Getting Started install python 3.7+ navigate to the directory where you un

14 Jan 10, 2022
A modern Python build backend

trampolim A modern Python build backend. Features Task system, allowing to run arbitrary Python code during the build process (Planned) Easy to use CL

Filipe Laíns 39 Nov 08, 2022
The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete.

Feature-Engineering The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete. Business

Ayşe Nur Türkaslan 4 Oct 17, 2021
ThinkPHP全日志扫描工具,命令行版和BurpSuite插件版

ThinkPHP3和5日志扫描工具,提供命令行版和BurpSuite插件版,尽可能全的发掘网站日志信息 命令行版 安装 git clone https://github.com/r3change/TPLogScan.git cd TPLogScan/ pip install -r requireme

119 Dec 27, 2022
Object-oriented programming exercise session held in Petnica.

OOP vežba ⚠️ The code in this repo is used for a OOP practice session held in Petnica. All instructions in the README file are written in Serbian. Ops

Pavle Ćirić 1 Jan 30, 2022
SmartGrid - Een poging tot een optimale SmartGrid oplossing, door Dirk Kuiper & Lars Zwaan

SmartGrid - Een poging tot een optimale SmartGrid oplossing, door Dirk Kuiper & Lars Zwaan

1 Jan 12, 2022
This is a database of 180.000+ symbols containing Equities, ETFs, Funds, Indices, Futures, Options, Currencies, Cryptocurrencies and Money Markets.

Finance Database As a private investor, the sheer amount of information that can be found on the internet is rather daunting.

Jeroen Bouma 1.4k Dec 31, 2022
An universal linux port of deezer, supporting both Flatpak and AppImage

Deezer for linux This repo is an UNOFFICIAL linux port of the official windows-only Deezer app. Being based on the windows app, it allows downloading

Aurélien Hamy 154 Jan 06, 2023
A set of simple functions to upload and fetch pastes on paste.uploadgram.me

pastegram-py A set of simple functions to upload and fetch pastes on paste.uploadgram.me. API Documentation Methods upload_paste(contents: bytes, file

Uploadgram 3 Sep 13, 2022
An improved version of the common ˙pacman -S˙

BetterPacmanLook An improved version of the common pacman -S. Installation I know that this is probably one of the worst solutions and i will be worki

1 Nov 06, 2021
Install packages with pip as if you were in the past!

A PyPI time machine Do you wish you could just install packages with pip as if you were at some fixed date in the past? If so, the PyPI time machine i

Thomas Robitaille 51 Jan 09, 2023
Simple Denial of Service Program yang di bikin menggunakan bahasa pemograman Python,

Peringatan Tujuan kami share code Indo-DoS hanya untuk bertujuan edukasi / pembelajaran! Dilarang memperjual belikan source ini / memperjual-belikan s

SonLyte 8 Nov 07, 2021
scap is a tool for putting code in places and for other purposes

Scap is the deployment script used by Wikimedia Foundation to publish code and configuration on production web servers.

Wikimedia 7 Nov 02, 2022
This is the improvised version of Dobot Magician which can be implemented for Dobot M1

pydobotM1 This is the edited driver for Dobot M1 version of the original pydobot library intended for use with the Dobot Magician. Here's what you nee

Shaik Abdullah 2 Jul 11, 2022
Keep your company's passwords behind the firewall

TeamVault TeamVault is an open-source web-based shared password manager for behind-the-firewall installation. It requires Python 3.3+ and Postgres (wi

//SEIBERT/MEDIA GmbH 38 Feb 20, 2022
Beatsaber for Python

beatsaber Beatsaber for Python It was automatically generated with mkpylib. If you're reading this message, it m

Shawn Presser 3 Jul 30, 2021
E5 自动续期

请选择跳转 新版本系统 (2021-2-9采用): 以后更新都在AutoApi,采用v0.0版本号覆盖式更新 AutoApi : 最新版 保留1到2个稳定的简易版,防止萌新大范围报错 AutoApi'X' : 稳定版1 ( 即本版AutpApiP ) AutoApiP ( 即v5.0,稳定版 ) —

95 Feb 15, 2021