A synchronous, single-threaded interface for starting processes on Linux

Related tags

Miscellaneoussfork
Overview

Summary

sfork is a prototype for a new system call on Linux which provides a synchronous, single-threaded interface for starting processes.

sfork can be viewed as a variation on vfork which does the minimal amount of work required to make vfork actually useful and usable. In particular, sfork removes all the traditional restrictions vfork has on what you can do in the child process.

Interface

The raw interface is identical to the usual prototypes on Linux for vfork, exit, and execveat:

int sfork();
int sfork_exit(int status);
int sfork_execveat(int dirfd, const char* pathname, char *const argv[],
                   char *const envp[], int flags);

However, unlike traditional fork and vfork, sfork only ever returns once. sfork always returns 0 on success, or a negative value if forking failed for any of the usual reasons, like a cap on the number of processes.

The pid, then, is obtained from the return value of exit or execveat. Of course, those system calls don’t usually return, hence the need to wrap them with sfork-supporting equivalents.

In other words, the control flow for sfork is different from the control flow for fork and vfork.

Control flow for fork and vfork proceeds as below. Each line is numbered according to the order in which it is reached. (Error checking is omitted for simplicity)

int ret; // 1
printf("I'm in the parent"); // 2
ret = vfork();  // 3 and 7
if (ret == 0) { // 4 and 8
  printf("I'm in the child"); // 5
  exec(); // 6
} else {
  printf("I'm in the parent once again"); // 9
  printf("Pid of child is %d", ret); // 10
}

Control flow for sfork proceeds like this (again, with error checking omitted):

int ret; // 1
printf("I'm in the parent"); // 2
sfork();  // 3
printf("I'm in the child"); // 4
ret = exec(); // 5
printf("I'm in the parent once again"); // 6
printf("Pid of child is %d", ret); // 7

Control flow works like that naturally in any language that calls sfork, like any other normal function call.

For example, with the Python wrapper, exceptions thrown in the child automatically propagate up. The subprocess() contextmanager in the Python wrapper catches exceptions, automatically calls exit(1) to exit the child process context and re-enter the parent process context, and rethrows the exception. So if a user application encounters an error while setting up the child, the error is naturally and easily propagated up.

A clean way to understand sfork, is to view it as moving a single existing thread of control from an existing process context, the parent, to a new, fresh process context, the child, which starts off sharing its address space with the parent.

In this view, after a call to sfork, exec is an overloaded operation which does three things: Creates a new address space inside the current process context and loads the executable into it, creates a new thread starting at the executable entry point in the current process context and the new address space, and returns the current thread to the parent process context.

And exit, after a call to sfork, just destroys the current process context (setting the exit code), and returns the current thread to the parent process context.

In this view, sfork actually is much more like unshare than fork or vfork. Like unshare, sfork creates a new execution context and moves the current thread into that execution context. Unfortunately, sfork cannot currently be implemented with unshare; see the discussion in appropriate section below.

Userspace implementation

Recall that vfork shares the memory space between the parent process and child process, and blocks the thread in the parent process that executes vfork. The thread in the parent process is unblocked when the child process calls either exec or exit.

The kernel, when implementing vfork, saves the parent process’s registers and restores them after the parent is resumed. To achieve the behavior of sfork, we would rather the kernel just not save and restore the registers at all, but rather, just continue control flow from the point of the child process’s exec.

If you view vfork as just moving a single thread of control between processes, then the fact that the kernel saves the registers of this thread at the point of calling vfork, and then restores them when calling exec or exit, becomes obviously unnecessary: Merely not doing that save and restore gives us sfork. Without that save and restore, we get a single continuous control flow without any jumps.

So all that the sfork wrapper does is perform the exact opposite jump of the kernel: It saves the child process’s registers at the point of exec or exit, and restore those child registers immediately after the parent process is resumed with the parent’s saved registers. This register save/restore exactly counteracts the kernel’s register save/restore.

Possible implementation using unshare

Instead of calling vfork to create a new process context, sfork could call unshare(CLONE_SIGHAND|CLONE_FILES|CLONE_FS) to create a new process context and move the current thread into it.

Then, instead of calling exec, we would call clone(new_stack, CLONE_VM) while inside the new process context, with an appropriately set up new_stack to immediately call exec.

Then to return to the parent process context, we would call setns(procfd, CLONE_SIGHAND|CLONE_FILES|CLONE_FS), where procfd is a file descriptor pointing to the parent process context.

The main missing piece here is that there’s no way to get a file descriptor representing the parent process context, and setns does not support passing any of CLONE_SIGHAND|CLONE_FILES|CLONE_FS, so there’s no way for the thread to return to the parent process.

Also, unshare doesn’t allow calling CLONE_SIGHAND in multi-threaded applications, for good reason. Properly dealing with signals will be tricky.

Also, unshare doesn’t allow calling CLONE_VM in multi-threaded applications, for reasons which are unclear to me. I think that could be changed to be allowed.

Also, calling clone(new_stack, CLONE_VM) will copy the address space, negating one of the main advantages of a vfork style approach. We may need some other specialized system call that runs an executable in a new address space on a new thread, inheriting all the parts of the execution context.

Owner
Spencer Baugh
Spencer Baugh
Tools I'm building in order to help my investments decisions

b3-tools Tools I'm building in order to help my investments decisions. Based in the REITs I've in my personal portifolio I ran a script that scrapy th

Rafael Cassau 2 Jan 21, 2022
My Solutions to 120 commonly asked data science interview questions.

Data_Science_Interview_Questions Introduction 👋 Here are the answers to 120 Data Science Interview Questions The above answer some is modified based

Milaan Parmar / Милан пармар / _米兰 帕尔马 181 Dec 31, 2022
A promo calculator for sports betting odds.

Sportbetter Calculation Toolkit Parlay Calculator This is a quick parlay calculator that considers some of the common promos offered. It is used to id

Luke Bhan 1 Sep 08, 2022
Alternative StdLib for Nim for Python targets

Alternative StdLib for Nim for Python targets, hijacks Python StdLib for Nim

Juan Carlos 100 Jan 01, 2023
Python module used to generate random facts

Randfacts is a python library that generates random facts. You can use randfacts.get_fact() to return a random fun fact. Disclaimer: Facts are not gua

Tabulate 14 Dec 14, 2022
Traffic flow test platform, especially for reinforcement learning

Traffic Flow Test Platform Traffic flow test platform, especially for reinforcement learning, named TFTP. A traffic signal control framework that can

4 Nov 07, 2022
To check my COVID-19 vaccine appointment, I wrote an infinite loop that sends me a Whatsapp message hourly using Twilio and Selenium. It works on my Raspberry Pi computer.

COVID-19_vaccine_appointment To check my COVID-19 vaccine appointment, I wrote an infinite loop that sends me a Whatsapp message hourly using Twilio a

Ayyuce Demirbas 24 Dec 17, 2022
For my Philips Airpurifier AC3259/10

Philips-Airpurifier For my Philips Airpurifier AC3259/10 I will try to keep this code

AcidSleeper 7 Feb 26, 2022
Modify the value and status of the records KoboToolbox

Modify the value and status of the records KoboToolbox (Modica el valor y status de los registros de KoboToolbox)

1 Oct 30, 2021
KUIZ is a web application quiz where you can create/take a quiz for learning and sharing knowledge from various subjects, questions and answers.

KUIZ KUIZ is a web application quiz where you can create/take a quiz for learning and sharing knowledge from various subjects, questions and answers.

Thanatibordee Sihaboonthong 3 Sep 12, 2022
3x+1 recreated in Python

3x-1 3x+1 recreated in Python If a number is odd it is multiplied by 3 and 1 is added to the product. If a number is even it is divided by 2. These ru

4 Aug 19, 2022
Plock : A stack based programming language

Plock : A stack based programming language

1 Oct 25, 2021
Linux GUI app to codon optimize many single-fasta files with coding sequences , using many taxonomy ids

codon_optimize_cds_with_many_taxids_singlefasta Linux GUI app to codon optimize many single-fasta files with coding sequences, using many taxonomy ids

Olga Tsiouri 1 Jan 23, 2022
Solutions to the language assignment for Internship in JALA Technologies.

Python Assignment Solutions (JALA Technologies) Solutions to the language assignment for Internship in JALA Technologies. Features Properly formatted

Samyak Jain 2 Jan 17, 2022
Monitor the New World login queue and notify when it is about to finish

nwwatch - Monitor the New World queue and notify when it is about to finish Getting Started install python 3.7+ navigate to the directory where you un

14 Jan 10, 2022
Mengzhan (John) code for Closed Loop Control system of Sharp Wave Ripples in Hippocampus CA3 region

ClosedLoopControl_Yu Mengzhan (John) code for Closed Loop Control system of Sharp Wave Ripples in Hippocampus CA3 region Creating Python Virtual Envir

Mengzhan (John) Liufu 1 Jan 22, 2022
Wrappers around the most common maya.cmds and maya.api use cases

Maya FunctionSet (maya_fn) A package that decompose core maya.cmds and maya.api features to a set of simple functions. Tests The recommended approach

Ryan Porter 9 Mar 12, 2022
Python tools for working with Orbit Ephemeris Messages (OEMs).

Python Orbit Ephemeris Message tools Python tools for working with Orbit Ephemeris Messages (OEMs). Development Status Installation The oem package is

Brad Sease 4 Apr 06, 2022
A simple python script that print the Mandelbrot set for every power of the formal formula.

Python Mandelbrot A simple python script that print the Mandelbrot set for every power of the formal formula.

Paride Giunta 2 Apr 15, 2022
A free and powerful system for awareness and research of the American judicial system.

CourtListener Started in 2009, CourtListener.com is the main initiative of Free Law Project. The goal of CourtListener.com is to provide high quality

Free Law Project 332 Dec 25, 2022