CPython calling convention

1. What can be called: `Callable`

We know Python is a strong-typed programming language, every object has its type (stored in the PyTypeObject *ob_type field in PyObject struct). Technically, we say a object (or precisely, type) is callable if its tp_call field is not NULL, i.e., points to the C routine which will be invoked when the Python object is called. The signature of tp_call is:

static PyObject *
type_call(PyObject *self, PyObject *args, PyObject *kwds)

where the self is the callee object (e.g., a Python function), args and kwds represent the positional and keyword arguments typed in tuple and dict, respectively.

The most common callable objects in Python include function, lambda function, classes with the __call__ dunder defined. And in this post, I'm going briefly to recap the motivation of introducing a new calling convention into CPython: the vector call.

2. The conventional calling protocol

Before PEP 590 since Python 3.9, if we want to make a call on a callable through its tp_call, several steps have to be conducted (see _PyObject_MakeTpCall):

Access tp_call in the type of callable object self, and validate it is not NULL
Pack nargs objects in the evaluation stack to form the argstuple tuple with _PyTuple_FromArray
Produce a dict instance kwdict with _PyStack_AsDict
Call the tp_call C function and retrieve the result
Decref the temporary argstuple and kwdict objects

Intuitively, it's costy for every single callable invocation as the interpreter would create and demise a tuple and a dict on-the-fly to emulate the Python function signature self(*args, **kwargs). It's better to have a protocol that the interpreter can directly inform the positional and keyword arguments to the calling routine.

3. Why vector call: the demand for efficient internal calls

As the envolution of Python, there are more and more internal calls in the execution of interpreter in the form of callbacks, such as sys.monitoring.register_callback and asyncio.Task.add_done_callback. In these scenarios, a callable is registered and deemed to be called once an event happens inside the interpreter. That says, the overhead is non-negligible as there will be enumerous internal calls onto the Python function. Thus, it's demanding to introduce a calling protocol that effects on invocations on Python callable objects inside Python interpreter runtime.

4. Vector calling convention

PEP 590 introduces the vector calling convention enabling CPython interpreter and 3rd party extension modules to internally invoke a Python callable object in an efficient manner. Compared to traditional calling protocol, the new argument layout will be:

The arguments are laid in a flatten "vector" in memory.

From the above figure, we can image why the developers name the protocol as "vector". Both the positional and the values of keyword arguments are stored in a C array, with the number of positional arguments are passed in nargsfg (means number of args and a special flag). Besides, the names of keyword arguments are stored in kwnames the same order of their values in args. Under this convention, there is no need to explicitly create tuple and dict object every time when a callable is called, the interpreter only needs to arrage the argument objects in C arrays (i.e., raw memory regions). Thereafter, the callable invocation could be more efficient with such a flatten argument layout.

CPython calling convention

1. What can be called: Callable

2. The conventional calling protocol

3. Why vector call: the demand for efficient internal calls

4. Vector calling convention

1. What can be called: `Callable`