In Python, we can iterate an object in for
loop like this:
a = [1, 2, 3]
for i in a:
print(i)
But not every object can be used like this. So what kind of object can be placed in a for-loop body? How can we make a custom class like this if some of its attributes can be retrieved sequentially. So in this article, I'd like to record my understanding about 3 confusing concepts in Python: iterable, iterator and generator.
Iterable vs. iterator
"Iterable" sounds more like a conceptual term: we would say an object is iterable if its elements can be accessed sequentially, i.e., one-by-one, one-after-another.
An object defines the __iter__
dunder method which returns an iterator is iterable.
An iterator is also an object, which must contains two dunders: __iter__
(returns itself self
) and __next__
returns the next elements to be accessed.
Iterator serves as the proxy to retrieve the sequential elements of a container (e.g., list or dict).
Actually, the for loop in Python is just a syntactic sugar which automatically invokes the __next__
method of the iterator on behalf of programmer.
Note that the iterators are disposables, that says, you cannot recycle or reuse them once the iteration terminates (see StopIteration
).
You have to obtain a new iterator from the corresponding container c
by calling c.__iter__()
dunder, or equivalently, iter(c)
.
Here is a simple example showing how iterator works:
class NumberIterator:
def __init__(self, n):
self.n = n
self.now = n
def __iter__(self):
return self
def __next__(self):
if self.now > 0:
x = self.now % 10
self.now //= 10
return x
else:
raise StopIteration()
for i in NumberIterator(114514):
print(i)
It prints the digits of the given number in the reversed order, and an iterator needs to raise a StopIteration
exception if no more items can be iterated.
The magic behind iterator
So what happens when you try to iterate an iterator until it's exhausted?
Let's see the bytecode, suppose we have the following Python code to loop over an iterator (e.g., range
object):
it = iter(range(10))
try:
while True:
x = next(it)
print(x)
except StopIteration:
pass
Then the vital part of its compiled bytecode is:
5 LOAD_NAME 3 (next)
PUSH_NULL
LOAD_NAME 2 (it)
CALL 1
STORE_NAME 4 (x)
Ultimately, the interpreter will invoke tp_iternext
function defined in the PyTypeObject
of a PyObject
(in this case, _PyRangeIterObject
).
So essentially, looping over an iterator (either through next(it)
or it.__next__()
) has nothing special other than a normal function call.
Inside the tp_iternext
function, it tracks some internal status of iterator, e.g., start, step and end for the range
object.
When there is no further elements to iterate, the tp_iternext
function raises a StopIteration
exception.
Generator vs. coroutine
In Python, there are 3 kinds of executables that can interrupt and resume later from the bytecode sequence:
- generator
- async generator
- coroutine
In this article, we're going to dive into the CPython internals about how these 3 concepts are implemented. Actually, the layout of underlyging C structures of them are identical:
#define _PyGenObject_HEAD(prefix) \
PyObject_HEAD \
/* List of weak reference. */ \
PyObject *prefix##_weakreflist; \
/* Name of the generator. */ \
PyObject *prefix##_name; \
/* Qualified name of the generator. */ \
PyObject *prefix##_qualname; \
_PyErr_StackItem prefix##_exc_state; \
PyObject *prefix##_origin_or_finalizer; \
char prefix##_hooks_inited; \
char prefix##_closed; \
char prefix##_running_async; \
/* The frame */ \
int8_t prefix##_frame_state; \
_PyInterpreterFrame prefix##_iframe; \
struct _PyGenObject {
/* The gi_ prefix is intended to remind of generator-iterator. */
_PyGenObject_HEAD(gi)
};
struct _PyCoroObject {
_PyGenObject_HEAD(cr)
};
struct _PyAsyncGenObject {
_PyGenObject_HEAD(ag)
};
So basically, they follow similar processing procedure in the interpreter.
Generator
In Python, we say a callable is a generator1 if it contains yield
statement inside its code:
def gen(init):
assert init > 0
while init > 0:
jump = yield init
init -= jump
When evaluating the generator frame, the yield
statement forces the generator to "pause", yielding the control flow to its parent frame (the caller invokes the generator).
g = gen(100)
print(next(g))
jump = 1
try:
while g:
print(g.send(jump))
jump *= 2
except StopIteration as e:
pass
You can call next()
on a generator to retrieve the yield value.
Additionally, you can call g.send()
method to send a value from caller to callee at the same time.
There is another way to declare a generator 2:
sum(x*x for x in range(10))
Compare to list comprehension, generator expression does not create the substantial object (i.e., list in this example), instead it just evaluates the necessary values one at a time. This saves the required memory to evaluate the statement. It actually creates a anomynous generator function in current scope.
Return at the beginning: RETURN_GENERATOR
If you inspect the bytecode of a generator function, you can find the first instruction will always be RETURN_GENERATOR
:
1 RETURN_GENERATOR
POP_TOP
L1: RESUME 0
2 LOAD_FAST_BORROW 0 (init)
LOAD_SMALL_INT 0
COMPARE_OP 148 (bool(>))
POP_JUMP_IF_TRUE 3 (to L2)
NOT_TAKEN
LOAD_COMMON_CONSTANT 0 (AssertionError)
RAISE_VARARGS 1
3 L2: LOAD_FAST_BORROW 0 (init)
LOAD_SMALL_INT 0
COMPARE_OP 148 (bool(>))
POP_JUMP_IF_FALSE 15 (to L3)
NOT_TAKEN
4 LOAD_FAST_BORROW 0 (init)
YIELD_VALUE 0
RESUME 5
STORE_FAST 1 (jump)
5 LOAD_FAST_BORROW_LOAD_FAST_BORROW 1 (init, jump)
BINARY_OP 23 (-=)
STORE_FAST 0 (init)
JUMP_BACKWARD 21 (to L2)
3 L3: LOAD_CONST 1 (None)
RETURN_VALUE
-- L4: CALL_INTRINSIC_1 3 (INTRINSIC_STOPITERATION_ERROR)
RERAISE 1
In the bytecode DSL, RETURN_GENERATOR
creates a PyGenObject
via _Py_MakeCoro
from the PyFunctionObject
.
The generator object contains the original function frame, together with additional frame execution status (note that generator may be paused or resumed at the positions of yield
).
The frame status could be one of:
typedef enum _framestate {
FRAME_CREATED = -3,
FRAME_SUSPENDED = -2,
FRAME_SUSPENDED_YIELD_FROM = -1,
FRAME_EXECUTING = 0,
FRAME_COMPLETED = 1,
FRAME_CLEARED = 4
} PyFrameState;
We say the frame containing yield
s is the generator frame, and the frame invokes the generator is the caller frame.
When you call a generator function, e.g., g = gen(100)
, the interpreter only runs this bytecode and then immediately returns the generated generator to the caller frame.
Inside the RETURN_GENERATOR
bytecode, the interpreter indeed copies the generator frame into the generator object (gi_iframe
), and sets the PC to the next bytecode right after RETURN_GENERATOR
.
After that, the interpreter puts the produced generator object at the TOS, then lets the caller frame continue to execute.
So when the caller frame calls the next()
on the returned generator at the first time, the interpreter will execute the generator frame at the second bytecode.
Pause in the middle: YIELD_VALUE
Generator function contains the yield
statement in the syntax: jump = yield init
, which is corresponding to YIELD_VALUE
bytecode.
At a glance, the stack effect of YIELD_VALUE
is:
inst(YIELD_VALUE, (retval -- value))
which converts retval
at TOS (init
in the example code) to a heap-safe value
(still on TOS) that can be resumed by the caller frame (returned by send
).
In details, YIELD_VALUE
detaches the generator frame by unlinked it from the interpreter frame list, and sets "current frame" to caller frame.
It sets the caller frame PC to the bytecode next to SEND
, and so caller frame continues to execute the following bytecodes.
Coroutines
At this stage, a new concept comes into our journey: async def
, which defines a function with CO_COROUTINE
set in its code object flags.
In Python, functions prefixed with async
are called coroutine functions, whose control flow could be paused/resumed during execution.
Inside their body, await
statements are allowed to yield current function and hand over the scheduling to event loop.
For example, we have a function agen
like this:
async def agen(x):
await asyncio.sleep(1)
The eye-tracking statement await asyncio.sleep(1)
is being compiled into the following bytecodes:
6 LOAD_GLOBAL 0 (asyncio)
LOAD_ATTR 2 (sleep)
PUSH_NULL
LOAD_SMALL_INT 1
CALL 1
GET_AWAITABLE 0
LOAD_CONST 1 (None)
L2: SEND 3 (to L5)
L3: YIELD_VALUE 1
L4: RESUME 3
JUMP_BACKWARD_NO_INTERRUPT 5 (to L2)
L5: END_SEND
POP_TOP
For coroutines, GET_AWAITABLE
just simply returns itself and puts it on TOS (for other more general cases, it returns the extracted iterator from the awaitable object).
So the magic lays in the SEND
bytecode, which is defined as:
op(_SEND, (receiver, v -- receiver, retval))
In SEND
, the interpreter attempts to convert the receiver
in the stack into generator
, and pushes the value to be sent into its evaluation stack (note that the frame is owned by the generator).
After that, it assigns the generator frame to interpreter and sets up the corresponding execution status.
When the generator pauses again, the interpreter retrieves the return value yield from the generator.
It either calls the tp_iternext
(equivalent to __next__
for a iterator), or it directly calls the default send()
method.
The control flow of the above await
statement looks like this:
┌──────────────────────┐
│ │
│ NOT READY┌────▼───┐ READY
│ ┌────────┼ SEND ┼───────┐
│ │ └────────┘ │
│ │ │
│ ┌──────▼──────┐ ┌─────▼────┐
│ │ YIELD_VALUE │ │ END_SEND │
│ │ │ └──────────┘
│ │ RESUME │
│ │ │
└─┼JUMP_BACKWARD│
└─────────────┘
If the awaitable object (generator) has finished, the caller frame is ready to continue and jumps to END_SEND
(indicated by the offset oparg of SEND
, which is 3 in above example).
Otherwise, the callee frame just yields value from generator (or event loop) and jumps back to SEND
for next resume.
Async generator: support asyncio within generator
Similar to coroutine, async generator3 are calllables that can yield in the middle of execution.
In additional to the yield
inside the function body, there is an extra await
statement in async generator, which allows itself to pause with asyncio
operations without yielding values to the caller.
The essentail factor that distinguishes await
from yield
is the pause and resume of coroutines are managed by event loop, while yield
is deem to obey the calling chain.
select
coroutines ┌───┐
┌──────────────┬──────►co0│
│ Thread State │ └───┘
│ ┌──────────┐ │ ┌───┐
│ │event loop│ ┼──────►co1│
│ └──────────┘ │ └───┘
│ current_task │ ┌───┐
└──────────────┴──────►co2│
└───┘
Inside event loop, it maintains a heap to track the waiting time of each coroutine. For ready tasks (can be resumed to execute without blocking), the event loop schedules them in the FIFO order. That says, unlike generators, the suspend and resume order is uncontrollable by the task itself, but also determined by other concurrent coroutines.
Outro
So basically, the underlying working machinary beneath
- generator
- coroutines (or generally,
asyncio.Future
) - async generator
share the same C struct definition and depend on SEND
, GET_AWAITABLE
, YIELD_VALUE
and some other bytecode instructions to ensure the asynchronious program execution.
With async generator, it is the combination of generator and asyncio
syntax.
It follows the calling chain of caller and callee suspend/resume order in the yield
statement, while hand over the scheduling to the centralized event loop where await
occurs.