Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

40
Byterun: A (C)Python interpreter in Python Allison Kaptur github.com/akaptur akaptur.github.io @akaptur

description

Allison Kaptur speaking at NYC Python in July 2014.

Transcript of Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Page 1: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Byterun: A (C)Python interpreter in Python

Allison Kaptur !

github.com/akaptur akaptur.github.io

@akaptur

Page 2: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Byterun Ned Batchelder

!Based on

# pyvm2 by Paul Swartz (z3p) from http://www.twistedmatrix.com/users/z3p/

Page 3: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Why would you do such a thing

>>> if a or b: ... do_stuff()

Page 4: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

out = "" for i in range(5): out = out + str(i) print(out)

Page 5: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

def fn(a, b=17, c="Hello", d=[]): d.append(99) print(a, b, c, d) !fn(1) fn(2, 3) fn(3, c="Bye") fn(4, d=["What?"]) fn(5, "b", "c")

Page 6: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

def verbose(func): def _wrapper(*args, **kwargs): return func(*args, **kwargs) return _wrapper !@verbose def add(x, y): return x+y !add(7, 3)

Page 7: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

try: raise ValueError("oops") except ValueError as e: print("Caught: %s" % e) print("All done")

Page 8: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can doclass NullContext(object): def __enter__(self): l.append('i') return self ! def __exit__(self, exc_type, exc_val, exc_tb): l.append('o') return False !l = [] for i in range(3): with NullContext(): l.append('w') if i % 2: break l.append('z') l.append('e') !l.append('r') s = ''.join(l) print("Look: %r" % s) assert s == "iwzoeiwor"

Page 9: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Some things we can do

g = (x*x for x in range(3)) print(list(g))

Page 10: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

A problem

g = (x*x for x in range(5)) h = (y+1 for y in g) print(list(h))

Page 11: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

The Python virtual machine: !

A bytecode interpreter

Page 12: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: the internal representation of a python

program in the interpreter

Page 13: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans

Page 14: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod.func_code.co_code

Function Code object

Bytecode

Page 15: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod.func_code.co_code '|\x00\x00|\x01\x00\x16}\x02\x00|\x02\x00S'

Page 16: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Bytecode: it’s bytes!

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod.func_code.co_code ‘|\x00\x00|\x01\x00\x16}\x02\x00|\x02\x00S' >>> [ord(b) for b in mod.func_code.co_code] [124, 0, 0, 124, 1, 0, 22, 125, 2, 0, 124, 2, 0, 83]

Page 17: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

dis, a bytecode disassembler

>>> import dis >>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Page 18: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

dis, a bytecode disassembler

>>> import dis >>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Line Number Index in

bytecode

Instruction name, for humans

More bytes, the argument to each

instruction

Hint about arguments

Page 19: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

whatever

some other thing

something

whatever

some other thing

something

a

b

whatever

some other thing

something

ans

Before After BINARY_MODULO

After LOAD_FAST

Data stack on a frame

Page 20: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !c --------------------- a | bar Frame | -> blocks: [] l | (newest) | -> data: [1, 2] l --------------------- | foo Frame | -> blocks: [] s | | -> data: [<foo.<lcl>.bar, 1] t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [<foo>] k ---------------------

Page 21: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

dis, a bytecode disassembler

>>> import dis >>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Page 22: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Page 23: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

} /*switch*/

/* Main switch on opcode */ READ_TIMESTAMP(inst0); !switch (opcode) {

Page 24: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

#ifdef CASE_TOO_BIG default: switch (opcode) { #endif

/* Turn this on if your compiler chokes on the big switch: */ /* #define CASE_TOO_BIG 1 */

Page 25: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Back to that bytecode

!>>> dis.dis(mod) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MODULO 7 STORE_FAST 2 (ans) ! 3 10 LOAD_FAST 2 (ans) 13 RETURN_VALUE

Page 26: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

case LOAD_FAST: x = GETLOCAL(oparg); if (x != NULL) { Py_INCREF(x); PUSH(x); goto fast_next_opcode; } format_exc_check_arg(PyExc_UnboundLocalError, UNBOUNDLOCAL_ERROR_MSG, PyTuple_GetItem(co->co_varnames, oparg)); break;

Page 27: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

case BINARY_MODULO: w = POP(); v = TOP(); if (PyString_CheckExact(v)) x = PyString_Format(v, w); else x = PyNumber_Remainder(v, w); Py_DECREF(v); Py_DECREF(w); SET_TOP(x); if (x != NULL) continue; break;

Page 28: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

It’s “dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3

Page 29: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“Dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3 >>> mod(“%s%s”, (“NYC”, “Python”))

Page 30: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“Dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3 >>> mod(“%s %s”, (“NYC”, “Python”)) NYC Python

Page 31: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“Dynamic”

>>> def mod(a, b): ... ans = a % b ... return ans >>> mod(15, 4) 3 >>> mod(“%s %s”, (“NYC”, “Python”)) NYC Python >>> print “%s %s” % (“NYC”, “Python”) NYC Python

Page 32: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

case BINARY_MODULO: w = POP(); v = TOP(); if (PyString_CheckExact(v)) x = PyString_Format(v, w); else x = PyNumber_Remainder(v, w); Py_DECREF(v); Py_DECREF(w); SET_TOP(x); if (x != NULL) continue; break;

Page 33: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

>>> class Surprising(object): … def __mod__(self, other): … print “Surprise!” !>>> s = Surprising() >>> t = Surprsing() >>> s % t Surprise!

Page 34: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

“In the general absence of type information, almost every instruction must be treated as INVOKE_ARBITRARY_METHOD.”

!- Russell Power and Alex Rubinsteyn, “How Fast Can

We Make Interpreted Python?”

Page 35: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Back to our problem

g = (x*x for x in range(5)) h = (y+1 for y in g) print(list(h))

Page 36: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !c --------------------- a | bar Frame | -> blocks: [] l | (newest) | -> data: [1, 2] l --------------------- | foo Frame | -> blocks: [] s | | -> data: [<foo.<lcl>.bar, 1] t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [<foo>] k ---------------------

Page 37: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !!!l --------------------- | foo Frame | -> blocks: [] s | | -> data: [3] t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [<foo>] k ---------------------

Page 38: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

def foo(): x = 1 def bar(y): z = y + 2 # <--- (3) return z return bar(x) # <--- (2) foo() # <--- (1) !!s t --------------------- a | main (module) Frame | -> blocks: [] c | (oldest) | -> data: [3] k ---------------------

Page 39: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

Back to our problem

g = (x*x for x in range(5)) h = (y+1 for y in g) print(list(h))

Page 40: Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython

More

Great blogs http://tech.blog.aknin.name/category/my-projects/pythons-innards/ by @aknin http://eli.thegreenplace.net/ by Eli Bendersky !Contribute! Find bugs! https://github.com/nedbat/byterun !Apply to Hacker School! www.hackerschool.com/apply