erg/doc/zh_CN/python/bytecode_specification.md
2022-09-06 09:29:14 +09:00

5.2 KiB
Raw Blame History

Python bytecode specification

badge

Format

  • 0~3 byte(u32): magic number (see common/bytecode.rs for details)
  • 4~7 byte(u32): 0 padding
  • 8~12 byte(u32): timestamp
  • 13~ byte(PyCodeObject): code object

PyCodeObject

  • 0 byte(u8): '0xe3' (prefix, this means code's 'c')
  • 01~04 byte(u32): number of args (co_argcount)
  • 05~08 byte(u32): number of position-only args (co_posonlyargcount)
  • 09~12 byte(u32): number of keyword-only args (co_kwonlyargcount)
  • 13~16 byte(u32): number of locals (co_nlocals)
  • 17~20 byte(u32): stack size (co_stacksize)
  • 21~24 byte(u32): flags (co_flags) ()
  • ? byte: bytecode instructions, ends with '0x53', '0x0' (83, 0): RETURN_VALUE (co_code)
  • ? byte(PyTuple): constants used in the code (co_consts)
  • ? byte(PyTuple): names used in the code (co_names)
  • ? byte(PyTuple): variable names defined in the code, include params (PyTuple) (co_varnames)
  • ? byte(PyTuple): variables captured from the outer scope (co_freevars)
  • ? byte(PyTuple): variables used in the inner closure (co_cellvars)
  • ? byte(PyUnicode or PyShortAscii): file name, where it was loaded from (co_filename)
  • ? byte(PyUnicode or PyShortAscii): the name of code itself, default is <module> (co_name)
  • ?~?+3 byte(u32): number of first line (co_firstlineno)
  • ? byte(bytes): line table, represented by PyStringObject? (co_lnotab)

PyTupleObject

  • 0 byte: 0x29 (means ')')
  • 01~04 byte(u32): number of tuple items
  • ? byte(PyObject): items

PyStringObject

  • If I use a character other than ascii, does it become PyUnicode?

  • "あ", "𠮷", and "α" are PyUnicode (no longer used?)

  • 0 byte: 0x73 (means 's')

  • 1~4 byte: length of string

  • 5~ byte: payload

PyUnicodeObject

  • 0 byte: 0x75 (means 'u')
  • 1~4 byte: length of string
  • 5~ byte: payload

PyShortAsciiObject

  • This is called short, but even if there are more than 100 characters, this will still short

  • or rather, there is no ascii that is not short (is short a data type?)

  • 0 byte: 0xFA (means 'z')

  • 1~4 byte: length of string

  • 5~ byte: payload

PyInternedObject

  • interned objects are registered in a dedicated map and can be compared with is

  • String, for example, can be compared in constant time regardless of its length

  • 0 byte: 0x74 (means 't')

PyShortAsciiInternedObject

  • 0 byte: 0xDA (means 'Z')
  • 1~4 byte: length of string
  • 5~ byte: payload

Python 字节码规范

格式

  • 0~3 byte(u32):幻数(详见common/bytecode.rs)
  • 4~7 byte(u32): 0 padding
  • 8~12 byte(u32): 时间戳
  • 13~ byte(PyCodeObject): 代码对象

PyCode 对象

  • 0 byte(u8): '0xe3' (前缀,这意味着代码的'c')
  • 01~04 byte(u32): args个数(co_argcount)
  • 05~08 byte(u32): position-only args 的数量 (co_posonlyargcount)
  • 09~12 byte(u32):仅关键字参数的数量(co_kwonlyargcount)
  • 13~16 byte(u32): 本地数 (co_nlocals)
  • 17~20 byte(u32): 栈大小(co_stacksize)
  • 21~24 byte(u32):标志(co_flags)()
  • ? byte字节码指令以'0x53'、'0x0'结尾(83, 0)RETURN_VALUE(co_code)
  • ? byte(PyTuple):代码中使用的常量(co_consts)
  • ? byte(PyTuple):代码中使用的名称(co_names)
  • ? byte(PyTuple)代码中定义的变量名包括params (PyTuple) (co_varnames)
  • ? byte(PyTuple):从外部范围捕获的变量(co_freevars)
  • ? byte(PyTuple):内部闭包中使用的变量(co_cellvars)
  • ? byte(PyUnicode 或 PyShortAscii):文件名,它是从哪里加载的(co_filename)
  • ? byte(PyUnicode or PyShortAscii): 代码本身的名字,默认是<module> (co_name)
  • ?~?+3 byte(u32): 第一行数 (co_firstlineno)
  • ? byte(bytes):行表,用 PyStringObject? (co_lnotab)

PyTupleObject

  • 0 byte: 0x29 (意思是:')')
  • 01~04 byte(u32): 元组项数
  • ? byte(PyObject):项目

PyString 对象

  • 如果我使用 ascii 以外的字符,它会变成 PyUnicode 吗?

  • “あ”、“𠮷”和“α”是 PyUnicode(不再使用?)

  • 0 byte0x73(表示's')

  • 1~4 byte字符串长度

  • 5~ byte有效载荷

PyUnicode 对象

  • 0 byte0x75(表示“u”)
  • 1~4 byte字符串长度
  • 5~ byte有效载荷

PyShortAsciiObject

  • 这叫短但是即使超过100个字符仍然会保持在短的状态

  • 或者更确切地说,没有不短的 ascii(短数据类型吗?)

  • 0 byte0xFA(表示“z”)

  • 1~4 byte字符串长度

  • 5~ byte有效载荷

PyInternedObject

  • 实习对象注册在专用地图中可以与is进行比较

  • 例如字符串,无论其长度如何,都可以在恒定时间内进行比较

  • 0 byte0x74(表示't')

PyShortAsciiInternedObject

  • 0 byte0xDA(表示“Z”)
  • 1~4 byte字符串长度
  • 5~ byte有效载荷