3.8 KiB
Code Generation
By default, Erg scripts are converted to pyc files and executed. In other words, they are executed as Python bytecode rather than Python scripts.
The pyc files are generated from the HIR, which has been desugared (phase 8) and linked with dependencies (phase 9).
The process is handled by the PyCodeGenerator
. This structure takes HIR
and returns a CodeObj
.
The CodeObj
corresponds to Python's Code object and contains the sequence of instructions to be executed, objects in the static area, and various other metadata. From the perspective of the Python interpreter, the Code
object represents a scope. The Code
representing the top-level scope will contain all the information necessary for execution. The CodeObj
is serialized into a binary format using the dump_as_pyc method and written to a pyc file.
Features Not Present in Python
Erg Runtime
Erg runs on the Python interpreter, but there are various semantic differences from Python. Some features are implemented by the compiler desugaring them into lower-level features, but some can only be implemented at runtime.
Examples include methods that do not exist in Python's built-in types.
Python's built-ins do not have a Nat
type, nor do they have a times!
method.
These methods are implemented by creating new types that wrap Python's built-in types.
These types are located here.
The generated bytecode first imports _erg_std_prelude.py
. This module re-exports the types and functions provided by the Erg runtime.
Record
Records are implemented using Python's namedtuple
.
Trait
Traits are implemented as Python's ABC (Abstract Base Classes). However, Erg's traits have little meaning at runtime.
match
Pattern matching is mostly reduced to a combination of type checks and assignment operations. This is done relatively early in the compilation process.
i, [j, *k] = 1, [2, 3, 4]
↓
_0 = 1, [2, 3]
i = _0[0]
_1 = _0[1]
j = _1[0]
k = _1[1:]
However, some are delayed until runtime.
x: Int or Str
match x:
i: Int -> ...
s: Str -> ...
This pattern match requires a runtime check. This check is performed by in_operator
.
Therefore, the desugared code for the above example is as follows. Exhaustiveness checking is performed at compile time.
if in_operator(x, Int):
...
else:
...
Control-flow Functions
Functions corresponding to Python control-flows such as for!
and if!
change entity depending on their optimization status. Usually, optimization can be performed and they are reduced to dedicated bytecode instructions.
for! [a, b], i =>
...
↓
LOAD_NAME 0(a)
LOAD_NAME 1(b)
BUILD_LIST 2
GET_ITER
FOR_ITER ...
STORE_NAME 2(i)
...
This is more efficient than function calls. However, there are cases where optimization cannot be performed, as shown below.
f! = [for!, ...].choice!()
f! [1, 2], i =>
...
Such cases must be treated as functions with entities. Functions are defined in _erg_control.py.