gh-126187 Add emscripten.py script to automate emscripten build (#126190)

Add emscripten.py script to automate emscripten build.

This is modeled heavily on `Tools/wasm/wasi.py`. This will form the basis of an Emscripten build bot.
This commit is contained in:
Hood Chatham 2024-11-09 03:12:55 +01:00 committed by GitHub
parent 54c63a32d0
commit f8276bf5f3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 355 additions and 137 deletions

View file

@ -0,0 +1 @@
Introduced ``Tools/wasm/emscripten.py`` to simplify doing Emscripten builds.

View file

@ -1,7 +1,7 @@
# Python WebAssembly (WASM) build # Python WebAssembly (WASM) build
**WASI support is [tier 2](https://peps.python.org/pep-0011/#tier-2).** **WASI support is [tier 2](https://peps.python.org/pep-0011/#tier-2).**
**Emscripten is NOT officially supported as of Python 3.13.** **Emscripten support is [tier 3](https://peps.python.org/pep-0011/#tier-3).**
This directory contains configuration and helpers to facilitate cross This directory contains configuration and helpers to facilitate cross
compilation of CPython to WebAssembly (WASM). Python supports Emscripten compilation of CPython to WebAssembly (WASM). Python supports Emscripten
@ -27,154 +27,57 @@ It comes with a reduced and preloaded stdlib without tests and threading
support. The ``Emscripten/node`` target has threading enabled and can support. The ``Emscripten/node`` target has threading enabled and can
access the file system directly. access the file system directly.
Cross compiling to the wasm32-emscripten platform needs the To cross compile to the ``wasm32-emscripten`` platform you need
[Emscripten](https://emscripten.org/) SDK and a build Python interpreter. [the Emscripten compiler toolchain](https://emscripten.org/),
Emscripten 3.1.19 or newer are recommended. All commands below are relative a Python interpreter, and an installation of Node version 18 or newer. Emscripten
to a repository checkout. version 3.1.42 or newer is recommended. All commands below are relative to a checkout
of the Python repository.
#### Toolchain #### Install [the Emscripten compiler toolchain](https://emscripten.org/docs/getting_started/downloads.html)
##### Container image
Christian Heimes maintains a container image with Emscripten SDK, Python
build dependencies, WASI-SDK, wasmtime, and several additional tools.
From within your local CPython repo clone, run one of the following commands:
```
# Fedora, RHEL, CentOS
podman run --rm -ti -v $(pwd):/python-wasm/cpython:Z -w /python-wasm/cpython quay.io/tiran/cpythonbuild:emsdk3
# other
docker run --rm -ti -v $(pwd):/python-wasm/cpython -w /python-wasm/cpython quay.io/tiran/cpythonbuild:emsdk3
```
##### Manually
###### Install [Emscripten SDK](https://emscripten.org/docs/getting_started/downloads.html)
**NOTE**: Follow the on-screen instructions how to add the SDK to ``PATH``.
You can install the Emscripten toolchain as follows:
```shell ```shell
git clone https://github.com/emscripten-core/emsdk.git /opt/emsdk git clone https://github.com/emscripten-core/emsdk.git --depth 1
/opt/emsdk/emsdk install latest ./emsdk/emsdk install latest
/opt/emsdk/emsdk activate latest ./emsdk/emsdk activate latest
``` ```
To add the Emscripten compiler to your path:
```shell
source ./emsdk/emsdk_env.sh
```
This adds `emcc` and `emconfigure` to your path.
###### Optionally: enable ccache for EMSDK ##### Optionally: enable ccache for EMSDK
The ``EM_COMPILER_WRAPPER`` must be set after the EMSDK environment is The ``EM_COMPILER_WRAPPER`` must be set after the EMSDK environment is
sourced. Otherwise the source script removes the environment variable. sourced. Otherwise the source script removes the environment variable.
```
. /opt/emsdk/emsdk_env.sh
EM_COMPILER_WRAPPER=ccache
```
###### Optionally: pre-build and cache static libraries
Emscripten SDK provides static builds of core libraries without PIC
(position-independent code). Python builds with ``dlopen`` support require
PIC. To populate the build cache, run:
```shell ```shell
. /opt/emsdk/emsdk_env.sh export EM_COMPILER_WRAPPER=ccache
embuilder build zlib bzip2 MINIMAL_PIC
embuilder --pic build zlib bzip2 MINIMAL_PIC
``` ```
### Compile and build Python interpreter ### Compile and build Python interpreter
From within the container, run the following command: You can use `python Tools/wasm/emscripten` to compile and build targetting
Emscripten. You can do everything at once with:
```shell ```shell
./Tools/wasm/wasm_build.py build python Tools/wasm/emscripten build
``` ```
or you can break it out into four separate steps:
The command is roughly equivalent to:
```shell ```shell
mkdir -p builddir/build python Tools/wasm/emscripten configure-build-python
pushd builddir/build python Tools/wasm/emscripten make-build-python
../../configure -C python Tools/wasm/emscripten configure-host
make -j$(nproc) python Tools/wasm/emscripten make-host
popd
``` ```
Extra arguments to the configure steps are passed along to configure. For
#### Cross-compile to wasm32-emscripten for browser instance, to do a debug build, you can use:
```shell ```shell
./Tools/wasm/wasm_build.py emscripten-browser python Tools/wasm/emscripten build --with-py-debug
``` ```
The command is roughly equivalent to:
```shell
mkdir -p builddir/emscripten-browser
pushd builddir/emscripten-browser
CONFIG_SITE=../../Tools/wasm/config.site-wasm32-emscripten \
emconfigure ../../configure -C \
--host=wasm32-unknown-emscripten \
--build=$(../../config.guess) \
--with-emscripten-target=browser \
--with-build-python=$(pwd)/../build/python
emmake make -j$(nproc)
popd
```
Serve `python.html` with a local webserver and open the file in a browser.
Python comes with a minimal web server script that sets necessary HTTP
headers like COOP, COEP, and mimetypes. Run the script outside the container
and from the root of the CPython checkout.
```shell
./Tools/wasm/wasm_webserver.py
```
and open http://localhost:8000/builddir/emscripten-browser/python.html . This
directory structure enables the *C/C++ DevTools Support (DWARF)* to load C
and header files with debug builds.
#### Cross compile to wasm32-emscripten for node
```shell
./Tools/wasm/wasm_build.py emscripten-node-dl
```
The command is roughly equivalent to:
```shell
mkdir -p builddir/emscripten-node-dl
pushd builddir/emscripten-node-dl
CONFIG_SITE=../../Tools/wasm/config.site-wasm32-emscripten \
emconfigure ../../configure -C \
--host=wasm32-unknown-emscripten \
--build=$(../../config.guess) \
--with-emscripten-target=node \
--enable-wasm-dynamic-linking \
--with-build-python=$(pwd)/../build/python
emmake make -j$(nproc)
popd
```
```shell
node --experimental-wasm-threads --experimental-wasm-bulk-memory --experimental-wasm-bigint builddir/emscripten-node-dl/python.js
```
(``--experimental-wasm-bigint`` is not needed with recent NodeJS versions)
### Limitations and issues ### Limitations and issues
Emscripten before 3.1.8 has known bugs that can cause memory corruption and
resource leaks. 3.1.8 contains several fixes for bugs in date and time
functions.
#### Network stack #### Network stack
- Python's socket module does not work with Emscripten's emulated POSIX - Python's socket module does not work with Emscripten's emulated POSIX
@ -241,8 +144,6 @@ functions.
[gh-90548](https://github.com/python/cpython/issues/90548). [gh-90548](https://github.com/python/cpython/issues/90548).
- Python's object allocator ``obmalloc`` is disabled by default. - Python's object allocator ``obmalloc`` is disabled by default.
- ``ensurepip`` is not available. - ``ensurepip`` is not available.
- Some ``ctypes`` features like ``c_longlong`` and ``c_longdouble`` may need
NodeJS option ``--experimental-wasm-bigint``.
#### In the browser #### In the browser
@ -263,15 +164,6 @@ Node builds use ``NODERAWFS``.
- Node RawFS allows direct access to the host file system without need to - Node RawFS allows direct access to the host file system without need to
perform ``FS.mount()`` call. perform ``FS.mount()`` call.
### wasm64-emscripten
- wasm64 requires recent NodeJS and ``--experimental-wasm-memory64``.
- ``EM_JS`` functions must return ``BigInt()``.
- ``Py_BuildValue()`` format strings must match size of types. Confusing 32
and 64 bits types leads to memory corruption, see
[gh-95876](https://github.com/python/cpython/issues/95876) and
[gh-95878](https://github.com/python/cpython/issues/95878).
### Hosting Python WASM builds ### Hosting Python WASM builds
The simple REPL terminal uses SharedArrayBuffer. For security reasons The simple REPL terminal uses SharedArrayBuffer. For security reasons

View file

@ -0,0 +1,325 @@
#!/usr/bin/env python3
import argparse
import contextlib
import functools
import os
try:
from os import process_cpu_count as cpu_count
except ImportError:
from os import cpu_count
from pathlib import Path
import shutil
import subprocess
import sys
import sysconfig
import tempfile
WASM_DIR = Path(__file__).parent.parent
CHECKOUT = WASM_DIR.parent.parent
CROSS_BUILD_DIR = CHECKOUT / "cross-build"
BUILD_DIR = CROSS_BUILD_DIR / "build"
HOST_TRIPLE = "wasm32-emscripten"
HOST_DIR = CROSS_BUILD_DIR / HOST_TRIPLE
LOCAL_SETUP = CHECKOUT / "Modules" / "Setup.local"
LOCAL_SETUP_MARKER = "# Generated by Tools/wasm/emscripten.py\n".encode("utf-8")
def updated_env(updates={}):
"""Create a new dict representing the environment to use.
The changes made to the execution environment are printed out.
"""
env_defaults = {}
# https://reproducible-builds.org/docs/source-date-epoch/
git_epoch_cmd = ["git", "log", "-1", "--pretty=%ct"]
try:
epoch = subprocess.check_output(git_epoch_cmd, encoding="utf-8").strip()
env_defaults["SOURCE_DATE_EPOCH"] = epoch
except subprocess.CalledProcessError:
pass # Might be building from a tarball.
# This layering lets SOURCE_DATE_EPOCH from os.environ takes precedence.
environment = env_defaults | os.environ | updates
env_diff = {}
for key, value in environment.items():
if os.environ.get(key) != value:
env_diff[key] = value
print("🌎 Environment changes:")
for key in sorted(env_diff.keys()):
print(f" {key}={env_diff[key]}")
return environment
def subdir(working_dir, *, clean_ok=False):
"""Decorator to change to a working directory."""
def decorator(func):
@functools.wraps(func)
def wrapper(context):
try:
tput_output = subprocess.check_output(
["tput", "cols"], encoding="utf-8"
)
terminal_width = int(tput_output.strip())
except subprocess.CalledProcessError:
terminal_width = 80
print("" * terminal_width)
print("📁", working_dir)
if clean_ok and getattr(context, "clean", False) and working_dir.exists():
print(f"🚮 Deleting directory (--clean)...")
shutil.rmtree(working_dir)
working_dir.mkdir(parents=True, exist_ok=True)
with contextlib.chdir(working_dir):
return func(context, working_dir)
return wrapper
return decorator
def call(command, *, quiet, **kwargs):
"""Execute a command.
If 'quiet' is true, then redirect stdout and stderr to a temporary file.
"""
print("", " ".join(map(str, command)))
if not quiet:
stdout = None
stderr = None
else:
stdout = tempfile.NamedTemporaryFile(
"w",
encoding="utf-8",
delete=False,
prefix="cpython-emscripten-",
suffix=".log",
)
stderr = subprocess.STDOUT
print(f"📝 Logging output to {stdout.name} (--quiet)...")
subprocess.check_call(command, **kwargs, stdout=stdout, stderr=stderr)
def build_platform():
"""The name of the build/host platform."""
# Can also be found via `config.guess`.`
return sysconfig.get_config_var("BUILD_GNU_TYPE")
def build_python_path():
"""The path to the build Python binary."""
binary = BUILD_DIR / "python"
if not binary.is_file():
binary = binary.with_suffix(".exe")
if not binary.is_file():
raise FileNotFoundError("Unable to find `python(.exe)` in " f"{BUILD_DIR}")
return binary
@subdir(BUILD_DIR, clean_ok=True)
def configure_build_python(context, working_dir):
"""Configure the build/host Python."""
if LOCAL_SETUP.exists():
print(f"👍 {LOCAL_SETUP} exists ...")
else:
print(f"📝 Touching {LOCAL_SETUP} ...")
LOCAL_SETUP.write_bytes(LOCAL_SETUP_MARKER)
configure = [os.path.relpath(CHECKOUT / "configure", working_dir)]
if context.args:
configure.extend(context.args)
call(configure, quiet=context.quiet)
@subdir(BUILD_DIR)
def make_build_python(context, working_dir):
"""Make/build the build Python."""
call(["make", "--jobs", str(cpu_count()), "all"], quiet=context.quiet)
binary = build_python_path()
cmd = [
binary,
"-c",
"import sys; " "print(f'{sys.version_info.major}.{sys.version_info.minor}')",
]
version = subprocess.check_output(cmd, encoding="utf-8").strip()
print(f"🎉 {binary} {version}")
@subdir(HOST_DIR, clean_ok=True)
def configure_emscripten_python(context, working_dir):
"""Configure the emscripten/host build."""
config_site = os.fsdecode(
CHECKOUT / "Tools" / "wasm" / "config.site-wasm32-emscripten"
)
emscripten_build_dir = working_dir.relative_to(CHECKOUT)
python_build_dir = BUILD_DIR / "build"
lib_dirs = list(python_build_dir.glob("lib.*"))
assert (
len(lib_dirs) == 1
), f"Expected a single lib.* directory in {python_build_dir}"
lib_dir = os.fsdecode(lib_dirs[0])
pydebug = lib_dir.endswith("-pydebug")
python_version = lib_dir.removesuffix("-pydebug").rpartition("-")[-1]
sysconfig_data = (
f"{emscripten_build_dir}/build/lib.emscripten-wasm32-{python_version}"
)
if pydebug:
sysconfig_data += "-pydebug"
host_runner = context.host_runner
env_additions = {"CONFIG_SITE": config_site, "HOSTRUNNER": host_runner}
build_python = os.fsdecode(build_python_path())
configure = [
"emconfigure",
os.path.relpath(CHECKOUT / "configure", working_dir),
"CFLAGS=-DPY_CALL_TRAMPOLINE -sUSE_BZIP2",
f"--host={HOST_TRIPLE}",
f"--build={build_platform()}",
f"--with-build-python={build_python}",
"--without-pymalloc",
"--disable-shared",
"--disable-ipv6",
"--enable-big-digits=30",
"--enable-wasm-dynamic-linking",
f"--prefix={HOST_DIR}",
]
if pydebug:
configure.append("--with-pydebug")
if context.args:
configure.extend(context.args)
call(
configure,
env=updated_env(env_additions),
quiet=context.quiet,
)
python_js = working_dir / "python.js"
exec_script = working_dir / "python.sh"
exec_script.write_text(f'#!/bin/sh\nexec {host_runner} {python_js} "$@"\n')
exec_script.chmod(0o755)
print(f"🏃‍♀️ Created {exec_script} ... ")
sys.stdout.flush()
@subdir(HOST_DIR)
def make_emscripten_python(context, working_dir):
"""Run `make` for the emscripten/host build."""
call(
["make", "--jobs", str(cpu_count()), "commoninstall"],
env=updated_env(),
quiet=context.quiet,
)
exec_script = working_dir / "python.sh"
subprocess.check_call([exec_script, "--version"])
def build_all(context):
"""Build everything."""
steps = [
configure_build_python,
make_build_python,
configure_emscripten_python,
make_emscripten_python,
]
for step in steps:
step(context)
def clean_contents(context):
"""Delete all files created by this script."""
if CROSS_BUILD_DIR.exists():
print(f"🧹 Deleting {CROSS_BUILD_DIR} ...")
shutil.rmtree(CROSS_BUILD_DIR)
if LOCAL_SETUP.exists():
with LOCAL_SETUP.open("rb") as file:
if file.read(len(LOCAL_SETUP_MARKER)) == LOCAL_SETUP_MARKER:
print(f"🧹 Deleting generated {LOCAL_SETUP} ...")
def main():
default_host_runner = "node"
parser = argparse.ArgumentParser()
subcommands = parser.add_subparsers(dest="subcommand")
build = subcommands.add_parser("build", help="Build everything")
configure_build = subcommands.add_parser(
"configure-build-python", help="Run `configure` for the " "build Python"
)
make_build = subcommands.add_parser(
"make-build-python", help="Run `make` for the build Python"
)
configure_host = subcommands.add_parser(
"configure-host",
help="Run `configure` for the host/emscripten (pydebug builds are inferred from the build Python)",
)
make_host = subcommands.add_parser("make-host", help="Run `make` for the host/emscripten")
clean = subcommands.add_parser(
"clean", help="Delete files and directories created by this script"
)
for subcommand in build, configure_build, make_build, configure_host, make_host:
subcommand.add_argument(
"--quiet",
action="store_true",
default=False,
dest="quiet",
help="Redirect output from subprocesses to a log file",
)
for subcommand in configure_build, configure_host:
subcommand.add_argument(
"--clean",
action="store_true",
default=False,
dest="clean",
help="Delete any relevant directories before building",
)
for subcommand in build, configure_build, configure_host:
subcommand.add_argument(
"args", nargs="*", help="Extra arguments to pass to `configure`"
)
for subcommand in build, configure_host:
subcommand.add_argument(
"--host-runner",
action="store",
default=default_host_runner,
dest="host_runner",
help="Command template for running the emscripten host"
f"`{default_host_runner}`)",
)
context = parser.parse_args()
dispatch = {
"configure-build-python": configure_build_python,
"make-build-python": make_build_python,
"configure-host": configure_emscripten_python,
"make-host": make_emscripten_python,
"build": build_all,
"clean": clean_contents,
}
if not context.subcommand:
# No command provided, display help and exit
print("Expected one of", ", ".join(sorted(dispatch.keys())), file=sys.stderr)
parser.print_help(sys.stderr)
sys.exit(1)
dispatch[context.subcommand](context)
if __name__ == "__main__":
main()