GH-102404, GH-100956: Document how to do a WASI build (GH-105251)

Also includes a reference shell script to implements what is documented.
This commit is contained in:
Brett Cannon 2023-06-02 15:15:41 -07:00 committed by GitHub
parent e01b04c907
commit 70dc2fb973
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 192 additions and 174 deletions

View file

@ -0,0 +1,2 @@
Document how to perform a WASI build on Linux. Also add
Tools/wasm/build_wasi.sh as a reference implementation of the docs.

View file

@ -10,9 +10,15 @@ use WASM runtimes such as *wasmtime*.
Users and developers are encouraged to use the script
`Tools/wasm/wasm_build.py`. The tool automates the build process and provides
assistance with installation of SDKs.
assistance with installation of SDKs, running tests, etc.
## wasm32-emscripten build
**NOTE**: If you are looking for information that is not directly related to
building CPython for WebAssembly (or the resulting build), please see
https://github.com/psf/webassembly for more information.
## wasm32-emscripten
### Build
For now the build system has two target flavors. The ``Emscripten/browser``
target (``--with-emscripten-target=browser``) is optimized for browsers.
@ -25,6 +31,10 @@ Cross compiling to the wasm32-emscripten platform needs the
Emscripten 3.1.19 or newer are recommended. All commands below are relative
to a repository checkout.
#### Toolchain
##### Container image
Christian Heimes maintains a container image with Emscripten SDK, Python
build dependencies, WASI-SDK, wasmtime, and several additional tools.
@ -38,7 +48,42 @@ podman run --rm -ti -v $(pwd):/python-wasm/cpython:Z -w /python-wasm/cpython qua
docker run --rm -ti -v $(pwd):/python-wasm/cpython -w /python-wasm/cpython quay.io/tiran/cpythonbuild:emsdk3
```
### Compile a build Python interpreter
##### Manually
###### Install [Emscripten SDK](https://emscripten.org/docs/getting_started/downloads.html)
**NOTE**: Follow the on-screen instructions how to add the SDK to ``PATH``.
```shell
git clone https://github.com/emscripten-core/emsdk.git /opt/emsdk
/opt/emsdk/emsdk install latest
/opt/emsdk/emsdk activate latest
```
###### Optionally: enable ccache for EMSDK
The ``EM_COMPILER_WRAPPER`` must be set after the EMSDK environment is
sourced. Otherwise the source script removes the environment variable.
```
. /opt/emsdk/emsdk_env.sh
EM_COMPILER_WRAPPER=ccache
```
###### Optionally: pre-build and cache static libraries
Emscripten SDK provides static builds of core libraries without PIC
(position-independent code). Python builds with ``dlopen`` support require
PIC. To populate the build cache, run:
```shell
. /opt/emsdk/emsdk_env.sh
embuilder build zlib bzip2 MINIMAL_PIC
embuilder build --pic zlib bzip2 MINIMAL_PIC
```
#### Compile a build Python interpreter
From within the container, run the following command:
@ -56,7 +101,7 @@ make -j$(nproc)
popd
```
### Cross-compile to wasm32-emscripten for browser
#### Cross-compile to wasm32-emscripten for browser
```shell
./Tools/wasm/wasm_build.py emscripten-browser
@ -93,7 +138,7 @@ directory structure enables the *C/C++ DevTools Support (DWARF)* to load C
and header files with debug builds.
### Cross compile to wasm32-emscripten for node
#### Cross compile to wasm32-emscripten for node
```shell
./Tools/wasm/wasm_build.py emscripten-browser-dl
@ -123,13 +168,13 @@ node --experimental-wasm-threads --experimental-wasm-bulk-memory --experimental-
(``--experimental-wasm-bigint`` is not needed with recent NodeJS versions)
# wasm32-emscripten limitations and issues
### Limitations and issues
Emscripten before 3.1.8 has known bugs that can cause memory corruption and
resource leaks. 3.1.8 contains several fixes for bugs in date and time
functions.
## Network stack
#### Network stack
- Python's socket module does not work with Emscripten's emulated POSIX
sockets yet. Network modules like ``asyncio``, ``urllib``, ``selectors``,
@ -143,7 +188,7 @@ functions.
- The ``select`` module is limited. ``select.select()`` crashes the runtime
due to lack of exectfd support.
## processes, signals
#### processes, signals
- Processes are not supported. System calls like fork, popen, and subprocess
fail with ``ENOSYS`` or ``ENOSUP``.
@ -153,7 +198,7 @@ functions.
- Resource-related functions like ``os.nice`` and most functions of the
``resource`` module are not available.
## threading
#### threading
- Threading is disabled by default. The ``configure`` option
``--enable-wasm-pthreads`` adds compiler flag ``-pthread`` and
@ -164,7 +209,7 @@ functions.
- It's not advised to enable threading when building for browsers or with
dynamic linking support; there are performance and stability issues.
## file system
#### file system
- Most user, group, and permission related function and modules are not
supported or don't work as expected, e.g.``pwd`` module, ``grp`` module,
@ -180,7 +225,7 @@ functions.
- Large file support crashes the runtime and is disabled.
- ``mmap`` module is unstable. flush (``msync``) can crash the runtime.
## Misc
#### Misc
- Heap memory and stack size are limited. Recursion or extensive memory
consumption can crash Python.
@ -198,7 +243,7 @@ functions.
- Some ``ctypes`` features like ``c_longlong`` and ``c_longdouble`` may need
NodeJS option ``--experimental-wasm-bigint``.
## wasm32-emscripten in browsers
#### In the browser
- The interactive shell does not handle copy 'n paste and unicode support
well.
@ -210,14 +255,14 @@ functions.
- Test modules are disabled by default. Use ``--enable-test-modules`` build
test modules like ``_testcapi``.
## wasm32-emscripten in node
### wasm32-emscripten in node
Node builds use ``NODERAWFS``.
- Node RawFS allows direct access to the host file system without need to
perform ``FS.mount()`` call.
## wasm64-emscripten
### wasm64-emscripten
- wasm64 requires recent NodeJS and ``--experimental-wasm-memory64``.
- ``EM_JS`` functions must return ``BigInt()``.
@ -226,7 +271,7 @@ Node builds use ``NODERAWFS``.
[gh-95876](https://github.com/python/cpython/issues/95876) and
[gh-95878](https://github.com/python/cpython/issues/95878).
# Hosting Python WASM builds
### Hosting Python WASM builds
The simple REPL terminal uses SharedArrayBuffer. For security reasons
browsers only provide the feature in secure environents with cross-origin
@ -234,7 +279,7 @@ isolation. The webserver must send cross-origin headers and correct MIME types
for the JavaScript and WASM files. Otherwise the terminal will fail to load
with an error message like ``Browsers disable shared array buffer``.
## Apache HTTP .htaccess
#### Apache HTTP .htaccess
Place a ``.htaccess`` file in the same directory as ``python.wasm``.
@ -251,76 +296,106 @@ AddType application/wasm wasm
</IfModule>
```
# WASI (wasm32-wasi)
## WASI (wasm32-wasi)
WASI builds require [WASI SDK](https://github.com/WebAssembly/wasi-sdk) 16.0+.
WASI builds require the [WASI SDK](https://github.com/WebAssembly/wasi-sdk) 16.0+.
See `.devcontainer/Dockerfile` for an example of how to download and
install the WASI SDK.
## Cross-compile to wasm32-wasi
### Build
The script ``wasi-env`` sets necessary compiler and linker flags as well as
``pkg-config`` overrides. The script assumes that WASI-SDK is installed in
``/opt/wasi-sdk`` or ``$WASI_SDK_PATH``.
```shell
./Tools/wasm/wasm_build.py wasi
```
The command is roughly equivalent to:
There are two scripts you can use to do a WASI build from a source checkout. You can either use:
```shell
mkdir -p builddir/wasi
pushd builddir/wasi
CONFIG_SITE=../../Tools/wasm/config.site-wasm32-wasi \
../../Tools/wasm/wasi-env ../../configure -C \
--host=wasm32-unknown-wasi \
--build=$(../../config.guess) \
--with-build-python=$(pwd)/../build/python
make -j$(nproc)
popd
./Tools/wasm/wasm_build.py wasi build
```
## WASI limitations and issues (WASI SDK 15.0)
or:
```shell
./Tools/wasm/build_wasi.sh
```
A lot of Emscripten limitations also apply to WASI. Noticeable restrictions
are:
The commands are equivalent to the following steps:
- Call stack size is limited. Default recursion limit and parser stack size
are smaller than in regular Python builds.
- ``socket(2)`` cannot create new socket file descriptors. WASI programs can
call read/write/accept on a file descriptor that is passed into the process.
- ``socket.gethostname()`` and host name resolution APIs like
``socket.gethostbyname()`` are not implemented and always fail.
- ``open(2)`` checks flags more strictly. Caller must pass either
``O_RDONLY``, ``O_RDWR``, or ``O_WDONLY`` to ``os.open``. Directory file
descriptors must be created with flags ``O_RDONLY | O_DIRECTORY``.
- ``chmod(2)`` is not available. It's not possible to modify file permissions,
yet. A future version of WASI may provide a limited ``set_permissions`` API.
- User/group related features like ``os.chown()``, ``os.getuid``, etc. are
stubs or fail with ``ENOTSUP``.
- File locking (``fcntl``) is not available.
- ``os.pipe()``, ``os.mkfifo()``, and ``os.mknod()`` are not supported.
- ``process_time`` does not work as expected because it's implemented using
wall clock.
- ``os.umask()`` is a stub.
- ``sys.executable`` is empty.
- ``/dev/null`` / ``os.devnull`` may not be available.
- ``os.utime*()`` is buggy in WASM SDK 15.0, see
[utimensat() with timespec=NULL sets wrong time](https://github.com/bytecodealliance/wasmtime/issues/4184)
- ``os.symlink()`` fails with ``PermissionError`` when attempting to create a
symlink with an absolute path with wasmtime 0.36.0. The wasmtime runtime
uses ``openat2(2)`` syscall with flag ``RESOLVE_BENEATH`` to open files.
The flag causes the syscall to reject symlinks with absolute paths.
- ``os.curdir`` (aka ``.``) seems to behave differently, which breaks some
``importlib`` tests that add ``.`` to ``sys.path`` and indirectly
``sys.path_importer_cache``.
- WASI runtime environments may not provide a dedicated temp directory.
- Make sure `Modules/Setup.local` exists
- Make sure the necessary build tools are installed:
- [WASI SDK](https://github.com/WebAssembly/wasi-sdk) (which ships with `clang`)
- `make`
- `pkg-config` (on Linux)
- Create the build Python
- `mkdir -p builddir/build`
- `pushd builddir/build`
- Get the build platform
- Python: `sysconfig.get_config_var("BUILD_GNU_TYPE")`
- Shell: `../../config.guess`
- `../../configure -C`
- `make all`
- ```PYTHON_VERSION=`./python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")'` ```
- `popd`
- Create the host/WASI Python
- `mkdir builddir/wasi`
- `pushd builddir/wasi`
- `../../Tools/wasm/wasi-env ../../configure -C --host=wasm32-unknown-wasi --build=$(../../config.guess) --with-build-python=../build/python`
- `CONFIG_SITE=../../Tools/wasm/config.site-wasm32-wasi`
- `HOSTRUNNER="wasmtime run --mapdir /::$(dirname $(dirname $(pwd))) --env PYTHONPATH=/builddir/wasi/build/lib.wasi-wasm32-$PYTHON_VERSION $(pwd)/python.wasm --"`
- Maps the source checkout to `/` in the WASI runtime
- Stdlib gets loaded from `/Lib`
- Gets `_sysconfigdata__wasi_wasm32-wasi.py` on to `sys.path` via `PYTHONPATH`
- Set by `wasi-env`
- `WASI_SDK_PATH`
- `WASI_SYSROOT`
- `CC`
- `CPP`
- `CXX`
- `LDSHARED`
- `AR`
- `RANLIB`
- `CFLAGS`
- `LDFLAGS`
- `PKG_CONFIG_PATH`
- `PKG_CONFIG_LIBDIR`
- `PKG_CONFIG_SYSROOT_DIR`
- `PATH`
- `make all`
# Detect WebAssembly builds
### Running
## Python code
If you followed the instructions above, you can run the interpreter via e.g., `wasmtime` from within the `Tools/wasi` directory (make sure to set/change `$PYTHON_VERSION` and do note the paths are relative to running in`builddir/wasi` for simplicity only):
```shell
wasmtime run --mapdir /::../.. --env PYTHONPATH=/builddir/wasi/build/lib.wasi-wasm32-$PYTHON_VERSION python.wasm -- <args>
```
There are also helpers provided by `Tools/wasm/wasm_build.py` as listed below. Also, if you used `Tools/wasm/build_wasi.sh`, a `run_wasi.sh` file will be created in `builddir/wasi` which will run the above command for you (it also uses absolute paths, so it can be executed from anywhere).
#### REPL
```shell
./Tools/wasm/wasm_build.py wasi repl
```
#### Tests
```shell
./Tools/wasm/wasm_build.py wasi test
```
### Debugging
* ``wasmtime run -g`` generates debugging symbols for gdb and lldb. The
feature is currently broken, see
https://github.com/bytecodealliance/wasmtime/issues/4669 .
* The environment variable ``RUST_LOG=wasi_common`` enables debug and
trace logging.
## Detect WebAssembly builds
### Python code
```python
import os, sys
@ -387,118 +462,15 @@ posix.uname_result(
'wasi'
```
## C code
### C code
Emscripten SDK and WASI SDK define several built-in macros. You can dump a
full list of built-ins with ``emcc -dM -E - < /dev/null`` and
``/path/to/wasi-sdk/bin/clang -dM -E - < /dev/null``.
```C
#ifdef __EMSCRIPTEN__
// Python on Emscripten
#endif
```
* WebAssembly ``__wasm__`` (also ``__wasm``)
* wasm32 ``__wasm32__`` (also ``__wasm32``)
* wasm64 ``__wasm64__``
* Emscripten ``__EMSCRIPTEN__`` (also ``EMSCRIPTEN``)
* Emscripten version ``__EMSCRIPTEN_major__``, ``__EMSCRIPTEN_minor__``, ``__EMSCRIPTEN_tiny__``
* WASI ``__wasi__``
Feature detection flags:
* ``__EMSCRIPTEN_PTHREADS__``
* ``__EMSCRIPTEN_SHARED_MEMORY__``
* ``__wasm_simd128__``
* ``__wasm_sign_ext__``
* ``__wasm_bulk_memory__``
* ``__wasm_atomics__``
* ``__wasm_mutable_globals__``
## Install SDKs and dependencies manually
In some cases (e.g. build bots) you may prefer to install build dependencies
directly on the system instead of using the container image. Total disk size
of SDKs and cached libraries is about 1.6 GB.
### Install OS dependencies
```shell
# Debian/Ubuntu
apt update
apt install -y git make xz-utils bzip2 curl python3-minimal ccache
```
```shell
# Fedora
dnf install -y git make xz bzip2 which ccache
```
### Install [Emscripten SDK](https://emscripten.org/docs/getting_started/downloads.html)
**NOTE**: Follow the on-screen instructions how to add the SDK to ``PATH``.
```shell
git clone https://github.com/emscripten-core/emsdk.git /opt/emsdk
/opt/emsdk/emsdk install latest
/opt/emsdk/emsdk activate latest
```
### Optionally: enable ccache for EMSDK
The ``EM_COMPILER_WRAPPER`` must be set after the EMSDK environment is
sourced. Otherwise the source script removes the environment variable.
```
. /opt/emsdk/emsdk_env.sh
EM_COMPILER_WRAPPER=ccache
```
### Optionally: pre-build and cache static libraries
Emscripten SDK provides static builds of core libraries without PIC
(position-independent code). Python builds with ``dlopen`` support require
PIC. To populate the build cache, run:
```shell
. /opt/emsdk/emsdk_env.sh
embuilder build zlib bzip2 MINIMAL_PIC
embuilder build --pic zlib bzip2 MINIMAL_PIC
```
### Install [WASI-SDK](https://github.com/WebAssembly/wasi-sdk)
**NOTE**: WASI-SDK's clang may show a warning on Fedora:
``/lib64/libtinfo.so.6: no version information available``,
[RHBZ#1875587](https://bugzilla.redhat.com/show_bug.cgi?id=1875587). The
warning can be ignored.
```shell
export WASI_VERSION=16
export WASI_VERSION_FULL=${WASI_VERSION}.0
curl -sSf -L -O https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-${WASI_VERSION}/wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gz
mkdir -p /opt/wasi-sdk
tar --strip-components=1 -C /opt/wasi-sdk -xvf wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gz
rm -f wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gz
```
### Install [wasmtime](https://github.com/bytecodealliance/wasmtime) WASI runtime
wasmtime 0.38 or newer is required.
```shell
curl -sSf -L -o ~/install-wasmtime.sh https://wasmtime.dev/install.sh
chmod +x ~/install-wasmtime.sh
~/install-wasmtime.sh --version v0.38.0
ln -srf -t /usr/local/bin/ ~/.wasmtime/bin/wasmtime
```
### WASI debugging
* ``wasmtime run -g`` generates debugging symbols for gdb and lldb. The
feature is currently broken, see
https://github.com/bytecodealliance/wasmtime/issues/4669 .
* The environment variable ``RUST_LOG=wasi_common`` enables debug and
trace logging.

44
Tools/wasm/build_wasi.sh Executable file
View file

@ -0,0 +1,44 @@
#!/usr/bin/bash
set -e -x
# Quick check to avoid running configure just to fail in the end.
if [ -f Programs/python.o ]; then
echo "Can't do an out-of-tree build w/ an in-place build pre-existing (i.e., found Programs/python.o)" >&2
exit 1
fi
if [ ! -f Modules/Setup.local ]; then
touch Modules/Setup.local
fi
# TODO: check if `make` and `pkgconfig` are installed
# TODO: detect if wasmtime is installed
# Create the "build" Python.
mkdir -p builddir/build
pushd builddir/build
../../configure -C
make -s -j 4 all
export PYTHON_VERSION=`./python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")'`
popd
# Create the "host"/WASI Python.
export CONFIG_SITE="$(pwd)/Tools/wasm/config.site-wasm32-wasi"
export HOSTRUNNER="wasmtime run --mapdir /::$(pwd) --env PYTHONPATH=/builddir/wasi/build/lib.wasi-wasm32-$PYTHON_VERSION $(pwd)/builddir/wasi/python.wasm --"
mkdir -p builddir/wasi
pushd builddir/wasi
../../Tools/wasm/wasi-env \
../../configure \
-C \
--host=wasm32-unknown-wasi \
--build=$(../../config.guess) \
--with-build-python=../build/python
make -s -j 4 all
# Create a helper script for executing the host/WASI Python.
printf "#!/bin/sh\nexec $HOSTRUNNER \"\$@\"\n" > run_wasi.sh
chmod 755 run_wasi.sh
./run_wasi.sh --version
popd