rcl/ideas/multithreading.md
Ruud van Asseldonk 7d31cb193a Add ideas about multithreading
For a long time I wondered how I could make RCL scale to large
repositories (e.g. Nixpkgs), because evaluation is completely
sequential, and it's hard to parallelize because you only discover the
imports during evaluation, and then you'd have to do some kind of
laziness, add thunks for a "pending evaluation" value, but then that
pollutes the entire evaluator that now has to force thunks. Laziness
also disturbs assertions ... if I import a file but never use it, should
its violated assertions surface? I think they should, but with laziness
they would not.

But now that I'm writing the typechecker, I had a realization. When the
typechecker encounters an import statement, it can go and load the file
in the background, and typecheck it already. It can even evaluate it
already, speculatively.
2024-02-24 21:43:38 +01:00

943 B

Multithreading

With a separate typecheck and evaluation phase, the typechecker already finds all imports, so it can kick off loading, lexing, parsing, and typechecking the imported file in the background, and the foreground thread can continue, assuming the typecheck passes. All the way at the end, we can report errors from all threads (be sure to sort them for reproducibility).

Evaluation is still sequential, but at least some stages can be pipelined. Note, this assumes that the type is not imported from the other file. But files that export types would be rare in a big repository where multithreading matters.

It is even possible to optimistically start evaluating imported files, though that might be wasteful because some of the imports might be conditional. There's another opportunity there: the typechecker can track whether the import is conditional or not, and kick off full evaluation of non-conditional imports already.