Hello everybody, I'd like to announce a new project to develop a code generator that emits WebAssembly: https://github.com/remixlabs/wasicaml With the support of RemixLabs I could already create a very first version that takes the OCaml bytecode as input and translates it to WebAssembly. While this approach probably doesn't lead to the fastest code, it is easy to accomplish, and it demonstrates the challenge (and already shows how to solve many of the part problems along the road). To be precisely, the target of the translator is wasm32-unknown-wasi, i.e. the WASI ABI. This ABI is still in early development, but provides already the syscalls (or better, host calls) to access files, to get the current time, and to read the environment. This is almost enough to run a compiler - I only had to add system() so that ocamlc can start external preprocessors. Also, due to the fact that the current wasm implementations still lack exception handling, I had to assume the presence of a host emulation of exceptions (which is easy to provide if the host environment is Javascript, but not necessarily for other environments). The translator takes the OCaml bytecode as input, i.e. you first create an excecutable $ ocamlc -o myexec ... and then make wasm out of it: $ wasicaml -o myexec.wasm myexec If you omit the .wasm suffix, wasicaml will put a preamble in front of the wasm code that starts the execution: $ wasicaml -o myexec_wasm myexec $ ./myexec_wasm Because of this trick, many problems of cross-compiling can be avoided. You may ask what the benefits of yet another "Web" language are. We already have two emitters targeting Javascript - isn't that enough? Well, two answers here. First, WASI is a proper LLVM target. Because of this, you can link code from other languages with your executable (e.g. C or Rust). So you are not limited to OCaml but can use any language that also targets the WASI ABI. E.g. you can do $ wasicaml -o myexec.wasm myexec -ccopt -lfoo to also link in libfoo.a (which must also be compiled to wasm). So it is multi-lingual from the beginning. Second, WebAssembly can be used outside the web, too. WASI targets more the command-line, and server plugins, and generally any OS-independent environments. For example, imagine you have an Electron app with a great UI, but for some special functionality you need to include some OCaml code, too. You don't want to give up the OS-independence, and WASI gives you now a natural option to add the OCaml code. And you still have access to the filesystem without hassle. - Another example is edge computing, i.e. when the cloud is extended by computers outside the data center, and the code should be in a form so that it can be run on as many platforms as possible. - All in all, WASI plays well when you need to combine OS-independence with a classic way of organizing the code as command or as server function, and you also need predictable performance. The challenge of translating OCaml to wasm is mainly the garbage collector. Wasm doesn't permit many of the tricks ocamlopt is using to know in which memory (or register) locations OCaml values are stored. In wasm, there are no registers but the closest vehicle are local variables. Now, it is not possible to scan these variables from the GC function, making it practically impossible to put OCaml values there while a function is called that might trigger a GC. There is also no really cheap way of obtaining a stack descriptor. Wasicaml inherits the stack from the bytecode interpreter and uses it as its own shadow stack for OCaml values. As wasicaml bases on the bytecode representation of the code, the bytecode instructions already ensure that values always live in this stack when the GC might run. Wasicaml additionally tries to identify values that don't need this special treatment (like ints and bools) and that are preferably stored in local variables, giving the wasm executor freedom to put these into registers or other high-speed locations. (Unfortunately, most of the type information is already erased in the bytecode, and this is definitely one of the deficiencies of the bytecode approach.) In order to maximize the performance, it is probably best to avoid the stack whenever possible. The current approach of transforming the bytecode hasn't brought to an end yet with respect to such optimizations. For example, there could be more analyses that figure out when GC runs are actually possible and when it is safe to use local variables. Another problem of the bytecode basis is that all function calls are indirect, preventing the wasm executor from inlining functions. As a project, I'd like to see wasicaml progressing in two directions. First, make the current approach as good as possible - although basing it on the bytecode representation has its downsides, it is easy to understand and it is possible to figure out what the necessary ingredients for fast code are. Second, get an idea where a possible real wasm backend would fit into the OCaml compiler (maybe it is c-- but maybe this doesn't give us much and you start better with lambda). Anyway, welcome to the new world of WebAssembly! Gerd -- PS. If you are interested in WebAssembly and like to work with me on another Wasm port for some time, there is a position: *https://www.mixtional.de/recruiting/2021-01/index.html -- PPS. Wasicaml is a project of Figly, Inc., commonly known as RemixLabs, developing a reactive low-code and code collaboration platform. https://remixlabs.com/ * -- ------------------------------------------------------------ Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de My OCaml site: http://www.camlcity.org Contact details: http://www.camlcity.org/contact.html Company homepage: http://www.gerd-stolpmann.de ------------------------------------------------------------