Add a new `wit-dylib` crate by alexcrichton · Pull Request #2304 · bytecodealliance/wasm-tools

@alexcrichton

This commit is the integration of a new crate into `wasm-tools` dubbed
`wit-dylib`. This is additionally integrated under a new `wasm-tools
wit-dylib` subcommand. The purpose of this crate is to create a
shared-everything dynamic library from a WIT world which implements the
world in terms of a static interface of functions. The main use case
envisioned for this is for componentizing interpreted programming
languages.

The approach taken here is that a shared-everything dynamic library is
generated purely from the input of a WIT world and some configuration
parameters. This generated dynamic library is then suitable to pass to
`wasm-tools component link` to create a full component. This dynamic
library might be further modified through means such as a GC pass or
some form of pre-initialization. This overall architecture is lifted
from componentize-py where it takes a very similar approach, but the
support here is disentangled from any Python specifics. More information
about this can be found in the README of the crate added here.

This new crate is integrated not only as-is but with a start at what is
supposed to be a comprehensive test suite of the generated code.
Specifically there are a suite of `src/bin/*.rs` files which each file
pretends to be an "interpreter" through an shared utility implementation
amongst the crates. Effectively this boxes up all WIT values into a
single `Val` representation. This enables testing all the various
runtime behaviors with high-level facilities like println-debugging,
vectors, strings, etc. Tests are modeled after `wit-bindgen test` where
there's a "caller" and a "callee" where the caller imports an interface
and the callee exports the interface. The `wasm-compose` crate composes
these together to produce a component runnable with a `wasmtime` CLI to
complete the test.

Some possible FAQ-style questions:

* **Why include this in `wasm-tools`?** - this is an interpreter-agnostic
  implementation of a component, for example nothing is Python-specific.
  It's intended that this is neutral and low-level enough to include in
  `wasm-tools`. Developers won't be using this day-in-and-day-out but
  it's hoped to be an integral part of componentizing interpreted languages.

* **What languages will use this?** - for now, none, it's just starting.
  The Rust crate written at `crates/wit-dylib/test-programs` is intended
  to be suitable for external use but isn't published just yet. I hope
  to dabble with Lua after this lands with the `mlua` crate and my hope
  is to work with Joel to integrate this into `componentize-py`.
  Longer-term I'd also like to integrate this into StarlingMonkey.

* **How is this used?** - the README contains a bit more information,
  but at a high level it's (a) write your interpreter and implement/use
  `wit_dylib.h`, (b) compile your interpreter as a shared library, (c)
  use `wit-dylib` for a WIT world, (d) link these together into a single
  component, and (e) profit. Various compiler flags are required to get
  this all passing, but that's the high-level bits.

* **How does wizer work?** - it doesn't, Wizer only works with core
  modules and not components. The `component-init` phase of
  `componentize-py` will need to be extracted and put somewhere
  (probably Wizer itself). In the meantime this approach of using
  shared-everything dynamic linking is incompatible with Wizer.

* **How different is this from `componentize-py`?** - very, I started
  with the same basic structure but ended up evolving relatively far
  from the specific implementation details of `componentize-py`. At a
  high-enough level the two continue to look the same but you don't have
  to go too far down to see how the implementations differ.

* **Fuzzing?** - I haven't figured this out and this implementation is
  not fuzzed yet. It's still TBD what exactly this would look like. It's
  easy enough to generate an arbitrary world and then generate a dylib
  and assert it's valid but what really wants to happen is to validate
  that the actual generated code is correct. This'll take some more
  integration work. In the meantime it's intended that the test suite is
  comprehensive enough to be able to uncover and execute any bug found
  to serve as a regression suite.

* **Why now?** - I had a itch and wanted to scratch it. It's expected
  that this will be a lynchpin of componentizing interpreted languages,
  but this is not all that's needed. For example `component-init` and/or
  Wizer integration is still needed. Basically this is a separable
  component I wanted to write, but there's yet more work to be done to
  fully integrate this everywhere.