This commit updates the `walrus` crate used in `wasm-bindgen`. The major
change here is how `walrus` handles element segments, exposing segments
rather than trying to keep a contiugous array of all the elements and
doing the splitting itself. That means that we need to do mroe logic
here in `wasm-bindgen` to juggle indices, segments, etc.
This commit adds a test suite for consuming interface types modules as
input and producing a JS polyfill output. The tests are relatively
simple today and don't exercise a ton of functionality, but they should
hopefully cover the breadth of at least some basics of what wasm
interface types supports today.
A few small fixes were applied along the way, such as:
* Don't require modules to have a stack pointer
* Allow passing `*.wat`, `*.wit`, or `*.wasm` files as input to
`wasm-bindgen` instead of always requiring `*.wasm`.
* Add tests for the interface types output of wasm-bindgen
This commit expands the test suite with assertions about the output of
the interface types pass in wasm-bindgen. The goal here is to actually
assert that we produce the right output and have a suite of reference
files to show how the interface types output is changing over time.
The `reference` test suite added in the previous PR has been updated to
work for interface types as well, generating `*.wit` file assertions
which are printed via the `wit-printer` crate on crates.io.
Along the way a number of bugs were fixed with the interface types
output, such as:
* Non-determinism in output caused by iteration of a `HashMap`
* Avoiding JS generation entirely in interface types mode, ensuring that
we don't export extraneous intrinsics that aren't otherwise needed.
* Fixing location of the stack pointer for modules where it's GC'd out.
It's now rooted in the aux section of wasm-bindgen so it's available
to later passes, like the multi-value pass.
* Interface types emission now works in debug mode, meaning the
`--release` flag is no longer required. This previously did not work
because the `__wbindgen_throw` intrinsic was required in debug mode.
This comes about because of the `malloc_failure` and `internal_error`
functions in the anyref pass. The purpose of these functions is to
signal fatal runtime errors, if any, in a way that's usable to the
user. For wasm interface types though we can replace calls to these
functions with `unreachable` to avoid needing to import the
intrinsic. This has the accidental side effect of making
`wasm_bindgen::throw_str` "just work" with wasm interface types by
aborting the program, but that's not actually entirely intended. It's
hoped that a split of a `wasm-bindgen-core` crate would solve this
issue for the future.
* Run the wasm interface types validator in tests
* Add more gc roots for adapter gc
* Improve stack pointer detection
The stack pointer is never initialized to zero, but some other mutable
globals are (TLS, thread ID, etc), so let's filter those out.
* Add reference output tests for JS operations
This commit starts adding a test suite which checks in, to the
repository, test assertions for both the JS and wasm file outputs of a
Rust crate compiled with `#[wasm_bindgen]`. These aren't intended to be
exhaustive or large scale tests, but rather micro-tests to help observe
the changes in `wasm-bindgen`'s output over time.
The motivation for this commit is basically overhauling how all the GC
passes work in `wasm-bindgen` today. The reorganization is also included
in this commit as well.
Previously `wasm-bindgen` would, in an ad-hoc fashion, run the GC passes
of `walrus` in a bunch of places to ensure that less "garbage" was seen
by future passes. This not only was a source of slowdown but it also was
pretty brittle since `wasm-bindgen` kept breaking if extra iteams leaked
through.
The strategy taken in this commit is to have one precise location for a
GC pass, and everything goes through there. This is achieved by:
* All internal exports are removed immediately when generating the
nonstandard wasm interface types section. Internal exports,
intrinsics, and runtime support are all referenced by the various
instructions and/or sections that use them. This means that we now
have precise tracking of what an adapter uses.
* This in turn enables us to implement the `add_gc_roots` function for
`walrus` custom sections, which in turn allows walrus GC passes to do
what `unexport_unused_intrinsics` did before. That function is now no
longer necessary, but effectively works the same way. All intrinsics
are unexported at the beginning and then they're selectively
re-imported and re-exported through the JS glue generation pass as
necessary and defined by the bindings.
* Passes like the `anyref` pass are now much more precise about the
intrinsics that they work with. The `anyref` pass also deletes any
internal intrinsics found and also does some rewriting of the adapters
aftewards now to hook up calls to the heap count import to the heap
count intrinsic in the wasm module.
* Fix handling of __wbindgen_realloc
The final user of the `require_internal_export` function was
`__wbindgen_realloc`. This usage has now been removed by updating how we
handle usage of the `realloc` function.
The wasm interface types standard doesn't have a `realloc` function
slot, nor do I think it ever will. This means that as a polyfill for
wasm interface types we'll always have to support the lack of `realloc`.
For direct Rust to JS, however, we can still optionally handle
`realloc`. This is all handled with a few internal changes.
* Custom `StringToMemory` instructions now exist. These have an extra
`realloc` slot to store an intrinsic, if found.
* Our custom instructions are lowered to the standard instructions when
generating an interface types section.
* The `realloc` function, if present, is passed as an argument like the
malloc function when passing strings to wasm. If it's not present we
use a slower fallback, but if it's present we use the faster
implementation.
This should mean that there's little-to-no impact on existing users of
`wasm-bindgen`, but this should continue to still work for wasm
interface types polyfills and such. Additionally the GC passes now work
in that they don't delete `__wbindgen_realloc` which we later try to
reference.
* Add an empty test for the anyref pass
* Precisely track I32FromOptionAnyref's dependencies
This depends on the anyref table and a function to allocate an index if
the anyref pass is running, so be sure to track that in the instruction
itself for GC rooting.
* Trim extraneous exports from nop anyref module
Or if you're otherwise not using anyref slices, don't force some
intrinsics to exist.
* Remove globals from reference tests
Looks like these values adjust in slight but insignificant ways over
time
* Update the anyref xform tests
This commit is a pretty large scale rewrite of the internals of wasm-bindgen. No user-facing changes are expected as a result of this PR, but due to the scale of changes here it's likely inevitable that at least something will break. I'm hoping to get more testing in though before landing!
The purpose of this PR is to update wasm-bindgen to the current state of the interface types proposal. The wasm-bindgen tool was last updated when it was still called "WebIDL bindings" so it's been awhile! All support is now based on https://github.com/bytecodealliance/wasm-interface-types which defines parsers/binary format/writers/etc for wasm-interface types.
This is a pretty massive PR and unfortunately can't really be split up any more afaik. I don't really expect realistic review of all the code here (or commits), but some high-level changes are:
* Interface types now consists of a set of "adapter functions". The IR in wasm-bindgen is modeled the same way not.
* Each adapter function has a list of instructions, and these instructions work at a higher level than wasm itself, for example with strings.
* The wasm-bindgen tool has a suite of instructions which are specific to it and not present in the standard. (like before with webidl bindings)
* The anyref/multi-value transformations are now greatly simplified. They're simply "optimization passes" over adapter functions, removing instructions that are otherwise present. This way we don't have to juggle so much all over the place, and instructions always have the same meaning.
This commit switches all of `wasm-bindgen` from the `failure` crate to
`anyhow`. The `anyhow` crate should serve all the purposes that we
previously used `failure` for but has a few advantages:
* It's based on the standard `Error` trait rather than a custom `Fail`
trait, improving ecosystem compatibility.
* We don't need a `#[derive(Fail)]`, which means that's less code to
compile for `wasm-bindgen`. This notably helps the compile time of
`web-sys` itself.
* Using `Result<()>` in `fn main` with `anyhow::Error` produces
human-readable output, so we can use that natively.
Turns out #1704 was buggy and ended up never injecting initialization
because the anyref table was never present! This fixes that issue and
this should now be tested on CI to ensure this doesn't regress and
future changes preserve correctness
Ensure that we enable the new `parallel` feature in the CLI so our tools all use
parallelized parsing, but none of our specific crates need it for usage.
This commit updates the `walrus` dependency with recent upstream API
changes in `walrus` itself, namely updates to passive segements and how
memory data segments are handled
We have very few tests today so this starts to add the basics of a test
suite which compiles Cargo projects on-the-fly which will hopefully help
us bolster the amount of assertions we can make about the output.
This commit moves `wasm-bindgen` the CLI tool from internally using
`parity-wasm` for wasm parsing/serialization to instead use `walrus`.
The `walrus` crate is something we've been working on recently with an
aim to replace the usage of `parity-wasm` in `wasm-bindgen` to make the
current CLI tool more maintainable as well as more future-proof.
The `walrus` crate provides a much nicer AST to work with as well as a
structured `Module`, whereas `parity-wasm` provides a very raw interface
to the wasm module which isn't really appropriate for our use case. The
many transformations and tweaks that wasm-bindgen does have a huge
amount of ad-hoc index management to carefully craft a final wasm
binary, but this is all entirely taken care for us with the `walrus`
crate.
Additionally, `wasm-bindgen` will ingest and rewrite the wasm file,
often changing the binary offsets of functions. Eventually with DWARF
debug information we'll need to be sure to preserve the debug
information throughout the transformations that `wasm-bindgen` does
today. This is practically impossible to do with the `parity-wasm`
architecture, but `walrus` was designed from the get-go to solve this
problem transparently in the `walrus` crate itself. (it doesn't today,
but this is planned work)
It is the intention that this does not end up regressing any
`wasm-bindgen` use cases, neither in functionality or in speed. As a
large change and refactoring, however, it's likely that at least
something will arise! We'll want to continue to remain vigilant to any
issues that come up with this commit.
Note that the `gc` crate has been deleted as part of this change, as the
`gc` crate is no longer necessary since `walrus` does it automatically.
Additionally the `gc` crate was one of the main problems with preserving
debug information as it often deletes wasm items!
Finally, this also starts moving crates to the 2018 edition where
necessary since `walrus` requires the 2018 edition, and in general it's
more pleasant to work within the 2018 edition!