This commit is a pretty large scale rewrite of the internals of wasm-bindgen. No user-facing changes are expected as a result of this PR, but due to the scale of changes here it's likely inevitable that at least something will break. I'm hoping to get more testing in though before landing!
The purpose of this PR is to update wasm-bindgen to the current state of the interface types proposal. The wasm-bindgen tool was last updated when it was still called "WebIDL bindings" so it's been awhile! All support is now based on https://github.com/bytecodealliance/wasm-interface-types which defines parsers/binary format/writers/etc for wasm-interface types.
This is a pretty massive PR and unfortunately can't really be split up any more afaik. I don't really expect realistic review of all the code here (or commits), but some high-level changes are:
* Interface types now consists of a set of "adapter functions". The IR in wasm-bindgen is modeled the same way not.
* Each adapter function has a list of instructions, and these instructions work at a higher level than wasm itself, for example with strings.
* The wasm-bindgen tool has a suite of instructions which are specific to it and not present in the standard. (like before with webidl bindings)
* The anyref/multi-value transformations are now greatly simplified. They're simply "optimization passes" over adapter functions, removing instructions that are otherwise present. This way we don't have to juggle so much all over the place, and instructions always have the same meaning.
This commit switches all of `wasm-bindgen` from the `failure` crate to
`anyhow`. The `anyhow` crate should serve all the purposes that we
previously used `failure` for but has a few advantages:
* It's based on the standard `Error` trait rather than a custom `Fail`
trait, improving ecosystem compatibility.
* We don't need a `#[derive(Fail)]`, which means that's less code to
compile for `wasm-bindgen`. This notably helps the compile time of
`web-sys` itself.
* Using `Result<()>` in `fn main` with `anyhow::Error` produces
human-readable output, so we can use that natively.
Turns out #1704 was buggy and ended up never injecting initialization
because the anyref table was never present! This fixes that issue and
this should now be tested on CI to ensure this doesn't regress and
future changes preserve correctness
This commit updates `wasm-bindgen` to the latest version of `walrus`
which transforms all internal IR representations to a list-based IR
instead of a tree-based IR. This isn't a major change other than
cosmetic for `wasm-bindgen` itself, but involves a lot of changes to the
threads/anyref passes.
This commit also updates our CI configuration to actually run all the
anyref tests on CI. This is done by downloading a nightly build of
node.js which is theorized to continue to be there for awhile until the
full support makes its way into releases.
Ensure that we enable the new `parallel` feature in the CLI so our tools all use
parallelized parsing, but none of our specific crates need it for usage.
This commit updates the `walrus` dependency with recent upstream API
changes in `walrus` itself, namely updates to passive segements and how
memory data segments are handled
This commit starts the `wasm-bindgen` CLI tool down the road to being a
true polyfill for WebIDL bindings. This refactor is probably the first
of a few, but is hopefully the largest and most sprawling and everything
will be a bit more targeted from here on out.
The goal of this refactoring is to separate out the massive
`crates/cli-support/src/js/mod.rs` into a number of separate pieces of
functionality. It currently takes care of basically everything
including:
* Binding intrinsics
* Handling anyref transformations
* Generating all JS for imports/exports
* All the logic for how to import and how to name imports
* Execution and management of wasm-bindgen closures
Many of these are separable concerns and most overlap with WebIDL
bindings. The internal refactoring here is intended to make it more
clear who's responsible for what as well as making some existing
operations much more straightforward. At a high-level, the following
changes are done:
1. A `src/webidl.rs` module is introduced. The purpose of this module is
to take all of the raw wasm-bindgen custom sections from the module
and transform them into a WebIDL bindings section.
This module has a placeholder `WebidlCustomSection` which is nowhere
near the actual custom section but if you squint is in theory very
similar. It's hoped that this will eventually become the true WebIDL
custom section, currently being developed in an external crate.
Currently, however, the WebIDL bindings custom section only covers a
subset of the functionality we export to wasm-bindgen users. To avoid
leaving them high and dry this module also contains an auxiliary
custom section named `WasmBindgenAux`. This custom section isn't
intended to have a binary format, but is intended to represent a
theoretical custom section necessary to couple with WebIDL bindings to
achieve all our desired functionality in `wasm-bindgen`. It'll never
be standardized, but it'll also never be serialized :)
2. The `src/webidl.rs` module now takes over quite a bit of
functionality from `src/js/mod.rs`. Namely it handles synthesis of an
`export_map` and an `import_map` mapping export/import IDs to exactly
what's expected to be hooked up there. This does not include type
information (as that's in the bindings section) but rather includes
things like "this is the method of class A" or "this import is from
module `foo`" and things like that. These could arguably be subsumed
by future JS features as well, but that's for another time!
3. All handling of wasm-bindgen "descriptor functions" now happens in a
dedicated `src/descriptors.rs` module. The output of this module is
its own custom section (intended to be immediately consumed by the
WebIDL module) which is in theory what we want to ourselves emit one
day but rustc isn't capable of doing so right now.
4. Invocations and generations of imports are completely overhauled.
Using the `import_map` generated in the WebIDL step all imports are
now handled much more precisely in one location rather than
haphazardly throughout the module. This means we have precise
information about each import of the module and we only modify
exactly what we're looking at. This also vastly simplifies intrinsic
generation since it's all simply a codegen part of the `rust2js.rs`
module now.
5. Handling of direct imports which don't have a JS shim generated is
slightly different from before and is intended to be
future-compatible with WebIDL bindings in its full glory, but we'll
need to update it to handle cases for constructors and method calls
eventually as well.
6. Intrinsic definitions now live in their own file (`src/intrinsic.rs`)
and have a separated definition for their symbol name and signature.
The actual implementation of each intrinsic lives in `rust2js.rs`
There's a number of TODO items to finish before this merges. This
includes reimplementing the anyref pass and actually implementing import
maps for other targets. Those will come soon in follow-up commits, but
the entire `tests/wasm/main.rs` suite is currently passing and this
seems like a good checkpoint.
LLVM's mergefunc pass may mean that the same descriptor function is used
for different closure invocation sites even when the closure itself is
different. This typically only happens with LTO but in theory could
happen at any time!
The assert was tripping when we tried to delete the same function table
entry twice, so instead of a `Vec<usize>` of entries to delete this
commit switches to a `HashSet<usize>` which should do the deduplication
for us and enusre that we delete each descriptor only once.
Closes#1264
This commit moves `wasm-bindgen` the CLI tool from internally using
`parity-wasm` for wasm parsing/serialization to instead use `walrus`.
The `walrus` crate is something we've been working on recently with an
aim to replace the usage of `parity-wasm` in `wasm-bindgen` to make the
current CLI tool more maintainable as well as more future-proof.
The `walrus` crate provides a much nicer AST to work with as well as a
structured `Module`, whereas `parity-wasm` provides a very raw interface
to the wasm module which isn't really appropriate for our use case. The
many transformations and tweaks that wasm-bindgen does have a huge
amount of ad-hoc index management to carefully craft a final wasm
binary, but this is all entirely taken care for us with the `walrus`
crate.
Additionally, `wasm-bindgen` will ingest and rewrite the wasm file,
often changing the binary offsets of functions. Eventually with DWARF
debug information we'll need to be sure to preserve the debug
information throughout the transformations that `wasm-bindgen` does
today. This is practically impossible to do with the `parity-wasm`
architecture, but `walrus` was designed from the get-go to solve this
problem transparently in the `walrus` crate itself. (it doesn't today,
but this is planned work)
It is the intention that this does not end up regressing any
`wasm-bindgen` use cases, neither in functionality or in speed. As a
large change and refactoring, however, it's likely that at least
something will arise! We'll want to continue to remain vigilant to any
issues that come up with this commit.
Note that the `gc` crate has been deleted as part of this change, as the
`gc` crate is no longer necessary since `walrus` does it automatically.
Additionally the `gc` crate was one of the main problems with preserving
debug information as it often deletes wasm items!
Finally, this also starts moving crates to the 2018 edition where
necessary since `walrus` requires the 2018 edition, and in general it's
more pleasant to work within the 2018 edition!