Skip to content

v1.1.0

Compare
Choose a tag to compare
@github-actions github-actions released this 02 May 02:53
· 656 commits to main since this release
f3f7352

wazero 1.1.0 improves debug, reduces memory usage and adds new APIs for advanced users.

1.1 includes every change from prior versions. The below elaborates the main differences, brought to you by many excellent engineers who designed, reviewed, implemented and tested this work. Many also contribute to other areas in Go, TinyGo and other languages, as well specification. If you are happy with the work wazero is doing for Go and Wasm in general, please star our repo as well any projects mentioned!

Now, let's dig in!

Debug

This section is about our debug story, which is better than before thanks to several contributors!

Go stack trace in the face of Go runtime errors

When people had bugs in their host code, you would get a high-level stack trace like below

2023/04/25 10:35:16 runtime error: invalid memory address or nil pointer dereference (recovered by wazero)
wasm stack trace:
        env.hostTalk(i32,i32,i32,i32) i32
        .hello(i32,i32) i64

While helpful, especially to give the wasm context of the error. This didn't point to the specific function that had the bug. Thanks to @mathetake, we now include the Go stack trace in the face of Go runtime errors. Specifically, you can see the line that erred and quickly fix it!

2023/04/25 10:35:16 runtime error: invalid memory address or nil pointer dereference (recovered by wazero)
wasm stack trace:
        env.hostTalk(i32,i32,i32,i32) i32
        .hello(i32,i32) i64

Go runtime stack trace:
goroutine 1 [running]:
runtime/debug.Stack()
        /usr/local/go/src/runtime/debug/stack.go:24 +0x64
github.com/tetratelabs/wazero/internal/wasmdebug.(*stackTrace).FromRecovered(0x140001e78d0?, {0x100285760?, 0x100396360?})
        /Users/mathetake/wazero/internal/wasmdebug/debug.go:139 +0xc4
--snip--

experimental.InternalModule

Stealth Rocket are doing a lot of great work for the Go ecosystem, including implementation of GOOS=wasip1 in Go and various TinyGo improvements such as implementing ReadDir in its wasi target.

A part of success is the debug story. wazero already has some pretty excellent logging support built into the command-line interface. For example, you can add -hostlogging=filesystem to see a trace of all sys calls made (e.g. via wasi). This is great for debugging. Under the scenes, this is implemented with an experimental listener.

Recently, @pelletier added StackIterator to the listener, which allows inspection of stack value parameters. This enables folks to build better debugging tools as they can inspect which values caused an exception for example. Since it can walk the stack, propagation isn't required to generate images like this, especially as listeners can access wasm memory.

image

However, in practice, stack values and memory isn't enough. For example, Go maintains its own stack in the linear memory, instead of using the regular wasm stack. The Go runtime stores the stack pointer in global 0. In order to retrieve arguments from the stack, the listener has to read the value of global 0, then the memory. Notably, this global isn't exported.

To work around this, Thomas exposed an experimental interface experimental.InternalModule which can inspect values given an api.Module. This is experimental as we still aren't quite sure if we should allow custom host code to access unexported globals. However, without this, you can't effectively debug either. If you have an opinion, please join our slack channel and share it with us! Meanwhile, thank @pelletier and Stealth Rocket in general for all the help in the Go ecosystem, not just their help with wazero!

Memory Usage

This section describes an advanced internal change most users don't need to know about. Basically, it makes wazero more memory efficient. If you are interested in details, read on, otherwise thank @mathetake for the constant improvements!

Most users of wazero use the implicit compiler runtime configuration. This memory maps (mmap syscall) the platform-specific code wazero generates from the WebAssembly bytecode. Since it was written, this used mmap once per function in a module.

One problem with this was page size. Basically, mmap can only allocate the boundary of the page size of the underlying os. For example, a very simple function can be as small as several bytes, but would still reserve up a page each (marked as executable and not reusable by the Go runtime). Therefore, we wasted roughly (len(body)%osPageSize)*function.

The new compiler changes the mmap scope to module. Even though we still need to align each function on 16 bytes boundary
when mmaping per module, the wasted space is much less than before. Moreover, with the code behind functions managed at module scope, it can be cleaned up with the module. We no longer have to abuse the runtime.Finalizer for cleanup.

Those using Go benchmarks should see improved compilation performance, even if it appears more allocations than before. One tricky thing about Go benchmarks is they can't report what happens via mmap. The net result of wazero will be less wasted memory, even if you see slightly more allocations compiling than before. These allocations are a target of GC and should be ignorable in the long-running program vs the wasted page problem in the prior implementation, as that was persistent until the compiled module closed.

In summary, this is another example of close attention to the big picture, even numbers hard to track. We're grateful for the studious eyes of @mathetake always looking for ways to improve holistic performance.

Advanced APIs

This section can be skipped unless you are really interested in advanced APIs!

Function.CallWithStack

WebAssembly is a stack-based virtual machine. Parameters and results of functions are pushed and popped from the stack, and in Go, the stack is implemented with a []uint64 slice. Since before 1.0, authors of host functions could implement exported functions with a stack-based API, which both avoids reflection and reduces allocation of these []uint64 slices.

For example, the below is verbose, but appropriate for advanced users who are ok with the technical implementation of WebAssembly functions. What you see is 'x' and 'y' being taken off the stack, and the result placed back on it at position zero.

builder.WithGoFunction(api.GoFunc(func(ctx context.Context, stack []uint64) {
	x, y := api.DecodeI32(stack[0]), api.DecodeI32(stack[1])
	sum := x + y
	stack[0] = api.EncodeI32(sum)
}), []api.ValueType{api.ValueTypeI32, api.ValueTypeI32}, []api.ValueType{api.ValueTypeI32})

This solves functions who don't make callbacks. For example, even though the normal api.Function call doesn't use reflection, the below would both allocate a slice for params and also one for the result.

Here's how the current API works to call a calculator function

results, _ := add.Call(ctx, x, y)
sum := results[0]

Specifically, both the size and ptr would be housed by a slice of size one. Code that makes a lot of callbacks from the host spend resources on these slices. @inkeliz began a design of a way out. @ncruces took lead on implementation with a lot of feedback from others. This resulted in several designs, with the below chosen to align to similar semantics of how host functions are defined.

wazero now has a second API for calling an exported function, Function.CallWithStack.

Here's how the stack-based API works to call a calculator function

stack := []uint64{x,y}
_ = add.CallWithStack(ctx, stack)
sum := stack[0]

As you can see above, the caller provides the stack, eliminating implicit allocations. The results of this are more significant on calls that are in a pattern, where the stack is re-used for many calls. For example, like this:

stack := make([]uint64, 4)
for i, search := range searchParams {
	// copy the next params to the stack
	copy(stack, search)
	if err := searchFn.CallWithStack(ctx, stack); err != nil {
		return err
	} else if stack[0] == 1 { // found
		return i // searchParams[i] matched!
	}
}

While most end users won't ever see this API, those using shared libraries can get wins for free. For example, @anuraaga updated go-re2 to use this internally and it significantly improved a regex benchmark, reducing its ns/op from 350 to 311 and eliminating all allocations.

wazero treats core APIs really seriously and don't plan to add any without a lot of consideration. We're happy to have had the help of @inkeliz @ncruces as well the many other participants including @achille-roussel @anuraaga @codefromthecrypt and @mathetake.

emscripten.InstantiateForModule

@jerbob92 has been porting various PDF libraries to compile to wasm. The compilation process uses emscripten, which generates dynamic functions. At first, we hard-coded the set of "invoke_xxx" functions required by PDFium. This was a decision to defer design until we understood how common end users would need these. Recently, Jeroen began porting more PDF libraries, including ghostscript and xpdf . These needed different dynamic functions from eachother. To solve this, @codefromthecrypt implemented a host function generator, emscripten.InstantiateForModule.

This works by analyzing the compiled module for functions needed, so you need to compile your binary first like this:

emscriptenCompiled, _ := r.CompileModule(ctx, emscriptenWasm)
_, err = emscripten.InstantiateForModule(ctx, r, emscriptenCompiled)

If you are doing porting work, you may find this handy. If so, please drop by our #wazero slack channel and tell us about it!

Minor changes

  • @Pryz added integrations tests so that the upcoming 1.21 GOOS=wasip1 is tested on each change. Thanks a lot to @mathetake for support on this, too!
  • @evacchi improved the implementation of non-blocking stdin on Windows (via select(2))
  • @ncruces added internal.WazeroOnly to interfaces not open for implementation. Thanks @achille-roussel for the design support!
  • @codefromthecrypt fixed a bug on zero-length reads in WASI
  • @mathetake made numerous improvements to the amd64 compiler