Go 1.7 toolchain improvements

This is a progress report on the Go toolchain improvements during the 1.7 development cycle. All measurements were taken using a Thinkpad x220, Core i5-2520M, running Ubuntu 14.04 linux.

Faster compilation

Since Go 1.5, when the compiler itself was translated from C to Go, compile times are slower than they used to be. Everyone knows it, nobody is happy about it, and we’re working on fixing it.

A huge amount of effort in the 1.7 cycle has gone into reducing the amount of memory and the wall time the compiler uses for various benchmark jobs. The results so far are:

Compile time for full build relative to Go 1.4.3

Compile time for full build relative to Go 1.4.3

Previously I reported the current 1.7 compiler was 2x slower than Go 1.4.3 for the Jujud test. After re-benchmarking everything for this post, the slowdown is closer to 2.2x. Jujud is the largest of the three benchmarks–512 packages, vs 304 and 102 packages respectively–and shows the largest slowdown.

The cmd/go result is a little misleading as the code being compiled changes with every release, vs the fixed codebases of the other benchmarks.

Note: The benchmark scripts for jujud, kube-controller-manager, and gogs are online. Please try them yourself and report your findings.

Improved linker performance

A significant part of the build time improvements observed above come from improvements to the linker. Relative to Go 1.6, the linker is now roughly 66% faster.

Link time relative to Go 1.4.3

Link time relative to Go 1.4.3

Relative to Go 1.4.3, linking is 10% faster for any non trivial binary, and up to 30% faster for large binaries like jujud. These figures are for ELF targets only, Mach-o and PE targets have not improved as much.

This isn’t just useful for building final binaries. A faster linker improves the edit/compile/test cycle as each test is itself a program that must be linked and run. Anecdotally the linker now uses a third less memory, which is valuable when linking large binaries.

Smaller binaries

With code generation and linker improvements, binaries produced by the tip compiler are substantially smaller than Go 1.6. This work has been spearheaded by David Crawshaw.

Binary sizes

Binary size (bytes)

At this point, with the exception of its own cmd/go tool, Go 1.7 produces smaller binaries than Go 1.4.3.

Code generation improvements

The big feature for the Go 1.7 cycle is the new SSA backend for 64bit Intel.

While not the focus of this post, it would be remiss not to include some information about performance improvements in compiled code (not just the compiler’s code). Stressing of course that these are preliminary figures as there is still four months to go before the new backend becomes the default for amd64.

Go 1 benchmarks, Go 1.6 vs 683448a

Go 1 benchmarks, Go 1.6 vs 683448a

These numbers match the figures reported by Keith Randall a few weeks ago, and are in line with his thesis in the SSA design doc.

I think it would be fairly easy to make the generated programs 20% smaller and 10% faster. — khr

These improvements are not just the work of the SSA backend. The standard library and garbage collector continue to see improvements, including a 20% improvement to the fmt package by Martin Möhrmann. These benefits flow to all the platforms that Go supports.

The sole regression above is caused by a current limitation in the register optimiser which manages to registerise one less variable in the Mandelbrot inner loop.

Looking ahead

According to the release schedule, approximately one month remains before the 1.7 change window closes and the dev cycle enters the bug fix phase. There is still lots of work to do, but the improvements so far will easily make Go 1.7 the best Go release to date.