Go is not C, so there is not an extreme fast way to merge slices
The builtin make
function may be used to allocate a slice.
As Go is a memory safe language which tries to remove the unspecified behaviors in C,
all the elements allocated by a make
call will be reset to zero in the make
call.
Go toolchain 1.15 introduces an optimization
to avoid resetting some elements allocated in a make
call presenting in the following alike code.
// x is a slice which element type is T.
// n is an integer expression, either constant or not.
y = make([]T, n)
copy(y, x)
// Only the elements within y[len(x):] will be reset.
For other scenarios, make
calls will still reset elements,
which is often unnecessary in practice.
For example, the make
call in the following code.
func MergeSlices(data ...[]int) []int {
n := 0
for _, s := range data {
n += len(s)
}
r := make([]int, 0, n)
for _, s := range data {
r = append(r, s...)
}
return r
}
That means there is not an extreme fast way to merge slices in Go.
How large is the performance downgrade caused by the unnecessary element resetting? From 1% to 45%, depending on sizes of the merged slices and CPU models. For example, on my machine, the downgrade in merging the 3 slices shown in the following code is about 20%.
package main
import "testing"
var (
x = make([]byte, 10000)
y = make([]byte, 20000)
z = make([]byte, 50000)
w = MergeSlices(x, y, z)
r [128][]byte
)
func MergeSlices(data ...[]byte) []byte {
type _ int
n := 0
for _, s := range data {
n += len(s)
}
r := make([]byte, 0, n)
for _, s := range data {
r = append(r, s...)
}
return r
}
func ClearSlice(data []byte) []byte {
type _ int
// The following loop will be optimized
// as a memclr call internally.
for i := range data {
data[i] = 0
}
return data
}
func Benchmark_MergeSlices(b *testing.B) {
for i := 0; i < b.N; i++ {
r[i&127] = MergeSlices(x, y, z)
}
}
func Benchmark_ClearSlice(b *testing.B) {
for i := 0; i < b.N; i++ {
r[i&127] = ClearSlice(w)
}
}
The benchmark result:
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
Benchmark_MergeSlices-4 66850 17283 ns/op 81920 B/op 1 allocs/op
Benchmark_ClearSlice-4 375154 2998 ns/op 0 B/op 0 allocs/op
2998.0 / (17283 - 2998) == 0.21
(The methodology used in this article is in a rough way. So I accept Axel Wagner's criticism.)
Please follow the official Twitter account of Go 101, @go100and1, to learn some Go details, facts, tips, etc, and read some Go articles from time to time.
Go 101 is a series of books about Go programming.
Books in Go 101 series
- Go (Fundamentals) 101 focuses on Go syntax/semantics (except custom generics related) and all kinds of runtime related things.
- Go Optimizations 101 provides some code performance optimization tricks, tips, and suggestions.
- Go Details & Tips 101 collects many details and provides several tips in Go programming.
- Go Generics 101 explains Go custom generics in detail.
Tapir started writing the Go 101 books and maintaining the go101.org website since 2016. New contents will continue being added to the books and the website from time to time. If you would like to, you can also support Go 101 by playing Tapir's games (for both Android and iPhone/iPad):
- Color Infection (★★★★★) - a physics based casual puzzle original game. 140+ levels.
- Rectangle Pushers (★★★★★) - a casual puzzle original game. 104+ levels.
- Let's Play With Particles - a casual action original game. Three game modes are included.