A Gentle Introduction to Generics in Go

The release of generics in Go 1.18 is a major change to the language. How do generics work? How do they impact performance? When do they make sense? A beginner-friendly wrap-up.

May 1, 2022 in Go

This article is a beginner-friendly introduction to the fundamental idea of generics and their implementation in Go, as well as a TL;DR for the various performance discussions around them. First, we'll take a look at the core problem addressed by generics.

用中文阅读:简单易懂的 Go 泛型使用和实现原理介绍

The Problem

Let's say we want to implement a simple tree data structure, consisting of nodes. Each node holds a value. Prior to Go 1.18, the typical way to implement this looked as follows:

type Node struct {
    value interface{}
}

This works fine in most cases, but it comes with a few drawbacks.

First, interface{} can be anything. If we wanted to restrict the possible types value can hold, for example integers and floats, we would be forced to check this at runtime:

func (n Node) IsValid() bool {
    switch n.value.(type) {
        case int, float32, float64:
            return true
        default:
            return false
    }
}

There was no possibility to restrict the allowed types at compile time. Type switches like the one above were common practice in many Go libraries. Here's a real-world example.

Second, working with the value in the Node is verbose and error-prone. Doing anything with the value always involves some kind of type assertion, even if you can safely assume that value holds an integer:

number, ok := node.value.(int)
if !ok {
    // ...
}

double := number * 2

These are only some of the inconveniences of working with interface{}, which provides no type safety and bears the risk of critical runtime errors.

The Solution

Instead of accepting every data type or a concrete type, we define a "placeholder type" called T as the type of value. Note that the code won't compile yet.

type Node[T] struct {
    value T
}

The generic type T needs to be declared first, which is done using square brackets after the struct or function name.

At this time, T could be any type. Only when instantiating a Node with an explicit type, T will be that type.

n := Node[int]{
    value: 5,
}

The generic Node is instantiated as a Node[int] (speak: node of integer), so T is an integer.

Type Constraints

In fact, the declaration of T is lacking a required information: A type constraint.

Type constraints are used to further restrict the possible types that can be used as T. Go itself provides some pre-defined type constraints, but self-defined types can be used as well.

type Node[T any] struct {
    value T
}

The any type constraint allows T to be literally any type. In case node values need to be compared at some point, there's a comparable type constraint. Types that fulfill this pre-defined constraint can be compared using ==.

type Node[T comparable] struct {
    value T
}

Any type can be used as a type constraint. Go 1.18 introduced a new interface syntax, where other data types can be embedded:

type Numeric interface {
    int | float32 | float64
}

This means an interface may not only define a set of methods, but also a set of types. Using the Numeric interface as a type constraint implies that value can either be an integer or a float:

type Node[T Numeric] struct {
    value T
}

Regaining Type Safety

The huge advantage of generic type parameters, as opposed to using something like interface{}, is that the eventual type of T will be known at compile time. Defining a type constraint for T entirely removes the need for runtime checks. In case the type used as T doesn't satisfy the type constraint, the code simply won't compile.

When writing generic code, you can act as if you already knew the final type of T:

func (n Node[T]) Value() T {
    return n.value
}

The function above returns n.Value, which is of type T. Therefore, the return value is T – and if T is an integer, the return type is known to be int. The return value can thus directly be used as an integer, without any type assertions.

n := Node[int]{
    value: 5,
}

double := n.Value() * 2

Regaining type safety at compile time makes Go code more reliable and less error-prone.

Use Cases for Generics

Typical use cases for generics are listed in When To Use Generics by Ian Lance Taylor. It comes down to three main scenarios:

  1. Working with built-in container types such as slices, maps, and channels.
  2. Implementing general-purpose data structures such as linked lists or tree structures.
  3. Writing a function whose implementation looks the same for many types, such as a sorting function.

In general, consider using generics when you don't want to make assumptions about the contents of the values you're operating upon. The Node from our example doesn't care too much about the value it holds.

A scenario where generics wouldn't be an appropriate choice is when there are different implementations for different types. Also, don't change function signatures with interfaces like Read(r io.Reader) to generic signatures like Read[T io.Reader](r T).

Performance

To understand the performance of generics and their implementation in Go, you first need to understand the two most common ways of implementing generics in general.

This is a TL;DR of the various performance deep dives and the discussions around them. It is unlikely that the performance characteristics of generics in Go are relevant to you.

Virtual Method Table

One way of implementing generics into a compiler is to use a virtual method table. The generic function is modified in a way that it only accepts pointers as arguments. The values are then allocated on the heap, and pointers to these values are passed to the generic function. This works because a pointer always looks the same, regardless of the type it is pointing to.

If those values are objects, and the generic function needs to invoke methods of these objects, it can't do so any longer. The function only has a pointer to the objects and doesn't know where their methods live. Consequently, it needs a table where the memory addresses of the methods can be looked up: the virtual method table. This so-called dynamic dispatch was already being used by interfaces in Go and languages like Java.

Virtual method tables may not only be used to implement generics, but also other kinds of polymorphism. However, dereferencing those pointers and invoking virtual functions is slower than calling the functions directly, and using virtual method tables prevents the compiler from performing optimizations.

Monomorphization

A much simpler approach is monomorphization, where the compiler generates a copy of the generic function for each data type it is invoked with.

func max[T Numeric](a, b T) T {
    // ...
}

larger := max(3, 5)

Since the max function shown above is invoked with two integers, the compiler will generate a copy of max for int when monomorphizing the code:

func maxInt(a, b int) int {
    // ...
}

larger := maxInt(3, 5)

The great advantage is that monomorphization results in significantly better runtime performance than using virtual method tables. Not only are direct method calls more efficient, they also enable a whole chain of compiler optimizations. The price for this has to be paid at compile time, though. Generating copies of the generic function for all relevant types is time-consuming.

Go's Implementation

Which of these two approaches is best suited for Go? Fast compilation is important, but so is runtime performance. To meet these demands, the Go team decided to take a hybrid approach when implementing generics.

Go uses monomorphization, but tries to reduce the number of function copies that need to be generated. Instead of creating a copy per type, it generates a copy per layout in memory: int, float64, Node, and other so-called "value types" all look different in memory, hence the generic function will be copied for all of these types.

In contrast to value types, pointers and interfaces always have the same layout in memory. The compiler will generate one copy of the generic function for calls with pointers and interfaces. Just as with virtual method tables, the generic function receives pointers and therefore needs a table for dynamically looking up method addresses. What's referred to as dictionaries in the Go implementation comes with the same performance characteristics of virtual method tables.

Conclusion

The consequence of this hybrid approach is that you get the performance benefits of monomorphization for calls with value types, and pay the costs of virtual method tables for calls with pointers or interfaces.

What's often been missed out in the performance discussions is that all these benefits and costs only concern the invocation of a function. Usually, most of the execution time is spent inside the function. The performance overhead of calling methods probably won't be a bottleneck for you, and even if it is, consider optimizing the function body first.

Further reading:
Vicent Marti: Generics can make your Go code slower (PlanetScale)
Andy Arthur: Generics and Value Types in Golang (Dolthub)
Virtual method table (Wikipedia)
Monomorphization (Wikipedia)
Dynamic dispatch (Wikipedia)

Implications on the Standard Library

It was a deliberate decision to not change the standard library as part of Go 1.18. The current plan is to gather experience with generics, learn how to use them appropriately, and figure out reasonable use cases within the standard library.

There are some proposals for generic packages, functions, and data structures:

  • constraints, providing type constraints (#47319)
  • maps, providing generic map functions (#47330)
  • slices, providing generic slice functions (#47203)
  • sort.SliceOf, a generic sort implementation (#47619)
  • sync.PoolOf and other generic concurrent data structures (#47657)
💫graph: A library for creating generic graph data structures and modifying, analyzing, and visualizing them.
💫
graph: A library for creating generic graph data structures and modifying, analyzing, and visualizing them.