Learn How We Test Go Lang at Stream

Stream’s API is used in production by more than 500 companies and 200 million end users. While we like to move fast, we definitely don’t like to break things.

An extensive test infrastructure enables us to move quickly and deploy code with confidence. A solid testing workflow is essential to stay productive as your team and codebase grows.

Most of the services powering Stream are written in Go. This blogpost will cover in detail what we learned writing tests for a large Go code base.

Our testing workflow

Testing is a core part of our development process. Not a single line of code is deployed to a live system before it’s properly tested and peer reviewed. Our workflow looks like this:

Implement the new feature with TDD using Go’s testing package, alongside some goodies from stretchr/testify
Write integration tests and acceptance tests with onsi/ginkgo and our own soon-to-be-released bdd library
Wait for the green light from Travis CI
Check test coverage on Codecov
Ship the new feature to staging

Unit testing

While following the TDD approach when writing our code, we make use of Go’s standard library testing package to write all of our tests: its simplicity matches idiomatic Go code, and more importantly, it works perfectly with table tests, which we use extensively.

As experienced programmers already know, table tests are a simple way to perform multiple input/output tests against a method or behavior with a small code fingerprint. Also, thanks to Go’s anonymous and inline structs, the code looks simple and easy to extend:

testCases := []struct{
  input          string
  expectedOutput int
}{
  {
    input:          "aabbcc",
    expectedOutput: 3,
  },
  {
    input:          "abcdefg",
    expectedOutput: 1,
  },
  ...
}
for _, tc := range testCases {
  output := getRepetitions(tc.input)
  assert.Equal(t, tc.expectedOutput, output)
}

Most Gophers will be familiar with Go’s pattern for handling errors, which is used also when writing tests: call a method, grab the error, check whether it’s not nil, and act consequently (happy paths to the left!).

Writing tests with the testing package goes the same way, but we decided to add something on top of it. Let’s say we’re testing a IsValid(*http.Request) (bool, error) function that checks if the headers of an HTTP request are well-formed, and we’re working with a table-based test:

func TestIsValid(t *testing.T) {
        ...
        for _, tc := range testCases {
                t.Run(tc.name, func(t *testing.T) {
                        valid, err := IsValid(tc.req)
                        if tc.shouldError && err == nil {
                                t.Fatal("expected error, got none")
                        }
                        if !tc.shouldError && err != nil {
                                t.Fatalf("expected no error, got %s", err)
                        }
                        if valid != tc.expected {
                                t.Errorf("expected valid to be %t, got %t", tc.expected, valid)
                        }
                })
        }
}

We found that the usual Go idiomatic error checking adds just too much noise in tests, and this is why we adopted testify‘s assert and require packages all across our unit tests. The previous snippet becomes:

import (
        "github.com/stretchr/testify/assert"
        "github.com/stretchr/testify/require"
)
func TestIsValid(t *testing.T) {
        ...
        for _, tc := range testCases {
                t.Run(tc.name, func(t *testing.T) {
                        valid, err := IsValid(tc.req)
                        if tc.shouldError {
                                require.Error(t, err)
                        } else {
                                require.NoError(t, err)
                        }
                        assert.Equal(t, tc.expected, valid)
                })
        }
}

The test is now cleaner and easier to extend.

As a general rule we rely on require when testing preconditions (i.e.: checking the error returned by a function before continuing) and assert for actual logic assertions. This ensures that our tests always return meaningful results, and we don’t have to waste time debugging our own tests.

In some places across our codebase we decided to go further and to use the full testify/suite package. This helps us in writing some peculiar tests faster.

Pro Tip: When working with table tests, include a name field in your table containing a brief description of the entry that’s being tested: using it as the name parameter of the testing.T.Run method will make you and your teammates able to understand tests in a glance, both when looking at the code as well as when looking at the test command output.

func TestTable(t *testing.T) {
        testCases := []struct {
                name        string
                dividend    float64
                divisor     float64
                shouldError bool
        }{
                {
                        name:        "dividing by zero",
                        shouldError: true,
                        dividend:    42,
                        divisor:     0,
                },
                {
                        name:        "dividing by odd numbers",
                        shouldError: false,
                        dividend:    42,
                        divisor:     21,
                },
                ...
        }
        for _, tc := range testCases {
                t.Run(tc.name, func(t *testing.T) {
                        ...
                })
        }
}

Mocking

Writing mocks and “sandboxed” behaviors is a vital part of unit testing, which becomes increasingly important as the complexity of your code grows.

We decided to use two approaches to mocking, depending on the "cost" of the process: direct dependency injection or the GoMock framework.

Dependency injection

Dependency injection is a common software design pattern in Go, and it comes handy when dealing with mocking. By using interfaces, Go's arguably most powerful feature, stubbing behaviors and mocking components becomes quite straightforward.

Let’s say we have this DB component which performs caching on Redis:

type DB struct {
        cache *redis.Client
}
func (d *DB) Get(query string) (string, error) {
        ...
        cached, err := d.cache.Get(query).Result()
        if err == redis.Nil {
                return cached, nil
        }
        ...
}

When testing the actual behavior of the DB.Get method at some point we probably won’t need an actual Redis connection: we need then to mock the behavior of the cache field in our DB struct. We can easily rearrange the previous code so to use an interface we can then re-implement:

type DB struct {
        cache Cache
}
type Cache interface {
        Get(string) (string, error)
}
type RedisCache struct {
        client *redis.Client
}
func (c *RedisCache) Get(key string) (string, error) {
        v, err := c.client.Get(query).Result()
        if err == redis.Nil {
                return v, nil
        }
        return v, err
}

We also consequently rearrange the DB.Get affected parts:

func (d *DB) Get(query string) (string, error) {
        cached, err := d.cache.Get(query)
        if err != nil {
                return "", err
        }
        if cached != "" {
                return cached, nil
        }
        ...
}

We can now mock the cache in our tests by just implementing the Cache interface with a mock structure which behaves as we need:

type mockCache struct{}
func (m mockCache) Get(string) (string, error) {
        return "", nil
}
func TestDBGet(t *testing.T) {
        mock := mockCache{}
        db := &DB{cache: mock}
        db.Get(...)
        ...
}

golang/mock

There are some scenarios where the previous approach would be too complicated and/or steal too much time (or, even worse, obfuscate the underlying code). In such cases, we create “proper” mocks with the GoMock framework: with a single command, mockgen, it generates full type mocks which we can use in our tests by settings expectations about received calls and outputs.

Our typical use case is when dealing with gRPCs: refactoring our code to have simple, interface-based components would add too much noise, and with GoMock we basically get fully functional client and server mocks basically for free, which perfectly match the kind of tests you want to perform on an input-output system like an RPC framework.

GoMock is a powerful tool that simplifies Gophers’ lives a lot, and we encourage you to play around with it and to take a look at the examples on the project’s Github repository.

Integration tests

We don’t perform integration tests nor acceptance tests with Go’s default library. We instead separate them from the rest of the codebase and place them in a separate repository which has its own lifecycle and dedicated CI jobs. Doing so allows us to perform really careful feature tests, performance tuning, integrating with the other moving parts involved, quality assurance tasks, and general end-to-end tests and benchmarks.

At first, we decided to use Ginkgo for these kinds of tests: Ginkgo is a great open source contribution that allows BDD for Go in a quite handy way. Go’s peculiarities make it somewhat more complicated than other languages (like Python and Ruby) to write behavior-driven code, but in the end we found it was a good trade between ease of use and effectiveness.

We tuned our setup and came up with the following structure: every feature/component that we want to test lives in a dedicated package inside a features/ folder, and we can easily run specific tests with some Makefile magic. This includes many options for targeting specific tests with regular expressions (using the --focus and --skip Ginkgo flags), triggering parallel execution, detecting races and so on.

But after some time we concluded that raw Ginkgo has some shortcomings for our purposes. Ultimately we wanted to write better “specs”, but Ginkgo tests had the following issues for us:

Get noisy pretty quickly
Don’t incentivize strong opinions when it comes to DSL and BDD best practices
Don’t have a (default) informative output (but it can be extended with a custom Reporter)
Have output which is not really suitable for documentation
Don’t have built-in reporting
Have an “unfriendly” approach to fine-grained tests

So, we started experimenting and came up with a homemade library, partially built on top of Ginkgo and Gomega matchers. We call it simply bdd, and we use it everywhere. It’s going to be open sourced and publicly released as soon as possible, but it’s not there yet; consider this a sneak peak!

bdd has two goals: writing specs a-là RSpec and writing feature tests a-là Cucumber, with all the benefits of these two famous tools. It exposes features we like about Ginkgo and hides the ones we don’t like, also making some strong assumptions about how we want to write our “specs”:

A Spec function is the container of a whole set of logically-connected tests (like the ones related to a single broad feature).
A Spec contains any number of Describe blocks, defining “things to be tested”.
A Describe block contains any number of When blocks (or other Describe blocks).
A When block contains a scenario to be tested for the current “thing”. It can also contain sub-scenarios as Describe blocks.
All assertions are done inside a Should function.

Writing tests with this structure is really important for a correct “sentence” formulation of these tests, and allows us to write our tests faster while better understanding what’s going on.

Stream as a company is growing fast, which means onboarding new hires and adding more tests to our codebase. We are new teammates to feel comfortable writing tests and not having to waste time wondering how to write them consistently (especially during their first days at Stream). Our humble testing library is written with them in mind!

We now have a consistent, no-nonsense “testing grammar” that is familiar to every developer in the company. Having a library that ensures your tests are always aligned with your teammates’ ones results in better code reviews, easier report checking, and overall faster development cycles.

When writing specs, it all starts with a Spec function:

var _ = Spec("Adding activities", func() {
        When("adding to a feed group that is not configured", func() {
                var err error
                BeforeEach(func() { feedGroup = "user_bogus_group" })
                JustBeforeEach(func() {
                        _, err = feed.AddActivity(&gestream.Activity{
                                Actor:  "john",
                                Verb:   "like",
                                Object: "apples",
                        })
                })
                Should("return a clear error", func() {
                        Expect(err).To(HaveOccurred())
                        Expect(err).To(BeFeedConfigExceptionError())
                        Expect(err).To(BeAnErrorWithDetailMsg("user_bogus_group feed group does not exists"))
                })
        })
})

As said before, we ensure that a Describe block always contains any number of When blocks, and that a When always contains some Should blocks.

We also make sure that the string descriptions for every block don’t stutter with the verb they’re referring to, so to always get a nice output: we think that reading the output of the tests is really important down the road, so the syntax of the single expectations should always be formulated and read in proper English.

We want our tests to be familiar to read even months after writing them: you really don’t want to read awful things like «when if I am logged in I has there is a logout button», but rather

«when I am logged in, the logout button should be visible».

Speaking of the output, we wrote a custom Ginkgo reported inspired by RSpec documentation format, and it looks something like this in the console:

FlatFeeds Suite
  Flat Feeds
    ...
    adding activities:
      when verb field is missing
        should return an explicit error
      when verb field is too long
        should return an explicit error
      when no pagination is specified
        should return 25 activities by default
        when using the id_lt pagination parameter
          should have id_lt equal to the second activity from the top
        when using the id_gt pagination parameter
          should have id_gt equal to the second activity from the top
    ...
Finished in 3.8849 seconds
142 examples, 0 failures, 3 skipped
PASS

And this is combined with HTML reporting and integration with our CI, so that we always know what’s happening, when, where, and (rarely 😏) why it’s failing. On top of this, we’re busy working on the second purpose of bdd, which is Cucumber-like feature tests.

This part is completely homemade and doesn’t use Ginkgo (although it does use Gomega matchers), and allows us to write feature files directly in Go, while being as close as possible as the neat feature files you’re probably familiar with.

Feature("sample feature", func() {
        Scenario("login",
                        Given("i am not logged in", givenNotLoggedIn()).
                                And("i am on the home page", givenOnHomePage()).
                                And("i am using a mobile browser", givenOnMobileBrowser()),
                        When("entering my email and password", enterEmailAndPassword()),
                        Then("i should be logged in",
                                func() {
                                        Expect(me).To(BeLoggedIn())
                                }),
                )
})

This is still in active development: you’ll hear more in the upcoming weeks...

Wrapping up

We believe that only great tests can lead to great software: writing tests is a compelling task and there are no universal rules about it. Our workflow is a constant work-in-progress refinement and tweaking process, because as we write more code, we discover new ways of testing it.

Writing tests with Go is generally a fun experience, thanks to the simplicity of the language: the standard library offers a streamlined testing environment, and the vibrant Gophers community came up with nice tools built on top of it. Nevertheless, Go is a young language and the when it comes to advanced testing techniques it’s far from offering the commodities you find in other languages like Python or Ruby: there’s a lot of room for experimenting and trying out new concepts, and that’s why we’re investing a lot of efforts in developing our own testing gear.

So now you know how we deal with testing with Go, and we hope you're inspired for writing better code and maybe adjusting your programming routines. Good luck!

Testing Go at Stream