Go streams

Go 1.18 was just released which means that Go officially supports generics now. Out of curiosity I decided to look into creating a library which implements something similar to Java streams. The goal of my simple implementation was to support processing elements of a slice using two operations: map and filter.

If you just want to see the code then you can find the repository here. I don't show the implementation in this article, as to be frank it is not that interesting. We will only look at the code which results from using this library. That way we can assess if using similar libraries would make Go programs more readable.

Filter

In this section we will try to filter a slice of integers leaving only even numbers in the resulting slice. First lets have a look at a pice of code which does this using a standard approach with a for loop:

func OnlyEven(slice []int) []int {
        var result []int

        for _, element := range slice {
            if element%2 == 0 {
                    result = append(result, element)
            }
        }

        return result
}

I am sure most people wrote hundreds of similar functions when programming in Go. Such simple functions are quite repetitive to write but easily recognizable and usually just as simple as the function that I just presented.

Now lets have a look at the same function implemented using the library which I wrote:

func OnlyEven(slice []int) []int {
    return streams.New(slice).
        Filter(onlyEven).
        Collect()
}

func onlyEven(v int) bool {
    return v%2 == 0
}

For readability I replaced an anonymous function with a named function. I believe this code is relatively easy to read but it is hard for me to tell if is more readable than the previous function. It does however seem to be faster to write and contains less usual boilerplate which may harm readability.

Overall I believe that a chain of filter calls could be more readable than many statements inside of a for loop under certain circumstances. That being said I don't think that normal for loops are annoying to write or hard to read so it is hard to tell if this solves any real problem.

Filter and map

Now let's try to make the example more complicated. First we will filter the values, then map them from int to string, and then filter out values which are longer than one digit. To keep things simple we will just use len instead to resorting to something like RuneCountInString.

Again, first lets try a conventional for loop:

func OnlyEvenAsStrings(slice []int) []string {
        var result []string

        for _, element := range slice {
            if element%2 != 0 {
                continue
            }

            s := strconv.Itoa(element)
            if len(s) > 1 {
                continue
            }

            result = append(result, s)
        }

        return result
}

As you can see I tried to structure the code in a way which avoids nesting the conditional statements. While there are many ways in which this function can be written I think this approach makes the flow of control easier to follow. All conditional statements are clearly identifiable as conditions which filter elements out.

Now lets try to do the same with my library:

func OnlyEvenAsStrings(slice []int) []string {
    return streams.Map(
        streams.New(slice).Filter(onlyEven),
        strconv.Itoa,
    ).
        Filter(onlyOneByte).
        Collect()
}

func onlyEven(v int) bool {
    return v%2 == 0
}

func onlyOneByte(v string) bool {
    return len(v) == 1
}

Unfortunately as you can see the code suddenly becomes much less readable. In my mind the function was supposed to look like this:

func OnlyEvenAsStrings(slice []int) []string {
    return streams.New(slice).
        Filter(onlyEven).
        Map(strconv.Itoa).
        Filter(onlyOneByte).
        Collect()
}

Unfortunately this is currently impossible with the way generics in Go work. The real example looks so bad as methods can't be parametrized with extra type parameters. For this reason I was forced to use a top level function. This is quite unfortunate because I think my idealized example would make much more sense. The currently resulting code certainly makes this a complete failure when it comes to readability.

A quick note on performance

Out of curiosity I benchmarked the filter functions from the first section. The performance of a simple loop is better than attempting to perform the same task using my library. While my implementation may be too naive I don't think this is necessarily related to my code. I am assuming that this is caused by various optimizations performed by the compiler when a simple loop is used. Therefore a more complicated implementation will most likely always lose.

BenchmarkFilterStream
BenchmarkFilterStream-16         4389160           312.2 ns/op

BenchmarkFilterLoop
BenchmarkFilterLoop-16       6928596           193.4 ns/op

Conclusions

My implementation creates something similar to Java streams to avoid creating too many intermediate slices and therefore avoid unnecessary allocations. A simpler exercise could be to simply write map and filter functions which accept and return a slice. This would however make chaining filter and map statements less efficient.

Overall I think a library like this makes little sense given that the map function can't be properly implemented. It is however possible that there is a way to do it that I missed. While the idealized version of the code that I presented could make chaining functions altering the stream very readable the real version is a complete write off. That being said the library-driven approach has the potential to make the code more composable.

Overall I would not use libraries such as this one even if the idealized version becomes possible to write as they don't seem to be in the sprit of Go. I will certainly not use this library that I just made and to quote Rob Pike:

You shouldn't use it either.

2022-03-17