A really fast blue gopher.

Benchmarking SQLite Performance in Go

SQLite has a reputation of being a toy database useful only for hobby projects and integration tests. I’ll show you why this is not true, with data to back it up.

Last updated December 2022 for Go 1.19.

Why this benchmark?

I’ve been writing about using SQLite and Go in the cloud recently. It’s a great combination of technologies that you can get really, really far with. Invariably, the same objections come up from some readers:

SQLite doesn’t scale.

We only use SQLite in our integration tests, in production we use a real database.

SQLite isn’t good for any serious production app.

I’m here to show you that those objections are plain wrong. Yes, different workloads require different solutions, and SQLite isn’t a fits-all solution. Nothing is. But raw query performance probably shouldn’t be your first concern when ruling out SQLite as your database of choice.

There are some code snippets in this article, and the full source with all the benchmark code is available on Github.

Have you spotted an error? Idea for improvement? Open an issue on the Github repo and help improve this benchmark.

Benchmark design

I’m going to be looking at a few different benchmarks:

  • Raw query performance for a trivial select 1 query, to set a performance baseline.
  • Separate read and write performance for reading and creating posts on a blog.
  • Simultaneous read and write performance for reading blog posts and commenting.
  • A comparison to equivalent benchmarks in Postgres.

I’ll run all of these in parallel with various degrees of parallelism.

I could have chosen a more established database benchmark suite, but I like how easy this setup is to understand and reason about, so that’s what we’ll use.

Why Go benchmarks?

Go has benchmarking built right into the standard test tools. It does warmup for you, you can decide what part of the benchmark is setup and where to measure from, and it can run benchmarks in parallel. In other words, it’s perfect for this.

A Go benchmark is simply a specially-named function in our testing code. In its simplest form, it looks something like this:

package example_test

func BenchmarkSimple(b *testing.B) {
	for i := 0; i < b.N; i++ {
		// 🏃
	}
}

You can run this benchmark with the go command:

$ go test -bench BenchmarkSimple
goos: darwin
goarch: arm64
pkg: example
BenchmarkSimple-8   	1000000000	         0.3209 ns/op
PASS
ok  	example	0.780s

Here, you can see we can do absolutely nothing at approximately 3,000,000,000 Hz. Cool, cool. Let’s do nothing, but in parallel:

package example_test

func BenchmarkParallel(b *testing.B) {
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			// 🏃🏃🏃
		}
	})
}
$ go test -bench BenchmarkParallel
goos: darwin
goarch: arm64
pkg: example
BenchmarkParallel-8   	1000000000	         0.1544 ns/op
PASS
ok  	example	0.549s

That’s quite an improvement, doing nothing at about twice the speed! But let’s move on to doing something with SQLite instead.

Bonus: Go benchmark output in CSV

The benchmarks

The base setup is this:

  • We’re using the SQLite CGo driver github.com/mattn/go-sqlite3.
  • The SQLite database is accessed in WAL mode, so we can have concurrent readers (but not writers). We’ll also use a busy timeout (so we can queue writes instead of getting "database locked" errors on trying to get a concurrent write lock) and enable foreign keys checking. The pragma synchronous is set to normal. We’re using an on-disk database, not in-memory.
  • For the Postgres setup, we’re using a Docker container with Postgres 15 accessed through the network on localhost.
  • The results in this article are from benchmarks run on a newly restarted Macbook Air from 2020, with M1 CPU with 8 cores, 16GB RAM, and an SSD disk.

Enough talk, let’s look at some graphs.

select 1

To have a trivial baseline performance, just to see what it takes to run some Go code and have SQLite parse and run the query, we have a benchmark for running select 1 in parallel:

sqlite_test.go
func BenchmarkSelect1(b *testing.B) { db := setupSQLite(b, false) b.ResetTimer() b.RunParallel(func(pb *testing.PB) { for pb.Next() { _, err := db.DB.Exec("select 1") noErr(b, err) } }) }

This is how it looks in a bar graph, with the x-axis being the degree of parallelism, and the y-axis the frequency of query runs (in kHz):

So we get the maximum select 1 performance with a parallelism of 4, running around 1,100,000 queries per second. If you have a web app just returning ones to your readers, running a million queries per second is pretty great performance. 😬

Reading a post

Now, a more interesting example. Our database schema is just two tables, one for storing blog posts, and one for storing blog post comments:

sqlite.sql
create table posts ( id integer primary key, title text not null, content text not null, created text not null default (strftime('%Y-%m-%dT%H:%M:%fZ')) ); create table comments ( id integer primary key, post_id int not null references posts (id), name text not null, content text not null, created text not null default (strftime('%Y-%m-%dT%H:%M:%fZ')) ); create index comment_created_idx on comments (created);

We’ll write a single blog post to the database during setup, and read it back (and throw away the result) using this benchmark:

func BenchmarkDB_ReadPost(b *testing.B) {
	for _, withMutex := range []bool{false, true} {
		b.Run("mutex "+strconv.FormatBool(withMutex), func(b *testing.B) {
			db := setupSQLite(b, withMutex)

			b.ResetTimer()

			b.RunParallel(func(pb *testing.PB) {
				for pb.Next() {
					_, _, err := db.ReadPost(1)
					noErr(b, err)
				}
			})
		})
	}
}

You see that withMutex variable? It’s a flag for an optional sync.RWMutex which the sqlite.Database struct methods can use. Why? Because I wanted to see whether using a Go mutex would be faster than the built-in synchronization mechanisms in SQLite. For reads, it’s not:

It’s actually pretty much the same with and without a mutex. Over 150,000 reads per second of a single blog post (that’s probably cached) with parallelism 4. An order of magnitude less queries per second than the trivial example above, but this actually does something useful!

Writing a post

So what about writing blog posts?

We’ll test pure write performance (no reads at all) by writing blog posts as fast as we can. Note that this involves a non-trivial body text, namely five paragraphs of lorem ipsum, and setting a creation timestamp with strftime:

sqlite_test.go
func BenchmarkDB_WritePost(b *testing.B) { for _, withMutex := range []bool{false, true} { b.Run("mutex "+strconv.FormatBool(withMutex), func(b *testing.B) { db := setupSQLite(b, withMutex) b.ResetTimer() b.RunParallel(func(pb *testing.PB) { for pb.Next() { err := db.WritePost("Lorem Ipsum, don't you think?", loremIpsum) noErr(b, err) } }) }) } }

We’re using that mutex again. Because SQLite’s internals for attempted concurrent writes involves some polling with backoff, this time it actually makes a big difference. The results speak for themselves:

This graph shows write query performance with and without the mutex, again with various degrees of parallelism and with the frequency on the y-axis. Write concurrency without a mutex goes down a cliff from parallelism 64, but stays pretty consistently good above 20,000 writes per second with the mutex.

This is a really cool result. For pure writes, using a mutex seems to be a really good idea! And 20,000 writes per second, on a laptop from two years ago. Nice. 😎

Reading and commenting on a post

Let’s try a more real-world example. This time, we’ll introduce a chance of commenting on the blog post for every read, and call this the comment rate. When reading the blog post, we’ll get its content, along with all the comments ordered by creation date.

func BenchmarkDB_ReadPostAndMaybeWriteComment(b *testing.B) {
	for _, withMutex := range []bool{false, true} {
		b.Run("mutex "+strconv.FormatBool(withMutex), func(b *testing.B) {
			for _, commentRate := range []float64{0.01, 0.1, 1} {
				b.Run(fmt.Sprintf("comment rate %v", commentRate), func(b *testing.B) {
					db := setupSQLite(b, withMutex)

					b.ResetTimer()

					b.RunParallel(func(pb *testing.PB) {
						for pb.Next() {
							_, _, err := db.ReadPost(1)
							noErr(b, err)

							if rand.Float64() < commentRate {
								err = db.WriteComment(1, "Love it!", "Great post. :D")
								noErr(b, err)
							}
						}
					})
				})
			}
		})
	}
}

Once again, we’ll do this with and without the mutex. The result is actually surprising!

Here’s the graph for the query performance with the mutex:

Here’s the graph for the query performance without the mutex (note the same y-axis range):

The performance is generally not quite as high without the mutex compared to with the mutex. But also, the worst performance isn’t as bad without the mutex compared to with the mutex.

I suspect this has something to do with how the RWMutex blocks readers when writing, which SQLite itself doesn’t do when writing. But on the other hand, as we saw with the pure write benchmark above, using the mutex for writes alone is generally better for high parallelism. There’s also an open issue in Go with the performance of the sync.RWMutex itself. (Thank you Komu Wairagu for the pointer on that last one!)

So what if we don’t use a RWMutex, but instead two database connection pools? One for reads (with unlimited connections) and one for writes (with just one allowed connection, so effectively a mutex)?

Reading and commenting with separate pools

The benchmark for reading and commenting on a blog post with separate database connection pools for reads and writes looks like this:

func BenchmarkDB_ReadPostAndMaybeWriteCommentWithPool(b *testing.B) {
	for _, commentRate := range []float64{0.01, 0.1, 1} {
		b.Run(fmt.Sprintf("comment rate %v", commentRate), func(b *testing.B) {
			writeDB, readDB := setupSQLitePool(b)

			b.ResetTimer()

			b.RunParallel(func(pb *testing.PB) {
				for pb.Next() {
					_, _, err := readDB.ReadPost(1)
					noErr(b, err)

					if rand.Float64() < commentRate {
						err = writeDB.WriteComment(1, "Love it!", "Great post. :D")
						noErr(b, err)
					}
				}
			})
		})
	}
}

The results are quite similar to the no-mutex version above, except the worst-case performance seems more consistent across different degrees of parallelism. I don’t know whether this is always the case or just a coincidence.

Reading and commenting in Postgres

So far, we’ve only seen results for SQLite. Let’s compare it to Postgres, which uses a client/server architecture over the network.

As mentioned in the introduction, for this benchmark, I started Postgres in a Docker container on the same machine where the benchmarks were running. There are some factors that could impact performance one way or another:

  • Overhead for the Docker container runtime environment.
  • I didn’t do performance tuning for the Postgres installation, it’s all standard configuration, except setting the max connection limit to 2048 instead of the default 100. Otherwise, our parallelism would make the server reject connections.

Normally, you would use Postgres over a network connection to a different machine anyway, so if anything, these numbers are probably too optimistic for this particular setup if you take the network latency into account.

As you can see, the performance is overall a bit lower than the SQLite benchmarks, but not much. I’m surprised the writes don’t scale better than in SQLite, given that Postgres happily works with concurrent writes. From parallelism 512, performance goes down the drain across comment rates, but that might have something to do with how I’ve set the max connection count. (Maybe there’s another parameter I should be adjusting?)

Just to make it really clear: Postgres is a great database, and a great choice for many applications. If you make Postgres your default choice, you will be in a really good position. Postgres can be tuned and scaled to sizes that SQLite cannot. But this is an article about SQLite benchmarks, and it’s a fun and useful comparison. 😊

Conclusion

In this article, I hope I’ve shown you that SQLite should absolutely be considered for serious production apps when it fits in the overall architecture. At least, it shouldn’t be discarded on a performance-basis alone.

My key takeaways are these:

  • Especially for read-heavy sites, SQLite has great performance. And remember, these benchmarks are just for a single machine! These days, you can scale SQLite horizontally in addition to vertically.
  • Even with only a single writer allowed at a time, write performance is nothing to disregard. As always, it depends. Investigate your workload and decide.
  • Even compared to Postgres, SQLite does really well. It’s about twice as fast in my artificial benchmarks, and that’s with Postgres running on the same machine. Over a network, Postgres will be slower (but obviously has many other benefits, again depending on your architecture).
  • Using mutexes and different connections pools has its benefits and drawbacks. If in doubt, I would probably use neither and just stick with one big connection pool for all connections.

This was a really fun article to research for and write, and I hope you’ve learned something from it.

I will definitely be linking to it the next time someone tells me that SQLite doesn’t scale.

References

Useful links and resources I’ve used for this article:

About

I’m Markus 🤓✨. I’m passionate about simple & boring but useful software. So we can build the things that actually matter.

Want to learn Go web and cloud development? I’ve got online Go courses! 😎

Need someone to do it for you? I do software consulting.

Enjoyed this? Want my newsletter? No spam. Instant unsubscribe any time.

Picture of Markus, the author.