Sandro Turriate

Coder, cook, explorer

Making SQLite faster in Go

Dec 31, 2020

The most popular way of using sqlite in Go happens to also be the slowest when using it in a concurrent application like a web app. Roll your own connection pool to speed things up.

TLDR

  1. Use a connection pool. Even when using the database/sql interface.
  2. Use shared cache mode.
  3. Use WAL. (though this article only focuses db reads.)
  4. Use Prepared Statements! And cache them (like crawshaw).

“SQLite Slow…?😕” How can that be? The standard client/server network overhead is absent and there's no extra security precautions or complex locking; sqlite is simply loading a single file in the same memory space as the executable. Then why am I seeing such slow req/s reading text from the database while using Fiber, one of Go‘s fastest web servers?

Well, let's run some benchmarks and investigate.

So what am I doing? I'm using SQLite 3.34.0 and the excellent mattn/go-sqlite3 library. It's the standard library with over 4,500 stars on Github that conforms to Go's database/sql interface. I have a single table named "pages" that contains an integer primary key (default rowid pk) and a "body" text colum. One record in the "pages" table containing ~5400 bytes of text. I have a handler in fiber which runs a simple query to return the body and sends the response back to the browser. I'm using hey to run the benchmark. It sends 50 concurrent requests to my local webserver over a 5 second period of time:
hey -z 5s -c 50 http://localhost:3000/

I'm comparing the speed of returning the payload from SQLite to simply serving the file from disk. I hoped that the overhead of opening and closing the file from disk for each request would be slower than the simple SQL SELECT statement, but unfortunately, that wasn't the case in testing. Still, I'll be happy if SQLite approaches the speed of streaming the file back directly from the filesystem, which is around 58,000–60,000 req/s.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
app := fiber.New()

app.Get("/", func(c *fiber.Ctx) error {
  file, err := os.Open("test.html")
  check(err)
  stat, err := file.Stat()
  check(err)
  ln := int(stat.Size())
  c.Type(".html")
  c.Response().SetBodyStream(file, ln)
  return nil
})
The control: reading the file from the file system returns ~60,000 req/s.

Let's see how the standard mattn-gosqlite3 library compares.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
db, err := sql.Open("sqlite3", "file:db.sqlite3?cache=shared&mode=rwc")
check(err)
defer db.Close()

//...

app.Get("/2", func(c *fiber.Ctx) error {
    row := db.QueryRow("select body from pages where id=1")
    var body []byte
    err = row.Scan(&body)
    check(err)
    c.Type(".html")
    c.Response().SetBody(body)
    return nil
})
A standard database query. We‘ll optimize this.

The above returns around 14,500 req/s. Not bad, but nowhere close to 60,000.
Let's try a prepared statement.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
bodyGet, err := db.Prepare("select body from pages where id=?")
check(err)
defer bodyGet.Close()
//...
app.Get("/2", func(c *fiber.Ctx) error {
    row := bodyGet.QueryRow("1")
    var body []byte
    err = row.Scan(&body)
    check(err)
    c.Type(".html")
    c.Response().SetBody(body)
    return nil
})
Using a prepared statement.

By adding a cached prepared statement, I'm seeing around 16,500 req/s. A nice improvement, but not substantial.

Now let's try adding a simple connection pool.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
max_conns := 5
conns := make(chan *sql.DB, max_conns)

for i := 0; i < max_conns; i++ {
  conn, err := sql.Open("sqlite3", "file:db.sqlite3?cache=shared&mode=ro")
  check(err)
  defer func() {
      conn.Close()
  }()
  conns <- conn
}

checkout := func() *sql.DB {
  return <-conns
}

checkin := func(c *sql.DB) {
  conns <- c
}

//...

app.Get("/2", func(c *fiber.Ctx) error {
  db := checkout()
  defer checkin(db)
  row := db.QueryRow("select body from pages where id=1")

  var body []byte
  err := row.Scan(&body)
  check(err)
  c.Type(".html")
  c.Response().SetBody(body)
  return nil
})
Using a connection pool.

With a connection pool, I'm seeing around 23,400 req/s. Great, A steady increase! But still a ways off. If you noticed, the connection pool returns a DB struct, which means we've lost our cached prepared statement. Let's add it back.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
max_conns := 5
conns := make(chan *sql.Stmt, max_conns)

for i := 0; i < max_conns; i++ {
  conn, err := sql.Open("sqlite3", "file:db.sqlite3?cache=shared&mode=ro")
  check(err)
  stmt, err := conn.Prepare("select body from pages where id=?")
  check(err)

  defer func() {
    stmt.Close()
    conn.Close()
  }()
  conns <- stmt
}

checkout := func() *sql.Stmt {
  return <-conns
}

checkin := func(c *sql.Stmt) {
  conns <- c
}

//...

app.Get("/2", func(c *fiber.Ctx) error {
  stmt := checkout()
  defer checkin(stmt)
  row := stmt.QueryRow(1)

  var body []byte
  err := row.Scan(&body)
  check(err)
  c.Type(".html")
  c.Response().SetBody(body)
  return nil
})
Using a pool of cached prepare statements.

43,600 req/s! Almost double the last test by adding cached prepared statements.
And increasing the worker pool from 5 to 20 gives us 48,000 req/s. Going from 20 workers to 100 squeaks out another 1000 req/s. And opening the database as immutable gives us another 1000.

Great, so we went from 14,000 req/s to 48,000 req/s. Not quite the 60,000 mark, but close enough for me.

If you're looking for that extra bit of performance, and a driver written specifically for sqlite3, which doesn't pigeon-hole itself in the database/sql interface, check out bvinc/go-sqlite-lite and crawshaw/sqlite.

Here's an example using go-sqlite-lite with the same cached prepared statement pool that we used above.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
max_conns := 5
conns := make(chan *sqlite3.Stmt, max_conns)

for i := 0; i < max_conns; i++ {
  conn, err := sqlite3.Open("file:db.sqlite3?cache=shared&mode=ro")
  check(err)
  stmt, err := conn.Prepare("select body from pages where id=?")
  check(err)

  defer func() {
    stmt.Close()
    conn.Close()
  }()
  conns <- stmt
}

checkout := func() *sqlite3.Stmt {
  return <-conns
}

checkin := func(c *sqlite3.Stmt) {
  conns <- c
}

//...

app.Get("/2", func(c *fiber.Ctx) error {
  stmt := checkout()
  defer checkin(stmt)
  err := stmt.Bind(1)
  _, err = stmt.Step()
  check(err)
  var body []byte
  err = stmt.Scan(&body)
  check(err)
  err = stmt.Reset()
  check(err)
  c.Type(".html")
  c.Response().SetBody(body)
  return nil
})
go-sqlite-lite with a prepared statement pool returing 56,600 req/s.

It returns 56,600 req/s! Super fast! I'd happily switch to this driver, but as most of my code has already been written using jmoiron/sqlx which depends on database/sql, I'm not motivated to rewrite it all and will just use my simple connection pool.

FYI, crawshaw/sqlite returns around 53,000 req/s.
And bvinc/go-sqlite-lite returns around 56,000 req/s.

Why prepared statements?

A prepared statement is a compiled SQL statement that can be used more than once. When we send SQL to the database engine, it needs to parse the SQL, which takes time. If we're going to send the same SQL statement over and over, we should be polite and have the database only parse our gnarly SQL syntax once. Caching the prepared statement saves significant time.

Why a connection pool?

Well, I‘m not certain. The performance improvement found by using a connection pool is especially puzzling because database/sql already utilizes a connection pool. But there are numberous issues (274,209,632) on the mattn/go-sqlite3 project reporting problems encountered when using the driver in concurrent applications. Bvinc sums up the recommended solution in a biting comment…

If you start using their driver normally, you'll immediately start getting problems with locked databases. They suggest that you use their custom connection string to enable shared cache mode, and then reduce the connection pool to only 1 connection. That's the suggested way of using it apparently.

I honestly don't understand how anyone is using sqlite through database/sql.

Ouch. So, your mileage may vary using mattn‘s driver, but in our simple series of tests, we've managed to increase our read speed without limiting the database connection manager to a maximum of one connection per process.

Crawshaw wrote an excellent article describing his sqlite driver and the opinions built into it (on by default).
It all makes sense to me, yet replicating his work (connection pool, and cached prepared statements) into Bvinc's less opinionated driver outpaces his own. Not sure why, but possibly because Crawshaw adds debug tracing to every executed query. Sidenote, he has another great article about building apps with Go and SQLite as a solo dev.

So if you're looking for better performance out of your sqlite database, consider the following:

  1. Adding a connection pool.
  2. Caching prepared statements.
  3. Using shared cache mode.
  4. Using the WAL journal mode.