How Go's net.DialContext() stops things when the context is cancelled

January 15, 2020

These days, a number of core Go standard packages support functions that take a context.Context argument and abort their operation if the context is cancelled. This is an interesting trick in Go, because normally you can't gracefully interrupt a goroutine doing network IO (which leads to problems in practice). When I started looking into the relevant standard library code I expected to find that things like net.Dialer.DialContext() had special hooks into the runtime's network poller (netpoller) to do this. This turns out to not be the case; instead dialing uses an interesting and elegant approach that's open to everyone doing network IO.

In order to abort an outstanding dial operation if the context is cancelled, the net package simply sets an expired (write) deadline. In order to do this asynchronously, it starts a background goroutine to listen for the context being cancelled (and then there's some complexity involved to clean everything up properly and handle potential races; races caused a number of issues, eg issue 16523). Setting read and write deadlines is already explicitly documented as affecting currently pending reads (and writes), not just future ones, so dialing is reusing a general mechanism that already needs to exist.

(This reuse is a little bit tricky for dialing, which is taking advantage of a customary and useful property where the underlying OS only reports a network socket as writeable once it's connected. This means that you generally check for a connection having completed by seeing if it's now writeable, and in turn this means you can sensibly limit or abort this check by setting a write deadline.)

Now that I've discovered this use of deadlines in DialContext, it's clear that I can do the same thing to abort outstanding network reads or writes in my own code. As a bonus, this will probably return a fairly distinctive error, or I can wrap this in something that implements 'read with context' or 'write with context', probably with some of the race precautions seen in the net package's code.

PS: I was going to say that this is also how net.ListenConfig.Listen handles its context being cancelled, but then I went to look at the code and now I have no idea how that actually works.

PPS: If the context you pass to DialContext() already has a deadline, DialContext() immediately sets a write deadline on the underlying network connection, in addition to its handling of cancellation. There's also some complexity in the code to stop as soon as possible if the context is cancelled immediately, before it starts up the whole extra goroutine infrastructure to wait.

Written on 15 January 2020.
« Stopping udev from renaming your VLAN interfaces to bad names
The question of how long Python 2 will be available in Linux distributions »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jan 15 23:45:58 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.