Stateful locks in Go

Locking serves as a construct which is used to enable writes from several threads or goroutines to the same data structures, to avoid concurrency issues. Imagine if every thread would read a value, add something to this value and then write the value back to the shared data structures - we expect in the best case to lose what we added in some routine, and we have to use a lock to avoid this.

The Mutex Lock

The thing is, in Go we have the sync.Mutex, with the functions Lock and Unlock. When you call Lock(), your routine will wait until you receive this lock and do whatever read/write operations that you want to do.

I am writing a crontab replacement, where I would reasonably want to use locking to ensure that only one script/program can run at the same time. I created a simple test with a Job struct, that runs three jobs at the same time:

wg := &sync.WaitGroup{}
job := Job{}
for i := 1; i <= 3; i++ {
	wg.Add(1)
	go func() {
		job.Run()
		wg.Done()
	}()
}
wg.Wait()

It’s worth to note here that wg.Add is executed outside the gorutines, to prevent concurrency issues I’m talking about here.

We declare the Job struct to embed a sync.Mutex to provide the locking mechanism which we will use for our job runs. I added some output to see what’s going on and a time.Sleep call to simulate some work being done.

type Job struct {
	sync.Mutex
}

func (r *Job) Run() {
	log.Println("Started run()")
	r.Lock()
	defer r.Unlock()
	log.Println("Doing work")
	time.Sleep(3 * time.Second)
	log.Println("Done work")
}

As I said before, the call to Lock waits until the lock is acquired. When running the above program, we get the expected results:

08:33:13 Started run()
08:33:13 Doing work
08:33:13 Started run()
08:33:13 Started run()
08:33:16 Done work
08:33:16 Doing work
08:33:19 Done work
08:33:19 Doing work
08:33:22 Done work

What we see from the output is this:

  1. All three goroutines start at the same time
  2. Goroutines are ran in sequence based on when they get the lock

For my use case this unfortunately has a few implications. Let’s consider a script, which runs once every five seconds. Depending on few external factors, the script may run up to 5 minutes. In such a case, I would get 60 goroutines that would all wait until they get a lock, individually. This situation has the potential for disaster - let’s say you changed the frequency from 5 seconds to five minutes, reloaded the configuration and you have no way to ‘reap’ these gorutines which were created on the old schedules.

So, a better pattern is needed. We need to answer the question “Is this job already running, if so, don’t run it again”. While our mutex lock can’t answer this question, It’s possible to use it to provide a concrete answer in the same way. We’ll create a StatefulLock struct that will contain a primitive bool type and functions Take and Release.

Getting lock state

type StatefulLock struct {
	sync.Mutex
	locked bool
}

func (l *StatefulLock) Take() bool {
	l.Lock()
	defer l.Unlock()
	if l.locked {
		return false
	}
	l.locked = true
	return true
}

func (l *StatefulLock) Release() {
	l.Lock()
	defer l.Unlock()
	l.locked = false
}

This should be all the structure we need to provide a conditional locking mechanism, which will enable us to skip job runs if a job is already running.

type Job struct {
	StatefulLock
}

func (r *Job) Run() {
	log.Println("Started run()")
	if !r.Take() {
		log.Println("Already running")
		return
	}
	defer r.Release()

	log.Println("Doing work")
	time.Sleep(3 * time.Second)
	log.Println("Done work")
}

This is our modified Job struct, where we replace the sync.Mutex with our own structure for locking. The Run() function was adjusted to end if the lock can’t be taken (a job is running). In practice our test now looks like this:

08:51:17 Started run()
08:51:17 Doing work
08:51:17 Started run()
08:51:17 Already running
08:51:17 Started run()
08:51:17 Already running
08:51:20 Done work
  1. All three routines start at the same time,
  2. The first routine gets the lock and starts work
  3. The rest can’t get the lock and exit

I think that should be good enough for my use case. In terms of simplicity, it’s as simple as it can get, and the use of sync.Mutex is more in line with the intended use - it only provides access to a shared data structure.

Update

Due to the conversation on reddit, and very constructive comments by @daveddev, @cafxx1985, @joushou - I have to add some information to the article.

Using Mutex locks for our case is sub-optimal. The main impact of the Mutex lock is against performance, when there’s a lot of contention between gorutines which want to lock the same structure. One of the underlying methods to Mutex is CompareAndSwap[type], which enables an atomic way to lock an asset and at the same time provide information if the locking was successful or not. The updated code for a “Semaphore” (better name than “StatefulLock”) is this:

type Semaphore struct {
        semaphore int32
}

func (l *Semaphore) CanRun() bool {
        return atomic.CompareAndSwapInt32(&l.semaphore, 0, 1)
}
func (l *Semaphore) Done() {
        atomic.CompareAndSwapInt32(&l.semaphore, 1, 0)
}

Also changed was the name of the field in the struct, to not leak into the Job struct by embedding. There’s a fully working example on the Go playground.

While I have you here...

It would be great if you buy one of my books:

I promise you'll learn a lot more if you buy one. Buying a copy supports me writing more about similar topics. Say thank you and buy my books.

Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.