Concurrency Control in Go: Inside rqlite’s Custom Synchronization Primitives

rqlite is a lightweight, open-source, distributed relational database written in Go, which uses SQLite as its storage engine.

Concurrency and synchronization are at the heart of any database — and distributed systems too. Today, I’d like to share the set of synchronization primitives and race-free data structures I’ve developed for rqlite, and explain how I use them to program key aspects of the database. They might even be useful to others confronted with similar requirements.

Check-And-Set Mechanism

Source

Because of the nature of the Raft consensus algorithm, rqlite periodically needs to perform a process called snapshotting. Snapshotting allows the Raft subsystem to truncate its log, preventing it from growing indefinitely. However, since snapshotting also requires exclusive write access to the database, we want to ensure it never blocks, as this could delay write operations in an unacceptable manner.

Ideally we don’t want snapshotting to start unless we can be confident it can run to completion — so grabbing a simple mutex wouldn’t work. What if the mutex was already held by some other part of the code? What I wanted was the ability to grab a lock, but have the code immediately return if the lock wasn’t available. From a system point-of-view this is OK as Raft will retry the snapshot operation at a later time.

To meet this need, I developed a simple Check-And-Set (CAS) mechanism. The CAS mechanism ensures that only one goroutine can enter a critical section at a time. If a second goroutine tries to enter while another owns the lock, it will receive an error instead of blocking. You can see this in operation during the snapshotting process, and the backup process — two processes which must not run concurrently, but both of which can always be retried later if needed.

Why not use TryLock()?

The Go sync library does expose TryLock(), but the the docs state:

Note that while correct uses of TryLock do exist, they are rare, and use of TryLock is often a sign of a deeper problem in a particular use of mutexes. 

So I decided to create my own data structure instead — it makes my intention clear, and won’t raise concerns when others review my code.

Multi-Reader Single-Writer (MultiRSW)

Source

In scenarios where multiple readers can safely access a resource simultaneously but writers require exclusive access, I introduced MultiRSW. This mechanism is similar to a read-write mutex, but with a crucial difference: if the mutex cannot be acquired, it returns an error instead of blocking. It’s implemented in a manner similar to the Check-And-Set mechanism.

How does rqlite use this data structure? One key application is in snapshot management. Many processes in rqlite need read-only access to a previously taken snapshot, and I want these reads to take place concurrently whenever possible. However, when a new snapshot is being created, snapshot reads must be prevented due to the design of rqlite. Simply blocking the reads during snapshot creation would cause problems in other parts of the system, as it could lead to unnecessary delays or deadlocks. Therefore, it’s better if the reads of the Snapshot Store are retried later. By returning an error to the readers when access cannot be granted, these read operations can catch the error and retry after some time, maximising the system’s responsiveness.

ReadyTarget: Waiting for Specific Conditions

Source

Monotonically incrementing indexes are common in a Raft-based system — the Commit Index and the Applied Index are important examples.

rqlite often needs to wait until a particular index reaches (or exceeds) a target value, or a timeout expires. This is particularly useful in test scenarios where you might need to wait until a certain log index has been applied to the database before executing a query.

To that end I introduced ReadyTarget. This synchronization structure allows a goroutine to block on a channel, which will close when a particular index reaches a certain target value. And by combining the blocking operation with a select statement, rqlite also ensures that a timeout will fire if the index doesn’t reach the value within a certain time.

ReadyChannels: Synchronizing Readiness

Source

In distributed systems, components often need to wait for multiple conditions to be true before proceeding. ReadyChannels help synchronize these readiness signals.

Code that uses this data structure registers a channel when it wants to block readiness — and when it wants to unblock readiness the client code closes the channel. Meanwhile ReadyChannels exposes a simple boolean method, which indicates readiness and is determined by the number of open registered channels at the time of the call.

Why not use Condition variables?

The Go standard library does offer condition variables, but again the documentation discourages their use in favour of channels – so that’s exactly what I did.

PollTrue: Simple Polling Utility

Source

While not always the most elegant solution, polling can be effective in certain situations. PollTrue encapsulates this pattern, removing substantial amounts of boilerplate code from the code base.

PollTrue repeatedly checks a condition until it becomes true, or a timeout occurs. This can be handy for waiting on asynchronous operations, and is often very useful for testing when it comes to writing robust unit tests.

Atomic Data Types: Time, Bool, and String

Source

Go’s sync/atomic package provides atomic operations for certain primitive types, but sometimes you need atomicity for types like time.Time, bool, or string. To address this, I created custom atomic wrappers for these types. Using these types makes code much cleaner and easier to read.

Lessons Learned

Developing these primitives taught me several valuable lessons:

  • Simplicity Over Complexity: Custom solutions don’t need to be complicated. Simple structures with clear responsibilities are more maintainable and less error-prone.
  • Understand the standard library: Go provides powerful concurrency primitives, but understanding their limitations and appropriate use cases is important.
  • Testing is Crucial: Concurrency issues can be subtle. Rigorous testing, including race condition detection, is essential. All of these data structures are unit tested, and the entire rqlite test suite is run with race detection enabled.
  • Iterate and Refine: These primitives evolved over time as rqlite’s needs changed. Don’t be afraid to refactor and improve your synchronization mechanisms.

Conclusion

Concurrency is a complex aspect of software development, especially in distributed systems like rqlite. By building these synchronization primitives and race-free data structures, I’ve been able make the source code clearer and easier to work with.

I hope sharing these tools and my experiences can help others facing similar challenges. The source code for these primitives is available in the rqlite repository, and I welcome feedback, suggestions, and contributions.

 

Leave a Reply

Your email address will not be published. Required fields are marked *