Dqlite - A Distributed SQLite

in #sql5 years ago

ddbms-introduction.png
Canonical just did a very interesting thing to the venerable SQLite, one of the most used databases in the world, it is written in C, is small, tight and fast so it is common for applications to use it for structured storage on limited devices like mobile phones, IoT, etc..They created a distributed version of it under the name Dqlite, and that just made my brain explode with possibilities, so what does Canonical say about Dqlite?

Dqlite (“distributed SQLite”) extends SQLite across a cluster of machines, with automatic failover and high-availability to keep your application running. It uses C-Raft, an optimised Raft implementation in C, to gain high-performance transactional consensus and fault tolerance while preserving SQlite’s outstanding efficiency and tiny footprint.

First, what is C-Raft? it is a fully asynchronous C implementation of the Raft consensus protocol. Raft offers a generic way to distribute a state machine across a cluster of computing systems, ensuring that each node in the cluster agrees upon the same series of state transitions. The consensus is arrived at via an elected leader, in a raft cluster, a server is either a leader or follower, but could possibly be a candidate if no leader is available. A heartbeat message is sent from the leader to the followers at regular intervals if no heartbeat is received then the followers become candidates and an election is held for a new leader. It’s beyond the scope of this article to get deeper than that, but the leader is responsible for managing the log replication between the followers, the consensus involves multiple servers agreeing on values and allows for servers to go offline and still continue to function, similar to a RAID array. C-Raft is a hand-tuned C implementation of Raft that is fully asynchronous by design.

C-Raft and Dqlite are both written in C for maximum cross-platform portability, published under the Apache 2.0 license for maximum compatibility. The system includes a common CLI pattern to initialize databases and voting member joins and departures. There is a tunable delay for extremely fast failover and automatic leader election.

So, enough pre-amble, SQLite is one of my favorite databases for keeping local data for apps, it’s small and fast and easy to include. More and more companies are using it in edge computing and IoT devices for this same reason, but that means they tend to be isolated and doing a synchronization of the data out of those lonely islands contributes to the complexity of the systems, and it’s this distributed consensus protocol where I think Dqlite starts to get really interesting.

So, as mentioned, Dqlite is using C-Raft, which is asynchronous, which results in Dqlite being an asynchronous single-threaded implementation that uses libuv in the event loop. A custom wire protocol has been implemented that is optimized for SQLite primitives and data types. When I first heard about Dqlite last year, they were writing it in Go, but apparently the developers ran into issues with the way that Go interoperates with C based on latency in function calls that would then spawn a goroutine and cause a context switch which would then degrade performance. While the system currently runs on ARM, X86, POWER and IBM Z architectures, there is no support for Windows or MacOS at this time.

I’ve been doing a lot of blockchain work the last couple of years and some of those projects have included using various Hyperledger projects as a way to store large amounts of data on a blockchain where I could control the costs and performance by doing a permissioned blockchain, but there is a lot of overhead involved in that as well. Looking at Dqlite and thinking outside the box a bit, I’m thinking it might be useful in some of these implementations instead of using Hyperledger. Another thought I had was using it to store ETL data dictionaries for a tool I’m working on and then I can easily get to them from any of the nodes in our system so I don’t have to have ‘n’ number of copies of the dictionary that follows the code around. There are a lot of interesting possibilities here, just have to think outside the box.