Learning data engineering
Sometime ago I was asked where to begin to learn data engineering. It was a broad question, and it took some to understand what exactly I was being asked.
Software engineering, distributed systems, databases, and the teams that build them
Software engineering, distributed systems, databases, and the teams that build them
Sometime ago I was asked where to begin to learn data engineering. It was a broad question, and it took some to understand what exactly I was being asked.
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.4.0 is now out and allows the Raft election timeout to be set. You can download the release from GitHub.
The batching of data or computation amortizing a fixed cost over multiple units — is a very common pattern in many computers systems. It’s particularly prevalent in networking and CPU memory accesses. But the implementation of batching includes many subtleties…
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.3.1 is now out and includes some cluster management fixes. You can download the release from GitHub.
I recently had a chance to speak about rqlite, the distributed, lightweight database built on SQLite, at the University of Pittsburgh Computer Science Club. It was a good evening as I spoke about distributed systems, the problems they solve, and how rqlite…
Go remains one of the languages I’m most productive in. Its combination of the rigour of static typing, but fluidity of Python, makes it both robust and easy to code in. It’s also got some innovative features that help you…
I’ve moved continuous testing of rqlite to CircleCI 2.0. The initial work I did with hraftd was helpful, though rqlite was definitely more involved. Testing is significantly quicker with the new, container-based, version of CircleCI, which should help noticeably with…
CircleCI, which I used for much of my open-source integration testing, has released version 2.0. Support for 1.0 is finishing in August 2018, so it’s time to migrate my projects. I’ve started with hraftd. It was pretty easy, but I…
Monitoring — the measurement of your system, the gathering of telemetry, and alerting when it behaves anomalously — is key to running large-scale, modern computer systems. But what many developers today don’t realise is that monitoring can be a key part of…
Hashicorp recently released version 1.0 of their Raft consensus package. The Hashicorp implementation, along with SQLite, forms the core of rqlite. rqlite has now been ported to release 1.0 and will be a key change in the upcoming release of…
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.3.0 is now out and includes some new security functionality. You can download the release from GitHub.
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.2.2 is now out. This release include some minor bug fixes. You can download the release from GitHub.
This is the third in a series about core data structures and algorithms. The outstanding characteristic of binary search is that it’s intuitive. Many algorithms are not, but binary search is what people — anyone, not just programmers — naturally…
This is the second in a series about core data structures and algorithms. Considering how important sorting is to computer science and programming, it’s actually a mystery why more programmers don’t appreciate it.
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.2.1 is now out. This release include some minor fixes to the CLI, as well as a simple benchmarking tool. You can download the release from…
This is the first in a series about core data structures and algorithms. Many explanations of data structures focus on the implementation — and that is very important — but I’ve always found some context makes it so much easier…
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.2.0 is now out. This release moves the build to Go 1.9. You can download the release from GitHub.
I’ve been programming for many years, and have spent most of the last few years managing development teams. I’ve written plenty of closed source software, and for a time made my living writing open source software too. One thing stands…
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.1.0 is now out. This release includes some minor enhancements and bug fixes. You can download the release from GitHub.
I’ve started a new Java client for rqlite, the lightweight distributed relational database, build on SQLite. I’m using Eclipse as my IDE and it’s working well so far. You can check out the source on Github here.
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v4.0.1 is now out. This release includes some minor bug fixes. You can download the release from GitHub.
I’ve started experimenting with Go and gRPC. To that end I’ve written a simple service that accepts connections from gRPC clients, allowing those clients to send queries to PostgreSQL. So far it’s been pretty straightforward. You can check out the…
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. It’s been over a year since the last major release and v4.0.0 is now available.
rqlite is a lightweight, open-source distributed relational database, with SQLite as its storage engine. v3.14.0 is now out. This release is the first built with Go 1.8. You can download the release from GitHub.
The new Analytics system, built by my team at Percolate, allows our end-users to program their own custom calculations, offering them the ability to precisely customize the product for their needs. At the center of that feature is a Pratt…