Philip O'Toole – Page 11 – Software engineering, distributed systems, databases, and the teams that build them

InfluxDB and the Raft consensus protocol

I recently presented at the InfluxDB San Francisco Meetup, on InfluxDB and the Raft consensus protocol. My talk was about the fundamental problems of distributed systems, and how InfluxDB uses Raft to solve these issues.

Continue reading “InfluxDB and the Raft consensus protocol”

Philip O'Toole
December 16, 2015

Designing a search system for log data — part 3

This is the last part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1 and part 2.

In the last post we examined the design and implementation of Ekanite, a system for indexing log data, and making that data available for search in near-real-time. Is this final post let’s see Ekanite in action.

Continue reading “Designing a search system for log data — part 3”

Philip O'Toole
December 7, 2015

Designing a search system for log data — part 2

This is the second part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1. Part 3 follows this post.

In the previous post I outlined some of the high-level requirements for a system that indexed log data, and makes that data available for search, all in near-real-time. Satisfying these requirements involves making trade-offs, and sometimes there are no easy answers.

Continue reading “Designing a search system for log data — part 2”

Philip O'Toole
December 1, 2015

Designing a search system for log data — part 1

This is the first part of a 3-part series “Designing and building a search system for log data”. Part 2 is here, and part 3 is here.

For the past few years, I’ve been building indexing and search systems, for various types of data, and often at scale. It’s fascinating work — only at scale does O(n) really come alive. Developing embedded systems teaches you how computers really work, but working on search systems and databases teaches you that algorithms really do matter.

Continue reading “Designing a search system for log data — part 1”

Philip O'Toole
November 22, 2015

Contributing to InfluxDB

When you’d like to contribute to an open-source project it can be difficult to know where to start. Check out my latest post for the InfluxDB blog, explaining how we on the Core team have curated a set of issues, hopefully making it easy for potential contributors to start.

Philip O'Toole
November 5, 2015

Testing InfluxDB Storage Engines

Another post for the InfluxDB blog — on testing the storage engines within InfluxDB.

You can check it out here.

Philip O'Toole
October 20, 2015

Building a distributed key-value store using Raft

Hashicorp provide a nice implementation of the Raft consensus protocol, and it’s at the heart of InfluxDB (amongst other systems). I wanted to experiment with a simple system built using this particular Raft implementation, so was inspired by raftd to built hraftd.

Continue reading “Building a distributed key-value store using Raft”

Philip O'Toole
October 12, 2015

Coding like it’s 1999

“Run into an obstacle in what you’re working on? Hmm, I wonder what’s new online. Better check.”

If you haven’t already, you should start reading Paul Graham’s essays. In one on philosophy, Graham believes that many of the answers provided by philosophy are useless because “…of how little effect they have”. By that standard another of his essays is of high utility because it has affected the way I program. John Stuart Mill would be pleased.

Continue reading “Coding like it’s 1999”

Philip O'Toole
October 5, 2015
12 Comments

The strange economics of open-source software

I always use the names of economists for my machines’ hostnames. keynes, friedman, marx, fisher, ricardo.

So every so often the strange economics of open-source software hits me.

Continue reading “The strange economics of open-source software”

Philip O'Toole
September 23, 2015
4 Comments

Who watches the watchers?

I’ve written my first post for the InfluxDB blog. In it I discuss the new statistics and monitoring system built into InfluxDB, starting with the 0.9.4 release. Functionality like this is critical when it comes to running a distributed database like InfluxDB.

You can check it out here.

Philip O'Toole
September 22, 2015

400 days of Go

It’s been 418 days since my first Github commit of Go code. In that time I’ve written a Syslog-to-Kafka producer, a Raft-based distributed SQLite database, a near real-time log search system, and become a core developer of InfluxDB.

Continue reading “400 days of Go”

Philip O'Toole
September 1, 2015
33 Comments

Leadership without Management

I came across another great video about software engineering management, this time by Bryan Cantrill. It’s a really great talk, and discussed in-depth — with plenty of humour thrown in — the importance of Mission to high-performing software developers.

Continue reading “Leadership without Management”

Philip O'Toole
August 20, 2015

Running services is hard

I’ve recently been thinking about why running Services is particularly hard. By Services I mean Software-as-a-Service platforms. During the years, I’ve written software for many different systems — embedded software, web services, databases, and distributed systems, but being involved with designing and running a SaaS platform was difficult in a whole new way: running Services is hard work.

Continue reading “Running services is hard”

Philip O'Toole
August 12, 2015

Gophercon 2015

This past week I attended Gophercon 2015, in Denver, CO. It was also a chance to get together with the rest of the InfluxDB team. And because the Go community is still relatively young and small, it was a great chance to meet, in person, some of the best people working with Go today.

Continue reading “Gophercon 2015”

Philip O'Toole
July 11, 2015

InfluxDB 0.9.0 released

The first version of the 0.9.0 series of InfluxDB has been released. It’s alpha-quality software but all of us on the InfluxDB team are very excited to see the software reach this stage.

You can read more about the release on this blog post.

Philip O'Toole
June 11, 2015

Software development: it’s got nothing to do with computers

Well, almost nothing.

Obviously it’s got something to do with computers since developers spend so much of their time in front of one. But software development is actually all about people. And successful software development even more so.

Continue reading “Software development: it’s got nothing to do with computers”

Philip O'Toole
May 6, 2015

Reviewing Implementing Cloud Design Patterns for AWS

Packt Publishing have a released a new book, Implementing Cloud Desi gn Patterns for AWS, for which I acted as an official technical reviewer.

Continue reading “Reviewing Implementing Cloud Design Patterns for AWS”

Increasing bleve indexing performance with sharding

Search is everywhere. Once you’ve built search systems, you see its potential application in many places. So when I came across bleve, an open-source search library written in Go, I was interested in learning more about its feature set and its indexing performance. And I could see immediately one might be able to shard it to improve performance.

Continue reading “Increasing bleve indexing performance with sharding”

Philip O'Toole
April 27, 2015

Reviewing Elasticsearch Cookbook

I recently acted as one of the official technical reviewers for ElasticSearch Cookbook – Second Edition by Alberto Paro. Published by Packt Publishing, the book contains a large number of “recipes” for elasticsearch.

Continue reading “Reviewing Elasticsearch Cookbook”

Philip O'Toole
March 13, 2015

Code reviews still rule

Searching-512 Recently at InfluxDB we discussed how code reviews fit in during the various stages of development. It’s great to see the team reach consensus about how we should develop software. It made me think more deeply about why I remain a big believer in the code review process.

Continue reading “Code reviews still rule”

Philip O'Toole
February 20, 2015
4 Comments

History of Software Engineering

I recently came across a talk on YouTube titled History of Software Engineering, given by Paolo Perrotta. Normally I find online videos to have a low information-to-time ratio, but this one was excellent. It’s not too long, with plenty of humour, and makes many serious points that resonated with me.

Continue reading “History of Software Engineering”

Philip O'Toole
February 17, 2015

Book Review: Cassandra High Availability

Packt recently asked me to review their new publication Cassandra High Availability, written by Robbie Strickland.

I’ve worked with Cassandra in the past — early designs of Loggly‘s 2nd generation Log analytics platform used Cassandra as its authoritative store for log data, but we ended up pulling it and using elasticsearch as both the store and search engine.

Continue reading “Book Review: Cassandra High Availability”

Philip O'Toole
February 3, 2015

Software Development for Infrastructure

Bjarne Stroustrup has another very interesting paper on his website. Titled Software Development for Infrastructure, it discusses some key ideas for building software that has “…more stringent correctness, reliability, efficiency, and maintainability requirements than non-essential applications.” It is not a long paper, but offers useful observations and guidelines for building such software systems.

Continue reading “Software Development for Infrastructure”

Philip O'Toole
January 10, 2015

Eudyptula Challenge

The Eudyptula Challenge is a series of programming tasks, with the goal of getting one up-to-speed on Linux kernel programming. When I first heard about it, it immediately intrigued me. I’ve written a few production Linux kernel modules in my time — mostly device drivers — so I started the challenge today.

Philip O'Toole
December 23, 2014

Drop, Throttle, or Buffer

Real-time — or near real-time — data pipelines are all the rage these days. I’ve built one myself, and they are becoming key components of many SaaS platforms. SaaS Analytics, Operations, and Business Intelligence systems often involve moving large amounts of data, received over the public Internet, into complex backend systems. And managing the incoming flow of data to these pipelines is key.

Continue reading “Drop, Throttle, or Buffer”

Philip O'Toole
December 10, 2014