Philip O'Toole – Page 12 – Software engineering, distributed systems, databases, and the teams that build them

Measure Everything

Tomorrow I join the team at InfluxDB, something I’m really excited about. I’m really looking forward to coding in Go full-time — it’s a language with real promise, a nice clean tool chain, and a very active community.
Continue reading “Measure Everything”

Philip O'Toole
November 16, 2014

Is node.js just a stopgap?

Something just doesn’t feel right about node.js.

After coding in it for almost a year, it’s been fun, but I’ve decided it’s just a waypoint to somewhere better.

Continue reading “Is node.js just a stopgap?”

Philip O'Toole
November 1, 2014
28 Comments

Evolving a language in and for the real world: C++ 1991-2006

Bjarne Stroustrup has great paper on his website titled Evolving a language in and for the real world: C++ 1991-2006. It provides fascinating insights on the development of the language, the challenges involved, and discusses interesting design ideas. If you have even a basic understanding of C++, it’s a such a worthwhile read.

Continue reading “Evolving a language in and for the real world: C++ 1991-2006”

Philip O'Toole
September 20, 2014

Replicating SQLite using Raft Consensus

SQLite is a “self-contained, serverless, zero-configuration, transactional SQL database engine”. However, it doesn’t come with replication built in, so if you want to store mission-critical data in it, you better back it up. The usual approach is to continually copy the SQLite file on every change.

I wanted SQLite, I wanted it distributed, and I really wanted a more elegant solution for replication. So rqlite was born.

Continue reading “Replicating SQLite using Raft Consensus”

Philip O'Toole
September 2, 2014
27 Comments

Wow, the Go Memory Model really threw me

So far coding in Go has been fun. It comes with nice functionality that lets you know that the Go team really have been writing system software (useful stuff like this, and this). And then I read about the Go Memory Model, and had my consciousness raised.

Continue reading “Wow, the Go Memory Model really threw me”

Philip O'Toole
August 7, 2014
5 Comments

Writing a Syslog Collector in Go

I’ve started coding in Go (golang), and I received some advice recently from Robert Griesemer, whom I was fortunate enough to sit beside at a recent Go Meetup. To learn Go, Robert suggested that I code a solution in Go for a problem I had previously solved in a different language.

Continue reading “Writing a Syslog Collector in Go”

Call me Definitely

The creator of the network monitoring system Riemann, Kyle Kingsbury, has put together a comprehensive series of blog posts, on the fault-tolerance, high-availability, and general correctness of number of database and storage technologies. Of the technologies discussed I am most familiar with — elasticsearch and Apache Kafka — I found the posts to be a great read.

If you haven’t read them yet, you should check them out on his site.

Philip O'Toole
July 14, 2014

InfluxDB and Grafana HOWTO

This blog describes working with InfluxDB 0.8. InfluxDB 0.8 is no longer supported, and has been superseded by the 1.0 release.

I recently came across InfluxDB — it’s a time-series database built on LevelDB. It’s designed to support horizontal as well as vertical scaling and, best of all, it’s not written in Java — it’s written in Go. I was intrigued to say the least.

Continue reading “InfluxDB and Grafana HOWTO”

What I wish I’d been told about the JVM

Java is the predominant language of Big Data technologies. HBase, Lucene, elasticsearch, Cassandra – all are written in Java and, of course, run inside a Java Virtual Machine (JVM). There are some other important Big Data technologies, while not written in Java, also run inside a JVM.

Examples include Apache Storm, which is written in Clojure, and Apache Kafka, which is written in Scala. This makes basic knowledge of the JVM quite important when it comes to deploying and operating Big Data technologies.

Continue reading “What I wish I’d been told about the JVM”

Philip O'Toole
April 16, 2014

How you should write software design documents

In my last blog post I explained why writing design documents is such a powerful approach to building well-engineered systems. But what should one document?

Continue reading “How you should write software design documents”

Philip O'Toole
April 2, 2014

Why you should write software design documents

Many software engineers never write design documents. Design documentation takes time, and implementations often proceed so far without any documentation that if it happens, it’s an act of recording what has been done — a tedious task at the best times.

Many software engineers argue “the code exists, it’s running, it’s working, let’s move on and build the next thing.”

Continue reading “Why you should write software design documents”

Always thinking of the next guy

My father worked for many years in Quality Assurance at Beckman, an American medical instruments firm. His job was to ensure that newly-manufactured centrifuge rotors would hold up when spun at thousands of RPMs. He used to tell me that the Beckman philosophy could be summarised in one sentence — “There is no substitute for quality”.

Continue reading “Always thinking of the next guy”

Philip O'Toole
March 30, 2014
1 Comment

Welcome to your data

After 2 years at Loggly, tomorrow I start a new role at Jut. While I will miss the team at Loggly very much, and the wonderful product we built during my team there, I’m looking forward very much to working again with some old colleagues from Riverbed Technology.

Philip O'Toole
February 17, 2014

Distributed Systems for Fun and Profit

I came across a very readable paper on distributed systems — Distributed systems for fun and profit. I recommend it for anyone interested in learning more about distributed systems, and the challenges involved with designing, building, and operating distributed systems.

Philip O'Toole
February 6, 2014

Book Review: Mastering ElasticSearch

Packt recently asked me to review their new publication Mastering ElasticSearch by Rafał Kuć and Marek Rogoziński. Since most of my experience with elasticsearch has been from a systems points of view — index management, cluster maintenance, indexing performance — I paid most attention to the chapters about those parts of elasticsearch.

Continue reading “Book Review: Mastering ElasticSearch”

Philip O'Toole
February 1, 2014

Infrastructure at Scale: Apache Kafka, Twitter Storm and elasticsearch

AWS have posted the video online of Jim Nisbet’s and my talk at AWS:reinvent 2013. In it, Jim and I describe the system we built at Loggly, which uses Apache Kafka, Twitter Storm, and elasticseach, to build a high-performance log aggregation and analytics SaaS solution, running on AWS EC2.

Continue reading “Infrastructure at Scale: Apache Kafka, Twitter Storm and elasticsearch”

Philip O'Toole
December 25, 2013

Speaking at AWS re:Invent 2013

This past week I had the opportunity to speak, with my colleague Jim Nisbet, at AWS re:Invent 2013. Titled “Unmeltable Infrastructure at Scale: Using Apache Kafka, Twitter Storm, and Elastic Search on AWS“, Jim and I described the architecture of Loggly’s next-generation log aggregation and analytics Infrastructure, which went live 3 months ago, and runs on AWS EC2.

Continue reading “Speaking at AWS re:Invent 2013”

Philip O'Toole
November 16, 2013

Loggly Generation 2 Released!

After 14 months of hard work, the next generation of Loggly has been released. It’s been a great time to be part of the Software Infrastructure team at Loggly and we have put together a superb log aggregation & real-time analytics platform.

We used a combination of custom log Collectors, Apache Kafka, Twitter Storm, ElasticSearch, and lots of secret sauce. You can find more details about the technology stack from my Loggly blog post.

Philip O'Toole
September 7, 2013

Technical Leadership through Testing

As technical lead at Loggly, responsibility for a well-engineered infrastructure ends with me. And one way to ensure the system is designed and implemented well is to stay as close as possible to the code, ensuring that the team and I write quality software.

But it can be difficult to complete the design and implementation of the features I am responsible for, ensure that what the team produces is well-implemented, and understand every line of code — there is only so much time in the day.

Continue reading “Technical Leadership through Testing”

Philip O'Toole
June 30, 2013

Using the Source

I have written another post for the Loggly blog — all about our guidelines for choosing and integrating open-source software and technology in your next project.

Check it out here.

Philip O'Toole
April 2, 2013

If you love your logs, set them free

I recently wrote my first post for the Loggly blog. It illustrates why host machines are often the worst place to store the logs those machines are generating.

You can check it out here.

Philip O'Toole
March 21, 2013

Monitoring Storm Kafka Spouts using Python

When running a large real-time processing system, monitoring is critical. But it does more than allow you to keep an eye on your system. During development it allows you test hypotheses about how it works, how it performs when certain parameters are changed, and takes the guessing out of working with dynamic systems.

Storm, a real-time computational framework open-sourced by Twitter, is such a system and comes with a Spout, allowing messages to be streamed from a Kafka Broker.

Continue reading “Monitoring Storm Kafka Spouts using Python”

Philip O'Toole
March 12, 2013