Notes on software, observability, and running teams.
Engineering leader at Canonical, focused on making observability accessible and valuable to teams of any size. This site is where I keep essays I've written, talks I've given, and things I'm currently thinking about.
Latest writing
browse the archive →
Why clever code is an organizational output, not an engineering one — and how agents turn a slow-moving problem into a fast one.
read the essay →More essays
8 totalMy design goals for building an autonomous AI agent, rather than limiting it to interactive prompting session-by-session.
A sizing tool for COS Lite deployments. It used to live here but got lost in one of my many blog migrations. Now it’s back!
Signal Studio explores a deficit in the OpenTelemetry ecosystem: how to assess the impact of changes to your config.yaml before rolling out in production.
This site has needed a facelift for years. Not because the technology was outdated, but because every previous version of this blog eventually died. Quietly.
The last couple of years, there has been quite a lot of development lowering the barrier of entry for observability. There are now quite a few, reasonably mature options out there that lets you set up…
At SLOConf 2021 I talked about how we may use error budgets to add pass/fail criterias to reliability tests we run as part of our CI pipelines.