Error Economics - How to avoid breaking the budget
At SLOConf 2021 I talked about how we may use error budgets to add pass/fail criterias to reliability tests we run as part of our CI pipelines.
As Site Reliability Engineers, one of our primary goals is to reduce manual labor, or toil, to a minimum while at the same time keeping the systems we manage as reliable and available as possible. To be able to do this in a safe way, it’s really important that we’re able to easily inspect the state of the system.