I was quoting Conan The Ops Guy in my previous post because I wanted to start writing some stuff about root cause analysis, problem management, after-action reports, etc. Then John Allspaw wrote this incredibly fantastic blog post about blameless postmortems that so eloquently and thoughtfully conveys a bunch of the things I was thinking about that now I am just going to sit here feeling inadequate and tell you to go read it.
Here are some of the things I really liked about what he had to say:
- A culture of blame leads to people not providing information, and the information is what you need to improve things
- A lack of information leads to larger disconnects in understanding between line and management
- Blame implies a strategy of deterrence, versus a strategy of prevention
- Just saying “Person X should have done Y instead of Z” does not help the next person, unless you also understand why Person X did Z in the first place and change those circumstances
- Post-mortem analyses should be about learning, not about blame
And one note from me that I put into a comment on the original post, but the comment is still awaiting moderation so I better say it here too:
One thing I would add: it’s also critical for a successful process that the output of post-mortems be acted upon in a *timely* and *visible* manner. If someone spends their time doing a bunch of analysis and comes up with recommendations on how to avoid problems but then feels like those recommendations are ignored or not appreciated, that also is highly disincentivizing future analysis. I’m sure Etsy doesn’t have this problem, but I’ve seen it happen in other organizations, especially as they get larger.
Basically John points out that if there are negative consequences to providing information that is useful in preventing future incidents, then people won’t provide that information. The flip side however is that if there aren’t positive consequences for providing that information, people also won’t provide it. And the best positive consequence I can think of is seeing your information acted on to make your service more reliable, prevent future problems, and improve the experience of your customers, all of which ultimately makes your business stronger.