Monthly Archives: December 2013

The Freakonomics of Oncall Pay

Wearing the pager. It’s a fact of life for many of us ops folks. I’ve taken part in many a discussion from a management perspective about how oncall duty should be compensated. When the people doing the talking are pointy-haired-manager types who haven’t done oncall themselves, their starting position is often similar to the “per incident support” policies you get from a vendor – if you’re oncall and you get paged, you get paid either a set amount for that incident or an amount proportional to how long you’re working on it. So you might get say $50 per incident, or $25 per hour you’re engaged, for example. And if your oncall duty period goes quietly (no pages), you won’t get anything above your normal salary. Let’s call this the “per incident” model. The other option is a flat wage bump for the period you’re oncall – so for example every week you’re oncall, you get an extra $150 in your check. Let’s call this the “per week” model.


image provided by flickr user hades2k under a Creative Commons license.

I’ve always been a strong advocate for the “per week” model instead of the “per incident” model, because being oncall is an intrusion by your work into your personal life – it reduces your freedom during your time off work, and it does that whether or not you get paged. You can’t fly to Vegas for the weekend, you can’t get sloshed after work, and in some cases you can’t even take a shower or drop your kids at school without arranging for someone to cover you in case there’s an incident while you’re away from the keyboard. Simply being oncall affects your life, and I’ve always felt that people should be compensated for that – not just for the time where they are actively working an issue.

Then I saw _Freakonomics_ and realized there’s an even more powerful argument for the “per week” model: incentives. In the “per incident” model, your compensation goes up when the system has more problems that require oncall support. In theory, people might try to cause incidents so that they will get paged and therefore get more money. Personally I doubt that happens very often, if at all. However, there’s a more subtle influence on root cause analysis and problem management that I think does have real effects. When you’re paid in the “per week” model, you’re strongly incentivized to address the root causes of problems and improve your systems so that during your oncall week it’s more likely that you’ll be able to sleep through every night and live your life normally for that week. So when you do encounter an issue in the “per week” model, you’re going to really want to figure out what caused it, and to make meaningful changes to prevent it from happening the next time around. Fixing the problem completely does you nothing but good – you’re still going to get paid the same oncall pay for your next stint, and you’re going to have a much more pleasant experience when you’re on call. But in the “per incident” model, putting all that effort into root cause and remediation is actually going to cost you money. The next time you’re oncall, each incident you prevented from happening will mean you don’t get the money for working that incident. So consciously or not, it’s likely you’re not going to work quite as hard to make your systems better as you would under the “per week” model. I believe that this can have real effects and result in your systems being less reliable and less stable than they could be.

How does your company do oncall? Is it per week, per incident, or something else? And how do you think that is affecting incentives and outcomes? I’d love to hear your experiences!

PS: I did a little research into military hazard pay to see if there were instructive parallels there. I found Hostile Fire Pay which seems to follow the “per week” model. Anyone have information/thoughts on how incentives have affected this program?