*** Disclaimer: This is not about my current job. Carry on. ***
In part 1, we looked at emergencies in IT departments and discussed some ways to handle things. In this part, I would like to talk about [what to do] and [what not to do]. Also, if you see someone handling things poorly, I have some suggestions.
When a good manager is being bad
If you are a good manager, you know what to expect from your people. However, if you are a manager who does not have experience with IT people, then the behavior of your crew, during a crisis, might surprise you. A rookie manager (or one without IT experience) will not be sure that everyone is doing what they are supposed to do. When the team is working hard (a good thing), they will seem really quiet, and you might feel very alone and maybe even abandoned. Please discipline/train yourself to not change that condition.
Here are three terrible things that you must not do during a crisis:
- Motivate people to work harder – It is great for sales people and athletes, but terrible for IT people. Getting the techies wound-up or emotional, is not going to help. It will prevent them from focusing. When you interrupt them, it sends them a very confusing message: It will seem to them as if you think they are doing things wrong and you want them to do something different. They will be inclined to stop what they are doing, and hand-over the thinking to you. Then, they will do whatever you tell them, no matter how bad your recommendations might be. It will not occur to them that they should simply reassure you, and then go-back to doing the correct thing.
- Ask for frequent updates – Talking to them will simply impede the techies’ ability to focus on the problem and determine a good solution. Each time they lose focus, they will have to regain focus. If the interruptions are frequent enough, it will be impossible to make any reasonable progress.
- Get immediate results – This is rarely the best path. Precision inherently takes time. Panicked fixes usually cause more harm/destruction than good. Although the best solution will take more time, it will usually take less time than undoing the damage from a panicked fix/change. It is akin to driving like a maniac when you are late for work. It is a bad plan and can yield disastrous results.
A better, more sensible process goes like this:
- Pre: Give each person a few minutes to assess/triage the situation (without fixing it). Gather information about the problem. Each person prepares to explain the problem & process to the team. Afterwards, the manager will be able to speak (on their behalf) in an informed manner, to anyone who asks.
- Have a quick meeting (5 min), away from the computers, etc. Ask the team to organize the plan-of-attack.
a. What could be causing this?
b. Has anything changed recently, that could cause this?
c. How long will it take to eliminate half of the possible causes (mentioned in (a))?
d. Is there anything that could be done to help/divide-and-conquer?
- Now that they have a solid sense of direction, give them space and fend-off distractions.
- After 30 more minutes, have another quick meeting to assess progress (are they working the list, or focusing on one problem? Is that a reasonable gamble?) Determine if this is likely to be solved a) any minute or b) this may take a while or c) no way of telling yet.
- After an hour, if the problem is still not solved, have another brief meeting to talk about what can be done for an interim solution. Even if the full solution might be minutes-away, it is best to start working on a plan B.
- If two hours gets past-you, then it is best to act as if this outage will be “all-day” or longer. Begin working on a contingency plan.
This process is pretty standard in the IT world. Everybody starts with this as a basic template for an emergency plan and tweaks the steps to fit their organization and its IT needs.
Handling outside vectors
Keep in mind that there are managers from other departments, who don’t know how to act. I once faced this type of situation at work. During an emergency server outage, a manager from another team was melting down and demanded that me and my team do the same. I actually had to lock her out of our area, to keep her away from a sys-admin who was deep in thought. Ten minutes later, things started recovering. In a half hour, everything was back to normal. Everything except for that other manager. She took it personally and demanded a promise from me that I would never attempt to contain her, ever again.
For a moment, I considered appeasing her with a white lie: by promising this would never happen again. Unfortunately, we were upgrading all of the servers with an aggressive schedule. This scenario was likely to happen a few more times. We all needed to be on our toes. I apologized for any bruised feelings, but I had to be clear that her behavior was out-of-line and she needed to stay-away when we were handling a crisis. If not, I would take it up with her boss.
So, if you are in an environment that undergoes outages occasionally, or if you are facing an upcoming project with known risks, talk with your manager and explain what he/she should expect from you (namely: concentration and silence). Also, if you are an IT person who gets attacks from fly-by panic-mongers, then get a commitment from your manager, to handle these banshees and maintain a safe perimeter for you, while you fight the fires.
It is one more form of planning that you should prepare, before it comes up. Or just send them my way. I’m glad to share.