Steps for supporting a system (part 2)

In the previous article, I talked about the fundamental resources that I need, to be able to support a system. In this part, I will break-down the individual steps of creating and applying a fix to a system.

Summary – Part 1: get permissions to everything. Part 2: TLDR – Once I have the source code, and some sort of Dev environment. I need to compile the code and deploy it to dev and confirm that it all works. This is a pre-requisite for supporting a system. Without all of this, my hands are tied.

Now that I have the permissions necessary for supporting this system (including all servers, databases, files, etc), I’m ready to continue.

Setting-up

In part 1, we talked about a dev (and QA, etc) environment(s). Considering that those are set-up and working, we are ready to make some fixes and test them before proceeding.

If there are no bugs, waiting to be fixed, I can step-debug through the code and see how things are working, to gain a better understanding of the system, before the work starts rolling-in. Be prepared, right?

Ha. I had you going there. Imagine a system without bugs or backlog! Haha! I think we all expect that every system will have a list of bugs, and a list of features you-wish-were-in-there. So, I know I won’t have time to poke around and learn. I will probably be smoke-jumping from day one. Geronimo! Let’s get this party started.

Picking your work

The backlog of work will usually consist of a few problems which were observed by users, some features which are wrong and a few quirks which folks can’t seem to pin-down. There might also be a few features which didn’t get completed yet, because of time or budget. Management must prioritize this work. Just don’t make everything a #1, or resort to fractions (like “this is a 0.9. Oh wait, this one is a 0.8…”). We aren’t going to start with a plan to lose-our-minds immediately. Right?

The other challenge is “Scope”. How often (and when) do you apply (deploy) fixes? [as each fix gets done] is probably too-often, unless the server/system is already broken or security-breached. Most agile teams prefer to do releases monthly, but some can be bi-monthly or even weekly. Keep-in-mind that each change may introduce new problems and new priorities, or just move them around a little. So don’t start-out overly optimistic.

Fixing – How this is going to go

Step 1 – Diagnose. I can’t really explain the hocus-pocus that goes into this step, but I can tell you that it involves basic skills, (eg. isolate the cause, “divide and conquer”, etc), some higher skills and some instinct. Most of it comes with experience. I’ve discussed this in previous blog posts. Once you isolate the problem(s), you can determine possible remedies or courses-of-action.

Step 2 – Make the fix. Use dev tools and a dev environment (probably on my own dev PC) to compile the code and run it. Once I get it working for me, there is around a 75% chance of success from here. Don’t start feeling lucky just yet cowboy!

Step 3 – Confirm the fix. Eliminating a problem is not the same as NOT-breaking anything else. I need help from folks who can tell the difference. If we are wrong, we will just be back here in a few hours, starting over, but with new problems. It is better to get it right, now, and have closure.

Step 4 – Apply the fix. This depends on where the fix goes. Maybe on a server, in-the-cloud, on a database or on a PC. We already talked about the fact that I will need access (permission) to put the fix (or fixes) where they go. I also need the ability to undo my fix, in the event that Step 3 was botched, or I can’t get past Step 3.

Step 5 – Wait for “the other shoe to drop”. Like I said in step 2, there is a 75% chance of success. There are way more users than testers. Time-will-tell.

Finally – Figure out what caused this, and how to prevent it. Let’s stop meeting like this.