Steps for supporting a system – Part 1

For me, as a dev (technical) leader/manager, I am occasionally given a system to support.  Often, I have not been involved in the creation of that system and I don’t know the people who made it, etc.  I guess it doesn’t matter much anyway. The system is now mine and management wants a working system and not complaints or excuses.

Through much trial and error, I’ve found a process/method which has worked very well for me, consistently throughout my career. Let’s go over it.

Responsibilities:

Before I get started, let me set the record straight. There are some things that are my responsibility and there are some things that are beyond my control. Since I am a programmer, I am gladly responsible for the following tasks

  • If something has gone wrong, I must help determine the root-cause of a problem
  • Advise or assist other team members with diagnostics, configuration or repair
  • If a problem/complaint is the result-of a programming flaw, I must change the program (code) and deploy changes
  • Identify alternative approaches for solving problems (eg. Sometimes I could change the program, but it might not be the most efficient choice, in some cases)
  • Preventative maintenance: assess a system and recommend preventative measures, or predict failure points and monitor the system(s)

Resources:

Now that we know what to expect from me, let’s talk about what resources I will need, so I can do my job. To meet these responsibilities, I must gather my knowledge resources:

  • Know all possible points of failure, how-likely each one will fail, symptoms of failure for each, diagnostic procedures
  • Which means I must:
    • Have a thorough inventory of all components (elements) of the system
    • Understand what each element/part of the system does
    • Understand how the elements/parts interact and are connected
    • Understand the technologies that comprise the system
    • Know how to isolate and test each element of a system, independently

That sounds pretty extensive. Okay, maybe I don’t need to know a-a-a-l-l-l-l of that stuff to fix/support a system, but it really does help. Face it: if there is a flaw in something that I’m not aware of, then how can I support it? Right?

As you can imagine, some of this takes time. If you have lots of career experience and frequently get exposed to a variety of technologies, then you might already have a grasp of these concepts and will recognize them immediately. So, it doesn’t always have to entail a big commitment. You just need to determine which ones you need to learn about and how deep your knowledge needs to become.

Perhaps that still seems a little vague. Okay, bottom line: To support a system, you need

  • Source code
  • Access to the database
  • Access to any servers

(.. only if the system has those things)  (if it has other things: print, crypto, devices, authentication, etc. then clearly you need to know about them too)

Prepare for Changes

Support = changes. If something isn’t working, you need to change it to make it work. That’s where things can get dicey. The changes must improve stuff without breaking other stuff and making it worse. That is the goal.

Nobody’s perfect. You need to count on some of your changes/improvements not-working-out. If you have some dev/test environments (or two/three), which are similar to your production environment, you can test your changes there before promoting fixes to prod. Wait. Let’s not seem wishy-washy about that. Let’s say we MUST have those. In fact, honestly, from day-1, this is the best way to confirm that you have all of the materials, they all work, and you understand them: Make a test environment, from scratch. Understand how to test & confirm your work.

To summarize:
Step 1: Get permissions to everything. Confirm that you actually do (nothing missing, unknown).
Step 2: Prepare for change, ensure quality, have a process to guide you, go for it, brace for impact.

(Continued in part 2)

About Tim Golisch

I'm a geek. I do geeky things.
This entry was posted in Errors, Lessons Learned, Maintenance, Testing and tagged , , , . Bookmark the permalink.

Leave a comment