A few months ago, in my post “Why isn’t there more documentation?”, I talked about the different audiences for documentation and how the needs of each group can vary quite a bit. I can confidently tell you that there is one type of documentation that yields high ROI. That is maintenance/support documentation.
When people think about docs for support, two main concepts come to mind:
- Knowledge base
- Technical documentation
Knowledge bases are time consuming (expensive) to build because users typically expect them to contain every single possible scenario that could ever EVER happen (which is impossible). Your ROI will diminish when you add content that people never use. Consequently, the simplest solution is to drop a search engine over an incident tracking system and call it “good enough”. A very successful public example of this is StackOverflow.com. Even Microsoft (MSDN) points to them.
Technical documentation is more narrow because its audience is expected to consist of technical people (ITIL support line 2-3+). May 2010, I led a group that implemented a system like this and it worked awesomely because it was designed using proper engineering principals. Namely:
- Identify who will use it (so it would meet their needs without getting out-of-hand)
- Narrowly define what type of information these people needed
- Define what information was out of scope (so it didn’t get bloated or costly or wasteful)
- Pick a platform that facilitates easy maintenance
- Define templates to make it easy to add the right information, consistently
- Perform reviews to enforce consistency
- Assess the system (after 3-6 months) to determine if it met our needs or what could be improved
We used a wiki, with versioning, because it made it easy to maintain the information but kept a history, in case somebody made a mistake, went rogue, etc. Since the company already had SharePoint and wiki libraries are built-in, it was very easy to set up and launch.
Perhaps, one of the biggest ingredients in the success of our tech docs stemmed from defining a content template/example. By doing so, the information was consistent and easy to quickly skim (for reading or reviewing). Our search feature found the information we needed because our templates required it to be there. In case you are interested, our template looked something like this:
[Title: System Name, for example: User Synch] Brief description of what the system does (purpose), who uses it, how critical it is. History Brief history including the last revision (description/any significant reasons for revising it) and date Servers/hosts/machines - Bulleted list of servers that this runs-on and their purpose (think: deployment targets) * Services (on this host) that are required - Typical list members: Host, database, authentication, firewall, router (anything that could cause this system to fail, be unavailable or get errors) - Example: MainSql01 - database - 10.0.22.19 - houses the origin data (get data from) for this service - Don’t forget VM hosts or SAN dependencies (if appropriate) Dependent Systems - [in-bound] Other systems that depend on this system and a brief description of the dependency - Example (for a web service) SharePoint - calls this service for the External user report and the deactivate user utility Technologies - List of technologies that are used for this system, so anyone who touches it will know what they are getting into before they touch it, or who to look for when asking for help (internal or external) - Example: ASP.NET, Web Services (WS classic), ESRI, SQL Server, Telerik AJAX, Visual Studio 2010 Security - List of permissions you will need to administrate, update, deploy, get-into any of these systems - Any account names or service accounts that any services are running-under (no PWs) - Other technologies, like HTTPS or PKI or Kerberos Developer resources - Everything you need to edit and compile the app - Source code location (server: \branch) - Dev tools such as: Visual Studio, SSMS, BIDS, RedGate, Oracle client - Any custom controls or 3rd party components, such as Telerik or NHibernate Point of Contact - Primary point-of-contact (developer) - Secondary (developer or ITIL line2 support)
This gave us a quick pocket of knowledge to search whenever we were doing a server upgrade or someone went on vacation, or was sick, etc. Also, when the service-desk got a call about something not working, it helped them route traffic to the right IT and dev support personnel. Likewise, it gave the IT people enough information to know if it was something they could handle or if they needed to get a developer involved. Finally, each time we added people to our team (buy or rent), it was a quick way for them to familiarize themselves with our environments and technologies.
To keep things organized and on the straight-and-narrow, we chose to have one day per month, where team leads (me) were required to review this doc repository and confirm that everything was still up-to-date. Also, updating these docs became part of our release process eventually, giving us a two-step process (check/balance).
So, when your stuff is broken and you need it up 5 minutes ago, these docs can make a big difference (= big ROI). The key is to not over-do it. You don’t need a mountain. This mole-hill was everything to us, and yet, so simple.