Automated Testing (aka. the hole in your wallet)

When I was in college, I did an internship as a software tester.  I developed and maintained a library of automated tests for two semesters.  I would like to share a few things I learned from that experience.

Pre:

Initially, the test group was doing all of the testing by hand.  As the list of tests grew, they were grouped into test plans and test suites, etc.  As the company’s products matured, the test team recognized that some parts of the apps were very stable and rarely experienced changes.  However, the test team did not want to just skip the tests for those areas because sometimes, changes to an app had cascading affects and had unintended consequences in other parts of the app.  So, we tested the whole thing (by-hand), every time.

This became very monotonous and time-consuming.  It became very obvious (to a group of trained developers, doing testing) that the repetitive/monotonous tasks should be automated.  Thus, began our automated testing endeavor.

Automated monkey testing

Right off the bat, we thought about making automated tests that simulated the behavior of an un-trained tester.  This vagabond script randomly wandered around the app, clicking arbitrary buttons and entering random information.  I would turn it loose overnight or over a weekend and see how long it ran till it crashed.  It only took a week or two to realize that, it proved value-less because we were unable to repeat the list of steps for any errors that we got.  We had no idea which parts of the app were hit.  We only knew that it ran for some period of time until something unexpected occurred.

We changed our approach and I focused on being thorough and covering every square inch of a screen or feature to methodically squeeze any hidden bugs out of the app.  It worked great initially.  Such thorough work took 10x to 20x over manual testing, but it yielded obscure bugs that would have been missed in a quick once-over.  However, once we squeezed the bugs out the script, the automated tests were pretty lame.  It ran and ended with a yawn.  My boss felt that we weren’t getting a value for all of the time invested.  I tried to point out that we could run this test hundreds or thousands of times over the lifetime of the app and fully regain the investment.  He said that, once it was stable, it had a low likelihood of ever breaking.  So the ROI would get lower, each time I ran it without incident.

Freshest tests

Again, my strategy got changed.  The new plan was to write scripts to only test new features.  It worked great for 3 days, till we got the next build and several of my test scripts were broken. They were working yesterday but not today.  Oh, it turns out that the developers changed the names of a few fields and reworded a few menus.  It had no impact to human testers because nothing looked or behaved differently, but my test scripts sure did notice the difference.  I spent two days fixing them and then two days after that, we got a new build and my scripts were broken again.  I started to notice a pattern.  To top it all off, this was happening to the old scripts (the ones with a shrinking ROI).  It was hard to justify the maintenance effort for them.  Eventually, the policy became: once an old test broke, we would not fix it.  So, all my old test scripts became rusty cars, up on concrete blocks, in the back yard. They were just eye-sores and made it seem like none of them had any value (or ever did).  A part of me died each time we abandoned a broken test script.

Measuring success of automated test scripts

The one area where my test scripts seemed to excel was in detecting changes.  Finding bugs was much easier and quicker by hand.  In seconds, I could click-around a screen and try three dozen simple tests and reveal some common mistakes made by programmers.  If I wrote an automated test script to do the same thing, it would take hours or days and it would only tell me what I already knew (because I usually tried stuff by hand before writing a program to automate it).  So, automating was a way of repeating everything I already did.

Occasionally, one of my automated tests actually found a bug that manual testing had overlooked.  It was so unexpected that I assumed it was a flaw in my automated test. It actually hadn’t dawned on me that this was a real bug and my script was the first to find it.   I had even started to fix my script to compensate for the unexpected behavior. So, I had to consider which was the more reliable program: mine, or the one being tested.  Every time my scripts crashed, I had to investigate if it was for the right reasons or not.  Was the flaw actually in your program, or was it a mistake on my part? Ugh.  How awful is that?

Yes, it seemed that every time my automated script found a bug, it responded by crashing and ruining my tests.  That felt so unnatural to me.  I really wanted to write programs that just worked really well.  This must be how a missile repairman feels: if he fixes the missile well, then all of his hard work will fly away and explode.

It goes like this: if my program is bad, it will crash for no reason; if it works well, it will crash for a good reason.  If it never crashes, then you didn’t get a good value out of it.  It is a very unnatural programming paradigm.

The ROI of automated testing

Out of this, I was able to derive the true value of automated testing.  Automated testing will provide a good value if you use it to:

  • Detect changes in a program, where it should not have changed
  • Perform repetitive tasks that are expected to succeed (or fail) consistently
  • Stress-test an app
  • Perform a test very quickly
  • Perform a test at a time or location where it might be impractical to use a human tester

Beyond these, if you choose to do automated testing, you will find yourself spending large amounts of time re-writing your program (test scripts) to chase a moving target.  It will be as difficult as writing a program with a changing scope.  If you have any experience with writing software, you know what I’m talking about, so you will know how bad that can get for tests too.

Machines for machines

Automated-testing is essentially, writing a program to test a program.  Take a moment to let the irony sink-in.  You should be able to recognize that your test script (program) is going to have the same life-cycle as any other programming.  It requires { scope, requirements, development, testing, maintenance } to go smoothly.  If you choose to skip or ignore these steps, then you will have a hackish mess that won’t work very well.  Also, just like other programming, if the scope gets changed on you, then you need to re-write parts of the program (repeat as necessary).  When will it be “done”? Who knows?

So, automated testing really only works well if you are using it to test parts of another program that is stable and the requirements aren’t changing (ie, the program isn’t changing).  Usually, for a new program, you will have a very narrow window at the end of a dev cycle, where everyone is screaming to release the product.  During that period, do you really believe it is a good idea to ask a developer (or tester) to rock-out a quick little app to test the other app (that took much longer to write)?  Do you think that will go well? Seriously?

Getting maximum ROI

Automated testing will work well for an app that is done and, in the future, you will add onto it without changing the whole thing.  Then, you can run some quick (automated) regression tests to confirm that nothing has changed in the original parts.  On upgrade day (v2.0), this type of test will be invaluable.  You make a copy of production, apply the upgrade and turn-loose the automated test.  Bim-bam-boom, you get your results in minutes instead of days.  Maybe, you even create a fake customer account on your production server, so after you apply the upgrade to your real production server, you can run a thorough test to make sure everything is golden (or not) so you know whether to undo or go home.

The last place where automated testing can work is in development scenarios where you are developing in layers and one layer needs to be solid before you build on top of it.  This is where products like NUnit or VStudio test projects work.  You build tests against your data layer and perform tests to ensure that it remains stable in spite of any changes that occur.  The only caveat is that your development process has to be built around this concept.  If you are frequently changing the code that is being tested, then you will have to frequently change your tests and you will burn up hours at an alarming rate.  Automated testing could become your worst foe.  So, be realistic about your process.  Likewise you could keep an eye on changes to your tests as a measurement of stability.  If your tests are changing more often than you expect, then something deeper is wrong and you better fix your team (and adjust your timeline).

Things to avoid

Automated testing is not good for everything.  The investment can be 10-20x the cost of manual testing.  So, to get good ROI, you want to avoid the following:

  • Changes – Automated tests excel at detecting changes.  If you are making changes, the automated tests will only tell you what you already know: it changed.  So don’t make automated tests until you expect it to NOT change.
  • New stuff – Most new programs take several months of actual usage before all of the bugs are ironed out and the project stabilizes.  Be realistic about this (see Changes, above).
  • Stuff with a brief shelf life – Think of automated testing as a big bank loan to a poor man.  Your payments (ie. ROI) will be small, so you will only pay it off if you have frequent payments (tests/ROI) or you stretch it out over a long period of time.  Otherwise, you are better-off hiring a human to do the work for 6.75/hr.
  • Not having a plan before you start writing automated tests – This one should be obvious.  The cost of automated testing is so high for development and maintenance.  You really want to use it sparingly.  Analyze your project and plan where the tests will go.  Measure this one more than any other testing.

Any other experiences or lessons learned?  Please share.

Advertisements

About Tim Golisch

I'm a geek. I do geeky things.
This entry was posted in Methodology, Testing. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s