Tuesday, May 31, 2011

The Tests are Broken, Now What?

Despite our best intentions to write durable tests, it seems inevitable that tests will break at some point. In some cases, breaking a test can be seen as a very positive thing: we've introduced a side-effect that has broken existing functionality, and our tests have helped prevent a bug. However, in some cases, tests break because of external factors -- maybe the tests weren't isolated enough, maybe a developer went rogue and forgot to run the tests, or some other poorly justified or lousy excuse. While identifying what went wrong in your development process is important, the immediate concern is that these broken tests represent a failed build and developer productivity is at risk -- what should you do with the broken tests?

If the developer looking at the tests knows a bit about the code, they should have enough information in the tests to help sort things out.  The developer should drop what they're doing, slow down and research the code and the tests and fix the broken tests. If caught early, it's very manageable. Unfortunately, this isn't the common case -- the developer introducing the breaking changes doesn't understand the tests and is ultimately left with one of two approaches:

  1. Comment out or remove the test
  2. Mark the test as Ignored

Neither of these approaches "fix" anything.  The first approach, to comment the test out, is equivalent to pissing on the requirements. The test supposedly represents a piece of functional behaviour of the application, the fact that it's broken suggests a feature is also broken. The only reason to comment out a test is if that functionality has ceased to exist or the test was invalid in the first place. (Take a look at my checklist for unit test code reviews to help qualify invalid tests)

The second alternative to mark the tests as Ignored can be dangerous, depending on whether you're able to track why the test has been disabled and able to monitor when it enters this state.  This is one part developer process, one part testing framework, one part reporting.

Within NUnit, the Ignore attribute fits the bill nicely.  The attribute accepts a string message which is used to report why the test has been disabled. It should be standard developer process to only allow use of Ignore if a message is provided. (Come to think of it, I should put in a feature request so that it is required.)  When NUnit runs, the Ignored tests are not executed but the number of ignored tests are included in the xml output, meaning that your build server can track ignored tests over time. You can also bypass the Ignored attribute within the NUnit user-interface by simply selecting the Ignored test and running it manually.

[Ignore("The xml for this test isn't serializing correctly. Please fix!")]
public void SuperImportantTest()
    // snip

Within MSTest however, Ignore is essentially the same as commenting out the tests.  There's no ability to record why the test has been excluded (developers must leave comments), and when MSTest runs the test suite the Ignored tests are simply excluded from the test run.  Looking at the MSTest output, the TRX file does not define a status for ignored, meaning that it can't be tracked in your build reports.

<ResultSummary outcome="Completed">
    pending="0" />

Though I'm not a big fan of this, I'll spare you the complaining and move on to working solutions.

The best approach within MSTest isn't to disable the test but to indicate that it isn't reliable and needs attention. In addition, we need these tests to stand out without breaking the build. This is exactly what Inconclusive is for.

public void SuperImportantTest() 
    Assert.Inconclusive("The xml for this test isn't serializing correctly. Please fix!"); 
    // snip 

The key to using Assert.Inconclusive is to put the statement as the first code-expression in your test. When MSTest encounters the Assert.Inconclusive statement, execution of the test is immediately halted, and the test is recorded as "Inconclusive". The TRX reports this status separately, and the message appears in the xml output.

Next Steps

With the tests marked as problematic, it’s possible to check-in the tests and unblock developers from a failed build while you and the original test author figure out how to fix the problem. It’s a temporary patch and is not intended for all breaking tests, just small fires that happen during critical developer crunches.  Once the fire has been put out and the tests corrected, you really should go back and investigate why the test broke in the first place:

  • Should the developer’s changes have broken this test?  If not, could the code/test have been written differently to be more durable?
  • What could the developer have done differently? Is this a one time occurrence or a disconnect in process?

That’s all for now.

submit to reddit