Monday, November 28, 2011

Fixing Parallel Test Execution in Visual Studio 2010

As the number of tests in my project grow, so does the length of my continuous integration build. Fortunately, the new parallel test execution of Visual Studio 2010 allow us to trim down the amount of time consumed by our unit tests. If your unit tests meet the criteria for thread-safety you can configure your unit tests to run in parallel simply by adding the following to your test run configuration:

<?xml version="1.0" encoding="UTF-8"?>
<TestSettings name="Local" id="5082845d-c149-4ade-a9f5-5ff568d7ae62" xmlns="">
  <Description>These are default test settings for a local test run.</Description>
  <Deployment enabled="false" />
  <Execution parallelTestCount="0">
    <TestTypeSpecific />
    <AgentRule name="Execution Agents">

The ideal setting of “0” implies that the test runner will automatically figure out the number of concurrent tests to execute based on the number of processors on the local machine. Based on this, a single-core CPU will run 1 test simultaneously, a dual-core CPU can run 2 and a quad-core CPU can 4. Technically, a quad-core hyper-threaded machine has 8 processors but when parallelTestCount is set to zero the test run on that machine fails instantly:

Test run is aborting on '<machine-name>', number of hung tests exceeds maximum allowable '5'.

So what gives?

Well, routing through the disassembled source code for the test runner we learn that the number of tests that can be executed simultaneously interferes with the maximum number of tests that can hang before the entire test run is considered to be in a failed state. Unfortunately the maximum number of tests that can hang has been hardcoded to 5. Effectively, when the 6th test begins to execute the test runner believes that the other 5 executing tests are in a failed state so it aborts everything. Maybe the team writing this feature picked “5” as an arbitrary number, or legitimately believed there wouldn’t be more than 4 CPUs before the product shipped, or simply didn’t make the connection between the setting and the possible hardware. I do sympathize for the mistake: the developers wanted the number to be low because a higher number could add several minutes to a build if the tests were actually in an non-responsive state.

The Connect issue lists this feature as being fixed, although their are no posted workarounds and a there’s a lot of feedback that feature doesn’t work on high-end machines even with the latest service pack. But it is fixed, no-one knows about it.

Simply add the following to your registry (you will likely have to create the key) and configure the maximum amount based on your CPU. I’m showing the default value of 5, but I figure number of CPUs + 1 is probably right.

Windows 32 bit:
Windows 64 bit:

Note: although you should be able to set the parallelTestCount setting to anything you want, overall performance is constrained by the raw computing power of the CPU, so anything more than 1 test per CPU creates contention between threads which degrades performance. Sometimes I set the parallelTestCount to 4 on my dual-core CPU to check for possible concurrency issues with the code or tests.


So what’s with the Connect issue? Having worked on enterprise software my guess is this: the defect was logged and subsequently fixed, the instructions were given to the tester and verified, but these instructions never tracked forward with the release notes or correlated back to the Connect issue. Ultimately there’s probably a small handful of people at Microsoft that actually know this registry setting exists, fewer that understand why, and those that do either work on a different team or no longer work for the company. Software is hard: one small fissure and the whole thing seems to fall apart.

Something within the process is clearly missing. However, as a software craftsman and TDD advocate I’m less concerned that the process didn’t capture the workaround as I am that the code randomly pulls settings from the registry – this is a magic string hack that’s destined to get lost in the weeds. Why isn’t this number calculated based on the number of processors? Or better, why not make MaximumAllowedHangs configurable from the test settings file so that it can be unit tested without tampering with the environment? How much more effort would it really take, assuming both solutions would need proper documentation and tests?

Hope this helps.