Monday, February 10, 2020

Challenges with Parallel Tests on Azure DevOps

As I wrote about last week, Adventures in Code Spelunking, relentlessly digging into problems can be a time-consuming but rewarding task.

That post centers around a tweet I made while I was struggling with an issue with VSTest on my Azure DevOps Pipeline. I'm feel I'm doing something interesting here: I've associated my automated tests to my test cases and I'm asking the VSTest task to run all the tests in the Plan; this is considerably different than just running the tests that are contained in the test assemblies. The challenge at the time was that the test runner wasn't finding any of my tests. My spelunking exercise revealed that the runner required an array of test suites despite the fact that the user interface restricts you to pick only one. I modified my yaml pipeline to contain a comma-delimited list of suites. Done!

Next challenge, unlocked!

Unfortunately, this would turn out to be a short victory, as I quickly discovered that although the VSTest task was able to find the test cases, the test run would simply hang with no meaningful insight as to why.

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests
[xUnit.net 00:00:00.57]   Discovered: MyTests
[xUnit.net 00:00:00.57]   Starting: MyTests
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

So, on a wild hunch I changed my test plan so that only a single test case was automated, and it worked. What gives?

Is it me, or you? (it’s probably you)

The tests work great on my local machine, so it’s easy to fall into a trap that the problem isn’t me. But to truly understand the problem is to be able to recreate it locally. And to do that, I’d need to strip away all the unique elements until I had the most basic setup.

My first assumption was that it might actually be the VSTest runner -- a possible issue with the “Run Test Plan” option I was using. So I modified my build pipeline to just run my unit tests like normal regression tests. And surprisingly, the results were the same. So, maybe it’s my tests.

Under a hunch that I might have a threading deadlock somewhere in my tests, I hunted through my solution looking for rogue asynchronous methods and notorious deadlock maker Task.Result. There were none that I could see. So, maybe there’s a mismatch in the environment setup somehow?

Sure enough, I had some mismatches. My test runner from the command-prompt was an old version. The server build agent was using a different version of the test framework than what I had referenced in my project. After upgrading nuget packages, Visual Studio versions and fixing the pipeline to exactly match my environment – I still was unable to reproduce the problem locally.

I have a fever, and the only prescription is more logging

Well, if it’s a deadlock in my code, maybe I can introduce some logging into my tests to put a spotlight on the issue. After some initial futzing around (I’m amazing futzing wasn’t caught by spellcheck, btw), I was unable to get any of these log messages to appear in my output. Maybe xUnit has a setting for this?

Turns out, xUnit has a great logging capability but requires a the magical presence of the xunit.runner.json file in the working directory.

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true
}

The presence of this file reveals this simple truth:

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests (method display = ClassAndMethod, method display options = None)
[xUnit.net 00:00:00.57]   Discovered: MyTests (found 10 test cases)
[xUnit.net 00:00:00.57]   Starting: MyTests (parallel test collection = on, max threads = 8)
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

And when compared to the server:

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests (method display = ClassAndMethod, method display options = None)
[xUnit.net 00:00:00.57]   Discovered: MyTests (found 10 test cases)
[xUnit.net 00:00:00.57]   Starting: MyTests (parallel test collection = on, max threads = 2)
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

Yes, Virginia, there is a thread contention problem

The build agent on the server has only 2 virtual CPUs allocated and both executing tests are likely trying to spawn additional threads to perform the asynchronous operations. By setting the maxParallelThreads to “2” I am able to completely reproduce the problem from the server.

I can disable parallel execution in the tests by adding the following to the assembly:

[assembly: CollectionBehavior(DisableTestParallelization = true)]

…or by disabling parallel execution in the xunit.runner.json:

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true,
  "parallelizeTestCollections": false
}

submit to reddit

0 comments: