Monday, February 10, 2020

Challenges with Parallel Tests on Azure DevOps

As I wrote about last week, Adventures in Code Spelunking, relentlessly digging into problems can be a time-consuming but rewarding task.

That post centers around a tweet I made while I was struggling with an issue with VSTest on my Azure DevOps Pipeline. I'm feel I'm doing something interesting here: I've associated my automated tests to my test cases and I'm asking the VSTest task to run all the tests in the Plan; this is considerably different than just running the tests that are contained in the test assemblies. The challenge at the time was that the test runner wasn't finding any of my tests. My spelunking exercise revealed that the runner required an array of test suites despite the fact that the user interface restricts you to pick only one. I modified my yaml pipeline to contain a comma-delimited list of suites. Done!

Next challenge, unlocked!

Unfortunately, this would turn out to be a short victory, as I quickly discovered that although the VSTest task was able to find the test cases, the test run would simply hang with no meaningful insight as to why.

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests
[xUnit.net 00:00:00.57]   Discovered: MyTests
[xUnit.net 00:00:00.57]   Starting: MyTests
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

So, on a wild hunch I changed my test plan so that only a single test case was automated, and it worked. What gives?

Is it me, or you? (it’s probably you)

The tests work great on my local machine, so it’s easy to fall into a trap that the problem isn’t me. But to truly understand the problem is to be able to recreate it locally. And to do that, I’d need to strip away all the unique elements until I had the most basic setup.

My first assumption was that it might actually be the VSTest runner -- a possible issue with the “Run Test Plan” option I was using. So I modified my build pipeline to just run my unit tests like normal regression tests. And surprisingly, the results were the same. So, maybe it’s my tests.

Under a hunch that I might have a threading deadlock somewhere in my tests, I hunted through my solution looking for rogue asynchronous methods and notorious deadlock maker Task.Result. There were none that I could see. So, maybe there’s a mismatch in the environment setup somehow?

Sure enough, I had some mismatches. My test runner from the command-prompt was an old version. The server build agent was using a different version of the test framework than what I had referenced in my project. After upgrading nuget packages, Visual Studio versions and fixing the pipeline to exactly match my environment – I still was unable to reproduce the problem locally.

I have a fever, and the only prescription is more logging

Well, if it’s a deadlock in my code, maybe I can introduce some logging into my tests to put a spotlight on the issue. After some initial futzing around (I’m amazing futzing wasn’t caught by spellcheck, btw), I was unable to get any of these log messages to appear in my output. Maybe xUnit has a setting for this?

Turns out, xUnit has a great logging capability but requires a the magical presence of the xunit.runner.json file in the working directory.

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true
}

The presence of this file reveals this simple truth:

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests (method display = ClassAndMethod, method display options = None)
[xUnit.net 00:00:00.57]   Discovered: MyTests (found 10 test cases)
[xUnit.net 00:00:00.57]   Starting: MyTests (parallel test collection = on, max threads = 8)
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

And when compared to the server:

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests (method display = ClassAndMethod, method display options = None)
[xUnit.net 00:00:00.57]   Discovered: MyTests (found 10 test cases)
[xUnit.net 00:00:00.57]   Starting: MyTests (parallel test collection = on, max threads = 2)
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

Yes, Virginia, there is a thread contention problem

The build agent on the server has only 2 virtual CPUs allocated and both executing tests are likely trying to spawn additional threads to perform the asynchronous operations. By setting the maxParallelThreads to “2” I am able to completely reproduce the problem from the server.

I can disable parallel execution in the tests by adding the following to the assembly:

[assembly: CollectionBehavior(DisableTestParallelization = true)]

…or by disabling parallel execution in the xunit.runner.json:

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true,
  "parallelizeTestCollections": false
}

submit to reddit

Friday, February 07, 2020

Adventures in Code Spelunking

image

It started innocently enough. I had an Azure DevOps Test Plan that I wanted to associate some automation to. I’d wager that there are only a handful of people on the planet who’d be interested by this, and I’m one of them, but the online walk-throughs from Microsoft’s online documentation seemed compatible with my setup – so why not? So, with some time in my Saturday afternoon and some horrible weather outside, I decided to try it out. And after going through all the motions, my first attempt failed spectacularly with no meaningful errors.

I re-read the documentation, verified my setup and it failed a dozen more times. Google and StackOverflow yielded no helpful suggestions. None.

It’s the sort of problem that would drive most developers crazy. We’ve grown accustomed to having all the answers a simple search away. Surely others have already had this problem and solved it. But when the oracle of all human knowledge comes back with a fat goose egg you start to worry that we’ve all become a group of truly lazy developers that can only find ready-made code snippets from StackOverflow.

When you are faced with this challenge, don’t give up. Don’t throw up your hands and walk away. Surely there’s an answer, and if there isn’t, you can make one. I want to walk you through my process.

Read the logs

If the devil is in the details, surely he’ll be found in the log file. You’ve probably already scanned the logs for obvious errors, it’s okay to go back and look again. If it seems the log file is gibberish at first glance, it often is. But sometimes the log contains some gems that give clues as to what’s missing. Maybe the log warns that a default value is missing, maybe you’ll discover a typo in a parameter.

Read the logs, again

Amp up the verbosity on the logs if possible and try again. Often developers use the verbose logging to diagnose problems that happen in the field, so maybe the hidden detail in the verbose log may reveal further gems.

Now’s a good moment for some developer insight. Are these log messages helpful? Would someone reading the logs from your program be as delighted or frustrated with the quality of these output messages?

Keep an eye out for references to class names or methods that appear in the log or stack traces. These could lead to further clues or give you a starting point for the next stage.

Find the source

Microsoft is the largest contributor to open-source projects on Github than anyone else, so it makes sense that they bought them. Just watching the culture shift within Microsoft in the last decade has been astounding and now it seems that almost all of their properties have their source code freely available for public viewing. Some sleuthing may be required to find the right repository. Sometimes it’s as easy as Googling “<name-of-class> github” or following the link on a nuget or maven repository.

But once you’ve found the source, you enter a world of magic. Best case scenario, you immediately find the control logic in the code that relates to your problem. Worse case scenario, you learn more about this component than anyone you know. Maybe you’ll discover they parse inputs as case sensitive strings, or some conditional logic requires the presence of a parameter you’re not using.

Within Github, your secret weapon is the ability to search within the repository, as you can find the implementation and usages in a single search. Recent changes within Github’s web-interface allows you to navigate through the code by clicking on class and method names – support is limited to specific programming languages but I’ll be in heaven when this capability expands. The point is to find a place to start and keep digging. It’ll seem weird not being able to set a breakpoint and simply run the app, but the ability to mentally trace through the code is invaluable. Practice makes perfect.

If you’re lucky, the output from the log file will help guide you. Go back and read it again.

As another developer insight – this code might be beautiful or make you want to vomit. Exposure to other approaches can validate and grow your opinions on what makes good software. I encourage all developers to read as much code that isn’t theirs.

After spending some time looking at the source, check out their issues list. You might discover your problem is known by a different name that is only familiar to those that wrote it. Alternative suitable workarounds might appear from other problems.

Roadblocks are just obstacles you haven’t overcome

If you hit a roadblock, it helps to step back and think of other ways of looking at the problem. What alternative approaches could you explore? And above all else, never start from a position where you assume everything on your end is correct. Years ago when I worked part-time at the local computer repair shop, I learnt the hard way that the easiest and most blatantly obvious step, checking to see if it was plugged in, was the most important step to not skip. When you keep an open-mind, you will never run out of options.

As evidenced by the tweet above, the error message I was experiencing was something that had no corresponding source-code online and all of my problems were baked into a black-box that only exists on the build server when the build runs. When the build runs… on the build server. When the build runs on the build agent… that I can install on my machine. Within minutes of installing a local build agent, I had the mysterious black-box gift wrapped on my machine.

No source code? No problem. JetBrain’s dotPeek is a free utility that allows you to decompile and review any .net executable.

Just dig until you hit the next obstacle. Step back, reflect. Dig differently. As I sit in a coffee shop looking out at the harsh cold of our Canadian winter, I reflect that we have it so easy compared to the original pioneers who forged their path here. That’s who you are, a pioneer cutting a path that no one has tread before. It isn’t easy, but the payoff is worth it.

Happy coding.

Thursday, February 06, 2020

GoodReads 2019 Recap

Hey Folks, like all posts that start in January, I’m starting my posts with the traditional …it’s been a while opener. Last year marks a first for this blog where I simply did not blog at all, which feels really strange. The usual suspects apply: busy at work, busy with kids, etc. However, in July of 2018 I started a new habit of taking a break from writing and focusing on reading more. I had planned to read 12 books in 2018 but read 20. Then I planned to read 24 in 2019 but read 41. I’ll probably have finished reading another book while I was writing this.

Maybe your New Year’s Resolution is to read more books. So, here are some highlights of book I read last year that you might enjoy:


The Murderbot Diaries

The Murderbot Diaries

Love, love, love Murderbot! By far, my favourite new literary character. The Murderbot diaries is set in the future where mankind has begun to explore planets beyond our solar system. If you were planning on exploring a planet, you’d hire a company to provide you with the assets to get there and as part of that contract, they’d provide you with a security detail to keep their assets you safe. Among that security detail is our protagonist, a security android that who has hacked his own governor module so it no longer needs to follow orders. What does a highly dangerous artificial intelligence with computer hacking skills and weapons embedded in its arms do with it’s own free will? Watch downloaded media and pretend to follow your orders. So. freaking. good.


The Broken Earth Series

The Broken Earth

The Fifth Season is strange mix of fantasy meets apocalypse survival, this series is so brilliantly written that I got emotional when it ended. The world-building is vast and revealed appropriately as the story progresses but this attention to creativity does not overwhelm the characters’ depth or story arcs. The world, perhaps our own, is a distant future where history is lost. Artifacts of dead-civilizations, like the crystal obelisks that float aimlessly in the sky have no explanation and every few hundred years, the earth undergoes a geological disaster known as a Season. Seasons may last for years. This one, may last for centuries.

Magic exists, but its source is a connection to the earth – an ability to delve, harness and channel the earth’s energy as a destructive force. For obvious reasons, those that are born with this ability are feared and thus rounded up and controlled by a ruling class. Our story involves a woman who secretly hides her ability and her kidnapped daughter who might be more powerful.

Recursion

Recursion: A Novel by [Crouch, Blake]

Blake Crouch blew me away in 2018 with Dark Matter, Recursion follows the story of a detective who investigates the suicide of a woman who suffers from a disease that creates a disconnect between their memories and reality. Is it an epidemic or a conspiracy?


The Southern Reach Trilogy (Annihilation, Authority, Acceptance)

Area X Three-Book Bundle: Annihilation; Authority; Acceptance (Southern Reach Trilogy) by [VanderMeer, Jeff]

I first heard of the book Annihilation from a CBC review of the bizarre and stunning visuals of the Annihilation movie starring Natalie Portman. The CBC review of the movie suggested that the director (Alex Garland) started the production of the movie before the 2nd and 3rd book of the series was written. Garland had support from the author, but it’s not surprising that the movie’s ending is radically different than the source material. I loved the movie, but needed to understand. The movie is a Kubrik mind-altering attempt to bring an unfilmable novel to the big screen, but the novel is so much more. The plot of the entire movie happens within the first few chapters, so if you liked the film the novel goes much further off the deep end. For example, the psychologist on the exhibition uses hypnosis and suggestive triggers on the rest of the exhibition to force compliance.  It’s not until our protagonist, the biologist, is infected by the effects of Area X does she become immune to the illusion.

The insidious aspect is the villain is a mysterious environment with no face, presence or motive. How do you defeat an environment? (spoiler: you can’t.  The invasive species wins (annihilation), the people in charge that are hiding the conspiracy have no idea how to stop it (authority), and the sooner you come to terms with it the better you’ll be (acceptance) – and maybe, given what we’ve done to the environment in the past, we deserve the outcome)