Showing posts with label Azure DevOps. Show all posts
Showing posts with label Azure DevOps. Show all posts

Monday, June 08, 2020

Keeping your Secrets Safe in Azure Pipelines

These days, it’s critical that everyone in the delivery team has a security mindset and is vigilant about keeping secrets away from prying eyes. Fortunately, Azure Pipelines have some great features to ensure that your application secrets are not exposed during pipeline execution, but it’s important to adopt some best practices early on to keep things moving smoothly.

Defining Variables

Before we get too far, let’s take a moment to step back and talk about the motivations for variables in Azure Pipelines. We want to use variables for things that might change in the future, but more importantly we want to use variables to prevent secrets like passwords and API Keys from being entered into source control.

Variables can be defined in several different places. They can be placed as meta-data for the pipeline, in variable groups, or dynamically in scripts.

Define Variables in Pipelines

Variables can be scoped to a Pipeline. These values, which are defined through the “Variables” button when editing a Pipeline, live as meta-data outside of the YAML file.

image

Define Variables in Variable Groups

Variable Groups are perhaps the most common mechanism to define variables as they can be reused across multiple pipelines within the same project. Variable Groups also support pulling their values from an Azure KeyVault which makes them an ideal mechanism for sharing secrets across projects.

Variable Groups are defined in the “Library” section of Azure Pipelines. Variables are simply key/value pairs.

image

image

Variables are made available to the Pipeline when it runs, and although there are a few different syntaxes I’m going to focus on using what’s referred to as macro-syntax, which looks like $(VariableName)

variables:
- group: MyVariableGroup

steps:
- bash: |
     echo $(USERNAME)
     printenv | sort

All variables are provided to scripts as Environment Variables. Using printenv dumps the list of environment variables. Both USERNAME and PASSWORD variables are present in the output.

image

Define Variables Dynamically in Scripts

Variables can also be declared using scripts using a special logging syntax.

- script: |
     $token = curl ....
     echo "##vso[task.setvariable variable=accesstoken]$token

Defining Secrets

Clearly, putting a clear text password variable in your pipeline is dangerous because any script in the pipeline has access to it. Fortunately, it’s very easy to lock this down by converting your variable into a secret.

secrets

Just use the lock icon to set it as a secret and then save the variable group to make it effectively irretrievable. Gandalf would be pleased.

Why doesn't JWfan have a secure connection? - Other Topics - JOHN ...

Now, when we run the pipeline we can see that the PASSWORD variable is no longer an Environment variable.

image

Securing Dynamic Variables in Scripts

Secrets can also be declared at runtime using scripts. You should always be mindful as to whether these dynamic variables could be used maliciously if not secured.

$token = curl ...
echo "##vso[task.setvariable variable=accesstoken;isSecret=true]$token"

Using Secrets in Scripts

Now that we know that secrets aren’t made available as Environment variables, we have to explicitly provide the value to the script – effectively “opting in” – by mapping the secret to variable that can be used during script execution:

- script : |
    echo The password is: $password
  env:
    password: $(Password)

The above is a wonderful example of heresy, as you should never output secrets to logs. Thankfully, we don't need to worry too much about this because Azure DevOps automatically masks these values before they make it to the log.

image

Takeaways

We should all do our part to take security concerns seriously. While it’s important to enable secrets early in your pipeline development to prevent leaking information, doing so will also prevent costly troubleshooting efforts when when variables are converted to secrets.

Happy coding.

Saturday, June 06, 2020

Downloading Artifacts from YAML Pipelines

Azure DevOps multi-stage YAML pipelines are pretty darn cool. You can describe a complex continuous integration pipeline that produces an artifact and then describe the continuous delivery workflow to push that artifact through multiple environments in the same YAML file.

In today’s scenario, we’re going to suppose that our quality engineering team is using their own dedicated repository for their automated regression tests. What’s the best way to bring their automated tests into our pipeline? Let’s assume that our test automation team has their own pipeline that compiles their tests and produces an artifact so that we can run these tests with different runtime parameters in different environments.

There are several approaches we can use. I’ll describe them from most-generic to most-awesome.

Download from Azure Artifacts

A common DevOps approach that is evangelized in Jez Humble’s Continuous Delivery book, is pushing binaries to an artifact repository and using those artifacts in ad-hoc manner in your pipelines. Azure DevOps has Azure Artifacts, which can be used for this purpose, but in my opinion it’s not a great fit. Azure Artifacts are better suited for maven, npm and nuget packages that are consumed as part of the build process.

Don’t get me wrong, I’m not calling out a problem with Azure Artifacts that will you require you to find an alternative like JFrog’s Artifactory, my point is that it’s perhaps too generic. If we dumped our compiled assets into the artifactory, how would our pipeline know which version we should use? And how long should we keep these artifacts around? In my opinion, you’d want better metadata about this artifact, like source commits and build that produced it, and you’d want these artifacts to stick-around only if they’re in use. Although decoupling is advantageous, when you strip something of all semantic meaning you put the onus on something else to remember, and that often leads to manual processes that breakdown…

If your artifacts have a predictable version number and you only ever need the latest version, there are tasks for downloading these types of artifacts. Azure Artifacts refers to these loose files as “Universal Packages”:

- task: UniversalPackages@0
  displayName: 'Universal download'
  inputs:
    command: download
    vstsFeed: '<projectName>/<feedName>'
    vstsFeedPackage: '<packageName>'
    vstsPackageVersion: 1.0.0
    downloadDirectory: '$(Build.SourcesDirectory)\someFolder'

Download from Pipeline

Next up: the DownloadPipelineArtifact task is full featured built-in Task that can download artifacts from different sources, such as an artifact produced in an earlier stage, a different pipeline within the project, or other projects within your ADO Organization. You can even download artifacts from projects in other ADO Organizations if you provide the appropriate Service Connection.

- task: DownloadPipelineArtifact@2
  inputs:
    source: 'specific'
    project: 'c7233341-a9ff-4e76-9367-909816bcd16g'
    pipeline: 1
    runVersion: 'latest'
    targetPath: '$(Pipeline.Workspace)'

Note that if you’re downloading an artifact from a different project, you’ll need to adjust the authorization scope of the build agent. This is found in the Project Settings –> Pipelines : Settings. If this setting is disabled, you’ll need to adjust it at the Organization level first.

image

This works exactly as you’d expect it to, and the artifacts are downloaded to $(Pipeline.Workspace). Note in the above I’m using the project guid and pipeline id, which are populated by the Pipeline Editor, but you can specify them by their name as well.

My only concern is there isn’t anything that indicates our pipeline is dependent on another project. The pipeline dependency is silently being consumed… which feels sneaky.

build_download_without_dependencies

Declared as a Resource

The technique I’ve recently been using is declaring the pipeline artifact as a resource in the YAML. This makes the pipeline reference much more obvious in the pipeline code and surfaces the dependency in the build summary.

Although this supports the ability to trigger our pipeline when new builds are available, we’ll skip that for now and only download the latest version of the artifact at runtime.

resources:
 pipelines:
   - pipeline: my_dependent_project
     project: 'ProjectName'
     source: PipelineName
     branch: master

To download artifacts from that pipeline we can use the download alias for DownloadPipelineArtifact. The syntax is more terse and easier to read. This example downloads the published artifact 'myartifact' from the declared pipeline reference. The download alias doesn’t seem to specify the download location. In this example, the artifact is downloaded to $(Pipeline.Workspace)\my_dependent_project\myartifact

- download: my_dependent_project
  artifact: myartifact

With this in place, the artifact shows up and change history appears in the build summary.

build_download_with_pipeline_resource

Update: 2020/06/18! Pipelines now appear as Runtime Resources

At the time this article was written, there was an outstanding defect for referencing pipelines as resources. With this defect resolved, you can now specify the version of the pipeline resource to consume when manually kicking-off a pipeline run.

  1. Start a new pipeline run

    manually-trigger-build-select-resource
  2. Open the list of resources and select the pipeline resource

    manually-trigger-build-select-resource-pipeline
  3. From the list of available versions, pick the version of the pipeline to use:

    manually-trigger-build-select-resource-pipeline-version

With this capability, we now have full traceability and flexibility to specify which pipeline resource we want!

Conclusion

So there you go. Three different ways to consume artifacts.

Happy coding!

Monday, February 10, 2020

Challenges with Parallel Tests on Azure DevOps

As I wrote about last week, Adventures in Code Spelunking, relentlessly digging into problems can be a time-consuming but rewarding task.

That post centers around a tweet I made while I was struggling with an issue with VSTest on my Azure DevOps Pipeline. I'm feel I'm doing something interesting here: I've associated my automated tests to my test cases and I'm asking the VSTest task to run all the tests in the Plan; this is considerably different than just running the tests that are contained in the test assemblies. The challenge at the time was that the test runner wasn't finding any of my tests. My spelunking exercise revealed that the runner required an array of test suites despite the fact that the user interface restricts you to pick only one. I modified my yaml pipeline to contain a comma-delimited list of suites. Done!

Next challenge, unlocked!

Unfortunately, this would turn out to be a short victory, as I quickly discovered that although the VSTest task was able to find the test cases, the test run would simply hang with no meaningful insight as to why.

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests
[xUnit.net 00:00:00.57]   Discovered: MyTests
[xUnit.net 00:00:00.57]   Starting: MyTests
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

So, on a wild hunch I changed my test plan so that only a single test case was automated, and it worked. What gives?

Is it me, or you? (it’s probably you)

The tests work great on my local machine, so it’s easy to fall into a trap that the problem isn’t me. But to truly understand the problem is to be able to recreate it locally. And to do that, I’d need to strip away all the unique elements until I had the most basic setup.

My first assumption was that it might actually be the VSTest runner -- a possible issue with the “Run Test Plan” option I was using. So I modified my build pipeline to just run my unit tests like normal regression tests. And surprisingly, the results were the same. So, maybe it’s my tests.

Under a hunch that I might have a threading deadlock somewhere in my tests, I hunted through my solution looking for rogue asynchronous methods and notorious deadlock maker Task.Result. There were none that I could see. So, maybe there’s a mismatch in the environment setup somehow?

Sure enough, I had some mismatches. My test runner from the command-prompt was an old version. The server build agent was using a different version of the test framework than what I had referenced in my project. After upgrading nuget packages, Visual Studio versions and fixing the pipeline to exactly match my environment – I still was unable to reproduce the problem locally.

I have a fever, and the only prescription is more logging

Well, if it’s a deadlock in my code, maybe I can introduce some logging into my tests to put a spotlight on the issue. After some initial futzing around (I’m amazing futzing wasn’t caught by spellcheck, btw), I was unable to get any of these log messages to appear in my output. Maybe xUnit has a setting for this?

Turns out, xUnit has a great logging capability but requires a the magical presence of the xunit.runner.json file in the working directory.

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true
}

The presence of this file reveals this simple truth:

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests (method display = ClassAndMethod, method display options = None)
[xUnit.net 00:00:00.57]   Discovered: MyTests (found 10 test cases)
[xUnit.net 00:00:00.57]   Starting: MyTests (parallel test collection = on, max threads = 8)
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

And when compared to the server:

[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.1 (64-bit .NET Core 3.1.1)
[xUnit.net 00:00:00.52]   Discovering: MyTests (method display = ClassAndMethod, method display options = None)
[xUnit.net 00:00:00.57]   Discovered: MyTests (found 10 test cases)
[xUnit.net 00:00:00.57]   Starting: MyTests (parallel test collection = on, max threads = 2)
-> Loading plugin D:\a\1\a\SpecFlow.Console.FunctionalTests\TechTalk.SpecFlow.xUnit.SpecFlowPlugin.dll
-> Using default config

Yes, Virginia, there is a thread contention problem

The build agent on the server has only 2 virtual CPUs allocated and both executing tests are likely trying to spawn additional threads to perform the asynchronous operations. By setting the maxParallelThreads to “2” I am able to completely reproduce the problem from the server.

I can disable parallel execution in the tests by adding the following to the assembly:

[assembly: CollectionBehavior(DisableTestParallelization = true)]

…or by disabling parallel execution in the xunit.runner.json:

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true,
  "parallelizeTestCollections": false
}

submit to reddit

Friday, February 07, 2020

Adventures in Code Spelunking

image

It started innocently enough. I had an Azure DevOps Test Plan that I wanted to associate some automation to. I’d wager that there are only a handful of people on the planet who’d be interested by this, and I’m one of them, but the online walk-throughs from Microsoft’s online documentation seemed compatible with my setup – so why not? So, with some time in my Saturday afternoon and some horrible weather outside, I decided to try it out. And after going through all the motions, my first attempt failed spectacularly with no meaningful errors.

I re-read the documentation, verified my setup and it failed a dozen more times. Google and StackOverflow yielded no helpful suggestions. None.

It’s the sort of problem that would drive most developers crazy. We’ve grown accustomed to having all the answers a simple search away. Surely others have already had this problem and solved it. But when the oracle of all human knowledge comes back with a fat goose egg you start to worry that we’ve all become a group of truly lazy developers that can only find ready-made code snippets from StackOverflow.

When you are faced with this challenge, don’t give up. Don’t throw up your hands and walk away. Surely there’s an answer, and if there isn’t, you can make one. I want to walk you through my process.

Read the logs

If the devil is in the details, surely he’ll be found in the log file. You’ve probably already scanned the logs for obvious errors, it’s okay to go back and look again. If it seems the log file is gibberish at first glance, it often is. But sometimes the log contains some gems that give clues as to what’s missing. Maybe the log warns that a default value is missing, maybe you’ll discover a typo in a parameter.

Read the logs, again

Amp up the verbosity on the logs if possible and try again. Often developers use the verbose logging to diagnose problems that happen in the field, so maybe the hidden detail in the verbose log may reveal further gems.

Now’s a good moment for some developer insight. Are these log messages helpful? Would someone reading the logs from your program be as delighted or frustrated with the quality of these output messages?

Keep an eye out for references to class names or methods that appear in the log or stack traces. These could lead to further clues or give you a starting point for the next stage.

Find the source

Microsoft is the largest contributor to open-source projects on Github than anyone else, so it makes sense that they bought them. Just watching the culture shift within Microsoft in the last decade has been astounding and now it seems that almost all of their properties have their source code freely available for public viewing. Some sleuthing may be required to find the right repository. Sometimes it’s as easy as Googling “<name-of-class> github” or following the link on a nuget or maven repository.

But once you’ve found the source, you enter a world of magic. Best case scenario, you immediately find the control logic in the code that relates to your problem. Worse case scenario, you learn more about this component than anyone you know. Maybe you’ll discover they parse inputs as case sensitive strings, or some conditional logic requires the presence of a parameter you’re not using.

Within Github, your secret weapon is the ability to search within the repository, as you can find the implementation and usages in a single search. Recent changes within Github’s web-interface allows you to navigate through the code by clicking on class and method names – support is limited to specific programming languages but I’ll be in heaven when this capability expands. The point is to find a place to start and keep digging. It’ll seem weird not being able to set a breakpoint and simply run the app, but the ability to mentally trace through the code is invaluable. Practice makes perfect.

If you’re lucky, the output from the log file will help guide you. Go back and read it again.

As another developer insight – this code might be beautiful or make you want to vomit. Exposure to other approaches can validate and grow your opinions on what makes good software. I encourage all developers to read as much code that isn’t theirs.

After spending some time looking at the source, check out their issues list. You might discover your problem is known by a different name that is only familiar to those that wrote it. Alternative suitable workarounds might appear from other problems.

Roadblocks are just obstacles you haven’t overcome

If you hit a roadblock, it helps to step back and think of other ways of looking at the problem. What alternative approaches could you explore? And above all else, never start from a position where you assume everything on your end is correct. Years ago when I worked part-time at the local computer repair shop, I learnt the hard way that the easiest and most blatantly obvious step, checking to see if it was plugged in, was the most important step to not skip. When you keep an open-mind, you will never run out of options.

As evidenced by the tweet above, the error message I was experiencing was something that had no corresponding source-code online and all of my problems were baked into a black-box that only exists on the build server when the build runs. When the build runs… on the build server. When the build runs on the build agent… that I can install on my machine. Within minutes of installing a local build agent, I had the mysterious black-box gift wrapped on my machine.

No source code? No problem. JetBrain’s dotPeek is a free utility that allows you to decompile and review any .net executable.

Just dig until you hit the next obstacle. Step back, reflect. Dig differently. As I sit in a coffee shop looking out at the harsh cold of our Canadian winter, I reflect that we have it so easy compared to the original pioneers who forged their path here. That’s who you are, a pioneer cutting a path that no one has tread before. It isn’t easy, but the payoff is worth it.

Happy coding.