Showing posts with label .net. Show all posts
Showing posts with label .net. Show all posts

Tuesday, July 01, 2008

Legacy Projects: Get Statistics from your Build Server

As I mentioned in my post, Working with Legacy .NET Projects, my latest project is a legacy application with no tests. We're migrating from .NET 1.1 to .NET 2.0, and this is the first entry in the series of dealing with legacy projects. Click here to see the starting point.

On the majority of legacy projects that I've worked on, there is often a common thread within the development team that believes the entire code base is outdated, filled with bugs and should be thrown away and rewritten from scratch. Such a proposal is a very tough sell for management, who will no doubt see zero value in spending a staggering amount only to receive exactly what they currently have, plus a handful of fresh bugs. Rewrites might make sense when accompanied with new features or platform shifts, but in large they are a very long and costly endeavour. Refactoring the code using small steps in order to get out of Design Debt is a much more suitable approach, but cannot be done without a plan that management can get behind. Typically, management will support projects that can quantify results, such as improving server performance or page load times. However, in the context of a sprawling application without separation of concerns, estimating effort for these types of projects can be extremely difficult, and further compounded when there is no automated testing in place. It's a difficult stalemate between simple requirements and a full rewrite.

Assuming that your legacy project at least has source control, the next logical step to improve your landscape is to introduce a continous integration server or build server. And as there are countless other posts out there describing how to setup a continuous integration server, I'm not going to repeat those good fellows.

While the benefits of a build server are immediately visible for developers, who are all too familiar with dumb-dumb errors like compilation issues due to missing files in source control, the build server can also be an important reporting tool that can be used to sell management on the state of the application. As a technology consultant who has played the part between the development team and management, I think it's fair to say that most management teams would love to claim that they understand what their development teams do, but they'd rather be spared the finer details. So if you could provide management a summary of all your application's problems graphed against a timeline, you'd be able to demonstrate the effectiveness of their investment over time. That's a pretty easy sell.

The great news is, very little is required on your part to produce the graphs: CruiseControl 1.3 has a built in Statistics Feature that uses XPath statements to extract values from your build log. Statistics are written to an xml file and csv file for easy exporting, and third party graphing tools can be plugged into the CruiseControl dashboard to produce slick looking graphs. The challenge lies in mapping the key pain points in your application to a set of quantifiable metrics and then establishing a plan that will help you improve those metrics.

Here's a common set of pain points and metrics that I want to improve/measure for my legacy project:

Pain Metrics Toolset
Tight Coupling (Poor Testability) Code Coverage, Number of Tests NCover, NUnit
Complexity / Duplication (Code Size) Cyclomatic complexity, number of lines of code, classes and members NCover, NDepend, SourceMonitor or VIL
Standards Compliance FxCop warnings and violations, compilation warnings FxCop, MSBuild

Ideally, before I start any refactoring or code clean-up, I want my reports to reflect the current state of the application (flawed, tightly coupled and un-testable). To do this, I need to start capturing this data as soon as possible by adding the appropriate tools to my build script. While it's possible to add new metrics to your build configuration at any time, there is no way to go back and generate log data for previous builds. (You could manually check out previous builds and run the tools directly, but would take an insane amount of time.) The CruiseControl.NET extension CCStatistics also has a tool that can reprocess your log files, which is handy if you add new metrics for data sources that have already been added to your build output.

Since adding all these tools into your build script requires some tinkering, i'll be gradually adding these tools into my build script. To minimize changes to my cruise control configuration, I can use a wildcard filter to match all files that follow a set naming convention. I'm using a "*-Results.xml" naming convention.

<-- from ccnet.config -->
<publishers>
<merge>
<files>
<file>c:\buildpath\build-output\*-Results.xml</file>
</files>
</merge>
</publishers>

Configuring the Statistics Publisher is really quite easy, and the great news is that the default configuration captures most of the metrics above. The out of box configuration captures the following:

  • CCNET: Build Label
  • CCNET: Error Type
  • CCNET: Error Message
  • CCNET: Build Status
  • CCNET: Build Start Time
  • CCNET: Build Duration
  • CCNET: Project Name
  • NUNIT: Test Count
  • NUNIT: Test Failures
  • NUNIT: Tests Ignored
  • FXCOP: FxCop Warnings
  • FXCOP: FxCop Errors

Here's a snippet from my ccnet.config file that shows NCover lines of code, files, classes and members. Note that I'm also using Grant Drake's NCoverExplorer extras to generate an xml summary instead of the full coverage xml output for performance reasons.

<publishers>
<merge>
<files>
<file>c:\buildpath\build-output\*-Results.xml</file>
</files>
</merge>

<statistics>
<statisticList>
<firstMatch name='NCLOC' xpath='//coverageReport/project/@nonCommentLines' include='true' />
<firstMatch name='files' xpath='//coverageReport/project/@files' include='true' />
<firstMatch name='classes' xpath='//coverageReport/project/@classes' include='true' />
<firstMatch name='members' xpath='//coverageReport/project/@members' include='true' />
</statisticList>
</statistics>

<!-- email, etc -->
</publishers>

I've omitted the metrics for NDepend/SourceMonitor/VIL, as I haven't fully integrated these tools into my build reports. I may revisit this later.

If you've found this useful or have other cool tools or metrics you want to share, please leave a note.

Happy Canada Day!

submit to reddit

Monday, June 23, 2008

Working with Legacy .NET Projects

My current project at work is a legacy application, written using .NET 1.1. The application is at least 5 years old and has had a wide range of developers. It's complex, has many third-party elements and constraints and lots of lots of code. Like all legacy applications, they set out with best of intentions but ended up somewhere else when new requirements started to deviate from the original design. It's safe to say that it's got challenges, it works despite its bugs and all hope is not yet lost.

Oh, and no unit Tests. Which in my world, is a pretty big thing. Hope you like Spaghetti!!

Fortunately, the client has agreed to a .NET 2.0 migration, which is a great starting place. All in all, I see this as a great refactoring exercise to slowly introduce tests and proper design. Along the way, we'll be fixing bugs, improving performance and reducing friction to change. I'll be writing some posts over the next while that talk about the strategies were using to change our landscape. Maybe, you'll find them useful for your project.

Related Posts:

submit to reddit

Tuesday, June 17, 2008

TDD Tips: Create Custom NUnit Categories

In my recent post about test naming conventions and guidelines, I mentioned that you should annotate tests for third-party and external dependencies with category attributes and limit the number of categories that you create. This post will show basic usage of categories, will explain some of the reasoning behind limiting the number of categories. I'll also show how you can create your own categories with NUnit 2.4.x.

Although it's possible to annotate all of your tests with categories, they're really only useful for marking sensitive tests, typically around logical boundaries in your application. Some of the typical categories that I mark my tests with:

  • Database: Tests that require a database to execute. You might want to exclude these tests when you're working remotely or isolate these tests if you need to validate a database deployment for an environment.
  • Integration: Tests that interact with external components you don't have much control over, such as web-services or other infrastructure.
  • Web: Tests that perform regression tests on the visual aspect of a web-site. These tests tend to be very time consuming or require special configuration, so being able to exclude them until they're required can be a big help. Often I run these type of tests when the build server kicks off a build.

Usage

Using categories are very straight forward. Here's an example of a test that is marked with a "Database" category


namespace Code
{
[TestFixture]
public class AdoOrderManagementProvider
{
[Test,Category("Database")]
public void CanRetrieveOrderById()
{
// database code here
}
}
}

Challenges with Categories

One problem I've found with using categories is that category names can be difficult to keep consistent in large teams, mainly because the category name is a literal value that is passed to the attribute constructor. In large teams, you either end up with several categories with different spellings, or the unclear intent of the categories becomes an obstacle which prevents developer adoption.

Fortunately, since NUnit 2.4.x, it's possible to create your own custom categories by deriving from the CategoryAttribute class. (In previous releases, the CategoryAttribute class was sealed.) Creating your own custom categories as concrete classes allows the solution architect to clearly express the intent of the testing strategy, and relieves the developer of spelling mistakes. As an added bonus, you get Intellisense support (through Xml Documentation syntax), ability to identify usages and the ability to refactor the category much more effectively than a literal value.

Here's the code for a custom database category, and the above example modified to take advantage of it:


using NUnit.Framework;

namespace NUnit.Framework
{
/// <summary>
/// This test, fixture or assembly has a direct dependency to a database.
/// </summary>
[AttributeUsage(AttributeTarget.Class | AttributeTarget.Method | AttributeTarget.Assembly, AllowMultiple = false)]
public class RequiresDatabaseAttribute : CategoryAttribute
{
public RequiresDatabaseCategoryAttribute() : base("Database")
{}
}
}

namespace Code
{
[TestFixture]
public class AdoOrderManagementProvider
{
[Test, RequiresDatabase]
public void CanRetrieveOrderById()
{
// etc...
}
}
}

It's important to point out that categories can be applied per Test, per Fixture or even for the entire Assembly, so you have lots of options in terms of the level of granularity.

Filtering Tests using Categories

The real advantage to using categories is that you can filter which tests should be included or excluded when the tests are run.

Filtering Categories within Nunit-Gui.exe

To actively include/exclude tests by category in the GUI:

  1. Click on the Categories tab in the top left
  2. Select the categories you wish to include/exclude, then click the Add button.
  3. If excluding tests, check the "exclude these categories" checkbox.

Filtering Categories in NUnit 2.4.7.

Filtering Categories with Nunit-Console.exe

To include/exclude tests by category from the command line use either the /include:<category-name> or /exclude:<category-name> parameters. It's possible to provide a list of categories by using a comma delimiter.

Example of running all tests within assemblyName.dll except for tests marked as Database or Web.:

nunit-console assemblyName.dll /exclude:Database,Web

Example of running only tests marked with the Database category:

nunit-console assemblyName.dll /include:Database
Note: The name of the category is case-sensitive.

Code Available

I'm pleased to announce that I've setup a repository using Google Project hosting. I'll be posting downloadable code samples. I've created a few simple NUnit categories based on the examples above that you can download and use for your projects:

Happy testing!

submit to reddit

Wednesday, June 04, 2008

TDD Tips: Test Naming Conventions & Guidelines

The idea behind test driven development is that you are writing the test first. Since all code must reside in a method, the very first step before you can write any code, is to name the test. If you're new to TDD, you'll find this to be a very difficult thing to do. Don't let this discourage you, I'd go so far to say that out of all the tasks a developer must accomplish, finding names for things is perhaps the most difficult. W.H. Auden's statement show's that this "meta" thing transcends development:

Proper names are poetry in the raw. Like all poetry they are untranslatable. ~W.H. Auden

This begs a question that comes up frequently for new TDD developers starting out as well as experienced developers during code review: "Is there a naming convention or guidelines for unit tests?" Some believe it to be a black art, but I think it's more like acquiring a rhythm and following along. Once you've got the rhythm it gets easier.

Prior to diving into the guidelines, let's clear up some basic vocabulary:

  • Target / Subject: I often use the term "Target" or "Subject" to refer to the piece of functionality that I'm testing.
  • Fixture: Synonymous with "TestFixture", a fixture is a class that contains a set of related tests. Fixtures are classes that have been decorated with the [TestFixture] attribute.
  • Suite: Test Suites are an older style of organizing tests. They're specialized fixtures that programmatically define which Fixtures or Tests to run. NUnit supports them for backward compatibility by using the [TestSuite] attribute. Since NUnit dynamically finds all tests with the [TestFixture] attribute, they're not as popular these days.
  • Test: You may have noticed that I capitalize Test in all my entries. Tests are methods within the Fixture that are decorated with the [Test] attribute and contain code that validates the functionality of our target.
  • Setup/TearDown: Test Fixtures can designate a special piece of code to run before every Test within that Fixture. That method is decorated with the [Setup] attribute. Likewise, a method with the [TearDown] attribute is called at the end of every test within a fixture.
  • Fixture Setup/Fixture TearDown: Similar to constructors and finalizers, methods with the [TestFixtureSetup] or [TestFixtureTearDown] attributes execute before and after the Fixture executes. These methods happen before [Setup] and after [TearDown].
  • Category : The [Category] attribute when applied to a method associates the Test within a user-defined category.
  • Ignore: Tests with the [Ignore] attribute are skipped over when the Tests are run.
  • Explicit: Tests with the [Explicit] attribute won't run unless you manually run them.

The following are some suggestions I've adopted or recommended to others from past projects. Feel free to take 'em at face value, or leave a comment if you have some to add:

Fixtures

DO: Name Fixtures consistently
TestFixtures should follow a consistent naming convention to make tests easier to find. Choose a naming convention such as <TargetType>Fixture or <TargetType>Test and stick to it.

DO: Mimic namespaces of Target Code
To help keep your tests organized, use the same folders and namespace structures as your target assembly. This will help you locate tests for target types and vice versa. Since most Test runners group Tests by their namespace, it's really easy to run all tests for a specific namespace by selecting by the container folder -- which is great for regression testing an area of code. I've got another post which talks about how to structure your Test namespaces.

DO: Name Setup/TearDown methods consistenty
When naming your fixture setup and teardown methods, you really should pick a style for these methods and stick with it. Personally, I can't find any reason why you would deviate from naming these methods FixtureSetup, FixtureTearDown, Setup, and TearDown as these provide clear names. By following a standard TestFixtures structure you can cut down some of the visual noise, make tests easier to read and produce more maintainable tests across multiple developers.

CONSIDER: Separating your Tests from your Production Code
As a general rule, you should try to separate your tests from your production code. If you have a requirement where you want to test in production or verify at the client's side, you can accomplish this simply by bundling the test library with your release. Still, every project is different, and tests won't necessarily impede production other than bloating up your assembly. Separate when needed, and use your gut to tell you when you should.

CONSIDER: Deriving common Fixtures from a base Fixture
In scenarios where you are testing sets of common classes or when tests share a great deal of duplication, consider creating a base TestFixture that your Fixtures can inherit.

CONSIDER: Using Categories instead of Suites or Specialized Tests
Although Suites can be used to organize Tests of similar functionality together, Categories are the preferred method. Suites represent significant developer overhead and maintenance. Likewise, creating specialized folders to contain tests (ie "Database Tests") also creates additional effort as tests for a particular Type become spread over the test library. Categories offer a unique advantage in the UI and at the command-line that allows you to specify which categories should be included or excluded from execution. For example, you could execute only "Stateful" tests against an environment to validate a database deployment.

CONSIDER: Splitting Test Libraries into Multiple Assemblies
From past experience, projects go to lengths to separate tests from code but don't place a lot of emphasis on how to structure Test assemblies. Often, a single Test library is created, which is suitable for most projects. However, for large scale projects that can have hundreds of tests this approach can get difficult to manage. I'm not suggesting that you should religiously enforce test structure, but there may be logical motivators to divide your test assemblies into smaller units, such as grouping tests with third-party dependencies or as an alternative for using Categories. Again, separate when needed, and use your gut to tell you when you should. (You can always go back)

AVOID: Empty Setup methods
As a best practice, you should only write the methods that you need today. Adding methods for future purposes only adds visual noise for maintenance purposes. The exception to this is when you are creating a base Fixture that contains empty methods that will be overridden by derived classes.

Tests

DO: Name Tests after Functionality
The test name should match a specific unit of functionality for the target type being tested. Some key questions you may want to ask yourself: "what is the responsibility of this class?" "What does this class need to do?" Think in terms of action words. Well written test names should provide guidance when the test fails. For example, a test with the name CanDetermineAuthenticatedState provides more direction about how authentication states are examined than Login.

DO: Document your Tests
You can't assume that all of your tests will be intuitive for everyone who reviews them. Most tests require special knowledge about the functionality your testing, so a little documentation to explain what the test is doing is helpful. Using XML Documentation syntax might be overkill, but a few comments here and there are often just the right amount to help the next person understand what you need to test and how your test approaches demonstrates that functionality.

CONSIDER: Use "Cannot" Prefix for Expected Exceptions
Since Exceptions are typically thrown when your application is a performing something it wasn't designed to do, prefix "Cannot" to tests that are decorated with the [ExpectedException] attribute. Some examples: CannotAcceptNullArguments, CannotRetrieveInvalidRecord.

I would consider this a "DO" recommendation, but this a personal preference. I can't think of scenarios where this isn't the case, so this one is up for debate.

CONSIDER: Using prefixes for Different Scenarios
If your application has features that differ slightly for application roles, it's likely that your test names will overlap. Some teams have adopted a For<Scenario> syntax (CanGetPreferencesForAnonymousUser). Other teams have adopted an underscore prefix _<Scenario> (AnonymousUser_CanGetPreferences).

AVOID: Ignore Attributes with no explanation
Tests that are marked with the Ignore attribute should include a reason for why this test has been disabled. Eventually, you'll want to circle back on these tests and either fix them or alter them so that they can be used. But without an explaination, the next person will have to do a lot of investigative work to figure out that reason. In my experience, most tests with the Ignore attribute are never fixed.

AVOID: Naming Tests after Implementation
If you find that your tests are named after the methods within your classes, that's a code smell that you're testing your implementation instead of your functionality. If you changed your method name, would the test name still make sense?

AVOID: Using underscores as word-separators
I've seen tests that use_underscores_as_word_separators_for_readability, which is so-o-o 1960. PascalCase should suffice. Imagine all the time you save not holding down the shift key.

AVOID: Unclear Test Names
Sometimes we create tests for bugs that are caught late in the development cycle, or tests to demonstrate requirements based on lengthy requirements documentation. As these are usually pretty important tests (especially for bugs that creep back in), it's important to avoid giving them vague test names that represent a some external requirement like FixForBug133 or TestCase21.

Categories

DO: Limit the number of Categories
Using Categories is a powerful way to dynamically separate your tests at runtime, however their effectiveness is diminished when developers are unsure which Category to use.

CONSIDER: Defining Custom Category Attributes
As Categories are sensitive to case and spelling, you might want to consider creating your own Category attributes by deriving from CategoryAttribute. UPDATE: Read more about custom NUnit Categories.

Well, that's all for now. Are you doing things differently, or did I miss something? Feel free to leave a comment.

Updates:

  • 6/18/08 - Added links to custom NUnit Categories

submit to reddit

Wednesday, May 21, 2008

log4net Configuration made simple through Attributes

I'm sure this is well documented, but for my own reference and your convenience, here's one from my list of favorite log4net tips and tricks: how to instrument your code so that log4net automatically picks up your configuration.

On average, I've been so happy with how well log4net has fit my application logging needs that most of my projects end up using it: console apps, web applications, class libraries. Needless to say I use it a lot, and I get tired of writing the same configuration code over and over:

private static void Main()
{
    string basePath = AppDomain.CurrentDomain.BaseDirectory;
    string filePath = Path.Combine(basePath, "FileName.log4net");
    XmlConfigurator.ConfigureAndWatch(new FileInfo(filePath));
}

log4net documentation refers to a Configuration Attribute (XmlConfiguratorAttribute), but it can be frustrating to use if you're not sure how to set it up. The trick is how you name your configuration file and where you put it. I'll walk through how I set it up...

log4net using XmlConfiguratorAttribute Walkthrough

  1. Add an Assembly Configuration Attribute: log4net will look for this configuration attribute the first time you make a call to a logger. I typically give my configuration file a "log4net" extension. Place the following configuration attribute in the AssemblyInfo.cs file in the assembly that contains the main entry point for the application.

    [assembly: log4net.Config.XmlConfigurator(ConfigFileExtension = "log4net",Watch = true)]

  2. Create your configuration file: As mentioned previously, the name of the configuration file is important as is where you put it. In general, the name of the configuration file should follow the convention: full-assembly-name.extension.log4net. The file needs to be at the base folder of the application, so for WinForms and Console applications it resides in the same folder as the main executable, for ASP.NET applications it's the root of the web-site along side the web.config file.

    Project Type Project Output log4net file name Location
    WinForm App Program.exe Program.exe.log4net with exe
    Console App Console.exe Console.exe.log4net with exe
    Class Library Library.dll N/A  
    ASP.NET /bin/Web.dll /Web.dll.log4net Web root (/)

  3. Define your Configuration Settings: Copy and paste the following sample into a new file. I'm using the Rolling Appender as this creates a new log file every time the app is restarted.

    <?xml version="1.0" encoding="utf-8" ?>
    <configuration>

    <configSections>
    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />
    </configSections>

    <log4net>
    <!-- Define output appenders -->
    <appender name="RollingLogFileAppender" type="log4net.Appender.RollingFileAppender">
    <file value="log.txt" />
    <appendToFile value="true" />
    <rollingStyle value="Once" /> <!-- new log file on restart -->
    <maxSizeRollBackups value="10"/> <!-- renames rolled files on startup 1-10, no more than 10 -->
    <datePattern value="yyyyMMdd" />
    <layout type="log4net.Layout.PatternLayout">
    <param name="Header" value="[START LOG]&#13;&#10;" />
    <param name="Footer" value="[END LOG]&#13;&#10;" />
    <conversionPattern value="%d [%t] %-5p %c [%x] - %m%n" />
    </layout>
    </appender>

    <!-- Setup the root category, add the appenders and set the default level -->
    <root>
    <level value="DEBUG" />
    <appender-ref ref="RollingLogFileAppender" />
    </root>

    </log4net>
    </configuration>
  4. Make a logging call as early as possible: In order for the configuration attribute to be invoked, you need to make a logging call in the assembly that contains that attribute. Note I declare the logger as static readonly as a JIT optimization.

    namespace example
    {
    public class Global : System.Web.HttpApplication
    {
    private static readonly ILog log = LogManager.GetLogger(typeof(Global));

    protected void Application_Start(object sender, EventArgs e)
    {
    log.Info("Web Application Start.");
    }
    }
    }

Cheers.

submit to reddit

Thursday, May 15, 2008

TDD Tips: Getting value out of Code Coverage

If you're following true test driven development, you should be writing tests before you write the code. By definition you only write the code that is required and you should always have 100% code coverage.

Unfortunately, this is not always the case. We have legacy projects without tests; we're forced to cut corners; we leave things to finish later that we forget about. For that reason, we look to tools to give us a sense of confidence in the quality of our code. Code coverage is often (dangerously) seen as a confidence gauge. So to follow up on a few of my other TDD posts, I want to talk about what value code coverage can provide and how you should and shouldn't use it...

Let's start by looking at what code coverage will tell us...

  • Code coverage shows which parts of our code have been tested. This metric is usually inferred as a total percentage of code that has been tested.
  • Most coverage tools keep track of how many times methods have been visited. This value shows us how much or how little testing is represented for specific a code block, but as far as I know, there's no overall valuable metric. You could infer "top most tested" or "top least tested" metrics.

In some cases, code coverage can be used to contribute to a confidence level. I feel better about a large code base that has an 80% coverage than little or no coverage. But coverage is just statistical data -- it can be misleading...

Good Coverage doesn't mean Good Code
Having a high coverage metric cannot be used as an overall code quality metric. Code coverage cannot reveal that your code or tests haven't accounted for unexpected scenarios, so it's possible that buggy code with "just enough" tests can have high coverage.

Good Coverage doesn't mean Good Tests
A widely held belief of TDD is that the confidence level of the code is proportional to the quality of the tests. Code coverage tools can be very useful to developers to identify areas of the code that are missing tests, but should not be used as a benchmark for test quality. Tests can become meaningless when developers write tests to satisfy coverage reports instead of writing tests to prove functionality of the application. See the example below.

How a few bad tests ruin coverage

Developers can unknowingly write a test that invalidates coverage. To demonstrate, let's assume we have a really simple Person class. For sake of argument, FirstName is always required so we make it available through the the constructor.


[TestFixture]
public class PersonTest
{
[Test]
public void CanCreatePerson()
{
Person p = new Person("Bryan");
Assert.AreEqual(p.FirstName,"Bryan");
}
}

public class Person
{
public Person(string firstName)
{
_first = firstName;
}
    public virtual string FirstName
{
get { return _first; }
set { _first = value; }
}
    private string _first;
}

This is all well and good. However, a code coverage report would reveal that the FirstName property setter (highlighted above) has no coverage.

Should we fix the code....


public Person(string firstName)
{
_first = firstName;
FirstName = firstName; // virtual method call in constructor
//
is a FxCop violation
}

... or the test?


[Test]
public void CanCreatePerson()
{
Person p = new Person("bryan");
Assert.AreEqual(p.FirstName,"bryan");
p.FirstName = "Bryan";
Assert.AreEqual("Bryan",p.FirstName);

}

Trick question. Neither!

There are two ways to improve code coverage -- write more tests, or get rid of code. In this case, I would argue that it better to remove the setter than write any code just to satisfy coverage. (Wow, less really IS more!) Leave the property as read-only until some calling code needs to write to it, at which point the tests for that call site will provide the coverage you need.

"But putting the setter back in is a pain!" -- sure it is. Alternatively, you can leave it in, but make sure you do not write a test for it. If the coverage remains zero for extended periods of time, remove it later. (If you can't remove it because some calling code is writing to it, you missed something in one of your tests.)

Note: In general, plain old value objects like our Person class won't need standalone tests. The exception to this is when you need tests to demonstrate specialized logic in getter/setter methods.

Coverage Tips for Your Project

  • Set goals for coverage: Talk to your team about coverage and gather their feedback early in the project. Identify areas that will be difficult to test and develop strategies to make your code more testable. Agree upon a level of acceptable coverage based on your timelines and these constraints. For most projects that start with TDD in mind, 70-80% is very realistic target. I don't have any concrete data to back this up, but I imagine that effort increases by levels of magnitude after a certain percentage.
  • Watch for changes in coverage: Rather than looking at overall code coverage percentage as a quality metric, integrate coverage into your build or continuous integration process and look at the change in coverage between builds. Coverage will flutuate as a project matures, eventually it should level out and remain relatively constant between changes. Applaud when it goes up, recognize the hard-work of your team when it stays the same, and investigate when it takes a steep drop. As an added bonus, the integrated coverage logs on your build server can be analyzed over time: it's amazing how developer churn, ramp-up, changes in functionality/design/timelines can become evident in a graphed timeline of failed builds and drops in coverage.
  • Use Milestones: Whether you're in an waterfall or agile project, pick milestones in your project where you can look at coverage. I try to fit in at least one code review per iteration and kick them off with a look at code coverage reports ("Yikes! We don't have any tests for this entire namespace, maybe we should fix that.") When coverage is low, I use this time to evangelize the benefits of having tests. Set a goal for next iteration and get buy-in from both the team, management (and client) for well written tests that bump up your coverage. It can be fun motivator for the team.
  • Don't Force it:. If you obsess about coverage, you're probably doing it wrong. Deliberately reworking code so that code will light up in the coverage report or writing coverage-serving tests yeilds little benefits -- let it come naturally by writing concise tests. If your tests don't reflect the functionality of the application, fix your tests; if the tests serve only to satisfy coverage they likely don't serve anybody.

submit to reddit

Monday, May 12, 2008

TDD Tips: Unit Test Namespace considerations

In my last post, I highlighted some of the test-driven benefits of using the InternalsVisbleTo attribute. In keeping with the trend of TDD posts, here's a recent change in direction I've made about how to separate your tests from your code.

There's a debate and poll going on about where you should put your tests. The poll shows that the majority of developers are putting their code in separate projects (yeaaaa!). Bill Simser's suggestion to have tests reside within the code is a belief that balances dangerously between heresy and pragmaticism. Although I'm opposed to releasing tests with production, one point I can identify with is the overhead of keeping namespaces between your code and your tests in sync. (Sorry Bill, if I wanted my end users to run my tests, I'd give them my Test library and tell them to download NUnit) A long the same lines, at some point our organization picked up some advice that code namespaces should reflect their location in source control. This has proven effective for maintenance as this makes it easier to track down Types when inspecting a stack-trace. Following this advice has led us to adopt a consistent naming strategy for assemblies and their namespaces:

Project Namespace Assembly
Component Company.Component Company.Component.dll
Test Company.Component.Test Company.Component.Test.dll

This works well, but I have a few hang-ups on this design. This strategy pre-dates most of our TDD efforts, and frankly it gets in the way. Here are my issues:

  • Namespace Mismatch: We attempt to model the same folder structure between projects and although the folder structure is the same, the namespaces are different. The type Customer would reside in Company.Component.Customer while the CustomerTest would reside in Company.Component.Test.Customer.
  • Pure TDD is difficult: When the namespaces are different, it's a lot of extra clicking if you want to create your types as you write your Tests. You have to get out of the Test, create the Type in Library project, switch back to the test and then add the appropriate namespace using statement. If you create the type in the same file as the Test, you'll have to refactor the tests and the Type namespaces when you move it to the library. Most of these issues get caught at compile time, but it's a real nuisance.

However, there is some great advice in the Framework Design Guidelines book which states that assembly names and namespaces don't necessarily have to match. From Brad Abrams site:

Please keep in mind that namespaces are distinct from DLL names. Namespaces represent logical groupings for developers whereas DLLs represent packaging and deployment boundaries. DLLs can contain multiple namespaces for product factoring and other reasons, and namespaces can span multiple DLLs. Since namespace factoring is different than assembly factoring, you should design them independently.

A great example is that there is no System.IO.dll in the .NET framework: System.IO.FileStream resides in MSCorLib.dll while System.IO.FileSystemWatcher resides in System.dll. So if we apply this concept to our solution and think of Tests as a subset of functionality with different packaging purposes, our code and test libraries look like this:

Project Namespace Assembly
Component Company.Component Company.Component.dll
Test Company.Component Company.Component.Test.dll

Here's a snap of my Test Library's project properties...

 

Now that the namespaces are identical between projects, I never have to worry about missed namespace declarations --- I can quickly create Types in the Test library and move them to the library when I'm done. As an added bonus, when I change the namespace using Resharper, it will change my Test library as well. Here's what the TDD flow looks like using Resharper:

  1. Write the test, refer to a new non-existent Type.
  2. Use Resharper to generate the missing class. The class is created in the same file as the test and is marked internal.
  3. Flush out the class using additional tests.
  4. When the class is finished, right click the class and choose Refactor -> Move. Specify a new file, the name will automatically reflect the Type name.
  5. Drag the new file while holding the SHIFT key from the Test library to the code project. This will physically move the file between the projects and automatically update the project definitions.

Caveats:

  • Folder Issues: I should point out that this doesn't solve resolve the folder renaming issue. If you rename the folder in your code library, you'll have to do the same in the Test library. Mind you, Resharper doesn't automatically fix folders when you rename them anyway, so you're going to have to fix this yourself.
  • Maintenance Strategy: The maintenance model strategy that allows you to identify the location of a Type in source control based on a stack-trace is partially broken with this design. I say partially because a stack-trace should really only be a concern for production code, and stack-traces for unit-tests don't provide much in the context of a Test Runner. Still, to support troubleshooting, I encourage developers to follow a "Test" naming convention for their tests.
  • Intillisense Confusion With your Test and Code library sharing the same namespaces, both TestFixtures and Types will show up in Intellisense when you write code in your Test library. Some might see this as noise when writing tests; others might use it as a good holistic view for classes and associated Tests. If this really bothers you, you could mark your tests with an attribute that would hide the tests from intellisense.

submit to reddit

Wednesday, May 07, 2008

Compiling .NET 1.1 using NAnt

Yesterday I met a cashier who needed to use a calculator when I gave him $20.35 for a $10.34 item. Experiences like this are terrifying, and rather than let myself become reliant on tools with rich user interfaces, I like to give my brain and fingers a workout every now and then and use some command line tools. Today, I needed to make some changes to a legacy .NET 1.1 application. Rather than going through the hassle of installing Visual Studio 2003, I figured I could get by with our great NAnt scripts and Notepad++ for a short while. Apart from having to download and install the .NET 1.1 SDK, I ran into a few snags:

Running NAnt in 1.1

Our NAnt scripts need to run under the .NET 1.1 framework and require a specific version of NAnt. Fortunately, when we put the project together, we assumed that not everyone would have NAnt installed on their machines, so we created a "tools" folder in our solution and included the appropriate version of NAnt. To simplify calling the local NAnt version, we created a really simple batch file:

tools\nant\bin\nant.exe -buildfile:main.build -targetframework:net-1.1 %*

Missing or Wrong References

The nant "solution" task gave me some trouble. Dependencies that were wired into the csproj file with a valid HintPath were not being found. In particular, I had problems with my version of NUnit. It was referencing a .NET 2.0 version somewhere else on the machine. While I could have treated the symptom by copying the command line out of the log file, I decided to go to the source using Reflector. The NAnt "solution" task uses the registry to identify well known assembly locations from the following locations:

HKCU\SOFTWARE\Microsoft\VisualStudio\<Version>\AssemblyFolders
HKLM\SOFTWARE\Microsoft\VisualStudio\<Version>\AssemblyFolders
HKCU\SOFTWARE\Microsoft\.NETFramework\AssemblyFolders
HKLM\SOFTWARE\Microsoft\.NETFramework\AssemblyFolders

I found the culprit here:

 

Deleting this registry key did the trick, now it compiles fine.

submit to reddit

Tuesday, May 06, 2008

TDD Tips: InternalsVisibleTo - Keep your API clean

...or how to have all the great benefits of clean code and 100% code coverage too.

Although the .NET 2.0 Runtime has been out for quite sometime, I'm still surprised that most people are not aware that the 2.0 framework supports a concept known as "Friend Assemblies", made possible using the InternalsVisibleToAttribute. For me, this handy (and dare i say awesome) attribute solves an age old problem frequently encountered with Test Driven Development and when I first stumbled upon it about two years ago, my jaw hit the floor and I was all nerdy giddy about it.

This has all been blogged about before, but I want to comment on some of the best practices this approach affords us. As a general rule of thumb, you should always try to keep your Unit Tests out of your production code. After all, the classes needed for testing will never be used by end-users, so to prevent bloating up your assembly you should put the tests in a different assembly and leave 'em at home when you release the code. Unfortunately, this produces a strange side-effect: Types and Methods that would normally be marked as internal or private must be made public so that the external Test assembly can access them. You're left with a difficult compromise... either choose to violate your API access rules to support testing, or forgo all unit testing and code coverage for clean code. While the practice of exposing types is relatively harmless, it can introduce some negative side-effects into your project, especially if you're producing a library that is shared with other applications or third parties. Specifically, it can hurt usability and performance:

  • Usability of your assembly will be reduced because users will have a full gammet of Types to choose from. A clean API with only a few public facing classes is easier to understand that dozens of utility and helper classes. If you only have a handful of classes, this doesn't apply to you -- but if you've ever inherited a project with hundreds of Types and piss-poor documentation, I know you know what I'm talking about.
  • Performance of your API will be compromised if you follow FxCop recommendations -- which btw, is good advise. With all these public facing types you'll need additional parameter-validation and error handling because you can't guarantee how third-parties will access your Types. If your app is for internal-use, you can shirk this responsibility, but be warned: the onus is on you to enforce proper use of your library and to ignore several dozen FxCop violations. If you have third-parties using your library, this is extra plumbing is hard to avoid so it's more likely the Types and Methods are kept private/internal and the tests are simply neglected. Which, IMHO is where you really need the tests since the bugs are more likely to be nested deep in your implementation rather than the public exposed API, and hey ...bugs are bad for business.

Here's a few links that refer to these best practices:

Fortunately, the InternalsVisibleTo attribute fixes these issues. By placing the attribute in your assembly, you can keep types as internal and still allow unit testing.

Attribute Usage Examples

Using the attribute is quite simple. The attribute is placed in the assembly that contains the internal classes and methods that you want to expose to other "friend" assemblies. The attribute lists the "friend" assembly.

using System.Runtime.CompilerServices;

[assembly:InternalsVisibleTo("assemblyName")]

MSDN documentation refers to strong names when referring to the friend assemblies, however, a strong-name is not required. This is extremely useful if you're just starting your project or not ready to strong-name the assembly. Note that if you are using a strong-name, it's the full public key and not just the public token.

[assembly:InternalsVisibleTo("assemblyName, PublicKey=fff....")]

To get the full public key of your assembly, you can use the strong name tool that ships with the .NET Framework to extract the public key:

sn -Tp Code.dll

Alternatively, David Kean has published a handy tool that can help you generate the InternalsVisibleTo attribute, so you can simply paste it into your assembly. However, his site is presently being reworked. I have the binary downloaded from his site, though I have no where to host the file. Give me a shout if you're interested... and David, let us know when you're site is back up.

Note: Although the strong-name is optional, you should be using strong-names on your assemblies as a best practice to prevent this type of runtime injection. And if you go down this route, all referenced assemblies must also been signed (all the more reason why you should be using strong-naming in the first place).

A Code Example...

This rudimentary example shows how you can create a class that takes advantage of the InternalsVisibleTo attribute. There are two assemblies: "Code" is my main assembly has the InternalsVisibleTo attribute and public facing API, "Test" is my test library that references "Code". If these assemblies weren't friends, all Types within "Code" would have to be public.

// within Code.dll
[assembly:InternalsVisibleTo("Code.Test")]

namespace Code
{
    public internal class StringUtility
    {
        public static string ProperCase(string input)
        {
            CultureInfo culture = Thread.CurrentThread.CurrentCulture;
            return culture.TextInfo.ToTitleCase(input.ToLower(CultureInfo.InvariantCulture));
        }
    }
}

// within Code.Test.dll
namespace Code.Test
{
    [TestFixture]
    public class StringUtilityTest
    {
        [Test]
        public void CanGetProperCaseFromInternalClass()
        {
            Assert.AreEqual("Hello", StringUtility.ProperCase("HELLO"));
        }
    }
}

Kudos to Rick Strahl for the ProperCase string tip.

The Payoff...

So now that you've got your internal classes with test coverage goodness, treat yourself by opening up FxCop and viewing the reduced violations report.

FxCop before:

This screen capture of FxCop shows a few standard FxCop violations (my assembly isn't strong-named, yet) and a violating public arguments warning.

FxCop after:

Since most FxCop rules are centered around designing public APIs, classes that are marked as internal are exempt from certain rules. This snapshot shows how our internal class isn't subject to requiring additional validation logic.

 

submit to reddit

Thursday, May 01, 2008

.NET Garbage Collection Behavior for Release Code

Every so often, I pick up my copy of Jeffrey Richter's CLR via C# which provides a great low level look at the .NET Framework intrinsics. When I read this book two things are likely to happen, either I fall fast asleep, or I discover something that makes my head snaps backward at break-neck speeds. Here's a great mind bender on garbage collection. Take this simple console program:

public class Program
{
    public static void Main()
    {
        // setup a call back for every two seconds
        Timer t = new Timer(Callback,null,0,2000);
        
        Console.ReadLine();
    }
    
    private static void Callback(object state)
    {
        Console.WriteLine("Callback called.");
        GC.Collect();
    }
}

This simple console program when compiled in Debug mode has different behavior than when it's compiled in Release mode.

Skeptic? Try it.

Debug Mode:

  1. Compile the solution in Debug mode.
  2. Open a command-prompt and execute the app
  3. The callback is called every two seconds until the Console reads a line.

Release Mode:

  1. Compile the solution in Release mode.
  2. Open a command-prompt and execute the app
  3. The callback is only called once.

...does your neck hurt? ;-)

In Release mode, the code and the JIT compiler are optimized. At the first callback where we force Garbage Collection, the Garbage Collector determines that our timer is not used in the remainder of the Main method, therefore not "rooted", and can be safely garbage collected. As this behavior would wreak havoc on debugging sessions, the JIT compiler treats un-optimized code (Debug) differently: it artificially "roots" all variables within a method to prevent them from premature garbage collection. Note that release code running under a Visual Studio debugging session will have the same behavior as debug code, that's why you need to run them from the command-line. You can fix this code by adding another call to our timer object further on down the method. When the Garbage Collector runs it will walk the stack and determine that our variable is "rooted" and our Release code will work just like our Debug counter-part.

Here's our Main method modified to prove that point:

public static void Main()
{
    Timer t = new Timer(CallBack,null,0,2000);
    Console.ReadLine();
    
    // our object is now rooted and will survive garbage collection
    t.Dispose();
}

Jeff also points out that simply adding code like:

t = null;

...won't change anything since this line will be optimized out of the code during JIT compilation. In short, what this means is that all objects don't have to fall out of scope (ie, end of the method) to be garbage collected. The garbage collector operates under the assumption that all objects are garbage until proven useful, regardless of where the object appears on the stack. So if you're not using it, the garbage collector is going to throw it out.

submit to reddit

Tuesday, April 29, 2008

Running Multiple .NET Services within a Single Process

I love the fact that .NET makes it profoundly easy to write Windows Services. Most of the low level details have been abstracted away, and while this makes it easier to write and deploy services, sometimes it doesn't work the way you'd expect. For example, I noticed something odd when I tried to write a Service that hosted multiple services. According to the API, it is possible to provide multiple ServiceBase objects to the static Run method as an array:

public static void Main()
{
 ServiceBase[] ServicesToRun = new ServiceBase[] { new Service1(), new Service2() };
 ServiceBase.Run(ServicesToRun);
}

However, when my code executes, only the first ServiceBase object runs, which seems suspicious. The culprit is that the API is somewhat misleading -- the ServiceBase.Run method is not actually responsible for running your services. Instead, it loads them into memory and then hands them off to the Service Control Manager for activation. The service that gets activated is the service you requested when you activated it from the Services Applet or command line:

NET START Service1

This error has appeared in many different forums, but no one seems to post a working example, so maybe I'm not entirely alone on this one. I think part of the confusion stems from the fact that I can give the first ServiceBase object in the array any ServiceName that I wish and it will execute.

public static void Main()
{
 ServiceBase myService = new Service1();
 MyService.ServiceName = "ServiceA";
 ServiceBase.Run(myService);
}

How to make it work:

The correct way to allow multiple services to run within in a single process requires the following:

  1. An installer class with the RunInstaller attribute set to True. The class is instantiated and invoked when you run InstallUtil.exe
  2. The installer class must contain one ProcessInstaller instance. This object is responsible for defining the operating conditions (Start-up mode and User) that your service application will run under.
  3. The installer class must contain one ServiceInstaller instance per ServiceBase in your application. If you plan on running multiple services, each one must (sadly) be installed prior to use.
  4. For the service that you anticipate being started from the Services Applet, list the other services in the ServicesDependedOn property so that they will be started when your service starts:
[RunInstaller(true)]
public class MyServiceInstaller : Installer
{
 public MyServiceInstaller()
 {
     ServiceProcessInstaller processInstaller = new ServiceProcessInstaller();
     processInstaller.Account = ServiceAccount.LocalSystem;
  
     ServiceInstaller mainServiceInstaller = new ServiceInstaller();
     mainServiceInstaller.ServiceName = "Service1";
     mainServiceInstaller.Description = "Service One";
     mainServiceInstaller.ServicesDependedOn = new string [] { "Service2" };
  
     ServiceInstaller secondServiceInstaller = new ServiceInstaller();
     secondServiceInstaller.ServiceName = "Service2";
     secondServiceInstaller.Description = "Service Two";
  
     Installers.Add(processInstaller);
     Installers.Add(mainServiceInstaller);
     Installers.Add(secondaryServiceInstaller);
 }
}

Now when Service1 starts, Service2 is also started. Happily, both services log to the same log4net file and the number of Processes in the Task Manager increments only by one.

Note that when Service2 is stopped, Service1 will also be stopped. However, shutting down Service2 will not stop Service1. If you want tighter coupling between the two services, you might consider adding ServiceController logic to Service1 to start and stop Service2 during the Service1 OnStart and OnStop methods... maybe something I'll follow up with a later post.

submit to reddit

Tuesday, April 22, 2008

ASP.NET Uri Fragment is not available

Recently, a question came my way about filtering URLs that contain fragment-identifiers. A fragment-identifier goes by many different names (bookmark, pound, hash, named-anchor, etc) and is represented as a pound symbol (#) at the end of a querystring:

http://server/path?query#fragment-identifier.

Unfortunately, I had looked into something similar only a few months previously, so my response came immediately: "this cannot be done." While researching a problem several months ago, I was surprised to learn that the fragment of the URL is a client-side only html tag, meaning that most modern browsers use this primarily to scroll the named element into view -- they do not transmit this information to the web server. A simple test shows this value is NEVER populated.

public partial class PageTest : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {
        Response.Write("Fragment = " + Request.Url.Fragment + "<br />");
    }
}

Sadly, this is not ASP.NET specific. It's part of the Uri specification. A Wikipedia article on this topic suggests that you can pass "#" to the server if it is encoded as %23, although this value is treated as part of the querystring instead of being interpreted as the Uri fragment.

If you need these values in the URL, put them in the query-string.

submit to reddit

Friday, April 18, 2008

Visual Studio "Format the whole document" for Notepad++ using Tidy

I've started playing with Notepad++ over the last year, and really liking it. If you've been living under a rock, its an opensource replacement for the boring windows notepad.exe and has appeared on top-ten lists, including Scott Hanselman's list Ultimate Developer and Power User Tool List. While I haven't completely replaced Visual Studio, I have found a few neat tricks that have saved me a lot of grief.

My recent favorite is Tidy, an open source tool that can format HTML output, is included as a TextFX plugin for Notepad++.  By default, it doesn't do much, but the magic starts when you drop a configuration file into the TextFX plugin folder. Here's how I've configured mine:

  1. Navigate to C:\Program Files\Notepad++\plugins\NPPTextFX
  2. Create a text file named htmltidy.cfg and place the following contents inside:
    indent: auto
    indent-spaces: 2
    wrap: 72
    markup: yes
    input-xml: yes
  3. Enjoy!

The configuration above is a basic format, which automatically wraps and indents XML/XHTML files nicely. To use just load up your XML file, choose "TextFX -> TextFX HTML Tidy -> Tidy" and your document should automatically indent properly. If you need more options, check out the Tidy quick reference guide. If you format a lot of XML documents, you can speed things up by assigning a ShortCut key:

  1. Choose "Settings -> Shortcut Mapper"
  2. Click on the "Plugin Commands" and scroll down to entry "D:Tidy" (entry 241 on my system).
  3. Double click the item and assign a ShortCut key.

A quick aside on the Shortcut keys, I had to try a few different options until Tidy formatted my document. I suspect that Notepad++ doesn't detect duplicate Shortcuts. I settled with CTRL+ALT+K, which seems to work without issue.

Lastly, if you want to completely replace "notepad.exe" with "Notepad++", there's a neat replacement utility referenced on the Notepad++ site that you should download and follow their basic instructions. Note that this utility is not the same as renaming notepad++.exe to notepad.exe and dropping it in your Windows directory; it's a utility that looks up the location of notepad++.exe from the registry and forwards requests to it. Also note, if your machine shipped with a copy of the operating system (typically a i386 folder), you need to replace the original notepad.exe there as well.

submit to reddit

Thursday, April 17, 2008

Selenium 0.92 doesn't work in IE over VPN or Dialup

I've been writing user interface tests for my current web project using Selenium. I really dig the fact that it uses JavaScript to manipulate the browser. I'm working on building a Language Pattern, where my unit tests read like a simple domain language -- it involves distilling Selenese output into a set of reusable classes.

I ran into a really frustrating snag during a late-night coding session and I started to freak out a bit -- my Selenium tests just magically stopped working! Instead of getting the Selenium framed window, my site was serving 404 messages. At first I thought the plumbing code that I had written was somehow serving the wrong URL.

I quickly switched my tests to FireFox and was relieved to see them working fine -- and under the same URL. Since my client uses IE 6, dropping Internet Explorer support for UI tests would be a deal breaker. I was surprised to see the tests work when I switched the URL from localhost:80 to localhost:4444, which is the port Selenium's proxy server runs on. The light in my head started to glow...

The aha moment came when I switched back to FireFox: I noticed that none of my FireFox plugins were loaded and that the Proxy server setting had been enabled to route localhost:80 through localhost:4444. Selenium is controlling registry settings for the browser, meaning that some setting was missing in IE. Although Internet Explorer had been configured to use my Selenium proxy-server settings, it ignores these values when on dial-up and VPN connections. You need to specify a different proxy server through an Advanced settings menu.

Both FireFox and Internet explorer use PAC files, which are used to automatically detect the configuration settings for your proxy server. Selenium generates a new PAC file between executions, so you'll quickly find that manually fixing it becomes a pain. To fix, create your own pac file and wire the setting in yourself.

Here's a snap of my connections dialog:

And the contents of my selenium-proxy.pac file:

function FindProxyForURL(url, host) {
        return 'PROXY localhost:4444; DIRECT';
}

submit to reddit

Tuesday, April 15, 2008

Danger: Visitor Design Pattern can be useful

In seems that in my circles that out of all the design patterns in the gang of four, the Visitor pattern is often seen as confusing and impractical. I'd agree with that assessment: patterns like the Command, Strategy, Composite, and Factory are commonly used because it's easy to think of examples that work. Whereas the Visitor Pattern has a confusing relationship between objects and requires a lot of upfront code to make it work. It's easily filed under the i-don't-think-i'll-ever-use-this category.

I recently found a great code example on Haibo Luo's blog that involved using reflection to read IL (using Method.GetMethodBodyAsIL()). In it he posts two very different approaches to parsing the IL, the first post shows a Reflector-like example of a IL-Reader; the second post is focused on a related side-project but outlines how he was able to use the Visitor pattern to allow different interpretations of the IL. The Visitor pattern is perfect here, because IL is based on a fixed specification that will never change. (A side note: the entire ILReader class is attached as a zip at the bottom of the post and is worth checking out if you're interested in parsing IL using Reflection.)

After showing this example to a few colleagues (with some heated debates), I found new appreciation for the Visitor pattern. The Visitor Pattern can be really useful anywhere you have a fixed set of data, which surprisingly happens more frequently than you might think.

Take "Application Configuration" as an example. Normally, I'd write a simple Parser to read through the configuration to construct application state. Since the configuration elements are a fixed object model, they can be easily modified to accept a visit from a visitor:

public interface IConfigVisitor
{
void Visit(MyConfig configSection);
void Visit(Type1 dataElement);
void Visit(Type2 dataElement);
}

public interface IConfigVisitorAcceptor
{
void Accept(IConfigVisitor visitor);
}

public class MyConfig : ConfigurationSection, IConfigVisitorAcceptor
{
// config stuff here, omitted

public void Accept(IConfigVistior visitor)
{
visitor.Accept(this);
}
}

public class Type1 : ConfigurationElement, IConfigVisitorAcceptor
{
// config stuff here, omitted

// example fields for Type1
public string Field1;

public void Accept(IConfigVisitor visitor)
{
visitor.Visit(this);
}
}

public class Type2 : ConfigurationElement, IConfigVisitorAcceptor
{
// config stuff here, omitted

// example fields for Type2
public string Field1;
public string Field2;

public void Accept(IConfigVisitor visitor)
{
visitor.Visit(this);
}
}

Little modification needs to be done to the parser to act as a Visitor. The parser is simply a visitor that collects state as it travels to each configuration element. This example is a bit trivial:


public class ConfigurationParserVisitor : IConfigVisitor
{
// example internal state for visitor
StringBuilder example = new StringBuilder();

public void Visit(MyConfig configSection)
{
// a custom iterator could be used here to simplify this
foreach(Type1 item in configSection.Type1Collection)
{
item.Accept(this);
}
foreach(Type2 item in configSection.Type2Collection)
{
item.Accept(this);
}
}

public void Visit(Type1 data)
{
example.AppendLine(data.Field1);
}
public void Visit(Type2 data)
{
example.AppendLine(data.Field1 + " " + data.Field2);
}

public string GetOutput()
{
return example.ToString();
}
}

public class Example
{
public static void Main()
{
MyConfig config = (MyConfig)ConfigurationManager.GetSection("myconfig");

ConfigurationParserVisitor parser = new ConfigurationParserVisitor();
config.Accept(parser);

Console.WriteLine(parser.GetOutput());
}
}

Here's usually where the argument gets heated: Why would anyone do this? Wouldn't you be better off writing a parser that accepts your configuration element as a parameter? A very valid question, it does seem an obtuse direction to follow if you only need to read your configuration file. However, where the Visitor pattern becomes useful is that new functionality can be added to the configuration elements without having to modify the object model in any way. Perhaps you want to auto-upgrade your settings to a new version, produce a report, display your configuration in a UI, etc.

One of the subtle advantages to this pattern is that new functionality can be expressed in a single class rather than spread about the solution. This makes it perfect fit for adding plugins to your application, or building an application that is composited together with a Command pattern.

While not all application will require this level of flexibility, it can be a very useful pattern when you need it. The upfront cost is a one-time event, so it's a pretty easy refactoring exercise.

submit to reddit

Tuesday, March 04, 2008

Redirect Standard Output of a Service to log4net

I recently wrote a simple windows service that hosted batch files and other applications within a service process. I found some great stuff located here, which really helped me along.

Like many other developers, I quickly discovered that debugging and diagnosing issues wasn't particularily easy. On my machine, it was fairly simple to set a break point and manually attach to the service, but diagnosing issues on other machines lacked detail in the Event Log. What I needed was a way to capture the output of my hosted application.

As I was already using log4net to trace through application flow, I used the following approach to redirect the output of my hosted application into my logger.


using System.Diagnostics;
using log4net;

public class MyService : ServiceBase
{
private static readonly ILog log = LogManager.GetLogger(typeof(MyService));

private Process process;

public override void OnStart(string[] args)
{
process = new Process();

ProcessStartInfo info = new ProcessStartInfo();

// configure the command-line app.
info.FileName = "java.exe"
info.WorkingDirectory = "c:\program files\Selenium\RemoteControlServer"
info.Arguments = "-jar selenium-server.jar"

// configure runtime specifics
info.UseShellExecute = false; // needed to redirect output
info.RedirectStandardOutput = true;
info.RedirectErrorOutput = true;

process.StartInfo = info;

// setup event handlers
process.EnableRaisingEvents = true;
process.ErrorDataReceived += new DataReceivedEventHandler(process_ErrorDataReceived);
process.OutputDataReceived += new DataReceivedEventHandler(process_OutputDataReceived);


process.Start();

// notify process about asynchronous reads
process.BeginErrorReadLine();
process.BeginOutputReadLine();

}

// fires whenever errors output is produced
private static void process_ErrorDataReceived(object sender, DataReceivedEventArgs e)
{
try
{
if (!String.IsNullOrEmpty(e.Data))
{
log.Warn(e.Data);
}
}
catch(Exception ex)
{
log.Error("Error occurred while trying to log console errors.", ex)
}
}

// fires whenever standard output is produced
private static void process_OutputDataReceived(object sender, DataReceivedEventArgs e)
{
try
{
if (!String.IsNullOrEmpty(e.Data)
{
log.Debug(e.Data);
}
}
catch(Exception ex)
{
log.Error("Error occurred while trying to log console output.", ex);
}
}
}

submit to reddit

Friday, December 21, 2007

Debugging WizardExtensions for Visual Studio Templates

As per my previous post, this exercise would probably be much easier if I used the Guidance Automation Toolkit, but in the spirit of Twelve Days of Code, I promised to boldly venture into areas I normally don't go. I decided that I wanted to try out a WizardExtension so that I could compare the experience with the Guidance Automation Toolkit. So I created a new project and added the following references:

  • EnvDTE
  • ENvDTE80
  • Microsoft.VisualStudio.TemplateWizardInterface
  • System.Windows.Forms

The Visual Studio Template documentation says you need to sign your assembly and install it into the GAC, but that's crazy. Rather than jumping through hoops, I found a handy forum post that describes that the Visual Studio follows the standard assembly probing sequence, so the assembly needs to be in a place that the devenv.exe process can find it. Signing and installing into the GAC is simply a security measure. I didn't want to dump my custom assembly into Visual Studio's assemblies (where I would forget about it) so I created a custom folder in %program files%\Microsoft Visual Studio 8\Common7\IDE and added this to the probingPath of the devenv.exe.config. To enable debugging for my custom wizard-extension, I use two Visual Studio instances. One for my wizard-extension, the other for testing the template. Here are the steps involved:

  • Add the assembly and class name to your ProjectGroup.vstemplate file:
<wizardextension>
 <assembly>Experiments.TemplateWizard</assembly>
 <fullclassname>Experiments.TemplateWizard.CustomizeProjectNameWizard</fullclassname>
</wizardextension>
  • Zip up the updated template and copy it into the appropriate Visual Studio Templates folder.
  • Compile the wizard-extension assembly and copy it and its pdb to a path where visual studio can find it
  • Launch a new instance of Visual Studio
  • Switch back to the other visual studio instance, attach to the "devenv" process (the one that says it's at the start page) and set your break-points
  • Switch back to the new instance of Visual Studio and start the template that contains your wizard extension
  • debugging goodness!!

Well, at least I saved myself the effort of signing, etc. This exercise showed that very little is actually done at the ProjectGroup level of a Multi-Project template: the RunStarted is called, followed by ProjectFinishedGenerating method. The biggest disappointment is that the project parameter in the ProjectFinishedGenerating is null. This is probably because the item being created is a Solution, not a project.

The last ditch (seriously, ditch!) is to cast the automationObject passed into RunStarted to _DTE, and the work through COM interop to manage the Solution. That sounds romantic.

submit to reddit

Thursday, December 20, 2007

Bundling Visual Studio templates for distribution

Microsoft's done a fairly good job in packaging for Visual Studio templates. Simply:

  1. Create an xml file that adheres to the Visual Studio Content Installer Schema Reference
  2. Rename the xml with a "vscontent" extension
  3. Place the vscontent file and your template zip into another zip file
  4. Rename that zip file with a "VSI" extension.

Now, when you double click the VSI file it runs a wizard that installs your template into the appropriate Visual Studio Template folder.

submit to reddit

Wednesday, December 19, 2007

Visual Studio 2005 Multi-Project Templates - a waste of time?

As part of my twelve-days-of-code, I'm tackling a set of small projects geared towards simple project automation. I've discovered in recent projects that although the road is always paved with good intentions, other tasks, emergencies and distractions always prevent you from accomplishing what seem to be the most minor tasks. So when starting out on a new task, we always cut corners with the intention of returning to these trivial tasks whenever we find the time, or when they become justified in our client's eyes. However, if we started out with these things done for us, no one would question their existence or worry about a catch-up penalty; we would just accept these things as best-practice.

Visual Studio Project templates are interesting, though my first encounters with them suggest they miss the mark. For my projects, I find the effort isn't about creating the project, it's about creating the overall solution: project libraries, web sites, test harnesses, references to third-party libaries and tools, build-scripts, etc. Visual Studio supports the concept of "Multi-Projects Templates", which are limited (see below), but I suspect that the Guidance Automation Extensions might fill in the gaps.

Visual Studio supports two types of templates within the IDE, and a third type which must be stitched together using XML. The first type refers to "Item Templates" which refer to single files which can be included in any project. I'm focusing more on Project templates and Multi-Project templates.

Within Visual Studio, the concept of a project template is extremely easy: you simply create the project the way you like and then choose the "Export Templates..." option from the File menu. The project and its contents are published as a ZIP file in "My Documents\Visual Studio 2005\My Exported Templates". A big plus on the template structure is that all the files support parameterization, which means you can decorate the exported files with keywords that are dynamically replaced when the template is created by the user. The export wizard takes care of most of the keyword substitution for you, such that root namespaces in all files will match the name of the user's solution. With this in mind, a Project Template is "sanitized" and waiting for your client to adopt your structure with their name.

Multi-Project Templates stitch multiple Project Templates together by using a master xml-based template file. These templates can't be created using the IDE, but you can create a solution and then export each project out as a Project Template, then following this handy MSDN article and the Template Schema reference, you can quickly parcel together a master template.

However, there's a few really nasty limitations with multi-item projects templates. The biggest issue is that the Project Name cannot be parameterized, so the template adopts the names that are defined in your configuration file. As a result, the only thing you can really customize is the name of the solution. I was completely baffled by this: I thought I must be doing something wrong. However, after a few minutes of googling, others had come to the exact same conclusion.

Fortunately, the template system supports a Wizard framework, which would allow you to write some code to dynamically modify the solution. Unfortunately, the code for this would have to be strong-named and installed in the GAC. I'm tempted to wade into this, but I fear that I might be better off looking at the Guidance Automation Toolkit.

submit to reddit