Thursday, July 23, 2009

Using RuntimeTypeHandles to improve Memory of .NET Apps

It's been a few months since I wrote my article Creating a better wrapper using AOP, and I haven't thought much of it, but a recent comment left by Krzysztof has motivated me to re-open the project.  Krzysztof has an interesting optimization that I want to try, and perhaps I will delve into that in another post, but today I want to expand on my intentions of improving both performance and memory management as mentioned in the caveats of my article.  In truth, most of this article is inspired by the work of Jeffrey Richter's C# via CLR and Vance Morrison's blog, and I've been itching to test out the concept of using runtime type handles for a while.

Some Background

Most developers are led to believe that Reflection is a performance hit, but few understand why.  Here's my attempt to explain it in a nutshell.  I’m neither a Microsoft employee nor expert in this matter, I’m simply paraphrasing and I encourage you to pick up Jeff’s book.  If I’m paraphrasing incorrectly, please let me know.

When your code is executed for the first time, the CLR finds all references to Types in the MSIL in order to JIT compile them.  As part of this process, Assemblies are loaded and Type information is read from the meta-data and put into the AppDomain’s heap using internal structures – mainly pointers and method tables.  The first time a Type’s method is called, it is JIT compiled into native instructions and a pointer refers to these compiled bits.  Nice, light and fast.

Reflection however is different.  By calling the Type’s built-in GetType method, we are forcing the runtime to read the meta-data and potentially construct dozens of managed objects (MethodInfo, FieldInfo, PropertyInfo, etc) and collectively these can be somewhat large data structures.  Often as most reflection methods use strings to identify methods, we’re performing a case-insensitive scan of the meta-data.  When we call those methods our parameters have to be pushed onto the stack as an array of objects.  When compared to compiled native code with pointers, you can see why this is slower.  (Roughly 200x slower)

Clearly, the .NET runtime doesn’t need these large data-structures to execute our code.  Instead we can leverage the same pointers that the CLR uses through their managed equivalents: RuntimeTypeHandle, RuntimeMethodHandle and RuntimeFieldHandle.  At their most basic level, they are the pointers (IntPtr objects) to our Types, without all the additional reflection performance overhead.  Since an IntPtr is basically just a numerical value (a ValueType nonetheless), they use considerably less memory.

An interesting point: an example from Vance’s blog demonstrates that typeof(Foo) is considerably slower than typeof(Foo).TypeHandle because the JIT recognizes you need the pointer and not the Type.

Even more surprising, anObj.GetType() == typeof(string) is faster than (anObj is string) for the exact same JIT optimization.  Amazing.

Using Handles

If your application holds onto Type data for extended periods, it may be worth your while to use handles.  Here’s a few examples of using them.

Get the handle for a Type
[Test]
public void CanGetHandleFromType()
{
    RuntimeTypeHandle handle = typeof(string).TypeHandle;
    Type type = Type.GetTypeFromHandle(handle);

    Assert.AreEqual("A String".GetType(), type);
}
Get the handle for a Method
[Test]
public void CanGetHandleFromMethod()
{
    MethodInfo m1 = typeof(Foo).GetMethod("Do");
    RuntimeMethodHandle handle = m1.MethodHandle;

    MethodInfo m2 = MethodBase.GetMethodFromHandle(handle) as MethodInfo;

    Assert.IsNotNull(m2);
    Assert.AreEqual(m1, m2);
}

Applying Handles

Getting back to my AOP example, I can now create a really simple mapping table for my AOP wrapper that’s lean(er) on memory and will cut down on needless calls to the Reflection API, thus speeding things up considerably.  Note that I’m doing all the Reflection work upfront and then storing only pointers.

Update: After some initial profiling, I wasn’t surprised to discover that loading the MethodInfo by the handle is as expensive as the original GetMethod call.  As such, I’ve opted to store my mapping table as Dictionary<RuntimeMethodInfo,MethodInfo> instead of Dictionary<RuntimeMethodInfo,RuntimeMethodInfo>

public class InvocationMapping
{
    public InvocationMapping(object instance)
    {
        _table = new Dictionary<RuntimeMethodHandle, MethodInfo>();
        _instance = instance;
    }

    public static InvocationMapping CreateMapping<TWrapper>(object target)
    {
        InvocationMapping mapping = new InvocationMapping(target);

        Type wrapperType = typeof(TWrapper);
        var wrapperProperties = wrapperType.GetProperties();

        Type targetType = target.GetType();
        foreach (var property in wrapperProperties)
        {
            MappedFieldAttribute attribute =
                                    property.GetCustomAttributes(typeof(MappedFieldAttribute),true)
                                    .OfType<MappedFieldAttribute>()
                                    .SingleOrDefault(mf => mf.TargetType == targetType);

            if (attribute != null)
            {
                PropertyInfo targetProperty = targetType.GetProperty(attribute.FieldName);

                mapping.Add(property, targetProperty);
            }
        }

        return mapping;
    }

    public Dictionary<RuntimeMethodHandle, MethodInfo> MappingTable
    {
        get { return _table; }
    }

    public void Add(PropertyInfo property, PropertyInfo targetProperty)
    {
        if (property.CanRead && targetProperty.CanRead)
        {
            _table.Add(property.GetGetMethod().MethodHandle, targetProperty.GetGetMethod());
        }
        if (property.CanWrite && targetProperty.CanWrite)
        {
            _table.Add(property.GetSetMethod().MethodHandle, targetProperty.GetSetMethod());
        }
    }

    public bool Supports(RuntimeMethodInfo handle)
    {
          return _table.ContainsKey(handle);
    }

    public object Invoke(RuntimeMethodInfo handle, params object[] args)
    {
         MethodInfo method = _table[handle];
         return method.Invoke(_instance, args);
    }

    private Dictionary<RuntimeMethodHandle, MethodInfo> _table;
    private object _instance;
}

My next post will look at combining this approach with a IProxyBuilderHook and/or ISelectorType.

submit to reddit

Tuesday, July 14, 2009

64-bit Apps / 32-bit Assemblies

Our team was working late the other night and got tripped up on how "Any CPU" works with 64-bit operating systems. I had developed a simple command-line utility that used some third-party assemblies that were installed in the GAC. Everything worked great on my local machine until I deployed it into the build environment, where the app simply blew up and reported an awkward "FileNotFoundException".

At a quick glance, it seemed obvious: the 3rd party assemblies weren't in the GAC. My head snapped back a bit when C:\Windows\Assembly showed that the exact versions I needed were in fact installed. I double checked my project references and painfully grudged through security policies, FusionLog files and web.config settings and with each attempt I became increasingly aggravated and annoyed. When I copied the 3rd party assemblies out of the GAC and into my utility's folder and got a BadFormatException, I realized that the only difference between my machine and the build environment was the 64bit operating system.

On a hunch, I recompiled my utility to target "x86" and suddenly everything worked. The team was baffled -- why didn't "Any CPU" work?

What does “Any CPU” mean?

The .NET Framework inspects the meta data of your dll or exe to determine how to load it. By default, "Any CPU" creates a native process to that OS. In other words, 32 bit process for 32bit OS and 64 bit process for 64bit OS. When a 32bit application (x86) runs on a 64bit OS, it runs as a WoW64 emulated 32bit process (Windows on Windows).

The following table outlines how the Visual Studio Build Configuration is interpreted between 32bit and 64bit OS.

Any CPU x86 x64
32 bit OS Native (32 bit) 32 bit process N/A
64 bit OS Native (64 bit) WoW64 process (emulated 32 bit) 64 bit process

The real gotcha is that your referenced assemblies must match the process space of the application they're being loaded into. Any assembly that is compiled as "Any CPU" is considered "neutral" and can be loaded into either a 32bit or 64bit process.

How do I determine what platform my assembly requires?

There are a few ways to determine if your assembly is Any CPU, 32bit or 64bit:

Global Assembly Cache

If your assembly is in the GAC, you can find this information in the Processor Architecture column.

assembly_32bit

Corflags

The corflags.exe tool that ships with the .NET SDK is a very powerful and dangerous tool. Just supplying the name of your exe or assembly to corflags shows some vital PE meta information. More great information can be found here: http://blogs.msdn.com/joshwil/archive/2005/05/06/415191.aspx

corflags

Incidentally, instead of recompiling the application with the “x86” target, I could simply have hacked the header using corflags:

corflags myapp.exe /32bit+

However, hacking the header of the assembly using corflags will invalidate the digital signature. Although you can always re-sign the assembly using the strong-name (sn) utility, it’s easier to just recompile and deploy.

Cheers.

submit to reddit