How do Extension Methods work and why was a new CLR not required?

Someone said recently that they thought extensions methods required a new CLR.

Extension methods are a new feature in .NET 3.5 (C#3/VB9) that let you appear to "spot weld" new methods on to existing classes. If you think that the "string" object needs a new method, you can just add it and call it on instance variables.

Here's an example. Note that the IntHelper35 class below defines a new method for integers called DoubleThenAdd. Now, I can do things like 2.DoubleThenAdd(2). See how the method directly "hangs off" of the integer 2? It's the "this" keyword appearing before the first parameter that makes this magic work. But is it really magic? Did it require a change to the CLR, or just a really smart compiler?

Let's do some experimenting and see if we can figure it out for ourselves.

using System;
namespace Foo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(2.DoubleThenAdd(3));
Console.WriteLine(IntHelper20.DoubleThenAdd(2, 3));
Console.ReadLine();
}
}

public static class IntHelper20
{
public static int DoubleThenAdd(int myInt, int x)
{
return myInt + (2 * x);
}
}

public static class IntHelper35
{
public static int DoubleThenAdd(this int myInt, int x)
{
return myInt + (2 * x);
}
}
}

I've also added an IntHelper20 class with an identical method but WITHOUT the "this" keyboard. It's a standard static method, and I call it in the standard way. Now, let's compile it, then disassemble it with Reflector and take a look at the IL (Intermediate Language).

.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 8
L_0000: nop
L_0001: ldc.i4.2
L_0002: ldc.i4.3
L_0003: call int32 ConsoleApplication8.IntHelper35::DoubleThenAdd(int32, int32)
L_0008: call void [mscorlib]System.Console::WriteLine(int32)
L_000d: nop
L_000e: ldc.i4.2
L_000f: ldc.i4.3
L_0010: call int32 ConsoleApplication8.IntHelper20::DoubleThenAdd(int32, int32)
L_0015: call void [mscorlib]System.Console::WriteLine(int32)
L_001a: nop
L_001b: call string [mscorlib]System.Console::ReadLine()
L_0020: pop
L_0021: ret
}

Interestingly, both method calls look the same. They look like static method calls with two integer parameters. From looking at this part of the IL, you can't actually tell which one is an extension method. We know the first one, IntHelper35, is, but from this snippet of IL, we can't tell.

Can Reflector tell the difference if we ask it to decompile to C# or VB (rather than IL)?

private static void Main(string[] args)
{
Console.WriteLine(2.DoubleThenAdd(3));
Console.WriteLine(IntHelper20.DoubleThenAdd(2, 3));
Console.ReadLine();
}

Interestingly, it knows the difference. How? Here's the decompilation of the IntHelper35 class itself:

.method public hidebysig static int32 DoubleThenAdd(int32 myInt, int32 x) cil managed
{
.custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor()
.maxstack 3
.locals init (
[0] int32 CS$1$0000)
L_0000: nop
L_0001: ldarg.0
L_0002: ldc.i4.2
L_0003: ldarg.1
L_0004: mul
L_0005: add
L_0006: stloc.0
L_0007: br.s L_0009
L_0009: ldloc.0
L_000a: ret
}

The only difference between the two methods is the CompilerServices.ExtensionAttribute. This attribute is added because of the "this" keyword, and it looks like it's what Reflector is using to correctly identify the extension method.

Extension methods are a really nice syntactic sugar. They're not really added to the class, as we can see, but the compiler makes it feel like they are.

Slightly Related Aside about Object Oriented "C"

This reminded me of what we called "object oriented C" in college. I found a great example on Phil Bolthole's site.

Basically you make a struct to represent your member variables, and then you create a number of methods where the first parameter is the struct. For example:

#include "FooOBJ.h" 
void diddle(){
FooOBJ fobj;

fobj=newFooOBJ(); /* create a new object of type "FooOBJ" */

/* Perform member functions on FooOBJ.
* If you try these functions on a different type of object,
* you will get a compile-time error
*/
setFooNumber(fobj, 1);
setFooString(fobj, "somestring");
dumpFooState(fobj);

deleteFooOBJ(fobj);
}

int main(){
diddle();
return 0;
}

In this C example, if you mentally move the first parameter to the left side and add a ".", like fobj.dumpFooState() it's almost like C++. Then, you ask yourself, "gosh, wouldn't it be nice if a compiler did this for me?"

Technorati Tags: ,,

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook bluesky subscribe
About   Newsletter

Hosting By
Hosted on Linux using .NET in an Azure App Service