Tuesday, September 22, 2009

New Job, New Challenges

I haven’t been blogging lately, but hopefully I will start getting the ball rolling once again. Let’s start with a personal post.

Yesterday I started on a new job at Intermedia, and I can’t stress enough how happy this makes me. I can’t wait to face all the new challenges that will arise through time, and my co-workers managed to make me feel very comfortable in only two days. I’m sure I’ll get to know really great people and very talented developers from which I will learn a lot. Also, this company has very interesting projects which most involve state of the art technology.

One of my main tasks will be research, which hopefully will have a positive impact on this blog. I hope you’ll enjoy what’s coming.

The board is set, let’s get it on!

Saturday, June 27, 2009

Template Method Pattern & IEnumerable

An Analysis Through C#'s Evolution

The template method is a very useful pattern commonly used within Data Access Layers. It's one of the simplest patterns within the gang of four, very similar to Command and Strategy and also belongs to the behavioral patterns. It makes good use of polymorphism in a simple class hierarchy.

Template Method's UML

Diagram copied from data & object factory.

A template method defines the skeleton of an algorithm delegating specific operations in it to subclasses. Whenever a process within an object is very similar between different siblings, then a Template Method is a candidate. I don't want to bore you with the details so I'll dig right into some code.

Consider the following abstract class for games:

protected abstract void InitializeGame();
protected abstract ArrayList GetPlayers();
protected abstract void MakeMove(Player player);
protected abstract bool GameEnded { get; }
protected abstract void DisplayResults();

public void Play()

    InitializeGame();
    while (!GameEnded)
    {
        foreach (Player player in GetPlayers())
        {
            MakeMove(player);
        }
    }
    DisplayResults();
}

And a simple Player class:

public class Player
{
    private readonly string name;

    public string Name
    {
        get { return this.name; }
    }

    public Player(string name)
    {
        this.name = name;
    }

    public override bool Equals(object obj)
    {
        Player player = obj as Player;
        return (player != null) && object.Equals(this.Name, player.Name);
    }

    public override int GetHashCode()
    {
        return this.Name == null ? 0 : this.Name.GetHashCode() * 53;
    }

}

The Play method is a Template Method, wherein InitializeGame, GetPlayers, MakeMove, GameEnded and DisplayResult are Primitive Operations.

Here's a basic implementation of a High Card game:

public class HighCard : GameBase
{

    private int rounds;
    private ArrayList players;
    private int currentRound;
    private Random random;
    private Card[] currentHand;

    public HighCard()
    {
        players = new ArrayList();
        random = new Random();
    }

    protected override void InitializeGame()
    {
        players.Clear();
        currentRound = 0;
        Hashtable names = new Hashtable();
        Console.WriteLine("-------------------------------------------------------------------------");
        Console.WriteLine(" Welcome to the Increbile High Card game. Please buckle your seat belts.");
        Console.WriteLine("-------------------------------------------------------------------------");
        Console.WriteLine();
        bool keepAsking = true;
        while (keepAsking)
        {
            if (players.Count < 2)
            {
                Console.Write("Enter player's name: ");
            }
            else
            {
                Console.Write("Enter player's name (leave blank for no more players): ");
            }
            string playerName = Console.ReadLine();
            if (playerName.Length > 0)
            {
                if (names.ContainsKey(playerName))
                {
                    Console.WriteLine("I'm sorry, but we already have a contestant named {0}.", playerName);
                }
                else
                {
                    Player player = new HighCardPlayer(playerName);
                    players.Add(player);
                    names.Add(playerName, null);
                }
            }
            else
            {
                if (players.Count >= 2)
                {
                    keepAsking = false;
                }
            }
        }
        currentHand = new Card[players.Count];
        Console.WriteLine();
        Console.Write("How many rounds would you like to play? ");
        while (!int.TryParse(Console.ReadLine(), out rounds))
        {
            Console.Write("I'm sorry, I don't understand what you mean by that. How many rounds? ");
        }
        Console.WriteLine();
    }

    protected override ArrayList GetPlayers()
    {
        return players;
    }

    protected override void MakeMove(Player player)
    {
        int playerIndex = players.IndexOf(player);

        Console.Write("{0} gets the... ", player.Name);
        Thread.Sleep(random.Next(500, 1000));

        Card card = GetNewCard();

        Console.WriteLine("{0} of {1}", card.Value, card.Suit);
        currentHand[playerIndex] = card;

        if (playerIndex == players.Count - 1)
        {
            EvaluateRound();
        }
    }

    protected override bool GameEnded
    {
        get 
        { 
            return rounds == currentRound; 
        }
    }

    protected override void DisplayResults()
    {
        Console.WriteLine("GAME OVER");
        Console.WriteLine();
        Console.WriteLine("Final Scores:");

        ArrayList winners = new ArrayList();
        int winnerScore = int.MinValue;
        for (int i = 0; i < players.Count; i++)
        {
            HighCardPlayer player = (HighCardPlayer)players[i];
            Console.WriteLine("* {0}: {1} points", player.Name, player.Score);
            if (player.Score > winnerScore)
            {
                winnerScore = player.Score;
                winners.Clear();
                winners.Add (player);
            }
            else if (player.Score == winnerScore)
            {
                winners.Add (player);
            }
        }
        Console.WriteLine();
        if (winners.Count != players.Count)
        {
            if (winners.Count == 1)
            {
                Console.Write("And the winner is: ");
            }
            else
            {
                Console.Write("And the winners are: ");
            }
            Console.ForegroundColor = ConsoleColor.White;
            Console.Write(((Player)winners[0]).Name.ToUpper());
            for (int i = 1; i < winners.Count; i++)
            {
                Console.Write(" - {0}", ((Player)winners[i]).Name.ToUpper());
            }
            Console.WriteLine();
            Console.ForegroundColor = ConsoleColor.Gray;
        }
        else
        {
            Console.WriteLine("There are NO winners! It's an all tie!");
        }
    }

    private Card GetNewCard()
    {
        Card card = null;
        do
        {
            card = new Card(random.Next(1, 14), random.Next(1, 5));
        } while (CardIsInHand(card));

        return card;
    }

    private bool CardIsInHand(Card card)
    {
        for (int i = 0; i < currentHand.Length; i++)
        {
            if (card.Equals(currentHand[i]))
            {
                return true;
            }
        }
        return false;
    }

    private void EvaluateRound()
    {
        currentRound++;
        HighCardPlayer player = GetRoundsWinner();
        Console.WriteLine();
        Console.WriteLine("Congratulations {0}! You won this round!", player.Name);
        Console.WriteLine();
        player.Score++;
        ClearHand();
    }

    private HighCardPlayer GetRoundsWinner()
    {
        int winnerIndex = 0;
        for (int i = 1; i < currentHand.Length; i++)
        {
            if (currentHand[i].CompareTo(currentHand[winnerIndex]) > 0)
            {
                winnerIndex = i;
            }
        }
        return (HighCardPlayer)players[winnerIndex];
    }

    private void ClearHand()
    {
        for (int i = 0; i < currentHand.Length; i++)
        {
            currentHand[i] = null;
        }
    }

}

public class Card: IComparable
{

    int valueIndex;
    int suitIndex;

    private static string[] suits = new String[] { "Clubs", "Diamonds", "Hearts", "Spades" };
    private string[] values = new string[] { "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K", "A" };

    public Card(int value, int suit)
    {
        if ((value < 1) || (value > 13)) throw new ArgumentOutOfRangeException("value");
        if ((suit < 1) || (suit > 4)) throw new ArgumentOutOfRangeException("suit");

        this.suitIndex = suit - 1;
        if (value == 1)
        {
            this.valueIndex = 12;
        }
        else
        {
            this.valueIndex = value - 2;
        }
    }

    public string Value
    {
        get { return values[valueIndex]; }
    }

    public string Suit
    {
        get { return suits[suitIndex]; }
    }

    public override bool Equals(object obj)
    {
        Card card = obj as Card;
        return (card != null) && object.Equals(card.Value, this.Value) && object.Equals(card.Suit, this.Suit);
    }

    public override int GetHashCode()
    {
        int code = 13;
        code = (code * 23) + (this.Value != null ? this.Value.GetHashCode() : 0);
        code = (code * 23) + (this.Suit != null ? this.Suit.GetHashCode() : 0);

        return code;
    }

    public int CompareTo(object obj)
    {
        Card card = obj as Card;
        if (obj == null) throw new ArgumentNullException("obj");
        if (card == null) throw new ArgumentException("The object has to be a card.", "obj");

        if (this.valueIndex.Equals(card.valueIndex))
        {
            return this.suitIndex.CompareTo(card.suitIndex);
        }
        else
        {
            return this.valueIndex.CompareTo(card.valueIndex);
        }
    }
}

class HighCardPlayer: Player
{
    private int score;

    public int Score
    {
        get { return this.score; }
        set { this.score = value; }
    }

    public HighCardPlayer(string name): base(name)
    { }

}

A couple of things to notice:

  • To really get the benefits of the Template Method Pattern, the method itself would have to be complex, relying on simple Primitive Operations implemented on derived classes. This example does a very bad job illustrating this. A good example for this is the Insert method of a data access object that would rely on derived classes operations like GetTableName and GetFieldMappings.
  • This implementation is done to work with the first .NET Framework. This is on purpose. The reason for this lies right ahead.
  • This game's implementation is not a very good one. It has a few flaws, but I didn't want the code to be more complex than it needed to. A Deck class and refactoring Card would be recommended, plus as it is, there is a better chance of having repeated value cards in the same hand than it should. This is beyond the scope of this post, so I'll leave it at that. (Originally planned on implementing Rock, Paper & Scissors, but it seemed too cliché).

From here onwards, the focus will be on the GetPlayers primitive operation.

C# 1.0 - Where it all Started

The GetPlayers method returns an ArrayList. All we do with the returned list is enumerating through it so changing it to return an IEnumerable makes perfect sense. This adds an extra drop of flexibility to game implementers. They can use an array, an ArrayList, or even their own IEnumerable implementation. To be fair, this isn't an extremely compelling reason, and I admit that on my C# 1.0 days, I would use ArrayLists for pretty much everything. Not one of the smartest practices, I reckon, but did it anyways. Anyhow, this is the modified abstract method:

protected abstract IEnumerable GetPlayers();

This adds no extra complexity to our previous implementation, although it does break it. We need to change our return type. New game implementations will thank us.

C# 2.0 - If I Wanted to do all this Castings, I'd be in the Acting Industry

I will absolutely not engage in a discussion about all the benefits of strong typed collections introduced at this point. I'll just say the world is now a better place thanks to generics and its very welcome nasty breaking changes. Unless there were an awful lot of game implementations out there, and little more to come, changing the method to return IEnumerable<Player> is a must!

protected abstract IEnumerable<Player> GetPlayers();

Admittedly, this does add extra complexity to our HighCard class. I chose to make the players field a List<HighCardPlayer>. This does avoid a whole lot of extra casts that had to be done before along the entire class, but the GetPlayers method did become longer:

protected override IEnumerable<Player> GetPlayers()
{
    return players.ConvertAll<Player>(delegate(HighCardPlayer player)
    {
        return player;
    });
}

Still, let analyze what the advantages are:

For one, if our players list was of the Player type, no extra code would've been required. Also, as for now, on our implementations we can take advantage on the newly incorporated yield keyword. If you don't like anonymous methods, the GetPlayers method can also be written this way:

protected override IEnumerable<Player> GetPlayers()
{
    foreach (Player player in players)
    {
        yield return player;
    }
}

Yes, it is that simple. But it gets even better; consider that the game you are implementing has a constant amount of players. You could be creating a backgammon game with player1 and player2. This would look just like this:

protected override IEnumerable<Player> GetPlayers()
{
    yield return player1;
    yield return player2;
}

Although this is two lines of code, it is way more readable than what we would've done before:

protected override IEnumerable<Player> GetPlayers()
{
    return new Player[] { player1, player2 };
}

Even for a solitaire game, return a 1 item enumeration is simpler, and in the probably not applicable case of a no playered game:

protected override IEnumerable<Player> GetPlayers()
{
    yield break;
}

So the yield keyword will offer our implementers a whole new universe of possibilities.

C# 3.0 – Linqing the OOP World with the Functional World

As you've already guessed, now C# 3.0 came along, and even more flexibility is added to our GetPlayers method. Thanks to the functional programming fellows, extension methods, lambda expressions and linq is introduced. Now more complex games, with say eliminations between rounds are pretty simple to implement.

protected override IEnumerable<Player> GetPlayers()
{
    return players
        .Where(player => !player.IsEliminated)
        .OrderByDescending(player => player.Score)
        .Cast<Player>();
}

Or even in query language:

protected override IEnumerable<Player> GetPlayers()
{
    return from highCardplayer in players
           let player = (Player)highCardplayer
           where !highCardplayer.IsEliminated
           orderby highCardplayer.Score descending
           select player;
}

Even our version in HighCard gets slightly simplified due to extension methods:

protected override IEnumerable<Player> GetPlayers()
{
    return players.Cast<Player>();
}

The beauty in this is that we have introduced no breaking changes what so ever, in fact, our GameBase class hasn't even changed.

C# 4.0 – A Peek into the Future

With C# 4.0 we will be able to take advantage of the covariant aspect of IEnumerable<T>. Basically, an IEnumerable<HighCardPlayer> can be implicitly converted to IEnumerable<Player>. This takes us back to our method being as simple as it started out being:

protected override IEnumerable<Player> GetPlayers()
{
    return players;
}

Notice how players is List<HighCardPlayer> which in turn is an IEnumerable<HighCardPlayer>, and thanks to covariance, it is also an IEnumerable<Player>.

Our GameBaseclass didn't change and no breaking changes were introduced. Still, interesting enough, there is a chance that switching to C# 4.0 will introduce a breaking change. If we had something like this:

IEnumerable playerEnum = players;

if (playerEnum is IEnumerable<Player>)
{
    // Do something
}
else if (playerEnum is IEnumerable<HighCardPlayer>)
{
    // Do something else
}

Now, the second block will be unreachable since every IEnumerable<HighCardPlayer> is in fact also an IEnumerable<Player>.

The Reason behind this Post

I've been working on a code generation project. For example, the class generator makes full use of Template Methods that use operations like GetFields, GetMethods, GetProperties and even GetDecorators. These operations have changed during time taking advantage of everything noted above, and I found it very interesting. I thought I'd share.

Random thoughts: The template method pattern isn't really the reason of all the things that were shown here, the responsible is the IEnumerable<T> interface. I might as well have used an interface with IEnumerable<T> methods or properties, but took this opportunity to talk a little about this pattern. I usually find it extremely useful, and the simplicity of it makes it yet more desirable. Pretty much like everything else, the value is benefit/complexity. Having such low complexity makes it very valuable, and I love the KISS principle.

On a personal note, I wish my very good friend Gonzalo a very happy birthday. I hope you enjoy this post and good luck on your future endeavors!

Thursday, June 11, 2009

MSDN Forums Browser

Yesterday was an exciting day; David Morton's MSDN Forums Browser was released at Codeplex. It's on a very alpha state, but it seems clear to me that has the potential to become an excellent and very valuable tool.

Yet today it got even better, he made me a part of his team! It's a wonderful opportunity to work side by side with some extremely talented developers. I'm thrilled about it, and I will embrace this opportunity to learn as much as I can, while hopefully becoming a valuable asset for the team.

Needless to say I encourage you to download it, try it out and give as much feedback as you can.

MSDN Forums Browser's Screenshot

Link to project: http://forumsbrowser.codeplex.com/

Saturday, May 30, 2009

Who's on First

This is a follow-up to my previous post: Swap Ints, Two Variables, No Waiting.

While analyzing my method's CIL, I thought that maybe I could implement a faster swap. Maybe there is a way to implement a method that doesn't need a temp variable but without all the overhead we've seen with xor. Well, the following came up:

.method public hidebysig static void  SwapIntsCil(int32& first,
                                                  int32& second) cil managed
{
  // Code size       9 (0x9)
  .maxstack  4
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldind.i4
  IL_0003:  ldarg.1
  IL_0004:  ldarg.0
  IL_0005:  ldind.i4
  IL_0006:  stind.i4
  IL_0007:  stind.i4
  IL_0008:  ret
} // end of method IntSwapper::SwapIntsCil

Let's see what it does:

  • IL_0000 Pushes first's address into the stack.
  • IL_0001 Pushes second's address into the stack.
  • IL_0002 Pops the address to fetch and push the actual Int32 value.
  • IL_0003 Pushes second's address into the stack.
  • IL_0004 Pushes first's address into the stack.
  • IL_0005 Pops the address to fetch and push the actual Int32 value.
  • IL_0006 Pops first's value and second's address to store the value at the address.
  • IL_0007 Pops second's value and first's address to store the value at the address.
  • IL_0008 Gets the hell outta here.

Click to display diagram.

To sum up, this method would use 4 levels of stack, declare no local variables, load 4 addresses, load 2 values through these addresses, store 2 values to these addresses and perform no operations.

In order to implement this, I moved previous swaps into a separate Assembly called KeepItSharp (yes, I'm that original) and to a class called IntSwapper. Built it in release mode, opened it with ildasm and dumped the CIL code. Added my new method and compiled it with ilasm. Time to test it!

static void Main(string[] args)
{
    int a = 15;
    int b = 320;
    Console.WriteLine("a: {0}\tb: {1}", a, b);

    KeepItSharp.IntSwapper.SwapIntsCil(ref a, ref b);

    Console.WriteLine("a: {0}\tb: {1}", a, b);
}

And the result is... <drumroll>

a: 15   b: 320
a: 320  b: 15
Press any key to continue . . .

Cool… my first CIL code and it works! Well, can't wait to see if it actually is faster than the conventional swap. Here it goes:

Benchmarking code is omitted because it's analogue to the one on my previous post.

SwapIntsXor: 00.559 - SwapIntsCil: 00.190 - SwapInts: 00.200
SwapIntsXor: 00.543 - SwapIntsCil: 00.189 - SwapInts: 00.194
SwapIntsXor: 00.544 - SwapIntsCil: 00.191 - SwapInts: 00.194
SwapIntsXor: 00.545 - SwapIntsCil: 00.190 - SwapInts: 00.190
SwapIntsXor: 00.549 - SwapIntsCil: 00.193 - SwapInts: 00.192
Press any key to continue . . .

Darn it! It seems to be a close one. OK, let's call it a tie.

Now, it seemed like fun to see what Red Gate's .NET Reflector had to say about this method, and the result is pretty peculiar. Check it out:

public static void SwapIntsCil(ref int first, ref int second)
{
    second = first;
    first = second;
}

And this works? Actually I wish that reflector would have thrown an error instead of showing this mysterious piece of coding that should've never worked.

The whole time while writing these last two posts, I kept thinking about Abbott and Costello's "Who's on First". Who's on first, what's on second, and on third? I don't know, I'm not using a temp variable!

Random thoughts: Aside from being cool to code some CIL, what left me thinking most was Reflector's behavior. For one, I will not stop using Reflector since the benefits are huge and these scenarios very scarce. On another note, this means that using CIL code directly allows us to write code that can't be translated to C# with reflecting tools. This seems pretty useful for licensing code, for example, although it would be easier to just code it in C and p/invoke it.

You can download KeepItSharp.dll here.

Thursday, May 28, 2009

Swap Ints, Two Variables, No Waiting

Who hasn't had to swap two ints? Everyone who implemented a sort algorithm, for example, had to swap two values. What does this function look like? Probably something like this:

static void SwapInts(ref int first, ref int second)
{
    int aux = first;
    first = second;
    second = aux;
}

And it works too!

static void Main(string[] args)
{
    int a = 15;
    int b = 320;
    Console.WriteLine("a: {0}\tb: {1}", a, b);

    SwapInts(ref a, ref b);

    Console.WriteLine("a: {0}\tb: {1}", a, b);
}
a: 15   b: 320
a: 320  b: 15
Press any key to continue . . .

My parents are so proud!

Anyways, a co-worker challenged me to implement the SwapInts method without using an auxiliary variable. We all know that patience is a virtue, and I know that it is something I should work on. I got frustrated within minutes and demanded the solution to this. It turned out not so complex after all. Check it out:

static void SwapIntsXor(ref int first, ref int second)
{
    first ^= second;
    second ^= first;
    first ^= second;
}

This works because xoring first and second produces a number that when xored with first we get second and when xored with second we get first. Pretty slick! Let's take a deeper look. Remember the xor table?

XOR01
001
110

Now, a pretty cool characteristic about xor is that when you xor twice you get back where you started:

0011
0101
0110

And once again…

0110
0101
0011

Bingo!

Now lets breakdown our new method. Consider A to be the first value, B the second and C the xored value.

first ^= second; -> first is C / second is B
second ^= first; -> first is C / second is A
first ^= second; -> first is B / second is A

This got me thinking a little. Of course this has absolutely no use in real life code, but it seemed as in terms of performance we could have a little improvement. So, no time to waste, I benchmarked them:

static void Main(string[] args)
{
    int a = 15;
    int b = 320;

    // Warm up
    SwapIntsXor (ref a,ref b);
    SwapInts (ref a, ref b);

    Benchmark(100000000, 5);
}

static void Benchmark(int innerIterations, int outterIterations)
{
    int a = 15;
    int b = 320;
    Stopwatch sw = new Stopwatch();

    for (int i = 0; i < outterIterations; i++)
    {
        sw.Start();
        for (int j = 0; j < innerIterations; j++)
        {
            SwapInts(ref a, ref b);
        }
        sw.Stop();
        Console.Write("SwapInts: {0:ss.fff} - ", new DateTime(sw.Elapsed.Ticks));
        sw.Reset();

        sw.Start();
        for (int j = 0; j < innerIterations; j++)
        {
            SwapIntsXor(ref a, ref b);
        }
        sw.Stop();
        Console.WriteLine("SwapIntsXor: {0:ss.fff}", new DateTime(sw.Elapsed.Ticks));
        sw.Reset();
    }
}

Output:

SwapInts: 00.190 - SwapXor: 00.558
SwapInts: 00.185 - SwapXor: 00.548
SwapInts: 00.189 - SwapXor: 00.547
SwapInts: 00.187 - SwapXor: 00.554
SwapInts: 00.187 - SwapXor: 00.546
Press any key to continue . . .

The conventional swap is faster than the new one. I can't stress enough that this was just for experimental purposes and to satisfy my own curiosity. By any means, even if with xor was significantly faster, would I recommend it since the conventional one is well known, straight forward and much, much easier to understand.

Anyways, since I've never really tried to understand any CIL, this seemed like a sweet opportunity to take the first steps and understand what makes it slower. Let's analyze the SwapInts method:

.method private hidebysig static void  SwapInts(int32& first,
                                                int32& second) cil managed
{
  // Code size       11 (0xb)
  .maxstack  2
  .locals init ([0] int32 aux)
  IL_0000:  ldarg.0
  IL_0001:  ldind.i4
  IL_0002:  stloc.0
  IL_0003:  ldarg.0
  IL_0004:  ldarg.1
  IL_0005:  ldind.i4
  IL_0006:  stind.i4
  IL_0007:  ldarg.1
  IL_0008:  ldloc.0
  IL_0009:  stind.i4
  IL_000a:  ret
} // end of method Program::SwapInts
  • IL_0000 Pushes first's address into the stack.
  • IL_0001 Pops the address to fetch and push the actual Int32 value.
  • IL_0002 Pops first's value and stores it into local variable aux.
  • IL_0003 Pushes first's address into the stack.
  • IL_0004 Pushes second's address into the stack.
  • IL_0005 Pops the address to fetch and push the actual Int32 value.
  • IL_0006 Pops second's value and first's address to store the value at the address.
  • IL_0007 Pushes second's address into the stack.
  • IL_0008 Pushes the value at local variable aux.
  • IL_0009 Pops aux's value and second's address to store the value at the address.
  • IL_000a Gets the hell outta here.

Click to display diagram.

To sum up, this method uses only 2 levels of the stack, declares 1 local variable, loads 4 addresses, loads 2 values through these addresses, loads 1 value through a local variable, stores 2 values to addresses, stores 1 value to a local variable and performs no operations.

Note: The stack mentioned above is a conceptual stack called the Virtual Execution System. This is not "the stack" everyone talks about when referring to where the local and parameter value types are stored. That being said lets move on...

Now we should check out the SwapIntsXor method:

.method private hidebysig static void  SwapIntsXor(int32& first,
                                                   int32& second) cil managed
{
  // Code size       22 (0x16)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  dup
  IL_0002:  ldind.i4
  IL_0003:  ldarg.1
  IL_0004:  ldind.i4
  IL_0005:  xor
  IL_0006:  stind.i4
  IL_0007:  ldarg.1
  IL_0008:  dup
  IL_0009:  ldind.i4
  IL_000a:  ldarg.0
  IL_000b:  ldind.i4
  IL_000c:  xor
  IL_000d:  stind.i4
  IL_000e:  ldarg.0
  IL_000f:  dup
  IL_0010:  ldind.i4
  IL_0011:  ldarg.1
  IL_0012:  ldind.i4
  IL_0013:  xor
  IL_0014:  stind.i4
  IL_0015:  ret
} // end of method Program::SwapIntsXor

It's seems heavier already! Let's start:

  • IL_0000 Pushes first's address into the stack.
  • IL_0001 Duplicates the top value of the stack.
  • IL_0002 Pops the address to fetch and push the actual Int32 value.
  • IL_0003 Pushes second's address into the stack.
  • IL_0004 Pops the address to fetch and push the actual Int32 value.
  • IL_0005 Pops second's value and first's value to perform a xor and push the result.
  • IL_0006 Pops the xor's result and first's address to store the value at the address.
  • IL_0007 Pushes second's address into the stack.
  • IL_0008 Duplicates the top value of the stack.
  • IL_0009 Pops the address to fetch and push the actual Int32 value.
  • IL_000a Pushes first's address into the stack.
  • IL_000b Pops the address to fetch and push the actual Int32 value.
  • IL_000c Pops first's value and second's value to perform a xor and push the result.
  • IL_000d Pops the xor's result and second's address to store the value at the address.
  • IL_000e Pushes first's address into the stack.
  • IL_000f Duplicates the top value of the stack.
  • IL_0010 Pops the address to fetch and push the actual Int32 value.
  • IL_0011 Pushes second's address into the stack.
  • IL_0012 Pops the address to fetch and push the actual Int32 value.
  • IL_0013 Pops second's value and first's value to perform a xor and push the result.
  • IL_0014 Pops the xor's result and first's address to store the value at the address.
  • IL_0015 Gets the hell outta here.

Click to display diagram.

To sum up, this method uses 3 levels of stack, declares no local variables, loads 6 addresses, duplicates 3 of these, loads 6 values through these addresses, stores 3 values to addresses and performs 3 operations. This one is definitely heavier than the previous.

I believe is worth noting that CIL doesn't necessarily translate directly or proportionally to machine code, it's just one step closer. Analyzing the machine code is a whole other story; a story I don't plan getting close to anywhere in the near future.

Random thoughts: I wonder why .maxstack is set on 8 for SwapIntXor when 3 would suffice, but for what I've seen this is pretty common. Anyhow, I have no idea if anyone will find all this information useful at all, but it was an extremely helpful exercise for me and a lot of fun.

Edit: Switched the benchmark. It was previously done in debug mode.

Follow up: Who's of First

Sunday, May 24, 2009

Evil Mutant Value Types

Setting a stucture's property through Reflection

Not so long ago I was presented with the following problem. Consider the following struct:

public struct UglyMutableStruct
{

    private int x;

    public UglyMutableStruct(int x)
    {
        this.x = x;
    }

    public int X
    {
        get { return this.x; }
        set { this.x = value; }
    }

    public override string ToString()
    {
        return string.Format("X: {0}", X);
    }

}

And the simple task was to set X's value through reflection:

UglyMutableStruct instance = new UglyMutableStruct(10);
Console.WriteLine(instance);

Type myType = typeof(UglyMutableStruct);

PropertyInfo propertyX = myType.GetProperty("X");
propertyX.SetValue(instance, 15, null);

Console.WriteLine(instance);

Output:

X: 10
X: 10
Press any key to continue . . .

Puzzled by this seemingly bizarre behavior, I searched the web to find a solution. I found a couple of people saying that it just can't be done, live with it. What I didn't find and wanted to know is why it can't be done. So, since I saw little analysis to be done to this particular bit of code I checked out the CIL through ildasm to see what could be happening. I came to my attention that instance was being boxed at the SetValue method, but never unboxed. OK, so let's try doing the boxing/unboxing ourselves.

UglyMutableStruct instance = new UglyMutableStruct(10);
Console.WriteLine(instance);

Object objectInstance = instance;
Type myType = typeof(UglyMutableStruct);

PropertyInfo propertyX = myType.GetProperty("X");
propertyX.SetValue(objectInstance, 15, null);
instance = (UglyMutableStruct)objectInstance;

Console.WriteLine(instance);

Output:

X: 10
X: 15
Press any key to continue . . .

It works. That was the problem! I left it at that, but something still felt not quite right.

What was really going on is that looking at the tree, I missed the forest. I saw reflection and thought that the underlying problem had to be reflection. Never stopped to think that when you pass a value type as a parameter the CLR sends a copy, and not the actual instance. The value of it was being set alright, but to the copy and not to my instance. It was just the expected behavior, just like this:

static void Main(string[] args)
{
    UglyMutableStruct instance = new UglyMutableStruct(10);
    Console.WriteLine(instance);
    SetValueToStruct(instance);
    Console.WriteLine(instance);
}

static void SetValueToStruct(UglyMutableStruct aCopy)
{
    Console.WriteLine("   --- Start method's scope ---");
    Console.WriteLine("   {0}", aCopy);
    aCopy.X = 15;
    Console.WriteLine("   {0}", aCopy);
    Console.WriteLine("   --- End method's scope ---");
}

Output:

X: 10
   --- Start method's scope --
   X: 10
   X: 15
   --- End method's scope --
X: 10
Press any key to continue . . .

The output is just as one would expect and it is exactly what was happening earlier.

This is yet another reason why mutable structs are plain evil. Only use them when performance is an issue.

Random thoughts: Well, my first thought is why on earth would someone be using reflection on a known type? Someone may want to keep things as abstract as they can to lower coupling. This is if you know that sometime in the future the actual property setting will be on an unknown type, but still if that comes to happen, you will be handling an object anyways, wouldn't you? Another scenario would be setting a large amount of data (many properties) and you'd like to keep the code clean. Still having a large amount of data kind of defeats the whole purpose of using a struct. So my conclusion is that this scenario is not seen very often, and if it happens it probably means that you need to rethink your design rather than make it work. But what's more important is that using mutable structs sooner or later means facing some unexpected behavior. If you have to use one, keep always in mind that you are.

Saturday, May 16, 2009

Co-variance / Contra-variance - Part II

This is a follow-up to my previous post: Co-variance / Contra-variance - Part I.

On my previous post I used the unfortunate expression:

All this suffering will be finally put to an end once C# 4.0 comes out with the lovely new feature of safe Co- and Contra-variance.

As it turns out, this isn't entirely true. Sure, this new feature will definitely make like a lot easier but it will not give us a straight forward solution for the problems I stated. The main reason is that safe co- and contra-variance will be only applicable to interfaces and delegates due to a restriction in the CLR. I don't know what this restriction is but I'll promptly post it once I find it out. Another restriction is that safe variance can only be applied when there is a reference conversion between the generic arguments. This means that no variance will be possible between int and object and I suspect that it will not be possible between two types where an implicit cast operator is declared.

Even though we cannot directly fix the methods on my previous post to work with safe variance, it can definitely be refactored into a safely variant solution. I will discuss how it would be achieved although for educational purposes I chose to implement something entirely new that is easier to understand and it's a better example of safe variance. Also I chose something different to what Mads Torgersen wrote for the “New Features in C# 4.0” document in C# Future, to both train myself and to put a brand new example out there.

So let's get our hands dirty:

public interface IConverter<TSource, TDestination>
{
    TDestination Convert(TSource source);
}

This interface can come in handy in lots of scenarios. Now we would like to implement a generic UI, so we may declare a form like this:

public partial class MyGenericForm<TBusinessObject> : Form
{

    private UserControl DisplayControl { get; set; }

    public MyGenericForm()
    {
        FormBorderStyle = FormBorderStyle.SizableToolWindow;

        InitializeComponent();
    }

    public MyGenericForm(IConverter<TBusinessObject, UserControl> converter, TBusinessObject sourceObject)
        :this()
    {
        AsignSourceObject(converter, sourceObject);
    }

    public void AsignSourceObject(IConverter<TBusinessObject, UserControl> converter, TBusinessObject sourceObject)
    {
        this.SuspendLayout();

        if (DisplayControl != null)
        {
            Controls.Remove(DisplayControl);
            DisplayControl.Dispose();
        }

        DisplayControl = converter.Convert(sourceObject);
        this.Text = DisplayControl.Text;
        this.ClientSize = DisplayControl.Size;
        this.Controls.Add(DisplayControl);

        this.ResumeLayout();
    }

    private void CloseButton_Click(object sender, EventArgs e)
    {
        this.Close();
    }

}

This form displays any kind of TBusinessObject given that you implement an IConverter that converts it to a UserControl. So we implement the class Person, UserControlPerson and a converter between them:

public class Person
{
    public string UserName { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

public class PersonToUserControlConverter : IConverter<Person, UserControl>
{
    public UserControl Convert(Person source)
    {
        PersonUserControl ctrl = new PersonUserControl();
        ctrl.User = source;
        return ctrl;
    }
}

Now we use the form like this:

Person john = new Person() { UserName = "Johnny", FirstName = "John", LastName = "Doe" };
using (MyGenericForm<Person> frm = new MyGenericForm<Person>(new PersonToUserControlConverter(), john))
{
    frm.ShowDialog();
}

Cool, it works as expected.

Further on the development of my project, I found myself needing a new form with some other features and not as restrictive as this one. I make it.

public partial class AnotherGenericForm<TBusinessObject> : Form
{
    public AnotherGenericForm()
    {
        InitializeComponent();
    }

    // Form's code

    public void AsignSourceObject(IConverter<TBusinessObject, Control> converter, TBusinessObject sourceObject)
    {
        // Some code
    }

    // More code

}

Unfortunately I can't use PersonToUserControl converter since it doesn't implement the IConverter<TBusinessObject, Control> interface. No problem, I'll just declare it and explicitly implement it:

public class PersonToUserControlConverter
    : IConverter<Person, UserControl>, IConverter<Person, Control>
{
    public UserControl Convert(Person source)
    {
        PersonUserControl ctrl = new PersonUserControl();
        ctrl.User = source;
        return ctrl;
    }

    Control IConverter<Person, Control>.Convert(Person source)
    {
        return this.Convert(source);
    }
}

OK, suddenly I realize the following: Eventually I may need to implement IConverter<TBusinessObject, ScrollableControl> and who knows what else. It seems that this will come back to haunt me. This isn't the best course of action.

The solution: Co-Variance.
With C# 4.0 I can make the IConverter interface co-variant. How? Using the out keyword.

public interface IConverter<TSource, out TDestination>
{
    TDestination Convert(TSource source);
}

Declaring the interface in this manner makes it safely co-variant. The out keyword indicates that TDestination can only be in an output position. Now IConverter<Person, UserControl> is also considered as IConverter<Person, ContainerControl>and all the way up the inheritance chain up to IConverter<Person, Object>. Notice that no breaking changes are introduced and we can now safely use our converter in the new form. Now we can safely declare:

IConverter<Person, Control> converter = new PersonToUserControlConverter();

In a similar fashion, we might end up having a form that has the following method:

public void AsignSourceObject(IConverter<ForumUser, Control> converter, TBusinessObject sourceObject)
{
    // Some code
}

ForumUser inherits from User. We can't use the same approach because in our interface TSource is an input. Also covariance goes up the inheritance chain, not downwards.

The solution: Contra-variance.
As you have already guessed, contra-variance and the in keyword will help us out here. Our (hopefully) final declaration of our IConverter is:

public interface IConverter<in TSource, out TDestination>
{
    TDestination Convert(TSource source);
}

The in keyword indicates that TSource can only be in an input position. Now IConverter<Person, UserControl> is also considered as IConverter<ForumUser, UserControl> and any other inheritance ramifications of Person.

Now, all of the following is completely legal:

PersonToUserControlConverter converter = new PersonToUserControlConverter();

IConverter<ForumUser, Control> forumUserToControl = converter;
IConverter<ForumUser, UserControl> forumUserToUserControl = converter;
IConverter<ForumAdministrator, ContainerControl> forumUserToControl = converter;
IConverter<Person, object> forumUserToControl = converter;
IConverter<Person, IComponent> forumUserToControl = converter;

Notice the last declaration. Even though IComponent isn't exactly in the inheritance chain, UserControl implements it, so it's legal too.

Why the in/out keywords?

Well, the obvious answer is out for output position and in for input position. Well… duh! But why is co-variance allowed only through arguments in output position? Well, if you think about it, it makes sense. Consider the following classes:

public class A { }
public class B : A { }
public class C : A { }
public class D : C { }
public class E : C { }

Now, if we have a method that returns a C we are certain that we can treat it as an A. We don't know if we can treat him as D since it might be a C, D or E. Hence, it is co-variant. Now if the object was to be used as a method's parameter, it means the method needs a C. So we can feed that method anything that can be considered a C, like D or E. We can't feed it with an A or a B since they cannot be used as a C. Hence, it is contra-variant. My head hurts.

Co- and Contra-variant Types in .NET Framework 4.0

public interface IEnumerable<out T> : IEnumerable
{
    new IEnumerator<T> GetEnumerator();
}

public interface IEnumerator<out T> : IEnumerator, IDisposable
{
    new T Current { get; }
}

public interface IComparer<in T>
{
    int Compare(T x, T y);
}

public delegate TResult Func<out TResult>();
public delegate TResult Func<in T, out TResult>(T arg);
public delegate TResult Func<in T1, in T2, out TResult>(T1 arg1, T2 arg2);
public delegate TResult Func<in T1, in T2, in T3, out TResult>(T1 arg1, T2 arg2, T3 arg3);
public delegate TResult Func<in T1, in T2, in T3, in T4, out TResult>(T1 arg1, T2 arg2, T3 arg3, T4 arg4);
public delegate void Action<in T>(T arg);
public delegate void Action<in T1, in T2>(T1 arg1, T2 arg2);
public delegate void Action<in T1, in T2, in T3>(T1 arg1, T2 arg2, T3 arg3);
public delegate void Action<in T1, in T2, in T3, in T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4);

A couple of educated guesses:

public interface IEqualityComparer<in T>
{
    // Members
}

public interface IGrouping<out TKey, out TElement>
{
    // Members
}

public interface IQueryable<out TElement>
{
    // Members
}

public delegate void EventHandler<in TEventArgs>(object sender, TEventArgs e);
public delegate bool Predicate<in T>(T obj);
public delegate int Comparison<in T>(T x, T y);

Solution to Previous Post

I'm not going to go into details about the solution. To the co-variance problem, well the solution is pretty straight forward. We change the SafeMethod to have an IEnumerable<MyBase> as parameter. Just like that we could now feed it with a List<MyFirstDerived> without any problems.

The contra-variant problem is not that simple. One possible solution would involve defining our own interface and class:

public interface ICollectable<T> 
{

    int Count { get; }
    void Add(T item);
    void Insert(int index, T item);
    bool Remove(T item);
    void RemoveAt(int index);
    void Clear();
    bool Contains(T item);

}

public partial class Pack<TAnimal>: IList<TAnimal>, ICollectable<TAnimal>, IList
    where TAnimal : Animal
{

    private List<TAnimal> innerList;

    // A whole lot of delegations!

}

Actually, I'd take it a little further and remove the Animal restriction to create yet another generic collection. The ICollectable gives you an interface of a collection where you can blindly modify it. Pretty weird, huh? Anyways, now our AddTigerToPack method would receive an ICollectable<Tiger> and our Pack<Animals> will be considered as such thanks to safe contra-variance.

Random thoughts: I wonder if they would include a better named ICollectable<in T> interface… Seriously, it keeps marveling me how they manage to keep introducing major changes without breaking any existing code. They manage to add a wonderful new functionality which opens a lot of possibilities without restricting what we already had. All these while still keeping things strongly typed and relatively simple. Kudos!

Sunday, May 10, 2009

Co-variance / Contra-variance - Part I

Apparently, one common mistake made by C# programmers involves co-variance and contra-variance. Among the cross-thread operation not valid exception and inter-form communication, one of the most common questions seen on the Microsoft's Visual C# Forum is why a List<MyDerivedType> can't be cast to a List<MyBaseType>. I must confess that the first time encountered this problem I was both surprised and pissed. In time I learned that when the language doesn't allow you to perform some kind of action, there is a good reason behind it. Soon enough I realized that this wasn't the exception.

Take into consideration the following code:

public class MyBase
{
    public virtual void Operation()
    {
        // Some code.
    }
}

public class MyFirstDerived : MyBase
{
    public override void Operation()
    {
        // Some code.
    }
}

public class MySecondDerived : MyBase
{
    public override void Operation()
    {
        // Some code.
    }
}

class Program
{
    static void Main(string[] args)
    {
        IList<MyFirstDerived> myStronglyTypedList = new List<MyFirstDerived>();
        BlowUpMethod(myStronglyTypedList);
        SafeMethod(myStronglyTypedList);
    }

    static void BlowUpMethod(IList<MyBase> aNotSoStronglyTypedList)
    {
        aNotSoStronglyTypedList.Add(new MySecondDerived());
    }

    static void SafeMethod(IList<MyBase> aNotSoStronglyTypedList)
    {
        foreach (MyBase aBaseObject in aNotSoStronglyTypedList)
        {
            aBaseObject.Operation();
        }
    }
}

Since C#, up to version 3.0, is invariant this code does not compile. One of the errors states:

Argument '1': cannot convert from 'System.Collections.Generic.IList<CoVarianceContraVariance.MyFirstDerived>' to 'System.Collections.Generic.IList<CoVarianceContraVariance.MyBase>'

If it did work, in SafeMethod there would no issues at all. This would work without any undesired secondary effects. This behavior is called co-variance. On the other hand, after invoking the BlowUpMethod we would have a MySecondDerived instance in a list were only MyFirstDerived (and derivations) instances are allowed. This will defeat the whole purpose of having strongly typed lists, wouldn't it?

Fortunately, there is a workaround to allow some sort of co-variance behavior that I find extremely useful in some scenarios.

static void SafeMethod<T>(IList<T> aStronglyTypedList) where T : MyBase
{
    foreach (MyBase aBaseObject in aStronglyTypedList)
    {
        aBaseObject.Operation();
    }
}

Works like a charm!

As you would expect a similar problems occurs when trying to apply contra-variance, but it is in fact a little trickier:

public abstract class Animal {}
public abstract class Carnivore : Animal
{
    public void Hunt (Animal prey)
    {
        // Some code.
    }
}
public class Tiger : Carnivore { }
public abstract class Herbivore : Animal { }
public class Antelope : Herbivore { }

public class Program
{
    static void Main(string[] args)
    {
        List<Animal> animals = new List<Animal>()
        {
            new Antelope(),
            new Tiger()
        };
        HuntingOverkill(animals, new Antelope());
        AddTigerToPack(animals);
    }

    static void HuntingOverkill(List<Carnivore> packOfCarnivores, Animal prey)
    {
        foreach (Carnivore carnivore in packOfCarnivores)
        {
            carnivore.Hunt(prey);
        }
    }

    static void AddTigerToPack(List<Tiger> registeredTigers)
    {
        registeredTigers.Add(new Tiger());
    }
}

Again, this code will not compile since a list of animals cannot be converted to a list of tigers. If it did, the method HuntingOverkill would be making a defenseless antelope hunt, and that is why this isn't allowed, although the method AddTigerToPack could work without any problems.

A similar workaround could be tried here, but it wouldn't work either, since if we made registeredTigers a list of T with T inheriting from Animal, we could end up with a Tiger inside a pack of herbivores. Those poor little creatures, can you imagine the chaos? So I tried to work up some workaround for this too, but although I tried banging my head with the keyboard, I couldn't come up with one that looked cool. I only ended up with ugly methods with generic arguments with lots of constraints and that couldn't be inferred by the compiler when calling them.

All this suffering will be finally put to an end once C# 4.0 comes out with the lovely new feature of safe Co- and Contra-variance. I will elaborate on how it will work on my next post.

Random thoughts: One could possible ask why the C# compiler isn't smart enough to allow cases like SafeMethod or AddTigerToPack. Well, first of all, I wouldn't like the compiler looking so deep into my code before compiling. A line has to be drawn at some point and I think this goes far beyond that line. Also this example is pretty simple and this cases of variance can (and will) end up in much more complex code. It will end up making the compiler having to do an awful lot of work and it would take huge amount of time in big projects. Finally and most importantly, changing something that is completely legal inside the method's scope would introduce breaking changes! It can even cause an application using our class library to stop working. So no, this is completely out of the question.

Follow-up: Co-variance / Contra-variance - Part II.