Attempted Dictionary Access

2013-11-09 7:33 PM

Dictionarys in C# are super useful for any sort of key/value-related datasets; but what is the best way to access that data?

If you're confident that your source data is there, the clearest way is obviously to use the [] indexing operator:

var englishToSpanishWords = new Dictionary<string, string>
{
    { "ball", "pelota" },
    { "cup", "taza" },
    { "man", "hombre" }
};
var ballInSpanish = englishToSpanishWords["ball"];

But what if I'm not confident that my requested key will be there? If I attempt

var ballInSpanish = englishToSpanishWords["woman"];

I'll get a KeyNotFoundException, since my attempted key wasn't in the dictionary.

There are two relatively straightforward was to do this:

Simple Null Check

This method is completely straightforward, and is a fairly simple null-armoring.

static string TryToGetValue_NullCheck(IDictionary<string, string> dictionary, string key)
{
    if (dictionary.ContainsKey(key))
    {
        return dictionary[key];
    }
    return null;
}

However, .NET provides another method

TryGetValue

To be completely blunt, I do not like the TryGetValue() method that .NET provides for dictionaries. The out coding paradigm is completely outdated since we now have nullable types. However, it's a method provided by the framework, so it's worth looking at:

static string TryToGetValue_Framework(IDictionary<string, string> dictionary, string key)
{
    string output;
    if (dictionary.TryGetValue(key, out output))
    {
        return output;
    }
    return null;
}

Again this is pretty straightforward, but the workflow is not as immediately obvious as the regular null armoring is: setup a variable to house the result, attempt to set that value using the dictionary and key, then conditionally return a result based on the degree of success that was encountered in setting that variable.

But Which One is Better?

As far as coding practice goes, it's a toss up. The first method is a little more straightforward and readable, but the second one is provided by the framework and, while it's not unreadable, it uses a coding paradigm which I consider an antipattern.

How do we solve this? Let's do some profiling and see which one is faster!

Here's a small app I whipped up to profile the performance of these methods:

public static void Main(string[] args)
{
    var size = 10000000;
    var target = Enumerable.Range(0, size)
                            .ToDictionary(a => a, a => Guid.NewGuid());

    var resultList = new List<Guid>();

    var sourceData = Enumerable.Range(size / 2, size);

    var timer = Stopwatch.StartNew();
    // Half will hit the dictionary, half will not
    foreach(var i in sourceData)
    {
        if (target.ContainsKey(i))
        {
            resultList.Add(target[i]);
        }
    }
    timer.Stop();
    Console.WriteLine("Contains/Index access: {0} milliseconds for {1} total requests",    
                        timer.ElapsedMilliseconds, size);

    resultList.Clear();
    timer.Restart();

    // Half will hit the dictionary, half will not
    foreach (var i in sourceData)
    {
        Guid result;
        if(target.TryGetValue(i, out result))
        {
            resultList.Add(result);
        }
    }
    timer.Stop();
    Console.WriteLine("TryGet access: {0} milliseconds for {1} total requests",
                        timer.ElapsedMilliseconds, size);

    Console.Read();
}

Execute!

Contains/Index access: 717 milliseconds for 10000000 total requests
TryGet access: 439 milliseconds for 10000000 total requests

Wow - the framework's method is 39% faster than the null check. This isn't necessarily shocking, since the .NET guys built the method, but it's enough of a difference to warrant use of that method.

Wrapping the (ugh) out Method

Since I can't stand using out-parameter-using methods, let's wrap this in a helpful extension method which attempts to get a result and returns null if none is found.

public static TValue ValueOrDefault<TKey, TValue>(this IDictionary<TKey, TValue> source, 
    TKey key) where TValue : class
{
    TValue result;
    return source.TryGetValue(key, out result) ? result : null;
}

The only problem here is that IDictionary<>s with a struct for the value type will not work. We need a different method for that.

public static TValue? ValueOrNull<TKey, TValue>(this IDictionary<TKey, TValue> source, 
    TKey key) where TValue : struct
{
    TValue result;
    return source.TryGetValue(key, out result) ? result : (TValue?)null;
}