Friday, 7 October 2016

Performance: Metadata calls vs Exceptions

Today I was embarking on a mission to check the existence of a field on an entity. If it exists, do something, if not... well, do nothing! My first thoughts were "metadata... urgh!". As all of us in the CRM development world know, the metadata is notoriously slow. Reading stackoverflow I found that the recommended answer was always using the metadata. So I thought to myself there's got to be a better way to do this. Which gave me an idea... which performs faster, exceptions or the metadata? To hit the metadata you need to run something like this:
RetrieveEntityRequest request = new RetrieveEntityRequest
{
    EntityFilters = EntityFilters.Attributes,
    LogicalName = "account"
};
var response = (RetrieveEntityResponse)service.Execute(request);
AttributeMetadata first = null;
foreach (var element in response.EntityMetadata.Attributes)
{
    if (element.LogicalName == "xyz_fieldname")
    {
        first = element;
        break;
    }
}
var fieldExists = first != null;
I chose not to use linq/expressions when checking for the field just to try keep it as performant as possible. This code is fine in general, but boy is it slow. So I came with this as an alternative instead:
try
{
    var query = new QueryExpression("account");
    query.Criteria.AddCondition("accountid", ConditionOperator.Equal, "294450db-46c9-447e-a642-3babf913d800");
    query.NoLock = true;
    query.ColumnSet = new ColumnSet("xyz_fieldname");
    service.RetrieveMultiple(query);
}
catch
{
    // ignored
}
Using a query expression has 2 advantages, you're running the query against the primary key (accountid). You don't care about the id itself, the code will either throw an exception if the field doesn't exist, or return no records if it succeeds (1 record if you are the unluckiest guy on the planet to get a matching guid... but even then it would be faster). The second advantage of a query expression is you can run it using a nolock. You really don't care about the result set, the purpose isn't to find a record, it's to see if including the column forces an exception. So how do you test the execution of this? I wrote a console app that used a stop watch to wrap each call. The first time I ran the tests I ran each console app independently so not to skew the results by code optimization on what call ran first etc. And the results I got for a single call were the exception generally executed about 1.5 times faster. Sample code is this:
private static void RunExceptionTests(IOrganizationService service, int steps)
{
    Console.WriteLine("Testing exception with {0} steps", steps);
    var stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < steps; i++)
    {
        try
        {
            var query = new QueryExpression("account");
            query.Criteria.AddCondition("accountid", ConditionOperator.Equal, "294450db-46c9-447e-a642-3babf913d800");
            query.NoLock = true;
            query.ColumnSet = new ColumnSet("xyz_fieldname");
            service.RetrieveMultiple(query);
        }
        catch
        {
            // ignored
        }
    }
    stopwatch.Stop();
    Console.WriteLine("Milliseconds taken: {0}", stopwatch.ElapsedMilliseconds);
}

private static void RunMetadataTest(IOrganizationService service, int steps)
{
    Console.WriteLine("Testing metadata with {0} steps", steps);
    var stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < steps; i++)
    {
        RetrieveEntityRequest request = new RetrieveEntityRequest
        {
            EntityFilters = EntityFilters.Attributes,
            LogicalName = "account"
        };
        var response = (RetrieveEntityResponse)service.Execute(request);
        AttributeMetadata first = null;
        foreach (var element in response.EntityMetadata.Attributes)
        {
            if (element.LogicalName == "xyz_fieldname")
            {
                first = element;
                break;
            }
        }
        var fieldExists = first != null;
    }
    stopwatch.Stop();
    Console.WriteLine("Milliseconds taken: {0}", stopwatch.ElapsedMilliseconds);
}
I ran several tests on CRM online varying between multiple/single calls to the metadata vs multiple/single retrieve multiple calls throwing an exception. Exceptions always outperformed the metadata. If you're performing multiple calls within the same code it jumps to about 3 times faster (I'm guessing optimizations come into play). Because of how plugins are loaded into memory for faster execution I would wonder if it would regularly perform at 2 - 3 times faster than a metadata call. Either way, the bottle neck is the call to the Metadata service which cannot be optimized unless you introduce caching and more code complexity. Also, if the field exists you won't get an exception which means yet another performance bonus. The only scenario I haven't tested is running it against a massively heavily used entity... but if you're hitting your database so hard that a nolock retrieve cannot return in an acceptable time frame you probably have bigger problems to worry about! To conclude, can I just say the following. Micrososft, will you fix your metadata service already! It's been slow for donkeys years and is a real annoyance when you have to use it.

Thursday, 6 October 2016

Extended CrmSvcUtil - Exporting an attribute list

A little feature of the Extended CrmSvcUtil I neglected to mention in my previous post (it was a late feature!) is the ability to export a list of strongly typed attribute names. This helps remove "magic strings" from your source and introduce some strongly typed attribute checking. This is useful, even when using Early Bound, as you often need to check the existence of an attribute.

For example, let's say you are writing a plugin that fires on update of contact and you wish to include a pre image containing the parent account. What you will often see is code like this:

if (preImage.Contains("parentcustomerid") == false)
{
    //trace / throw an exception stating the parent count hasn't been provided...
}

Checking for null is not the same as checking for existence, because some contacts might not have a parent account set. So the check for existence is often quite important. Using an attribute list allows you to strongly type this instead as follows:

if (preImage.Contains(ContactAttributes.ParentCustomer) == false)
{
    //trace / throw an exception stating the parent count hasn't been provided...
}

Another area where this is incredibly useful is when building queries using Query Expressions or Fetch Expressions. If you want to include a set of columns, or set a condition on an attribute you will end up with this type of code:

QueryExpression qe = new QueryExpression();
qe.EntityName = "contact";
qe.ColumnSet = new ColumnSet();
qe.ColumnSet.Columns.Add("parentcustomerid");

Being able to specify the attribute strongly, like the following looks much better:

qe.ColumnSet.Columns.Add(ContactAttributes.ParentCustomer);

This helps work around many issues, like typo bugs or name change, like in cases where somebody accidentally creates a field called "new_ProjjectType and wishes to fix the name of the field. If 5 or 6 plugins already reference this field and perform some logic based on its value you might end up with multiple "magic strings" to fix across your code. Using an attribute list is a 1 fix solution to the problem.

The source for the Extended CrmSvcUtil can be downloaded from git hub with the latest release available to download from here