Jaa


LINQ Macros

A colleague was asking how to construct a particular LINQ “operator macro” today. Basically, he was finding it inconvenient to repeat boilerplate for particular operator patterns in his code, but was struggling to inline expression snippets into LINQ query expressions. Thought I’d share a sample macro because the pattern is generally useful. I’ll illustrate a simple Left Anti Semi Join (LASJ) operator macro, but you can hopefully recognize other possibilities as well.

The LASJ operator takes as arguments a “left” collection, a “right” collection and a join predicate. It returns all elements on the left that have no matching elements on the right. My first attempt uses the LINQ “Any” operator:

public static IEnumerable<TLeft> LeftAntiSemiJoin<TLeft, TRight>(

    this IEnumerable<TLeft> left,

    IEnumerable<TRight> right,

    Func<TLeft, TRight, bool> predicate)

{

    return left.Where(l => !right.Where(r => predicate(l, r)).Any());

}

Works fine, but what if I want to issue a database query? In the above example, I’m potentially treating IQueryable collections as in-memory IEnumerable collections, which is horribly inefficient. LINQ to Objects will scan the contents of the left collection and for every element will scan the contents of right collection until it finds an element matching the predicate. Let’s change the operator slightly to address this concern:

public static IQueryable<TLeft> LeftAntiSemiJoin<TLeft, TRight>(

    this IQueryable<TLeft> left,

    IQueryable<TRight> right,

    Func<TLeft, TRight, bool> predicate)

{

    return left.Where(l => !right.Where(r => predicate(l, r)).Any());

}

Unfortunately, I now get a NotSupportedException at runtime from LINQ to SQL: “Method 'System.Object DynamicInvoke(System.Object[])' has no supported translation to SQL. ” The problem is the opaque predicate argument, which cannot be evaluated remotely on SQL Server. So let’s turn the predicate into a lambda expression:

public static IQueryable<TLeft> LeftAntiSemiJoin<TLeft, TRight>(

    this IQueryable<TLeft> left,

    IQueryable<TRight> right,

    Expression<Func<TLeft, TRight, bool>> predicate)

{

    return left.Where(l => !right.Where(r => predicate(l, r)).Any());

}

Still no luck: this time the C# compiler complains that “ 'predicate' is a 'variable' but is used like a 'method' ”. We’ll need to manually inline the predicate expression instead:

public static IQueryable<TLeft> LeftAntiSemiJoin<TLeft, TRight>(

    IQueryable<TLeft> left,

    IQueryable<TRight> right,

    Expression<Func<TLeft, TRight, bool>> predicate)

{

    var leftPrm = predicate.Parameters[0];

    var rightPrm = predicate.Parameters[1];

   

    // retrieve methods

    Func<IQueryable<TRight>, bool> anyDelegate = Queryable.Any;

    var anyMethod = anyDelegate.Method;

    Func<IQueryable<TRight>, Expression<Func<TRight, bool>>, IQueryable<TRight>> whereDelegate = Queryable.Where;

    var whereMethod = whereDelegate.Method;

   

    // l => !right.Where(r => predicate(l, r)).Any()

    var leftPredicate = Expression.Lambda<Func<TLeft, bool>>(

        Expression.Not(

            Expression.Call(anyMethod,

                Expression.Call(whereMethod,

                    Expression.Constant(right),

                    Expression.Lambda<Func<TRight, bool>>(predicate.Body, rightPrm)))),

        leftPrm);

       

    return left.Where(leftPredicate);

}

A working solution! A couple of observations:

· Notice that this approach does not rely on Expression.Invoke. Some LINQ providers (LINQ to Entities and LINQ to StreamInsight among them) do not support invocation expressions. By reusing the existing parameter expressions (leftPrm and rightPrm) from the predicate argument within the constructed expression, I avoid the need to “rebind” any parameters. See my earlier post for a more general workaround to the invocation expression limitation.

· One of my favorite tricks: rather than relying on Type.GetMethod and MethodInfo.MakeGenericMethod, I’m retrieving MethodInfos from typed delegates. More robust than the more conventional solution because the C# compiler statically binds to the appropriate method signature.

A gotcha: if you attempt to use this (or other) operator macros within a lambda body, some LINQ providers will balk at the unrecognized method. For instance, LINQ to SQL is fine with the following usage of the macro because it doesn’t see the LeftAntiSemiJoin method call in the resulting expression tree (the macro is expanded before reaching LINQ to SQL):

var query1 = LeftAntiSemiJoin(Categories, Products, (c, p) => c.Cid == p.Cid);

but complains about an unsupported method when the same macro appears within an expression:

 var query2 = from c1 in Categories

             from c2 in LeftAntiSemiJoin(Categories, Products, (c, p) => c.Cid == p.Cid)

             select c1;

You can work around this (intentional) limitation by assigning the expanded query to a local variable (we’ve already assigned the expanded version to query1 so we can reuse it here):

var query3 = from c1 in Categories

     from c2 in query1

         select c1;

StreamInsight implements a temporal version of LASJ as well. It’s similar to the familiar SQL Server operator but returns left-hand events for time intervals during which no corresponding right-hand event exists. The corresponding macro operator is shown below. The only significant difference from the IQueryable version is that the IsEmpty stream operator is used in place of the Any sequence operator:

public static CepStream<TLeft> LeftAntiSemiJoin<TLeft, TRight>(

  CepStream<TLeft> left, CepStream<TRight> right, Expression<Func<TLeft, TRight, bool>> predicate)

{

    var leftPrm = predicate.Parameters[0];

    var rightPrm = predicate.Parameters[1];

   

    // retrieve methods

    Func<CepStream<TRight>, bool> isEmptyDelegate = CepStream.IsEmpty;

    var isEmptyMethod = isEmptyDelegate.Method;

    Func<CepStream<TRight>, Expression<Func<TRight, bool>>, CepStream<TRight>> whereDelegate = CepStream.Where;

    var whereMethod = whereDelegate.Method;

   

    // l => right.Where(r => predicate(l, r)).IsEmpty()

    var leftPredicate = Expression.Lambda<Func<TLeft, bool>>(

        Expression.Call(isEmptyMethod,

            Expression.Call(whereMethod,

                Expression.Constant(right),

                Expression.Lambda<Func<TRight, bool>>(predicate.Body, rightPrm))),

        leftPrm);

       

    return left.Where(leftPredicate);

}

Comments

  • Anonymous
    January 16, 2011
    You can do similar things with LinqKit (www.albahari.com/.../linqkit.aspx) without manually having to inline the methods. Especially when you reuse a certain part of a query in multiple other queries, it gets both tedious, and you're repeating yourself when you manually inline them.