LINQ: Building an IQueryable Provider - Part I

This is the first in a series of posts on how to build a LINQ IQueryable provider. Each post builds on the last one.

Complete list of posts in the Building an IQueryable Provider series

I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using IQueryable. People have been asking me advice on doing this for quite some time now, whether through internal Microsoft email or questions on the forums or by cracking the encryption and mailing me directly. Of course, I’ve mostly replied with “I’m working on a sample that will show you everything” letting them know that soon all will be revealed. However, instead of just posting a full sample here I felt it prudent to go step by step so I can actual dive deep and explain everything that is going on instead of just dumping it all in your lap and letting you find your own way.

The first thing I ought to point out to you is that IQueryable has changed in Beta 2. It’s no longer just one interface, having been factored into two: IQueryable and IQueryProvider. Let’s just walk through these before we get to actually implementing them.

If you use Visual Studio to ‘go to definition’ you get something that looks like this:

public interface IQueryable : IEnumerable {

Type ElementType { get; }

Expression Expression { get; }

IQueryProvider Provider { get; }

}

public interface IQueryable<T> : IEnumerable<T>, IQueryable, IEnumerable {

}

Of course, IQueryable no longer looks all that interesting; the good stuff has been pushed off into the new interface IQueryProvider. Yet before I get into that, IQueryable is still worth looking at. As you can see the only things IQueryable has are three read-only properties. The first one gives you the element type (or the ‘T’ in IQueryable<T>). It’s important to note that all classes that implement IQueryable must also implement IQueryable<T> for some T and vice versa. The generic IQueryable<T> is the one you use most often in method signatures and the like. The non-generic IQueryable exist primarily to give you a weakly typed entry point primarily for dynamic query building scenarios.

The second property gives you the expression that corresponds to the query. This is quintessential essence of IQueryable’s being. The actual ‘query’ underneath the hood of an IQueryable is an expression that represents the query as a tree of LINQ query operators/method calls. This is the part of the IQueryable that your provider must comprehend in order to do anything useful. If you look deeper you will see that the whole IQueryable infrastructure (including the System.Linq.Queryable version of LINQ standard query operators) is just a mechanism to auto-construct expression tree nodes for you. When you use the Queryable.Where method to apply a filter to an IQueryable, it simply builds you a new IQueryable adding a method-call expression node on top of the tree representing the call you just made to Queryable.Where. Don’t believe me? Try it yourself and see what it does.

Now that just leaves us with the last property that gives us an instance of this new interface IQueryProvider. What we’ve done is move all the methods that implement constructing new IQueryables and executing them off into a separate interface that more logically represents your true provider.

public interface IQueryProvider {

IQueryable CreateQuery(Expression expression);

IQueryable<TElement> CreateQuery<TElement>(Expression expression);

object Execute(Expression expression);

TResult Execute<TResult>(Expression expression);

}

Looking at the IQueryProvider interface you might be thinking, “why all these methods?” The truth is that there are really only two operations, CreateQuery and Execute, we just have both a generic and a non-generic form of each. The generic forms are used most often when you write queries directly in the programming language and perform better since we can avoid using reflection to construct instances.

The CreateQuery method does exactly what it sounds like it does. It creates a new instance of an IQueryable query based on the specified expression tree. When someone calls this method they are basically asking your provider to build a new instance of an IQueryable that when enumerated will invoke your query provider and process this specific query expression. The Queryable form of the standard query operators use this method to construct new IQueryable’s that stay associated with your provider. Note the caller can pass any expression tree possible to this API. It may not even be a legal query for your provider. However, the only thing that must be true is that expression itself must be typed to return/produce a correctly typed IQueryable. You see the IQueryable contains an expression that represents a snippet of code that if turned into actual code and executed would reconstruct that very same IQueryable (or its equivalent).

The Execute method is the entry point into your provider for actually executing query expressions. Having an explicit execute instead of just relying on IEnumerable.GetEnumerator() is important because it allows execution of expressions that do not necessarily yield sequences. For example, the query “myquery.Count()” returns a single integer. The expression tree for this query is a method call to the Count method that returns the integer. The Queryable.Count method (as well as the other aggregates and the like) use this method to execute the query ‘right now’.

There, that doesn’t seem so frightening does it? You could implement all those methods easily, right? Sure you could, but why bother. I’ll do it for you. Well all except for the execute method. I’ll show you how to do that in a later post.

First let’s start with the IQuerayble. Since this interface has been split into two, it’s now possible to implement the IQueryable part just once and re-use it for any provider. I’ll implement a class called Query<T> that implements IQueryable<T> and all the rest.

public class Query<T> : IQueryable<T>, IQueryable, IEnumerable<T>, IEnumerable, IOrderedQueryable<T>, IOrderedQueryable {

QueryProvider provider;

Expression expression;

public Query(QueryProvider provider) {

if (provider == null) {

throw new ArgumentNullException("provider");

}

this.provider = provider;

this.expression = Expression.Constant(this);

}

public Query(QueryProvider provider, Expression expression) {

if (provider == null) {

throw new ArgumentNullException("provider");

}

if (expression == null) {

throw new ArgumentNullException("expression");

}

if (!typeof(IQueryable<T>).IsAssignableFrom(expression.Type)) {

throw new ArgumentOutOfRangeException("expression");

}

this.provider = provider;

this.expression = expression;

}

Expression IQueryable.Expression {

get { return this.expression; }

}

Type IQueryable.ElementType {

get { return typeof(T); }

}

IQueryProvider IQueryable.Provider {

get { return this.provider; }

}

public IEnumerator<T> GetEnumerator() {

return ((IEnumerable<T>)this.provider.Execute(this.expression)).GetEnumerator();

}

IEnumerator IEnumerable.GetEnumerator() {

return ((IEnumerable)this.provider.Execute(this.expression)).GetEnumerator();

}

public override string ToString() {

return this.provider.GetQueryText(this.expression);

}

}

As you can see now, the IQueryable implementation is straightforward. This little object really does just hold onto an expression tree and a provider instance. The provider is where it really gets juicy.

Okay, now I need some provider to show you. I’ve implemented an abstract base class called QueryProvider that Query<T> referred to above. A real provider can just derive from this class and implement the Execute method.

public abstract class QueryProvider : IQueryProvider {

protected QueryProvider() {

}

IQueryable<S> IQueryProvider.CreateQuery<S>(Expression expression) {

return new Query<S>(this, expression);

}

IQueryable IQueryProvider.CreateQuery(Expression expression) {

Type elementType = TypeSystem.GetElementType(expression.Type);

try {

return (IQueryable)Activator.CreateInstance(typeof(Query<>).MakeGenericType(elementType), new object[] { this, expression });

}

catch (TargetInvocationException tie) {

throw tie.InnerException;

}

}

S IQueryProvider.Execute<S>(Expression expression) {

return (S)this.Execute(expression);

}

object IQueryProvider.Execute(Expression expression) {

return this.Execute(expression);

}

public abstract string GetQueryText(Expression expression);

public abstract object Execute(Expression expression);

}

I’ve implemented the IQueryProvider interface on my base class QueryProvider. The CreateQuery methods create new instances of Query<T> and the Execute methods forward execution to this great new and not-yet-implemented Execute method.

I suppose you can think of this as boilerplate code you have to write just to get started building a LINQ IQueryable provider. The real action happens inside the Execute method. That’s where your provider has the opportunity to make sense of the query by examining the expression tree.

And that’s what I’ll start showing next time.

UPDATE:

It looks like I’ve forget to define a little helper class my implementation was using, so here it is:

internal static class TypeSystem {

internal static Type GetElementType(Type seqType) {

Type ienum = FindIEnumerable(seqType);

if (ienum == null) return seqType;

return ienum.GetGenericArguments()[0];

}

private static Type FindIEnumerable(Type seqType) {

if (seqType == null || seqType == typeof(string))

return null;

if (seqType.IsArray)

return typeof(IEnumerable<>).MakeGenericType(seqType.GetElementType());

if (seqType.IsGenericType) {

foreach (Type arg in seqType.GetGenericArguments()) {

Type ienum = typeof(IEnumerable<>).MakeGenericType(arg);

if (ienum.IsAssignableFrom(seqType)) {

return ienum;

}

}

}

Type[] ifaces = seqType.GetInterfaces();

if (ifaces != null && ifaces.Length > 0) {

foreach (Type iface in ifaces) {

Type ienum = FindIEnumerable(iface);

if (ienum != null) return ienum;

}

}

if (seqType.BaseType != null && seqType.BaseType != typeof(object)) {

return FindIEnumerable(seqType.BaseType);

}

return null;

}

}

Yah, I know. There’s more ‘code’ in this helper than in all the rest. Sigh. J

Comments

  • Anonymous
    July 30, 2007
    I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using

  • Anonymous
    July 30, 2007
    Here's an anthology of VS 2008 Beta 2's changes to LINQ and its domain-specific implementations, which includes a brief description and link to this post: http://oakleafblog.blogspot.com/2007/07/linq-changes-from-orcas-beta-1-to-vs.html --rj

  • Anonymous
    July 30, 2007
    I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using

  • Anonymous
    July 30, 2007
    The comment has been removed

  • Anonymous
    July 31, 2007
    The comment has been removed

  • Anonymous
    July 31, 2007
    Well, the reason I asked about why the provider is part of the queryable is that I now can't have general code which works on a general set of entities and write a query there and use it on any of the providers I have available, i.e. one per database: I now have to provide this info to the datasource I'm using in the code which formulates the query. Sure, if I have just one provider, no problem. If I have generic code which can target sqlserver and oracle and db2 at the same time, I have a problem, as the code in my application should be generic (I now can do that) without knowledge of db's but when I have different providers, I can't because I have to pass these on in the code which formulates the query. So this then requires a query provider which is actually a placeholder which gets the real provider plugged in when the actual db to target is selected. or I'm missing something obvious :)

  • Anonymous
    July 31, 2007
    Frans, you may be confusing the concept of a LINQ provider with an ADO database provider.  You can certainly have your 'provider' target a variety of different databases, etc. Also, tying the IQueryable provider to the IQueryable only influences the default translation of the query. You can also get the Expression from any IQueryable and attempt to process it using another provider.

  • Anonymous
    July 31, 2007
    The comment has been removed

  • Anonymous
    July 31, 2007
    Thanks Matt for clearing that up. I indeed am confusing the two, so if I can have a normal provider which can later on be tied to an ado.net db provider, I'm OK :)

  • Anonymous
    July 31, 2007
    Additionally, I'm really happy you're writing these articles. I was a little disappointed when I saw that the docs to write a linq provider weren't included in orcas beta 2's docs but luckily these articles will help me get started :)

  • Anonymous
    August 01, 2007
    Part III? Wasn’t I done in the last post? Didn’t I have the provider actually working, translating, executing and returning a sequence of objects? Sure, that’s true, but only just so. The provider I built was really fragile. It only understood one major

  • Anonymous
    August 02, 2007
    The comment has been removed

  • Anonymous
    August 02, 2007
    Links to articles detailing how to create IQueryable providers: Matt Warren: http://blogs.msdn.com/mattwar/archive/2007/07/30/linq-building-an-iqueryable-provider-part-i.aspx

  • Anonymous
    August 03, 2007
    Over the past four parts of this series I have constructed a working LINQ IQueryable provider that targets ADO and SQL and has so far been able to translate both Queryable.Where and Queryable.Select standard query operators. Yet, as big of an accomplishment

  • Anonymous
    August 07, 2007
    Matt Warren présente sur un blog une implémentation d'un provider Linq vers SQL en plusieurs étapes.

  • Anonymous
    August 09, 2007
    Después de que muchos (un servidor incluido) se hayan roto literalmente la cabeza durante meses investigando

  • Anonymous
    August 09, 2007
    This is the sixth in a series of posts on how to build a LINQ IQueryable provider. If you have not read

  • Anonymous
    August 26, 2007
    Risorse su Linq to SQL

  • Anonymous
    August 28, 2007
    As you can probably tell from the title of my last few posts I've been doing some work with LINQ over

  • Anonymous
    August 28, 2007
    As you can probably tell from the title of my last few posts I've been doing some work with LINQ over

  • Anonymous
    August 28, 2007
    I recently spend a few (many) hours doing some research into the workings of LINQ providers for an internal

  • Anonymous
    September 03, 2007
    As you can probably tell from the title of my last few posts I've been doing some work with LINQ over

  • Anonymous
    September 04, 2007
    This is the seventh in a series of posts on how to build a LINQ IQueryable provider. If you have not

  • Anonymous
    September 04, 2007
    The comment has been removed

  • Anonymous
    September 25, 2007
    Welcome to the thirty-first edition of Community Convergence. This issue features links to seven very

  • Anonymous
    October 01, 2007
    At the most abstract level, LINQ (Language Integrated Query) can query against two types of provider

  • Anonymous
    October 09, 2007
    This is the eighth in a series of posts on how to build a LINQ IQueryable provider. If you have not read

  • Anonymous
    October 31, 2007
    The Banshee-to-Windows porting has been more or less done for a while and the code is about to be integrated

  • Anonymous
    November 29, 2007
    What is LINQ? LINQ stands for Language Integrated Query and is a DSL within C# for querying data. It

  • Anonymous
    December 05, 2007
    What is LINQ? LINQ stands for Language Integrated Query and is a DSL within C# for querying data. It

  • Anonymous
    January 01, 2008
    An Updated LINQ to WMI Implementation

  • Anonymous
    January 06, 2008
    Over the holidays Alex Turner, Mary Deyo and I added a new sample to the downloadable version of the

  • Anonymous
    January 06, 2008
    Over the holidays Alex Turner, Mary Deyo and I added a new sample to the downloadable version of the

  • Anonymous
    January 16, 2008
    This is the nineth in a series of posts on how to build a LINQ IQueryable provider. If you have not read

  • Anonymous
    January 31, 2008
    Just storing a couple of links for a rainy day: Mehfuz Hossain's LINQ provider basics article on Dotnetslackers

  • Anonymous
    January 31, 2008
    Just storing a couple of links for a rainy day: Mehfuz Hossain's LINQ provider basics article on Dotnetslackers

  • Anonymous
    February 06, 2008
    Check out the following from Matt Warrens blog posts, if you are interested on how to implement IQueryable...

  • Anonymous
    February 06, 2008
    Check out the following from Matt Warrens blog posts, if you are interested on how to implement IQueryable

  • Anonymous
    February 11, 2008
    Last year, when I was working at db4objects on db4o , I insisted on the need for db4o to act as a LINQ

  • Anonymous
    March 20, 2008
    Linq query providers appear all over the place. Some say &quot;Linq to Everything&quot; to refer to all

  • Anonymous
    May 02, 2008
    Someone asked a great question on the ADO.NET Entity Framework forums yesterday: how do I compose predicates

  • Anonymous
    May 27, 2008
    It seems that everyone else is chiming in on Danny Simmons' recent comparisons of the Entity Framework

  • Anonymous
    July 08, 2008
    This is the tenth in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you'll want to find a nice shady tree, relax and mediate on why your world is so confused and full of meaningless tasks that it

  • Anonymous
    July 10, 2008
    Dużo się m&#243;wi i pisze o tym, że LINQ jest elastyczne i rozszerzalne. Sam powtarzam, że aby podpiąć

  • Anonymous
    July 14, 2008
    This is the eleventh in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you’ll want to do so before proceeding, or at least before proceeding to copy the code into your own project and telling your

  • Anonymous
    July 25, 2008
    This week I am coming to you from the Microsoft Campus. So as you would expect I have a lot of energy

  • Anonymous
    November 17, 2008
    This is the twelfth in a series of posts on how to build a LINQ IQueryable provider. If you have not

  • Anonymous
    November 18, 2008
    Part I - Reusable IQueryable base classes Part II - Where and reusable Expression tree visitor Part II

  • Anonymous
    December 16, 2008
    At the most abstract level, LINQ (Language Integrated Query) can query against two types of provider

  • Anonymous
    January 05, 2009
    At the most abstract level, LINQ (Language Integrated Query) can query against two types of provider

  • Anonymous
    March 19, 2009
    http://blogs.msdn.com/mattwar/pages/linq-links.aspx Here&#39;s a list of all the posts in the building