LINQ: Building an IQueryable Provider - Part I
This is the first in a series of posts on how to build a LINQ IQueryable provider. Each post builds on the last one.
Complete list of posts in the Building an IQueryable Provider series
I’ve been meaning for a while to start up a series of posts that covers building LINQ providers using IQueryable. People have been asking me advice on doing this for quite some time now, whether through internal Microsoft email or questions on the forums or by cracking the encryption and mailing me directly. Of course, I’ve mostly replied with “I’m working on a sample that will show you everything” letting them know that soon all will be revealed. However, instead of just posting a full sample here I felt it prudent to go step by step so I can actual dive deep and explain everything that is going on instead of just dumping it all in your lap and letting you find your own way.
The first thing I ought to point out to you is that IQueryable has changed in Beta 2. It’s no longer just one interface, having been factored into two: IQueryable and IQueryProvider. Let’s just walk through these before we get to actually implementing them.
If you use Visual Studio to ‘go to definition’ you get something that looks like this:
public interface IQueryable : IEnumerable {
Type ElementType { get; }
Expression Expression { get; }
IQueryProvider Provider { get; }
}
public interface IQueryable<T> : IEnumerable<T>, IQueryable, IEnumerable {
}
Of course, IQueryable no longer looks all that interesting; the good stuff has been pushed off into the new interface IQueryProvider. Yet before I get into that, IQueryable is still worth looking at. As you can see the only things IQueryable has are three read-only properties. The first one gives you the element type (or the ‘T’ in IQueryable<T>). It’s important to note that all classes that implement IQueryable must also implement IQueryable<T> for some T and vice versa. The generic IQueryable<T> is the one you use most often in method signatures and the like. The non-generic IQueryable exist primarily to give you a weakly typed entry point primarily for dynamic query building scenarios.
The second property gives you the expression that corresponds to the query. This is quintessential essence of IQueryable’s being. The actual ‘query’ underneath the hood of an IQueryable is an expression that represents the query as a tree of LINQ query operators/method calls. This is the part of the IQueryable that your provider must comprehend in order to do anything useful. If you look deeper you will see that the whole IQueryable infrastructure (including the System.Linq.Queryable version of LINQ standard query operators) is just a mechanism to auto-construct expression tree nodes for you. When you use the Queryable.Where method to apply a filter to an IQueryable, it simply builds you a new IQueryable adding a method-call expression node on top of the tree representing the call you just made to Queryable.Where. Don’t believe me? Try it yourself and see what it does.
Now that just leaves us with the last property that gives us an instance of this new interface IQueryProvider. What we’ve done is move all the methods that implement constructing new IQueryables and executing them off into a separate interface that more logically represents your true provider.
public interface IQueryProvider {
IQueryable CreateQuery(Expression expression);
IQueryable<TElement> CreateQuery<TElement>(Expression expression);
object Execute(Expression expression);
TResult Execute<TResult>(Expression expression);
}
Looking at the IQueryProvider interface you might be thinking, “why all these methods?” The truth is that there are really only two operations, CreateQuery and Execute, we just have both a generic and a non-generic form of each. The generic forms are used most often when you write queries directly in the programming language and perform better since we can avoid using reflection to construct instances.
The CreateQuery method does exactly what it sounds like it does. It creates a new instance of an IQueryable query based on the specified expression tree. When someone calls this method they are basically asking your provider to build a new instance of an IQueryable that when enumerated will invoke your query provider and process this specific query expression. The Queryable form of the standard query operators use this method to construct new IQueryable’s that stay associated with your provider. Note the caller can pass any expression tree possible to this API. It may not even be a legal query for your provider. However, the only thing that must be true is that expression itself must be typed to return/produce a correctly typed IQueryable. You see the IQueryable contains an expression that represents a snippet of code that if turned into actual code and executed would reconstruct that very same IQueryable (or its equivalent).
The Execute method is the entry point into your provider for actually executing query expressions. Having an explicit execute instead of just relying on IEnumerable.GetEnumerator() is important because it allows execution of expressions that do not necessarily yield sequences. For example, the query “myquery.Count()” returns a single integer. The expression tree for this query is a method call to the Count method that returns the integer. The Queryable.Count method (as well as the other aggregates and the like) use this method to execute the query ‘right now’.
There, that doesn’t seem so frightening does it? You could implement all those methods easily, right? Sure you could, but why bother. I’ll do it for you. Well all except for the execute method. I’ll show you how to do that in a later post.
First let’s start with the IQuerayble. Since this interface has been split into two, it’s now possible to implement the IQueryable part just once and re-use it for any provider. I’ll implement a class called Query<T> that implements IQueryable<T> and all the rest.
public class Query<T> : IQueryable<T>, IQueryable, IEnumerable<T>, IEnumerable, IOrderedQueryable<T>, IOrderedQueryable {
QueryProvider provider;
Expression expression;
public Query(QueryProvider provider) {
if (provider == null) {
throw new ArgumentNullException("provider");
}
this.provider = provider;
this.expression = Expression.Constant(this);
}
public Query(QueryProvider provider, Expression expression) {
if (provider == null) {
throw new ArgumentNullException("provider");
}
if (expression == null) {
throw new ArgumentNullException("expression");
}
if (!typeof(IQueryable<T>).IsAssignableFrom(expression.Type)) {
throw new ArgumentOutOfRangeException("expression");
}
this.provider = provider;
this.expression = expression;
}
Expression IQueryable.Expression {
get { return this.expression; }
}
Type IQueryable.ElementType {
get { return typeof(T); }
}
IQueryProvider IQueryable.Provider {
get { return this.provider; }
}
public IEnumerator<T> GetEnumerator() {
return ((IEnumerable<T>)this.provider.Execute(this.expression)).GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator() {
return ((IEnumerable)this.provider.Execute(this.expression)).GetEnumerator();
}
public override string ToString() {
return this.provider.GetQueryText(this.expression);
}
}
As you can see now, the IQueryable implementation is straightforward. This little object really does just hold onto an expression tree and a provider instance. The provider is where it really gets juicy.
Okay, now I need some provider to show you. I’ve implemented an abstract base class called QueryProvider that Query<T> referred to above. A real provider can just derive from this class and implement the Execute method.
public abstract class QueryProvider : IQueryProvider {
protected QueryProvider() {
}
IQueryable<S> IQueryProvider.CreateQuery<S>(Expression expression) {
return new Query<S>(this, expression);
}
IQueryable IQueryProvider.CreateQuery(Expression expression) {
Type elementType = TypeSystem.GetElementType(expression.Type);
try {
return (IQueryable)Activator.CreateInstance(typeof(Query<>).MakeGenericType(elementType), new object[] { this, expression });
}
catch (TargetInvocationException tie) {
throw tie.InnerException;
}
}
S IQueryProvider.Execute<S>(Expression expression) {
return (S)this.Execute(expression);
}
object IQueryProvider.Execute(Expression expression) {
return this.Execute(expression);
}
public abstract string GetQueryText(Expression expression);
public abstract object Execute(Expression expression);
}
I’ve implemented the IQueryProvider interface on my base class QueryProvider. The CreateQuery methods create new instances of Query<T> and the Execute methods forward execution to this great new and not-yet-implemented Execute method.
I suppose you can think of this as boilerplate code you have to write just to get started building a LINQ IQueryable provider. The real action happens inside the Execute method. That’s where your provider has the opportunity to make sense of the query by examining the expression tree.
And that’s what I’ll start showing next time.
UPDATE:
It looks like I’ve forget to define a little helper class my implementation was using, so here it is:
internal static class TypeSystem {
internal static Type GetElementType(Type seqType) {
Type ienum = FindIEnumerable(seqType);
if (ienum == null) return seqType;
return ienum.GetGenericArguments()[0];
}
private static Type FindIEnumerable(Type seqType) {
if (seqType == null || seqType == typeof(string))
return null;
if (seqType.IsArray)
return typeof(IEnumerable<>).MakeGenericType(seqType.GetElementType());
if (seqType.IsGenericType) {
foreach (Type arg in seqType.GetGenericArguments()) {
Type ienum = typeof(IEnumerable<>).MakeGenericType(arg);
if (ienum.IsAssignableFrom(seqType)) {
return ienum;
}
}
}
Type[] ifaces = seqType.GetInterfaces();
if (ifaces != null && ifaces.Length > 0) {
foreach (Type iface in ifaces) {
Type ienum = FindIEnumerable(iface);
if (ienum != null) return ienum;
}
}
if (seqType.BaseType != null && seqType.BaseType != typeof(object)) {
return FindIEnumerable(seqType.BaseType);
}
return null;
}
}
Yah, I know. There’s more ‘code’ in this helper than in all the rest. Sigh. J
Comments
Anonymous
July 30, 2007
I’ve been meaning for a while to start up a series of posts that covers building LINQ providers usingAnonymous
July 30, 2007
Here's an anthology of VS 2008 Beta 2's changes to LINQ and its domain-specific implementations, which includes a brief description and link to this post: http://oakleafblog.blogspot.com/2007/07/linq-changes-from-orcas-beta-1-to-vs.html --rjAnonymous
July 30, 2007
I’ve been meaning for a while to start up a series of posts that covers building LINQ providers usingAnonymous
July 30, 2007
The comment has been removedAnonymous
July 31, 2007
The comment has been removedAnonymous
July 31, 2007
Well, the reason I asked about why the provider is part of the queryable is that I now can't have general code which works on a general set of entities and write a query there and use it on any of the providers I have available, i.e. one per database: I now have to provide this info to the datasource I'm using in the code which formulates the query. Sure, if I have just one provider, no problem. If I have generic code which can target sqlserver and oracle and db2 at the same time, I have a problem, as the code in my application should be generic (I now can do that) without knowledge of db's but when I have different providers, I can't because I have to pass these on in the code which formulates the query. So this then requires a query provider which is actually a placeholder which gets the real provider plugged in when the actual db to target is selected. or I'm missing something obvious :)Anonymous
July 31, 2007
Frans, you may be confusing the concept of a LINQ provider with an ADO database provider. You can certainly have your 'provider' target a variety of different databases, etc. Also, tying the IQueryable provider to the IQueryable only influences the default translation of the query. You can also get the Expression from any IQueryable and attempt to process it using another provider.Anonymous
July 31, 2007
The comment has been removedAnonymous
July 31, 2007
Thanks Matt for clearing that up. I indeed am confusing the two, so if I can have a normal provider which can later on be tied to an ado.net db provider, I'm OK :)Anonymous
July 31, 2007
Additionally, I'm really happy you're writing these articles. I was a little disappointed when I saw that the docs to write a linq provider weren't included in orcas beta 2's docs but luckily these articles will help me get started :)Anonymous
August 01, 2007
Part III? Wasn’t I done in the last post? Didn’t I have the provider actually working, translating, executing and returning a sequence of objects? Sure, that’s true, but only just so. The provider I built was really fragile. It only understood one majorAnonymous
August 02, 2007
The comment has been removedAnonymous
August 02, 2007
Links to articles detailing how to create IQueryable providers: Matt Warren: http://blogs.msdn.com/mattwar/archive/2007/07/30/linq-building-an-iqueryable-provider-part-i.aspxAnonymous
August 03, 2007
Over the past four parts of this series I have constructed a working LINQ IQueryable provider that targets ADO and SQL and has so far been able to translate both Queryable.Where and Queryable.Select standard query operators. Yet, as big of an accomplishmentAnonymous
August 07, 2007
Matt Warren présente sur un blog une implémentation d'un provider Linq vers SQL en plusieurs étapes.Anonymous
August 09, 2007
Después de que muchos (un servidor incluido) se hayan roto literalmente la cabeza durante meses investigandoAnonymous
August 09, 2007
This is the sixth in a series of posts on how to build a LINQ IQueryable provider. If you have not readAnonymous
August 26, 2007
Risorse su Linq to SQLAnonymous
August 28, 2007
As you can probably tell from the title of my last few posts I've been doing some work with LINQ overAnonymous
August 28, 2007
As you can probably tell from the title of my last few posts I've been doing some work with LINQ overAnonymous
August 28, 2007
I recently spend a few (many) hours doing some research into the workings of LINQ providers for an internalAnonymous
September 03, 2007
As you can probably tell from the title of my last few posts I've been doing some work with LINQ overAnonymous
September 04, 2007
This is the seventh in a series of posts on how to build a LINQ IQueryable provider. If you have notAnonymous
September 04, 2007
The comment has been removedAnonymous
September 25, 2007
Welcome to the thirty-first edition of Community Convergence. This issue features links to seven veryAnonymous
October 01, 2007
At the most abstract level, LINQ (Language Integrated Query) can query against two types of providerAnonymous
October 09, 2007
This is the eighth in a series of posts on how to build a LINQ IQueryable provider. If you have not readAnonymous
October 31, 2007
The Banshee-to-Windows porting has been more or less done for a while and the code is about to be integratedAnonymous
November 29, 2007
What is LINQ? LINQ stands for Language Integrated Query and is a DSL within C# for querying data. ItAnonymous
December 05, 2007
What is LINQ? LINQ stands for Language Integrated Query and is a DSL within C# for querying data. ItAnonymous
January 01, 2008
An Updated LINQ to WMI ImplementationAnonymous
January 06, 2008
Over the holidays Alex Turner, Mary Deyo and I added a new sample to the downloadable version of theAnonymous
January 06, 2008
Over the holidays Alex Turner, Mary Deyo and I added a new sample to the downloadable version of theAnonymous
January 16, 2008
This is the nineth in a series of posts on how to build a LINQ IQueryable provider. If you have not readAnonymous
January 31, 2008
Just storing a couple of links for a rainy day: Mehfuz Hossain's LINQ provider basics article on DotnetslackersAnonymous
January 31, 2008
Just storing a couple of links for a rainy day: Mehfuz Hossain's LINQ provider basics article on DotnetslackersAnonymous
February 06, 2008
Check out the following from Matt Warrens blog posts, if you are interested on how to implement IQueryable...Anonymous
February 06, 2008
Check out the following from Matt Warrens blog posts, if you are interested on how to implement IQueryableAnonymous
February 11, 2008
Last year, when I was working at db4objects on db4o , I insisted on the need for db4o to act as a LINQAnonymous
March 20, 2008
Linq query providers appear all over the place. Some say "Linq to Everything" to refer to allAnonymous
May 02, 2008
Someone asked a great question on the ADO.NET Entity Framework forums yesterday: how do I compose predicatesAnonymous
May 27, 2008
It seems that everyone else is chiming in on Danny Simmons' recent comparisons of the Entity FrameworkAnonymous
July 08, 2008
This is the tenth in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you'll want to find a nice shady tree, relax and mediate on why your world is so confused and full of meaningless tasks that itAnonymous
July 10, 2008
Dużo się mówi i pisze o tym, że LINQ jest elastyczne i rozszerzalne. Sam powtarzam, że aby podpiąćAnonymous
July 14, 2008
This is the eleventh in a series of posts on how to build a LINQ IQueryable provider. If you have not read the previous posts you’ll want to do so before proceeding, or at least before proceeding to copy the code into your own project and telling yourAnonymous
July 25, 2008
This week I am coming to you from the Microsoft Campus. So as you would expect I have a lot of energyAnonymous
November 17, 2008
This is the twelfth in a series of posts on how to build a LINQ IQueryable provider. If you have notAnonymous
November 18, 2008
Part I - Reusable IQueryable base classes Part II - Where and reusable Expression tree visitor Part IIAnonymous
December 16, 2008
At the most abstract level, LINQ (Language Integrated Query) can query against two types of providerAnonymous
January 05, 2009
At the most abstract level, LINQ (Language Integrated Query) can query against two types of providerAnonymous
March 19, 2009
http://blogs.msdn.com/mattwar/pages/linq-links.aspx Here's a list of all the posts in the building