Another Model for Query Interpretation

An imperative model for interpreting Linq to Objects queries has already been discussed, but are there any other models or method for interpreting queries?  It turns out that there are other models.  Linq to SQL queries operate in a completely different manner.  In the discussion about Linq to Object queries, it was pointed out that these two queries are fundamentally different even though they produce the same results.

var q1 = from x in foo

         from y in bar

         where p(x)

         select f(x,y);

var q2 = from x in foo

         where p(x)

         from y in bar

         select f(x, y);

They are different because each clause is executed one after the other so there are performance implications based on the order of the clauses.  In this case, the second query is more efficient.  But this does not apply to Linq to SQL queries.  Take a look at the following two queries:

var q1 = from c1 in db.Customers
where c1.City == "London"
from c2 in db.Customers
select new { name1 = c1.ContactName, name2 = c2.ContactName };

var q2 = from c1 in db.Customers
from c2 in db.Customers
where c1.City == "London"
select new { name1 = c1.ContactName, name2 = c2.ContactName };

Linq to SQL queries do not execute in the program but rather they are converted to SQL and then executed on the database.  It is easy to inspect the SQL that is generated from a given C# query.  Here is the generated SQL for each query:

SELECT [t0].[ContactName], [t1].[ContactName] AS [ContactName2]
FROM [Customers] AS [t0], [Customers] AS [t1]
WHERE [t0].[City] = @p0

SELECT [t0].[ContactName], [t1].[ContactName] AS [ContactName2]
FROM [Customers] AS [t0], [Customers] AS [t1]
WHERE [t0].[City] = @p0

Notice that they are exactly the same.  So it doesn't matter which order was used to specify the query.  We already examined why the order of the clauses affects the performance of Linq to Objects queries, but why doesn't the order affect the performance in Linq to SQL queries?  The reason is that Linq to SQL queries do not use delegates but rather they use expression trees (quoted lambdas).  The Linq to SQL provider takes the expression trees and the sequence of method calls and builds up a structure that encodes what the query was describing.  A translator then transforms this structure into SQL; however, the translator is free to analyze the whole query at once and even discard or optimize parts of the query.  So Linq to SQL takes the two queries, analyzing them, and then transforming them into the same SQL.  The expression trees enable Linq to SQL to support a declarative model of queries.

Given this, it is easy to imagine that an in-memory query provider similar to Linq to Objects could be constructed that did not take delegates but takes expression trees.  It could act as a query planner by optimizing the query.  This would enable an in-memory query provider that was more declarative in nature.  Of course there would be trade-off in the time to analyze vs the time to execute.  So it seems it would only work well for large data sets.  If this was coupled with some good concurrency techniques, it would be possible to develop a rather robust in-memory query provider.

Comments

  • Anonymous
    January 09, 2007
    This concludes my series of posts about queries. I will still discuss them occassionally and if anyone

  • Anonymous
    January 09, 2007
    This concludes my series of posts about queries. I will still discuss them occassionally and if anyone

  • Anonymous
    January 11, 2007
    Welcome to the eighteenth issue of Community Convergence. I'm Charlie Calvert, the C# Community PM, and

  • Anonymous
    January 11, 2007
    Welcome to the eighteenth issue of Community Convergence. I'm Charlie Calvert, the C# Community PM, and