Jaa


So what's the deal with this whole C# 3.0 / Linq thingy?

I've been mulling over the best way to talk about the new C# 3.0 stuff we've been working on.  I presented the post on how you could use the new C# 3.0 features to go beyond the basic query functionality we've been targetting it at.  The was to help give an appreciation about how we've added strong query support through the addition of several new smaller features that can be used for more than query (although that's the formost area that we're trying to attack).  However, i then realized that it was somewhat interesting that i would present the post on "what *else* you can do with C# 3.0" before anyone even had idea of what you "could" did with C# 3.0 first.

I could do a fairly detailed drill down of the new C# features, but i actually thought a more holistic approach would be better in this case.  So i'm actually going to talk about the general problem space we're confronting, and i'll try to provide some running examples to help carry me through this.

So what is Linq?  Well, Linq is the culmination of a number of techniques we're producing to help deal with the large disconnect between data programming and general purpose programming languages. Linq stands for Language INtegrated Query, and simply put, it's about taking query, set operations and transforms and making them first class concepts in the .Net world.  This means making them available in the CLR, in .Net programming languages, and in the APIs that you're going to be using to program against data in the future.  Through all this you can get a completely unified query experience against objects, XML, and relational data.  i.e. the most common forms of data that will appear in your application.  And, what's best, if you happen to have your own form of data that doesn't fit into those different models, then you can use our extensible system to target that model as well.  After all, our XML and relational data access models (called XLinq and DLinq respectively) are just APIs built on top of the core Linq infrastructure.  As such, i'm not going to dive too deeply into those specific models.  I'm going to let the individual teams who are responsible for that (and who know those APIs far more intimately) to give you all the information at their disposal.

So, let's first talk about data access today and how our new approach most likely differs from that you've been used to.  If you're accessing a database somewhere in your application, then there's a good chance that you've embedded some bit of SQL somewhere.  Maybe you've kept it fairly clean and abstracted away, or maybe you have SqlCommand's left rigth and center all with their own "select *"'s or other raw SQL commands stored hither.  Of course, when writing this code you had no compile time checking that your SQL strings were well formed, no IntelliSense, etc.  Because, effectively, you are using two completely different languages in an environment that only understands one.  This is pretty bad, but really only begins to scratch the surface of the deep mismatch between this relational data domain and the object domain.

Through and through you have mismatches between objects and relational data and XML in your system.  Different types.  Different operations.  Different programming models.  Your code which works on XML won't work on relational data.  You code which works on relational won't work on objects. etc.  But there's a better way.  Now we can allow you to work with all these different data systems right within C# (or VB).  This means using the same syntax, the same types, and the same programm ing models to query and manipulate all these different forms of data in a unified manner.  And, because support for these models has been built on top of an extensible system, it means that if necessary you can do the same as what we've done to bring this strong query support anywhere you need to it go where we don't currently have an offerring.

To ground this discussion a little, let's start looking at a simple example of C# 3.0/Linq in action.  (Note: this example might look very familiar.  That's because many demos and examples are made to run against the Northwind DB.  This allows us to all talk about the same thing and have consistent and clear names for entities).   You start with a simple list of Customers:

         Customer[] customers = GetCustomers();

Nothing magic going on here.  Nothing up my sleaves.   Just a regular .Net array initialized from some source.  Now, to make things a little simpler (especially for later examples) we can then write that as:

         Customer[] customers = GetCustomers();
        var custs = customers;

What's going on in that second line? Well, "var" is are way of introducing "local variable type inference".  It's a new C# 3.0 feature that allows you to save space by not writing the type of a local variable, while also having the type inferred from the expression that initializes the variable.  So, in the above code, "custs" is known at compile time to be a "Customer[]".  If you were to write:

         var i = 10;
        var b = true;
        var s = "hello";

then it would be the *exact* same as writing:

         int    i = 10;
        bool   b = true;
        string s = "hello";

We'll see later on why this can be quite a handy thing.  Now, let's extend our code a bit further to start querying that array of customers:

         Customer[] customers = GetCustomers();
        var custs = customers.Where(c => c.City == "Seattle");

Here we're simplying querying all our customers for the set of customers that are from Seattle.  And "custs" will be an IEnumerable<Customer>.  We can even carry that a little further in to the following query:

         Customer[] customers = GetCustomers();
        var custs = customers.Where(c => c.City == "Seattle").Select(c => c.Name);

Here we're projecting out the name of all our customers from Seattle.   So custs will be an IEnumerable<string>.  Now, what the heck is this code.  This isn't your daddy's C# anymore.  What are those funky arrows?  And where did the "Where" and "Select" methods come from??  They're certainly don't seem to be defined on array type when i look at it in ILDasm!  Well, to answer the first question, the funky => arrow the new C# 3.0 syntax that allows you to create a lambda expression. You can think of a lambda expression as a natural evolution of the anonymous methods introduced in C# 2.0.  Lambda expressions benefit from simpler syntax and the ability to use inference.  So now you can write:

         c => c.City == "Seattle"  //instead of
        delegate (Customer c) { return c.City == "Seattle"; }

As you can see, the C# 2.0 method just drowns you in syntax and it makes it a rather poor choice to use in queries (heck! there's a 2x increase in query size between the two).  However, the new C# lambda expression succitly encapsulates the test we want to perform, with only about 5 characters overhead.

That answers the first question, but what about the second?  Where, oh where did "Where" come from?  This is an example of another new C# 3.0 feature we call "extension methods".  Extensions are a way to allow you to add operations to existing types that aren't under your control.  While that may give you the heebie-jeebies, rest assured, you're not actually modifying the actual type.  Rather, you're being allowed to use succint syntax to in effect execute a method as if it existed on this type.  Specifically, extension methods are static methods that look like so:

 namespace System.Query {
    public static class Sequence {
        public static IEnumerable<T> Where<T>(this IEnumerable<T> e, Predicate<T> p) {
            foreach (T t in e) {
                if (p(t)) {
                    yield return t;
                }
            }
        }
    }
}

This declares an "extension method" on the IEnumerable<T> type.  When you import the namespace by writing "using System.Query", you now gain the ability to call teh "Where" method on anything that implements IEnumerable<T> (like Arrays).  With these extension methods we can now compose powerful query functions together to manipulate data easy.

So at this point we've seen three new C# 3.0 features that can be used together to build a powerful base for querying objects.  In future posts i'll include information about the rest of the new language features, and i'll give a more comprehensive view of how sophisticated our query support is.

Comments

  • Anonymous
    September 13, 2005
    I think the general idea is fantastic ... I think as a general mechanism to replace OR systems, particularly when digesting the problem domain of business objects and logic, it is missing some key elements ... and unfortunately a design change is required to fix (kinda why the XmlSerializer class is effectively useless).

    I'll put it up on my blog and post back.

  • Anonymous
    September 13, 2005
    The comment has been removed

  • Anonymous
    September 13, 2005
    sniff I was so looking forward to c# 2.0, and now you just made it look stale in comparison.

    You big ol' meanie, you!

  • Anonymous
    September 13, 2005
    I'm not an ruby specialist but have been working with it about one week and this whole thing seems to me very similar like thing works in ruby world.
    try this
    http://www.rubyonrails.com
    especialy ActionRecord stuff
    bye

  • Anonymous
    September 13, 2005
    I must confess I am very impressed by this. It's about time something like this was introduced, it'll make life a lot easier on those of us who do a lot of work with databases.

  • Anonymous
    September 13, 2005
    The comment has been removed

  • Anonymous
    September 13, 2005
    namespace System.Query {
    public static class Sequence {
    public static void ForEach<T>(this IEnumerable<T> e, Func<T> f) {
    foreach (T t in e) {
    f(t);
    }
    }
    }
    }

  • Anonymous
    September 13, 2005
    So will Linq also work with SQL Server?

    Will GetCustomers() return my whole customer table or will it 'magically' select only those I specify in my where clause?

  • Anonymous
    September 13, 2005
    Ok, all the developers (myself included) out there are totally going bananas over the revelations from...

  • Anonymous
    September 13, 2005
    class System.Query.Sequence is not mentioned in the code

    Customer[] customers = GetCustomers();
    var custs = customers.Where(c => c.City == "Seattle");

    What if there was another class with an extension method

    public static IEnumerable<T> Where<T>(this IEnumerable<T> e, Predicate<T> p);

    How would we specify which extension method is applicable?

    // Ryan

  • Anonymous
    September 13, 2005
    Will LINQ support querying something besides IEnumerable in the future? How would I tell LINQ to query a Binary Space Partition? (requiring LINQ to know about the data container)

    Also, in the NorthWind example (Channel 9 Anders' video @ http://channel9.msdn.com/Showpost.aspx?postid=114680 is LINQ doing a full table scan every time? Or does it take advantage of existing indexes? But you're not duplicating the SQL Server engine inside the CLR ... are you?

    Also, will LINQ warns me of malform cross joins? Where I'm about to return 200 billion rows? Who of us haven't done that, right? Am I right? Hello?

  • Anonymous
    September 14, 2005

    As you've probably already heard, at long last we've announced the new features that we're planning...

  • Anonymous
    September 14, 2005
    The comment has been removed

  • Anonymous
    September 14, 2005
    chris: "So will Linq also work with SQL Server? "

    Absolutely. That's what Dlinq is all about.

    "Will GetCustomers() return my whole customer table or will it 'magically' select only those I specify in my where clause?"

    Read more about DLinq. THere's no magic involved... but rather a cool system where Expression<T> trees are remoted to your DB and executed there. So customers will only pull down the results of the query that's executed server side. And it will only materialize them when you foreach. So you can build up your query to something huge, and have no cost client side for it. No need to pull down anything extra... you get the idea :)

  • Anonymous
    September 14, 2005
    Reread my original comment -- I already created that same extension method. My feedback was to actually bake it in.

  • Anonymous
    September 14, 2005
    Ryan: "class System.Query.Sequence is not mentioned in the code "

    My bad. I'll update the code accordingly.

    "Customer[] customers = GetCustomers();
    var custs = customers.Where(c => c.City == "Seattle");

    What if there was another class with an extension method

    public static IEnumerable<T> Where<T>(this IEnumerable<T> e, Predicate<T> p);

    How would we specify which extension method is applicable? "

    Then there would be a conflict if you called .Where (just like a regular overload conflict). HOwever, you could always specify each method with with it's full name. i.e.: Sequence.Where(customers, c => c.City == "Seattle")

  • Anonymous
    September 14, 2005
    Ryan:

    You must explicitly import the namespace the extension method is defined in. Even if it's defined in the current namespace.

    Yeah, that last part is a bit silly, but these are just the preview bits.

    Cyrus:

    Is there a real feedback place?

  • Anonymous
    September 14, 2005
    Minh: "Will LINQ support querying something besides IEnumerable in the future? How would I tell LINQ to query a Binary Space Partition? (requiring LINQ to know about the data container) "

    You need to look into all the DLinq stuff that's happening :)

    And send feedback if it's not going to meet your needs. I'm less savvy about DLinq as it just builds on top of the basic query work that we're doing.

    "Also, in the NorthWind example (Channel 9 Anders' video @ http://channel9.msdn.com/Showpost.aspx?postid=114680 is LINQ doing a full table scan every time?"

    No. DLinq is doing nothign but providin the infrastructure to remote queries over to the server. THen only when iterated by the client are results materialized into the CLR world. Then when changes are made, they're local changes that can be sent back to your DB with commits.

    " Or does it take advantage of existing indexes? But you're not duplicating the SQL Server engine inside the CLR ... are you? "

    Not at all (duplicating SQL server that is). All queries run server side and naturally take advantage of whatever indicies you have.

    "Also, will LINQ warns me of malform cross joins? Where I'm about to return 200 billion rows? Who of us haven't done that, right? Am I right? Hello? "

    Not sure about that. HOpefully you'll test your stuff first and realize that you've done something really bad like that :)

  • Anonymous
    September 14, 2005
    The comment has been removed

  • Anonymous
    September 14, 2005
    If I download the C# LINQ preview, will that give me Extension Methods in Visual Studio 2005?


    What time frame are we talking before Orcas is ready?

    Go!

  • Anonymous
    September 14, 2005
    Hey Cyrus, a great post in the future would be some details on using lamdas in C#. I get your example, but not having much a LISP background, I end up having to re-read the code a few times to get my arms around it. :)

    Cool stuff!

  • Anonymous
    September 14, 2005
    Sean M: Yup, the linq preview comes with a little installer for VS 2005 to give it alpha support for this stuff! Cheers!

    No idea on Orcas though :) (check the net though, i'm sure someone must have said something already).

  • Anonymous
    September 14, 2005
    Sean Chase: "Hey Cyrus, a great post in the future would be some details on using lamdas in C#. I get your example, but not having much a LISP background, I end up having to re-read the code a few times to get my arms around it. :)

    Cool stuff! "

    Absolutely! I plan on blogging a lot about this. :)

  • Anonymous
    September 14, 2005
    Wow! Great stuff! I was waiting for something similar, knowing Comega and Xen..
    I think your solution is very elegant, and I always liked very much the lambda expression, back at the university..
    I also like the extensibility of this solution.

    Now the only think missing to c# is concurrency..do you plan to add it in a Comega way? With another message based solution? Or with another shared memory solution? I you want to, I have a little project based on Rotor to share.. =)

    Also, a last question: how this will cope with third party languages and compilers (especially dynamic languages?)

    Great work!

  • Anonymous
    September 14, 2005
    Ok, all the developers (myself included) out there are totally going bananas over the revelations from...

  • Anonymous
    September 15, 2005
    There's a <a href="http://blog.lab49.com/?p=93">discussion of some C#3 features</a> over at <a href="http://blog.lab49.com">Lab49 Blog</a> talking (and arguing) about extension methods, functional programming, etc. Interesting if you like that sort of thing.

  • Anonymous
    September 15, 2005
    After the declaration of C# 3.0 I went ahead and installed the PDC bits. After reading through the language...

  • Anonymous
    September 15, 2005
    After the declaration of C# 3.0 I went ahead and installed the PDC bits. After reading through the language...

  • Anonymous
    September 15, 2005
    After the declaration of C# 3.0 I went ahead and installed the PDC bits. After reading through the language...

  • Anonymous
    September 16, 2005
    I think the => operator is not only funky, but also clunky. It's so C++ish.

    I guess the combination is easy to type on an English keyboard, but on a Norwegian one those two characters are almost as far from each other as they can possibly get.

    I would like a colon better. The colon is already used as a separator in the switch statement:

    var custs = customers.Where(c : c.City == "Seattle");

  • Anonymous
    September 16, 2005
    I think the => operator is not only funky, but also clunky. It's so C++ish.

    I guess the combination is easy to type on an English keyboard, but on a Norwegian one those two characters are almost as far from each other as they can possibly get.

    I would like a colon better. The colon is already used as a separator in the switch statement:

    var custs = customers.Where(c : c.City == "Seattle");

  • Anonymous
    September 16, 2005
    Thomas: I'll take your feedback to the Language design team. Thanks!

  • Anonymous
    September 16, 2005
    In the last post i&amp;nbsp; discussed a little bit of background on why we wanted to introduce Linq, as...

  • Anonymous
    September 16, 2005
    In the last post i&amp;nbsp; discussed a little bit of background on why we wanted to introduce Linq, as...

  • Anonymous
    September 17, 2005
    In the last post i&amp;nbsp; discussed a little bit of background on why we wanted to introduce Linq, as...

  • Anonymous
    September 17, 2005
    Im with Thomas about not liking the => operator so much. Its not clear that a sense of directionality is usefull when reading lambda expressions.

    Id also rather see enclosing brackets be mandatory - its makes the functional nature of the resulting expression more clear.

    Maybe something like this:

    var sum = (x,y,z)(x+y+z)
    var sqr = (x)(xx)

    c.f

    var sum = x,y,z => x+y+z
    var sqr = x => x
    x

    Note how the overuse of the equals sign tend to blur the distinctions between = and =>. To my eye, I tend to see this associativity (if only because => is 'bigger' than = ):

    (var sqr = x) => x*x

  • Anonymous
    September 18, 2005
    Blog link of the week 37

  • Anonymous
    September 20, 2005
    Damien: "var sum = (x,y,z)(x+y+z)
    var sqr = (x)(x*x) "

    Bleagh :)

  • Anonymous
    September 20, 2005
    Anyone how would you write something like this???

    select o.customerID, c.CompanyName, count(o.OrderID) NumOrders
    from Customers c,
    Orders o
    where c.CompanyName LIKE 'A%'
    and o.CustomerID = c.CustomerID
    group by o.CustomerID, c.CompanyName

  • Anonymous
    September 20, 2005
    Id also like to know how joins are meant to be handled.

  • Anonymous
    September 21, 2005
    Damien: "var sum = (x,y,z)(x+y+z)"

    Cyrus: "Bleagh :) "

    Would curly-braces make it more comfy?

    var sum = (x,y,z){x+y+z}

    I personally liked Thomas's ':' the most. And Damien, you know why '=>' implies associativity to you, you just don't want to admit you've soiled yourself with Perl ;-P

  • Anonymous
    September 21, 2005
    Confused:

    "Anyone how would you write something like this???

    select o.customerID, c.CompanyName, count(o.OrderID) NumOrders
    from Customers c,
    Orders o
    where c.CompanyName LIKE 'A%'
    and o.CustomerID = c.CustomerID
    group by o.CustomerID, c.CompanyName"


    Like this:

    var q = from c in Customers, o in Orders
    ........where c.CompanyName.StartsWith(“A”) &&
    ..............c.CustomerId == o.CustomerId
    ........group o by new { o.CustomerId,
    .........................c.CompanyName } into g
    ........select new { g.Key.CustomerId,
    .....................g.Key.CompanyName,
    .....................NumOrders = g.Group.Count() }

  • Anonymous
    September 21, 2005
    Ok, so now we know how to write nested loop joins.

    Is it the case that anything more sophisticated than a nested-loops join requires more than is delivered in System.Query.Sequence?

  • Anonymous
    September 21, 2005
    Damien: "var sum = (x,y,z)(x+y+z)"

    Cyrus: "Bleagh :) "

    Ken : "Would curly-braces make it more comfy?"

    curly braces would be reserved for lambda blocks

    var f = (x,y,z) {
    while (x<y) { z += x; x += 1;}
    return x;
    }

    Ken: "... you've soiled yourself with Perl"

    Actually no, I always mamanged to somehow avoid doing any Perl. I even got Python installed on a few systems way back when Python was in versions 1.x.

  • Anonymous
    September 28, 2005
    I just posted on Lab49 blog ( http://blog.lab49.com/?p=132 ) asking about the overall rationale for this, as well as some specific shortcomings in the implementation.

    The main rationale for Linq seems to be "Anders thinks its a good idea", which frankly is enough for me :-) But yet I'm still not quite seeing it as the amazing new breakthrough that others seem to think it is.

    Perhaps you (or someone) could post more on the (real-world) use cases/scenarios/patterns/architectures that would incorportate Linq?

    Thanks - Daniel

  • Anonymous
    March 15, 2006
    PingBack from http://www.palladiumconsulting.com/blog/sebastian/?p=7

  • Anonymous
    August 27, 2006
    Ok, all the developers (myself included) out there are totally going bananas over the revelations from

  • Anonymous
    September 06, 2007
    As you've probably already heard, at long last we've announced the new features that we're planning on

  • Anonymous
    January 21, 2009
    PingBack from http://www.keyongtech.com/4482102-linq-vs-delphi

  • Anonymous
    May 29, 2009
    PingBack from http://paidsurveyshub.info/story.php?title=cyrus-blather-so-what-s-the-deal-with-this-whole-c-3-0-linq-thingy

  • Anonymous
    May 31, 2009
    PingBack from http://woodtvstand.info/story.php?id=2672