Jaa


Astoria data sources and system layering

The question is: what data sources should Astoria support as inputs? Should we allow ADO.NET Entity Framework only or open up more? If we open up more, how does the system need to be layered to enable the use of those alternate data sources?

Let me step back for a second first and establish why certain data sources are interesting.

Why does Astoria build on top of the EDM and the Entity Framework?

Putting an HTTP head on top of some set of data is easy enough that it’s tempting to do it against any data source out there. The problem is that then if the way data is modeled and surfaced has not been thought through, inconsistencies and awkward aspects start to show up in the HTTP interface.

The Entity Data Model (EDM) was a very nice candidate. EDM explicitly models all data as entities, which can be nicely mapped to resources. So the definition of a resource is crisp and it comes from the underlying data model. Furthermore, the EDM defines associations between those entities, which can be surfaced quite naturally as hyperlinks. Finally, semantics around query and update for EDM data are clearly defined and can be mapped easily to HTTP verbs.

Entity Framework also provides an important contribution: mapping. Very rarely you want to surface your database data as-is to the web, either to an application or as a service for external applications. You’ll want to adjust names, reshape data, merge tables, etc. The Entity Framework mapping layer can help with data, and with the VS integration we’ll include in the Entity Framework for EDM design and mapping, you’ll be able to do this in a nice visual experience.

Clearly, the world of data sources does not end with the Entity Framework…

Strictly from the data model perspective we clearly need to pick a single one for consistency, and EDM has the right characteristics for our needs. Now, from the perspective of the actual data sources that we can handle, it would not be reasonable to assume that every data source out there will be plugged into the Entity Framework. There are many scenarios where you want to surface data coming from other sources and you still want to use the Astoria URI and payload formats, along with the client libraries and tools that will be available for it.

So we want to support Entity Framework *and* other data sources. Now the trick is how to do this without basically writing Astoria twice :-), and maintain a sound, consistent data- and interaction-model.

What do we need from a data source?

Astoria has a few specific needs when interacting with a data source:

  • Surface data as reasonable units we can expose as resources. For example, we could say that each CLR object is a resource
  • Addressability: each resource (say CLR object) needs to be addressable. For us to create an address we need to be able to figure out what members are the “keys” on that resource
  • Query composition: Astoria URIs are a simple form of queries. We need to be able to formulate and execute queries against the data source without knowing the specifics of the target data source. We also need to be able to “compose” queries; e.g. to take a given query and say add sort order to it
  • Update: we need to be able to pull a bunch of resources (say, again, CLR objects), make some modifications, and then push the changes back into the data source.

Proposed approach

Trying to avoid inventing new stuff, we could try to tackle this problem using the technology that was introduced as part of LINQ. The IQueryable interface provides a mechanism by which you can build a query by applying operators to input IQueryable objects, obtaining a new IQueryable object that includes the applied operator (there is plenty of blog entries and such on IQueryable on the web, so I won’t elaborate on it here; Matt Warren’s series on IQueryable provide a great detailed reference for this.).

In this approach we can say that your data service takes as input any class that has a set of public properties of type IQueryable<T>. For example:

public class MyDataService : WebDataService<MyDataSource>

{

}

Where MyDataSource could be something like:

public class MyDataSource

{

  public IQueryable<SomeType> SomeThings {…}

  public IQueryable<OtherType> OtherThings {…}

}

public class SomeType {

  public int ID { get; set; }

  public string SomeData { get; set; }

  public ICollection<OtherThings> RelatedThings { get; }

}

public class OtherType { ... }

That would result in a service with two top-level entity containers, /SomeThings and /OtherThings.

Translation to LINQ expression trees is for the most part fairly straightforward. Following the example, let’s say that you now request the URI /SomeThings!1, we would translate it to roughly:

myDataSourceInstance.SomeThings.Where(i => i.ID == 1)

(at least conceptually, we would build the expression trees directly and not go through the language form, of course).

A more elaborate example would be association traversal, for example /SomeThings!1/RelatedThings; the translation for that would be:

myDataSourceInstance.SomeThings.Where(i => i.ID == 1).SelectMany(i => i.RelatedThings)

So we are basically saying that the Astoria server runtime has 2 halves. The top-half is the Astoria runtime itself; this part is “fixed”, and it implements URI translation,  the XML/JSON/etc wire formats, the interaction protocol, etc. It’s basically what it makes an Astoria service look like an Astoria service. The bottom half is the data-access layer and it’s pluggable. Communication between layers happens in terms of the IQueryable interface plus a set of conventions to map CLR graphs into the URI/payload patterns of Astoria.

Other options?

An alternate approach would be to come up with a new interface. Astoria needs only a few operations to be supported by data sources, and the IQueryable interface is certainly overkill for the job. However, the fact that IQueryable already exists and there will be various implementations for it has a lot of value itself…

Net result

The net result that we want, through IQueryable or some other alternative, is that we can surface various data sources through Astoria, so that clients can interact with data services across the way using a single, uniform mechanism. The exposed HTTP interface is still EDM-ish even when the source is not the Entity Framework, and the metadata is still expressed in EDM terms for consistency. That helps maintain the simple, straightforward semantics of the Astoria HTTP interface.

Now you can bring any data source you want and expose it through Astoria, provided that you have or can write an IQueryable implementation for it. Not only we’ll automatically hook it up in the Astoria pipeline, but also we’d push-down all the filters, sorting, and other operators, so the data source can efficiently implement them.

Querying over the Entity Framework

It turns out that we build first-class LINQ support into the Entity Framework, we call the thing LINQ to Entities. That means that while we have a different code-path for metadata, the query composition and execution code paths, as well as all of the serialization and other details, is common code across Entity Framework and any other LINQ implementation plugged into Astoria.

What’s missing

Metadata: one thing I did not discuss here is how we turn CLR object graphs obtained by executing IQueryable objects into something that works well with the Astoria HTTP interface. There is a set of conventions that we’ll use, and a few attributes to override conventions when they don’t work for you. In a future post we’ll discuss those in detail.

Update: while this model enables us to support querying over arbitrary .NET classes that expose containers of instances as IQueryable objects, it does not say how to update stuff. Discussing the update model will take a whole post (or several, most likely), but the short story is that we’ll define an interface, something like IUpdatable or whatever names works, that has the basic operations we need to perform in order to handle updates. The interface would have primitive operations for adding a new resource, remove an existing resource, applying modifications to resources and also handle linking/unlinking of resources.

Pablo Castro
Technical Lead
Microsoft Corporation
https://blogs.msdn.com/pablo

 

This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.

Comments

  • Anonymous
    September 27, 2007
    PingBack from http://www.artofbam.com/wordpress/?p=3528
  • Anonymous
    September 27, 2007
    It is time for another weekly roundup of news that focuses on .NET, agile and general development related
  • Anonymous
    September 28, 2007
    When you first announced Astoria, it was an exciting idea.  A quick snap-in framework to expose RESTful services has SO much potential.  But when you originally mentioned it would be written solely for EDM, that was depressing news.  I went so far as helping develop an open source implementation of Astoria to support other OR/M mappers.  This post is very encouraging that you are considering opening up the framework.  At the very least, please provide read-only support for anything implementing IQueryable.  What's better?  Create an IAstoriaProvider interface (feel free to change the name) one level up so that we can plug in our own data providers.  This interface would be responsible for fetching data, determining all of the metadata information, as well as persisting object changes.  One flaw many people see in Astoria is that updates to objects appear to bypass any sort of business tier.  If I post a new Customer object to the Astoria service, I want a business tier to be called that will send out a new welcome email for instance.  The standard EDM provider is great for simple projects but Astoria should set it's primary focus as a presentation layer framework and allow us to feed it whatever information we desire and save objects through our own home grown code.  If we choose to do this, let us be responsible for working out the mapping inconsistencies.  
  • Anonymous
    September 29, 2007
    We're trying to keep up posting regularly on the design aspects of Astoria we have on the table week
  • Anonymous
    September 29, 2007
    We&#39;re trying to keep up posting regularly on the design aspects of Astoria we have on the table week
  • Anonymous
    September 29, 2007
    I have been talking about Astoria on this blog for some time now. For those who don't know what Astoria
  • Anonymous
    September 30, 2007
    IQueryable is OK. For most datasources there are already compelling reasons to write IQueryable implementations, so this will just add more value to them.The scenarios where you would be interested in having support for Astoria and not for LinQ are probably very limited (you probably know them better).
  • Anonymous
    September 30, 2007
    @Derrick: I'm happy to hear that this design approach sounds reasonable and the post is encouraging. Yes, I simplified the description a bit because some of the pieces are still cooking, but the real deal is that you can just do IQueryable or you can go for a bigger interface that allows you to control more things, including updates.@Andres: thanks for the feedback. Note that even if you don't have an IQueryable implementation for your data source, writing one that's just enough for Astoria is substantially simpler than writing a full on LINQ provider, so this should enable a broad set of sources to be plugged in. Now, given that you have more continued contact with the real world, if you run into that wouldn't work with this strategy I'd love to hear about it :)
  • Anonymous
    December 07, 2007
    A new name, but goals stay the same As per our last blog post, we just finished a round of presentations
  • Anonymous
    December 07, 2007
    A new name, but goals stay the same As per our last blog post, we just finished a round of presentations
  • Anonymous
    January 15, 2008
    While the syntax described for filter in the previous post allows you to do some nifty things, there
  • Anonymous
    January 15, 2008
    While the syntax described for filter in the previous post allows you to do some nifty things, there