Jaa


Mocks Nix - An Extensible LINQ to SQL DataContext

I often get asked how LINQ to SQL is supposed to be used with Test Driven Design (TDD). Okay, not really. People aren’t knocking on my door or calling me at 3:00 am. I do, however, occasionally read developers angst on their personal blogs. It seems they are trying to actually do this, but are often confounded by the DataContext and its dearth of appropriate interfaces. Of course, my original knee-jerk reaction is to question why anyone would want or need to do this in the first place. Certainly, abstraction at a higher level of the application would be more appropriate, yada yada yada. Eventually, my internal ranting ebbs and my practical side takes over. I start thinking like an engineer. How would I go about it? If only I’d added such fundamental interfaces such as IDataContext and ITable<T> before hitting RTM, all would be so much easier. Yet, TDD was not a priority. It wasn’t even on the list of features that didn’t make the cut. Still, how would I do it? Then I start wishing I could override the DataContext’s methods and substitute my own logic. Yet these methods are not virtual and cannot be overridden. Then with fitting irony I recall reading the other developer blogs that pointed this out too.

Of course, this only makes the problem that much more interesting and worthy of a good hack. I consider wrapping the DataContext in some other layer that looks exactly like it and abstract it that way, but then realize it would certainly trip the system up, especially deep in the query translation engine where it expects to find references to specific types. Instead, the ideal solution would keep the DataContext the same, yet allow me to do something other than hitting the database when a query is executed. If only LINQ to SQL had a public provider model, I could simply plug a new one in and use it to intercept all interaction with the database. Oh, double irony, as there is no such provider model, at least not a public one. Grin.

LINQ to SQL was actually designed to be host to more types of back-ends than just SQL server. It had a provider model targeted for RTM, but was disabled before the release. Don’t ask me why. Be satisfied to know that is was not a technical reason. Internally, it still behaves that way. The trick is to find out how to swap in a new one when everything from the language to the runtime wants to keep you from doing it.

Fortunately, the DataContext has a nice little ‘provider’ instance variable just waiting to be overwritten. A little bit of reflection can make quick work of that. The trouble is how to specify a new provider. The DataContext only talks to it through an interface (as it should), and yet that interface is internal to the LINQ to SQL assembly. The programming language won’t let you define your own implementation. How do you go about implementing an interface that you can’t even say the name of in your source code?

Actually, I can think of two ways; 1) write a bunch of reflection emit code that generates an implementation at runtime or 2) trick the runtime into thinking some existing object implements the interface. You can probably guess where I am going from here, as every good hack needs a good trick. Besides, a bunch of reflection emit code would be a lot more work. Onward to the fun solution!

This is where CLR grand-interception-theory comes in; in the CLR you can intercept any interaction with any object, really, as long as it’s a method call and the object derives from MarshalByRef. Actually, that’s not really true, you can intercept more than method calls, or at least they don’t start out being method calls, and they don’t necessarily need to be on only MarshalByRef objects. Still, not only do I want to intercept calls on an object, I want to make the object appear to implement an interface and intercept the calls on that interface. That’s a tall order, to be sure. But it can be done.

The interception capability is the underpinnings of remoting (aka DCOM) support in the runtime. I can use it to make an object masquerade as another object. The original intention was to enable client-side proxy objects to appear to implement the API of an object that only really exists on a server. The term ‘MarshalByRef’ refers to the DCOM behavior of marshalling a reference to the object from the server back to the client, such that calls on the client-side proxy are marshaled back to the server. It works by the JITer injecting specialized thunks into the code that identify and handle calls to these special dopplegangers. The really interesting thing to note is that interfaces in the runtime work nearly the same. They also have thunks that are capable of recognizing these proxies and acting accordingly; quite possibly because COM is so dependent on multitudes of interfaces. However, regardless of the reason they exist, I can use this mechanism to wedge my own provider implementation into the mix.

What I first need to do is define a proxy object that will intercept these calls. The remoting mechanism actually uses two different proxies, one that masquerades as the type (the transparent proxy) and one that receives the interception (the ‘real’ proxy.) Both of these guys are intended to exist on the client. The real proxy is supposed to be the object that actually implements the marshalling behavior. My guess is that the only reason that I’m even allowed to implement my own real proxy is to enable marshalling over newer communication layers. Fortunately, I can use this proxy to simply act as an interceptor to do my bidding.

The next question I faced was what to do when I actually intercepted the calls. Should I forward them on to some new grand public provider model? That just seemed a bit over the top. Instead, I chose to redirect the calls back to methods on the DataContext that can be overridden. It was a quicker hack and introduces far fewer concepts to those already familiar with the DataContext. And that’s really what you wanted all along, anyway, wasn’t it?

So I reveal to you, the new and shiny ExtensibleDataContext, one with a few new poorly named methods that you can actually override and implement yourself.

using System;

using System.Collections;

using System.Collections.Generic;

using System.Diagnostics;

using System.IO;

using System.Linq;

using System.Linq.Expressions;

using System.Text;

using System.Reflection;

using System.Runtime.Remoting;

using System.Runtime.Remoting.Activation;

using System.Runtime.Remoting.Proxies;

using System.Runtime.Remoting.Messaging;

using System.Runtime.Remoting.Services;

using System.Data;

using System.Data.Common;

using System.Data.Linq;

using System.Data.Linq.Mapping;

using System.Data.Linq.Provider;

namespace System.Data.Linq

{

public class ExtensibleDataContext : DataContext

{

public ExtensibleDataContext(object connection, MappingSource mapping)

: base("", mapping)

{

FieldInfo providerField = typeof(DataContext).GetField("provider", BindingFlags.Instance | BindingFlags.NonPublic);

object proxy = new ProviderProxy(this).GetTransparentProxy();

providerField.SetValue(this, proxy);

this.Initialize(connection);

}

protected virtual void Initialize(object connection)

{

}

private TextWriter LogImpl { get; set; }

private DbConnection ConnectionImpl { get; set; }

private DbTransaction TransactionImpl { get; set; }

private int CommandTimeoutImpl { get; set; }

protected internal virtual void ClearConnectionImpl()

{

}

protected internal virtual void CreateDatabaseImpl()

{

}

protected internal virtual void DeleteDatabaseImpl()

{

}

protected internal virtual bool DatabaseExistsImpl()

{

return false;

}

protected internal virtual IExecuteResult ExecuteImpl(Expression query)

{

return new ExecuteResult(null);

}

protected class ExecuteResult : IExecuteResult

{

object value;

public ExecuteResult(object value)

{

this.value = value;

}

public object GetParameterValue(int parameterIndex)

{

return null;

}

public object ReturnValue

{

get { return this.value; }

}

public void Dispose()

{

IDisposable d = this.value as IDisposable;

if (d != null)

d.Dispose();

}

}

protected internal virtual object CompileImpl(Expression query)

{

return null;

}

protected internal virtual IEnumerable TranslateImpl(Type elementType, DbDataReader reader)

{

return null;

}

protected internal virtual IMultipleResults TranslateImpl(DbDataReader reader)

{

return null;

}

protected internal virtual string GetQueryTextImpl(Expression query)

{

return null;

}

protected internal virtual DbCommand GetCommandImpl(Expression query)

{

return null;

}

public class ProviderProxy : RealProxy, IRemotingTypeInfo

{

ExtensibleDataContext dc;

internal ProviderProxy(ExtensibleDataContext dc)

: base(typeof(ContextBoundObject))

{

this.dc = dc;

}

public override IMessage Invoke(IMessage msg)

{

if (msg is IMethodCallMessage)

{

IMethodCallMessage call = (IMethodCallMessage)msg;

if (call.MethodBase.DeclaringType.Name == "IProvider" && call.MethodBase.DeclaringType.IsInterface)

{

MethodInfo mi = typeof(ExtensibleDataContext).GetMethod(call.MethodBase.Name + "Impl", BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.DeclaredOnly);

if (mi != null)

{

try

{

return new ReturnMessage(mi.Invoke(this.dc, call.Args), null, 0, null, call);

}

catch (TargetInvocationException e)

{

return new ReturnMessage(e.InnerException, call);

}

}

}

}

throw new NotImplementedException();

}

public bool CanCastTo(Type fromType, object o)

{

return true;

}

public string TypeName

{

get { return this.GetType().Name; }

set { }

}

}

}

}

The ExtensibleDataContext’s constructor has the job of overwriting the DataContext’s private ‘provider’ variable. It creates a new ProviderProxy instance and assigns it to the private field using FieldInfo.SetValue(). The implementation of SetValue attempts to cast the object to the LINQ to SQL private interface IProvider. This succeeds because the function CanCastTo on the ProviderProxy returns true, allowing the proxy to be cast to any type. After that, all interface calls on this object are rerouted to the Invoke method. The implementation of Invoke simply calls the DataContext back, invoking methods with similar names. These are left empty for you to override in your own derivation of ExtensibleDataContext.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Data.Linq;

using System.Data.Linq.Mapping;

using System.Text;

namespace MocksNix

{

public class MyDataContext : ExtensibleDataContext

{

static MappingSource mapping = new AttributeMappingSource();

public MyDataContext()

: base("", mapping)

{

}

public Table<Customer> Customers

{

get { return this.GetTable<Customer>(); }

}

protected internal override IExecuteResult ExecuteImpl(System.Linq.Expressions.Expression query)

{

this.Log.WriteLine("executing query: {0}", query);

return new ExecuteResult(new Customer[] { });

}

}

public class Customer

{

[Column(IsPrimaryKey = true)]

public string CustomerId;

[Column]

public string ContactName;

}

class Program

{

static void Main(string[] args)

{

MyDataContext dc = new MyDataContext();

var query = from c in dc.Customers where c.CustomerId == "X" select c;

var list = query.ToList();

}

}

}

Now, I can use the ExtensibleDataContext in a small test program. I create my own MyDataContext that implements ExecuteImpl(). This method gets called whenever a query needs to be executed. Instead of executing the query, I write out a simple message and return an empty collection.

That’s it. Now take this bit of code and go forth and prosper.

DISCLAIMER: Overriding internal implementation details is not a practice recommend or supported by Microsoft. Implementation details are subject to change without warning.

But who cares!

Go on, mock LINQ to SQL all you want.

Comments

  • Anonymous
    May 04, 2008
    How does this work with generated code? Or are you suggesting that you would generate the code from the dbml and then replace DataContext with this ExtensibleDataContext?

  • Anonymous
    May 04, 2008
    I often get asked how LINQ to SQL is supposed to be used with Test Driven Design (TDD). Okay, not really.

  • Anonymous
    May 05, 2008
    The comment has been removed

  • Anonymous
    May 05, 2008
    This post is a confluence of two distinct sets of comments I got: The above-mentioned feature is a well-hidden

  • Anonymous
    May 06, 2008
    Matt.. I'll say it:  You craaazy, man. By the way, what are the security principle requirements for this trick to work?  Should we expect it to work in a web hosting scenario?

  • Anonymous
    May 07, 2008
    Keith, I'm using reflection to overwrite a private field.  You do the math.

  • Anonymous
    May 08, 2008
    The comment has been removed

  • Anonymous
    May 09, 2008
    The comment has been removed

  • Anonymous
    May 11, 2008
    @Matt:  Yeah, that's what I thought.  Just wanted to be sure about whether it was Cthulu or Leviathon you were invoking :)

  • Anonymous
    May 21, 2008
    "LINQ to SQL was actually designed to be host to more types of back-ends than just SQL server. It had a provider model targeted for RTM, but was disabled before the release.  Don’t ask me why.  Be satisfied to know that is was not a technical reason. Internally, it still behaves that way.  The trick is to find out how to swap in a new one when everything from the language to the runtime wants to keep you from doing it." That sounds scary. As a customer, I think I want to know why. How can I find out? Who should I ask, and should I expect an honest answer if the reason is not technical?

  • Anonymous
    May 22, 2008
    Product decisions are made all the time that are not based on technical reasons.  Often times it has to do with the balance of resources that can be put on the problem.

  • Anonymous
    May 24, 2008
    What are the services we get for free from Linq to Sql by using this approach? I'm guessing I have to translate Expression(s) to DbCommand(s) and DbDatareader(s) to Object(s), but what about Object Tracking?

  • Anonymous
    May 24, 2008
    If you are 'mocking' the DataContext you are probably going to want to have it produce specific results depending on the cases you are testing and so you probably won't need to interpret the queries.  I can imagine more advanced scenarios, however, writing code to handle these quickly becomes as complicated as writing a provider with its own query translator.

  • Anonymous
    June 05, 2008
    I think you're going about this the wrong way. Instead of trying to mock the DataContext directly, why not hide it behind an abstraction? Then you can easilly mock it, use it for dependency injection etc: http://www.iridescence.no/Posts/Linq-to-Sql-Programming-Against-an-Interface-and-the-Repository-Pattern.aspx

  • Anonymous
    June 13, 2008
    I made an attempt at implementing IUpdatable on the current (i.e. VS 2008 Sp1 B1) bits of ADO.NET Data...

  • Anonymous
    July 02, 2008
    While I have finished my series on LINQ to SQL I wanted to talk about some of the reaction. In his summary

  • Anonymous
    July 02, 2008
    While I have finished my series on LINQ to SQL I wanted to talk about some of the reaction. In his summary

  • Anonymous
    July 10, 2008
    Navigate back to Part 2 of this series of entries. Ok, ok, ok, the other two parts where lean on the...

  • Anonymous
    August 05, 2008
    The Automated Testing Continuum - Part 2 (Unit Testing LinQ)

  • Anonymous
    October 31, 2008
    Interesting blog post about it . And some related information on Stackoverflow posts . The basic gist appears to be comments made on the ado.net blog that state the Entity Framework is the only thing getting major developer time for Visual Studio 2010