Entity Framework for DBAs

There are a host of new technologies coming out, and among them are some ORM type of systems. I want to spend some time exploring how a DBA will work with these, and if they are good or bad. Given that I am on the Entity Framework team, that is the place that I want to start.

 

 

Entity Framework for DBAs

There are two new data access technologies coming out of Microsoft in the coming year that have particular interest for DBAs: the Entity Framework and LINQ to SQL. Both of these technologies are going to radically change the access patterns that you see as a DBA, and the way that you work. What is the impact of these technologies on a DBA? Are they a step in the positive direction?

 

For this post, I am going to focus on the Entity Framework, and look at some of its features and capabilities as they relate to being a DBA. If you would like an introduction to the Entity Framework as a whole, this post is a great place to start.

 

Fear and Loathing

When most DBAs first hear about LINQ and the Entity Framework, the first reaction is revulsion. In a previous life, I wore a DBA hat, so my initial reaction was not too different. Why?

 

Take a look at a LINQ to Entities statement like the following.

 

var query = from Cust in Customers

            where Cust.LastName == “Smith”

           select Cust.ID;

 

Ok, that is not so bad, and not too different from something you might see in a stored procedure. Now, take a look at this.

var query = Customers.GroupBy(c => c.Region)

    .Select(g => new

                 {

                     Region = g.Key, FreightTotal = g

                    .SelectMany(c2 => c2.Orders)

                    .Sum(o => o.Freight)

                 });

What exactly is this going to do to your database? How many table scans are going to occur? It might be a great query, or it might bring your server to its knees. What happens when you need to normalize the Customers table for performance? What is it that we gain for this uncertainty?

The Journey Forward

The journey from embedded SQL to where we are today has been long and arduous. We have passed a lot of mileposts: T-SQL, views, stored procedures, etc. With each step, DBAs have been able to exercise more control over the database, both in terms of administration and in usage, while allowing the application to develop in its own way. This has allowed productivity to increase, and systems to become less fragile.

 

The introduction of the Entity Framework furthers this tradition, and has the benefit of giving the DBA more power and control over the system.

Control

In my mind, the key attribute of a DBA is that the power of control rests with them. The DBA is the guardian of performance, security and reliability. None of these can be achieved with the ability to control what goes on; what SQL is executed, what the database structure looks like, who can run what, etc.

The above LINQ query looks a lot like something you would write in T-SQL (odd syntax notwithstanding), but actually, you are writing to the conceptual model. What does this mean? This means that you now have more control over the database!

How does this work? Now, you do not need to care how ugly that LINQ query looks, as it all works against the logical model. So, you have more control over the database, because as long as you make sure that the logical model remains consistent, you can do whatever you need to do in order to make the system work!

Model

Database models are nothing new. In fact, they are very old. In the beginning, there was EF Codd, who brought to bear the power of relational algebra. There was also Chen, who created the Entity Relational Model (ERM). From these, the basic database model was born. But like most things in computer science, a little seasoning was in order. And nothing seasons like a well chosen abstraction.

The evolution of views

First there were joins, which allowed for normalization to occur. However, as databases became larger and more complex, queries started to look like the tax code. To help, another tasty abstract layer arrives: views. Views allowed DBAs to consolidate a set of joins into a logical “table”, and, with later versions of SQL Server, even allowed for some level of updating (if you were willing to write instead-of triggers).

By abstracting the object on which you query data from the way in which you store data, views gave DBAs more control and flexibility over the structure of the database. Tables could be changed, normalized, partitioned, and as long the compensating changes were made in the view, life was good, if a little boring with all the CRUD SQL that needed to be written.

Stored Procedures – an alternative evolution

Views are not the only way to abstract the structure of a database; another common approach is to use stored procedures. This gives the DBA the ultimate in flexibility. The entire API for the database can be explicitly defined, but it is flat, and you cannot compose them, so all that powerful SQL is no longer usable by the developer.

In addition, you still need to maintain all those nasty novel sized SQL statements, but at least the developer cannot see them!

The Entity Framework

With the Entity Framework (EF) and the Entity Data Model, the DBA/Architect now has a system that has, and expands on, the benefits of both views and stored procedures. This framework currently resides outside of the database, but it can be thought of as a part of the data layer.

Now, instead of having to hand manage views (including those oh-so-fun instead of triggers), the Entity Framework will do the heavy lifting for you. The Entity Framework contains some very intelligent view mapping technology, so if you can declare the mapping, you can update it!

Now, instead of having to write thousands of stored procedures to control the API of a database, you can create a logical model that becomes the API. And when you need the benefit of stored procedures for performance or logic, you can still plug them in.

Does this mean that the Entity Framework will replace views and stored procedures? NO!

Stored procedures and views are still very powerful and useful technologies. Performance and security are two key benefits of stored procedures and views, and there is no reason to throw the baby out with the bathwater! Myself, I would not give up indexed views for the world; those things are very cool.

Before the Entity Framework, the API between the database and the application was like a pond. Any time you touched anything, the mud would rise in “clouds”. Likewise, any change to the database might affect the API, and potentially cause things to break.

Muddy API

After the Entity Framework, the API is now explicit. It can still use the same technologies and techniques on the database side, but it is not cloudy any more. It gives you the flexibility to change things, plus it allows for some cool benefits for the application developer, like ORM, LINQ, and Entity SQL.

Clean API

So the Entity Framework lets you continue to use the technologies that you love (stored procedures, views, etc) and even those you have to tolerate (dynamic SQL). Except that now you don’t need to jump through hoops to support an implicit API for accessing the database; you can now worry about more important things.

We now live in the land of milk and honey, right? Remember, there is no such thing as a free lunch. So what do you give up? Some measure of query control. But the gain in database control and flexibility far outweighs the loss of query control, in my opinion.

 

The best use of your time

We now have a way to cleanly define a logical API for developers, one that does not require weeks and weeks of CRUD development, and a long involved process with each change. That alone is worth the price of admission!

 

The Entity Framework is going to you as a DBA the ability to make better use of your time, and to spend some on making the system better, instead of changing random SQL queries all day. Like any version 1 of a product, I am sure that the Entity Framework will cause some problems for DBAs, but the upside is huge, and that is why my DBA side has come around to seeing the benefit of the Entity Framework.

 

 

What do you think? Am I full of hot air? Something else? What are you concerns as a DBA about the Entity Framework? I'd like to use this blog as a way to have a conversation with you, to help make a better product.

 

Thanks,

 

Erick Thompson

Program Manager

Microsoft

Comments

  • Anonymous
    July 31, 2007
    PingBack from http://linq.blogstogo.com/2007/07/31/entity-framework-for-dbas/

  • Anonymous
    September 21, 2007
    In most shops given the rapid application development cycles, the logical model changes quite a bit thru the project life cycle and even during support phases. In such an environment - instead of having to change the LINQ/Entity queries within the code any time a change occurs, it is better to leave the db code to a seperate database layer (encapsulated by stored procs/views/functions). The LINQ/EF model creates an unnecessary dependency and would make application development more complicated and inflexible. Extremely performant and flexible applications can and have been written using the current model using the "Muddy API"s that you mention. Hope MS is not creating a new complication/problem just in the name of coming up with innovative technologies/APIs. We have seen that already with technologies with COM+/MSMQ and buzzwords from Redmond like Windows DNA/WinFS etc.

  • Anonymous
    September 28, 2007
    Jason, The primary idea of having the conceptual model is that it shouldn't change - and when it does, it should be due to a business need. If the business need doesn't change, then the conceptial (and logical) model should remain the same. It is usually the operations on the Entities that changes quite a bit, which is expected, and not impacted by the EF. I think that the idea of the EF as a dependency isn't the correct idea, any more than the concept of an Interface is a dependency. It's a way to allow two systems to agree on a common way to communicate with each other. We have this today, it's just not centralized and nor formalized, which makes it too easy to break. And when it does break, you don't know until you get a runtime error. The issues with performance are valid - it's almost always going to be faster to write applications that lie closer to the database. This is true in all programming - machine language is faster than C, which is faster than C++, which is faster than C#, etc. However, the question you need to ask yourself is how often do you want to write the same data access code? Thanks, Erick

  • Anonymous
    November 28, 2007
    The article asked some very important questions about the LinkQ query presented : What exactly is this going to do to your database? How many table scans are going to occur? It might be a great query, or it might bring your server to its knees. What happens when you need to normalize the Customers table for performance? What is it that we gain for this uncertainty? But the article NEVER ANSWERED THE QUESTION or showed how many steps the DBA would have to go through in order to answer it.  We are just given some vague suggestion that afterward we will have more control.  My suspicion is that it if it were as easy as you suggest, the "how to" would be inlcluded in the article.  

  • Anonymous
    November 28, 2007
    Gregg, Instead of showing the exact steps, I wanted to give pointers to the different parts of the technology that could be used. A how-to guide would be very long, and would have to go into great detail. The reason why it would need to be long is that it isn't simple. Very few things are simply when dealing with large databases - they are complex beasts. A slow database query can be solved with an index, but a how-to for indexes would be a long post indeed! But the first step is knowing that an index could help - which is the intention of this post. Thanks, Erick

  • Anonymous
    December 05, 2007
    Presentations given in Irvine and Riverside, CA - here's the deck and links for more information about

  • Anonymous
    March 09, 2008
    After using Linq to SQL on a small project I was impressed.  At least on small projects it would save a lot of time by just dragging and dropping your tables into the designer and you have a data access layer; impressive.  Unfortunately, until then I did not know that Linq is very much a SQL Server product.   I had read in a few places that support for Linq to Entities would be available for other databases.  I have researched and actually tried what was developed for Oracle and it is practically useless at this time. It is now clear that, Oracle is not going to support Microsoft in this endeavor.  Oracle considers Linq as a threat (Linq depends on database intelligence rather than tuning for QEP), and Microsoft as a competitor that is having an effect on its sales.  Therefore My questions are: Can Microsoft fund the development of a driver for Oracle that supports Linq to Entities? Is Microsoft interested to see a well functioning driver for Oracle that supports Linq to Entities? Slower response time can be accepted in many of our applications; however we can not justify replacing a good RDBMS in order to use Linq.