EF Feature CTP4: DbContext & Databases

Article
09/02/2010

The information in this post is out of date.

Visit msdn.com/data/ef for the latest information on current and past releases of EF.

For Code First to a New Database see https://msdn.com/data/jj193542

For Code First to an Existing Database see https://msdn.com/data/jj200620

We recently announced the release of Feature CTP4 for the ADO.Net Entity Framework (EF). CTP4 contains a preview of new features that we are considering adding to the core framework and would like community feedback on. CTP4 builds on top of the existing Entity Framework 4 (EF4) which shipped as part of .NET Framework 4.0 and Visual Studio 2010.

CTP4 contains the first preview of a set of Productivity Improvements for EF that provide a cleaner and simpler API surface designed to allow developers to accomplish the same tasks with less code and fewer concepts. The Productivity Improvement effort is described in more detail in a recent Design Blog post.

In addition to a simplified API surface the Productivity Improvements also offer a streamlined Code First experience that takes care of common boiler plate code and uses convention over configuration to reduce the amount of code needed to start performing data access. You can then gradually start to override these conventions with your own configuration as you progress through the development lifecycle and ultimately deploy to a production environment. One example of these conventions is database location and provisioning which we will cover in this post.

This post is largely targeted at using the Code First approach with DbContext to generate a database, if you are mapping to an existing database then this is covered at the end of the post.

Default Conventions

First let’s look at the default behavior of DbContext and how it uses convention rather than configuration to reduce the amount of code we need to write to get an application up and running. Below is a complete application that uses DbContext to persist and query data using DbContext. No additional code or configuration is required; DbContext will automatically create a database for us based on our domain model. The database will be created on our localhost\SQLEXPRESS instance and will be named after the fully qualified type name of our derived context (in the following example this would be PI.DbDemo.ProductCatalog).

using System;
using System.Collections.Generic;
using System.Data.Entity;

namespace PI.DbDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var context = new ProductCatalog())
            {
                // Persist Data
                var food = new Category { Name = "Food" };
                context.Categories.Add(food);
                context.SaveChanges();

                // Query Data
                foreach (var cat in context.Categories)
                {
                    System.Console.WriteLine(cat.Name);
                }
            }

            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }
    }

    public class ProductCatalog : DbContext
    {
        public DbSet<Category> Categories { get; set; }
        public DbSet<Product> Products { get; set; }
    }

    public class Category
    {
        public int CategoryId { get; set; }
        public string Name { get; set; }

        public ICollection Products { get; set; }
    }

    public class Product
    {
        public int ProductId { get; set; }
        public string Name { get; set; }
        public decimal UnitPrice { get; set; }

        public Category Category { get; set; }
    }
}

Of course this convention is useful to get up and running but it’s not going to get us all the way to deploying in a production environment, you probably aren’t going to be using a local SQL Express instance in production and if you’re an enterprise developer your DBA probably isn’t going to overjoyed at the idea of you application having permissions to create databases (with good reason). In the next sections we’ll look at how you can start to take control over the database as your requirements progress.

Connection Factories

Under the covers there is a convention that is taking the name of your context and turning it into a database connection, this is an AppDomain wide setting that can be changed via a static property; System.Data.Entity.Infrastructure.Database.DefaultConnectionFactory. Connection factories implement the System.Data.Entity.Infrastructure.IDbConnectionFactory interface which defines a single CreateConnection method. When you use the default constructor on DbContext the fully qualified name of your context is passed to the default connection factory to obtain a database connection.

public interface IDbConnectionFactory
{
DbConnection CreateConnection(string nameOrConnectionString);
}

Changing the Database Name

If you just want to change the name of the database that is generated then you can control the string that is passed to the default connection factory by using the DbContext constructor that specifies the nameOrConnectionString parameter. Here is our derived context updated to specify a database name:

public class ProductCatalog : DbContext
{
    public ProductCatalog()
        :base("DemoProductStore")
    { }

    public DbSet<Category> Categories { get; set; }
    public DbSet<Product> Products { get; set; }
}

Changing the Database Server

If you want to have your database on another Microsoft SQL Server Instance then you can tweak the settings on the SQL Client connection factory that is included in CTP4; System.Data.Entity.Infrastructure.SqlConnectionFactory. This connection factory includes a constructor that allows us to override pieces of the final connection sting, such as username, password and server. We need to make changes to the default convention before any contexts are created in our AppDomain, in the case of our console application we can just do this at the start of the Main method:

static void Main(string[] args)
{
    Database.DefaultConnectionFactory =
         new SqlConnectionFactory("Server=MyDatabaseServer");

   ...

}

Changing to SQL Compact

Along with the SQL Client connection factory we also include the System.Data.Entity.Infrastructure.SqlCeConnectionFactory which will generate connections to SQL Compact databases. Because the SQL Compact providers aren’t backwards compatible you will need to specify the invariant name of the provider version you want to use. Currently the SQL Compact 4.0 provider is the only one that supports Code First database creation and it is available for download as a separate CTP.

static void Main(string[] args)
{
    Database.DefaultConnectionFactory =
         new SqlCeConnectionFactory("System.Data.SqlServerCe.4.0");

    ...

}

By default SQL Compact files will be created in the |DataDirectory| directory, for executables this is the same directory as the executable, for web applications this is the ‘App_Data’ directory. The SQL Compact factory also includes constructors to override the directory that databases are created in, or any other part of the connection string.

App.config/Web.config

All these conventions are great but if our database server changes between dev, test and production then we really want to be able to change it easily without having to recompile code. No problem, just add a connection string to your applications config file with a name that matches the name of your context (either fully-qualified or not). Because the shape of the model comes from your code rather than xml files this is just a plain connection string rather than an EntityClient connection string used in other areas of EF.

<configuration>
<connectionStrings>
    <add name ="ProductCatalog"
         providerName="System.Data.SqlClient"
         connectionString="Server=.\SQLEXPRESS;Database=ProductCatalog;Integrated Security=True" />
</connectionStrings>
</configuration>

Note that if you pass a string to a DbContext constructor (for the nameOrConnectionString parameter) then that string should match the name of the connection string added to your config file.

Database Initializers

So far we’ve looked at how a database is located, now let’s look at how that database is initialized. By default if the database does not exist then DbContext will automatically create a database for you without any data in it, but of course this can be changed. Database initialization is handled via a setting at the AppDomain level which can be tweaked or replaced for your context via the static System.Data.Entity.Infrastructure.Database.SetInitializer method. Initializers need to implement the System.Data.Entity.Infrastructure.IDatabaseInitializer interface, which contains a single method to initialize the database for a given context:

public interface IDatabaseInitializer<in TContext>
where TContext : DbContext
{
void InitializeDatabase(TContext context);
}

Having this very-generalized hook gives developers and third parties the opportunity to build custom solutions to help with initialization, in addition to the options provided in the box.

It is worth calling out that this is a very new feature that only just made it into CTP4 and we know it has some rough edges, but we really want your feedback so we decided to include it. As you’ll see when we look at evolving models there are some missing pieces in the CTP4 story but we feel like the database initializer hook is the right way to approach the problem and we just need to expand the abilities of the initializers and tooling that we provide.

Evolving Models

Of course it is unlikely that our domain model will remain the same forevermore after the first time we run our application. This in turn means our database schema is going to need to evolve along with our domain model, this scenario is going to highlight one of the big gaps in CTP4 and something we are working to improve. Ideally we would have a set of tools that help you migrate the existing database, either automatically or by generating migration scripts that can be tweaked and then saved for later use against test and production databases. We’re not there yet though, in CTP4 you need to drop and re-create the database with the new schema to match your domain model.

So let’s go ahead and change our domain model, I’m just adding a Description field to our Product class:

public class Product
{
    public int ProductId { get; set; }
    public string Name { get; set; }
    public decimal UnitPrice { get; set; }
    public string Description { get; set; }

    public Category Category { get; set; }
}

Now if we run our application we’ll get an InvalidOperationException because our schema doesn’t have a column for the new Description field:

“The model backing the 'ProductCatalog' context has changed since the database was created. Either manually delete/update the database, or call Database.SetInitializer with an IDatabaseInitializer instance. For example, the RecreateDatabaseIfModelChanges strategy will automatically delete and recreate the database, and optionally seed it with new data.”

As the exception suggests there is an initialization strategy in CTP4 that will drop and re-create the database for us when the model changes. We need to register the initializer before any contexts are created in our AppDomain, in the case of our console application we can just do this at the start of the Main method. Running the application again will cause the old database to be dropped and the new database with the Description column to be created, obviously this isn’t ideal for all situations but it’s where we got to for CTP4.

static void Main(string[] args)
{
    Database.SetInitializer<ProductCatalog>(
        new RecreateDatabaseIfModelChanges<ProductCatalog>());

    ....
}

Other Strategies

In addition to the RecreateDatabaseIfModelChanges strategy CTP4 also includes two other strategies, CreateDatabaseOnlyIfNotExists and AlwaysRecreateDatabase, the names are pretty self-explanatory. CreateDatabaseOnlyIfNotExists is the default strategy. AlwaysRecreateDatabase can be useful in testing when combined with seed data, allowing you to reset the database to a clean state before each test run, this is shown in the next section.

Seed Data

All three of the strategies included in CTP4 can be derived from and have a virtual Seed method that is called after the database has been created. This allows us to insert some data into the database once it has been created, an instance of the context type is passed into the method so you can specify seed data in terms of your domain objects and use the context to persist them. Of course the underlying database connection is available from context.Database.Connection so you can always interact directly with the database if you want.

Let’s define an initializer for use in a test project that will recreate the database before each test run with a known set of test data. I’m sure some of you are cringing at the thought of running tests against an actual database all the time. It’s definitely not for everyone and you can certainly use interfaces and in-memory fakes rather than hitting the database (we created the IDbSet<T> interface for exactly this reason). This is simply another option for folks who don’t need/want to create a layer of abstraction between their application and data access layer.

public class ProductTestInitializer : AlwaysRecreateDatabase<ProductCatalog>
{
    protected override void Seed(ProductCatalog context)
    {
        var food = new Category { Name = "Food" };
        context.Categories.Add(food);

        context.Products.Add(new Product
        {
            Name = "Bread",
            UnitPrice = 2,
            Category = food
        });

        context.Products.Add(new Product
        {
            Name = "Butter",
            UnitPrice = 5,
            Category = food
        });

        context.Products.Add(new Product
        {
            Name = "Vegemite",
            UnitPrice = 4,
            Category = food
        });
    }
}

If we are using MSTest (included with Visual Studio) for our testing then we use a ‘Class Initialize’ method to register the initializer, this will run before we construct a context in any of our tests:

[TestClass]
public class MyTests
{
    [ClassInitialize]
    public void ClassInitialize(TestContext c)
    {
        Database.SetInitializer<ProductCatalog>(new ProductTestInitializer());
    }

    [TestMethod]
    public void MyTest1()
    {
    }
}

Schema Metadata

One of the challenges in database initialization is in knowing whether an existing database is compatible with the current domain model. In CTP4 we include an additional “EdmMetadata” table in generated schemas to help answer this question. The table includes some information about the model that was used to generate the database and is used internally to support the DbContext.Database.ModelMatchesDatabase method.

Of course not everyone is going to be happy with this table being included in their database so you can switch it off using the Code First ModelBuilder API, if you do turn it off then some of the initialization strategies will not be able to function correctly.

public class ProductCatalog : DbContext
{
    public DbSet<Category> Categories { get; set; }
    public DbSet<Product> Products { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.IncludeMetadataInDatabase = false;
    }
}

Customizing the Generated Schema

Not everyone is going to be happy with the database schema that is generated by default. It’s not really in the scope of this post but we’ll take a quick look at customizing the schema. Below is an example of using the Fluent API to control the table names for our two classes. The Fluent API provides many more knobs for controlling the schema including column names, splitting entities between multiple tables, inheritance patterns and many more. This is one area of the Fluent API that still has some rough edges and we are working to make the code required for mapping more succinct and intuitive for the next release.

public class ProductCatalog : DbContext
{
    public DbSet<Category> Categories { get; set; }
    public DbSet<Product> Products { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Category>()
            .MapSingleType()
            .ToTable("purchasing.CategoryLookup");

        modelBuilder.Entity<Product>()
            .MapSingleType()
            .ToTable("purchasing.ProductCatalog");
    }
}

Working with Existing Databases

Most of the content in this post has dealt with scenarios where Code First is responsible for generating the database. If you have an existing database then you don’t need database location or initialization. Here is the process to follow for using Code First against an existing database:

Define the classes that make up you domain model
Define a derived DbContext and use the Fluent API and/or DataAnnotations to tell Code First how your classes map to your database
Add a connection string to you app/web.config file that has the same name as your derived context and points to your database
Switch off the initialization functionality for your context by registering null:
Database.SetInitializer<ProductCatalog>(null);

Summary

In this post we looked at how DbContext locates a database and initializes it for Code First developers. We covered the default behavior and how to tweak it or replace it with your own behavior, we also saw how to turn off the conventions when mapping to an existing database. There are some missing pieces in the database initialization space but we would like your feedback on the hooks we have made available and the general approach we are taking.

If you need assistance with EF Feature CTP4 we have an Entity Framework Pre-Release Forum.

Rowan Miller
Program Manager
ADO.Net Entity Framework

Comments

Anonymous
September 02, 2010
Great job! Thank you. One suggestion. Identity columns are usually keys and vica versa, right? So, I would like to wright configurations something like: this.HasKey(e => e.Key).IsIdentity(); or: this.HasIdentityKey(e => e.Key); instead of: this.HasKey(e => e.Key); this.Property(e => e.Key).IsIdentity(); It is not true with complex keys, so it could be ignored or exception thrown.
Anonymous
September 02, 2010
Are there any plans to at least have a one off code generation of the model from an existing database? This would write 90% of the code for you and then let you manually tweak the remaining code. Keeping it up to date with schema changes would also be fairly trivial and so would not require further code generation from an "Update model from database" type function. Are you in a position where you could give a road map to delivery - even if its as loose as quarters? Thanks, Jason
Anonymous
September 02, 2010
I wish working with existing databases were more than an afterthought. Many serious applications will have one, or will want to create the database independently, making it equivalent. These also (I think) contain much harder problems (e.g. schema is not exactly how the EF team imagined), so I hate to see so much energy given to DB generation, and so little to difficult problems like different schemas, not to mention features like stored procedures and UDFs (in queries, please??).
Anonymous
September 04, 2010
Great work! It looks like EF evolves in a right direction.
Anonymous
September 04, 2010
Hi, Its a very nice article. Very good hoooks in the EF framework. This allows the developers to easily customize the coding. Thanks, Thani
Anonymous
September 05, 2010
Its a very nice article. Very good hoooks in the EF framework. This allows the developers to easily customize the coding.
Anonymous
September 06, 2010
I love the Seed method. Regarding your comment that "I’m sure some of you are cringing at the thought of running tests against an actual database all the time.", well, I certainly prefer to mock things that can be mocked. But there is a place and time for actually testing real operations against the database. And when that time arrives, I prefer to have a small, controlled test database against which those database tests run. This allows efficient unit testing of database classes.
Anonymous
September 06, 2010
The comment has been removed
Anonymous
September 07, 2010
The comment has been removed
Anonymous
September 08, 2010
Rowan Are you going to offer a way to make custom conventions when using MapSingleType? For example AutoMapper (automapper.codeplex.com) has a PascalCaseNamingConvention to LowerUnderscoreNamingConvention. This would be handy when mapping to an existing database whose field's naming convention is lowercase underscore.
Anonymous
September 09, 2010
Why is Database.DefaultConnectionFactory a static property? There is no reason why I shouldn't be able to create different contexts with completely different connection types. There should be a constructor which takes an IDbConnection. And why are you creating your own factory classes instead of using DbProviderFactory? It seems like every new data access technology wants to create its own set of data access abstractions instead of using the ones that are already in the BCL.
Anonymous
September 09, 2010
@Ryan We are looking at providing customizable conventions, it’s not definite that we will get them into the first RTM but they are definitely something we are working on. @David Nelson DbContext does expose a constructor that accepts a DbConnection and you can also use the app/web.config to specify a connection, using either of these approaches will bypass the default connection factory. The default connection factory functionality basically builds on DbProviderFactory, as well as creating connections it’s also a convention that fills out default values (such as server and login mechanism) and also decides on a database name based on the name of each context.
Anonymous
September 10, 2010
I got a runtime error "The provider did not return a ProviderManifestToken string" when running a simple demo just like what you do, it told me "ProviderIncompatibleException". How to resolve this issue? Sql server 2005 is the only database server on my computer.
Anonymous
September 13, 2010
We have an existing database, so for us to be able to use EF4/Codefirst, which is our preference, we need to be able to map the model to the existing database - and all that entails. Of course some of our database conventions don't always map so well to classes and property names. And of course we want to write the minimum amount of code required! Having EF generate databases or upgade scripts are of zero interest to us. We actually keep schema version information in the database, so there is limited need to know the schema is different.
Anonymous
September 16, 2010
Is there any reason why there cannot be a overload on DbContext that could take a IDbConnection instead of just DbConnection. The same really goes for the other methods - what about support for IDbModel, IModelBuilder and even a IDbContext? This would make unit testing of code that uses the EF4 code first approach much easier.
Anonymous
September 17, 2010
The comment has been removed
Anonymous
October 02, 2010
Why do you call this data access class "ProductCatalog" and not "ProductRepository" ?
Anonymous
October 11, 2010
The comment has been removed
Anonymous
December 04, 2010
@Koistya `Navin A repository is typically responsible managing a single type of entity rather than all types of entities in the model and generally restrict the queries that can be executed rather than exposing an IQueryable. Folks usually build a set of repositories over the top of a context. @Stacey Thank you for the feedback, most folks are telling us they prefer the short names for things they type regularly. For less common types that need a bit more explanation we are using more verbose names (like the database initializers etc.).
Anonymous
December 24, 2010
With normal DataContext objects there is a .log property to enable logging of the SQL queries. What is the equivalent with DbContext? Many thanks
Anonymous
January 26, 2011
Is it possible to alter the AlwaysRecreateDatabase attribute to apply option "Close existing connections"? Would make testing with seed a lot easier if you didn't have to manually drop table in SSMS clicking "Close existing connections" everytime
Anonymous
July 06, 2011
So confused... Everything is working except the Seed method is never called! I see the database get created, I am using my class which inherits from AlwaysRecreateDatabase. I have overridden the seed method but the breakpoint is never hit.
Anonymous
September 13, 2011
Thank you for the post about creating a code first sql factory this is helpful

Share via