Jaa


Foreign Keys are our friends...

I often find customers who question the value of database-enforced integrity constraints.   While people often create primary keys, the foreign ones are perhaps a bit more .... well, foreign. 

Let's start at the beginning. In traditional, old-school database design, you take your business object and you divide it up into tables. A foreign key (FK) lets you tie the rows (even in different tables) associated with one business object together and describe them as one set that should be preserved.  So, if we create an FK from Orders to Customers and from OrderDetails to Orders, we can now make sure that nobody can delete the customer row.

Foreign keys are enforced on top of the indexing mechanism in SQL Server.  So, this implies that you need indexes in order to use this mechanism.  Especially in larger data warehouses, people scrutinize each index due to the space requirements, often leaving off the definition of foreign keys in their warehouse databases.  In other cases, people might just be lazy and not bother creating them.

 Today, I'll show you a case when the foreign key definition matters to the database engine.

 The SQL Server Query Optimizer contains logic to detect that some joins are unnecessary and to remove them.  I will show you an example where this happens only when the Foreign Key is defined.  If you think about it, a foreign key enforces that there must be a parent row for any child in the relationship.  So, an equijoin join across this condition is unnecessary if you don't need any columns from the parent side of the relationship.

 There are two example queries and plans at the end that show that the resulting plans have no joins at all - they have been removed.  So, the next time someone says that foreign keys don't matter, think about that for a minute before you agree.  While I don't believe that they are appropriate everywhere, the SQL Server Optimizer does benefit from knowing about constraints in the data like foreign keys.

 

Thanks,

Conor

drop

table t1

drop

table f1

create

table t1(id int primary key not null, data binary(2000))

create

table f1(f_id int primary key identity, t_id int not null, otherdata binary(1000))

alter

table f1 add constraint fk1 FOREIGN KEY (t_id) REFERENCES t1(id)

--alter table f1 drop constraint fk1

declare

@i int=0

set

nocount on

begin

transaction

while

@i < 2000

begin

insert

into t1 (id) values (@i*100)

insert

into f1 (t_id) values (@i*100)

insert

into f1 (t_id) values (@i*100)

insert

into f1 (t_id) values (@i*100)

set

@i+= 1

end

commit

transaction

select

f1.* from t1 inner join f1 on t1.id = f1.t_id

select

f1.* from f1 where exists (select 1 from t1 where t1.id = f1.t_id)

StmtText

-------------------------------------------------------

select f1.* from t1 inner join f1 on t1.id = f1.t_id

StmtText

----------------------------------------------------------------------------------

|--Clustered Index Scan(OBJECT:([blog1].[dbo].[f1].[PK__f1__2911CBED1367E606]))

StmtText

------------------------------------------------------------------------------

select f1.* from f1 where exists (select 1 from t1 where t1.id = f1.t_id)

StmtText

----------------------------------------------------------------------------------

|--Clustered Index Scan(OBJECT:([blog1].[dbo].[f1].[PK__f1__2911CBED1367E606]))

Comments

  • Anonymous
    August 07, 2008
    Molto spesso mi imbatto in strutture dati che non implementano foreign key demandando l&#39;integrità

  • Anonymous
    August 15, 2008
    Hi Connor, nice to see your blog back :) Some suggestions with all my respect, and maybe I'm wrong due to set options: select t1.id, f1.* from t1 inner join f1 on t1.id = f1.t_id table t1 is accessed, while optimizer should know that t1.id is equal to f1.t_id if you create an index for the FK column reference: create nonclustered index nci_f1_t_id on f1 (t_id) and you run a query like this (OLTP Traditional) select f1.* from t1 inner join f1 on t1.id = f1.t_id where f1.t_id = 1 the query uses properly the new NCI, but needs to access the parent table. Do you see this recommendations usefull at connect? Regards, Eladio

  • Anonymous
    August 25, 2008
    Frequently, as a trainer and consultant I have to face in our clients the benefits that customers can