SQL Formatting - Why is this so difficult? (posted by Aaron)

Have you ever gone to modify what you thought was a simple stored procedure (one that you didn't write) needing to make a few small changes only to open up the stored procedure and find yourself scared to even touch the thing because it makes no sense whatsoever? There's nothing that discourages me more than opening up a stored procedure that someone else has written and seeing something like this:

SELECT * from emp q join address x ON q.id = x.addressid join children y on q.id = y.eid where STATUS = '3' and q.name = @name and cname = @Child

Now obviously this is a very simple example. And sure, the syntax is probably correct, and sure, it probably works. But boy is it hard to read and difficult to understand. Even though this is a very, very simply query, anyone reading it for the first time would be left with a bunch of questions with not so obvious answers. Example:
Which tables do the fields status and cname belong to?
Is the field status a foreign key to some lookup table?
Do the employee and child tables have different name fields? If so, why?

And this is just the beginning. Not only are you left with these questions, but there are a ton of other questions which might jump into your head. For example:
* Why are some tables abbreviated, some spelled out, and some plural? Is the dev trying to tell me something here?
* Why are tables aliased with non-meaningful characters?
* Does that fact that some id fields have "_" in them mean anything? 

I recently took a class taught by Brad Abrams on the importance of good API design. Brad was constantly was reminding us throughout the course that developers consuming our APIs should "fall into the pit of success". His analogy was that people usually equate success with "scaling a mountain" or something similar - but the reality is that scaling a mountain is really, really hard. You don't want devs programming against your APIs to find them cumbersome to use and/or difficult understand - rather, you want those developers to "fall into the pit of success". Meaning, it's so clear and straightforward that your API makes it easy for them to succeed.

It really hit home to me, and I think the same analogy applies to SQL programming. Let's do some simple formatting on the query above, apply a few naming conventions, and see if it doesn't make more sense.

SELECT emp.EmployeeID, emp.EmName, ch.ChName
FROM tbl_Employee emp
JOIN tbl_Address addr
ON emp.EmployeeID = addr.EmployeeID
JOIN tbl_Child ch
ON emp.EmployeeID = ch.EmployeeID
WHERE emp.StatusID = 3 \\ Active employees = 3
AND emp.EmName = @EmployeeName
AND ch.ChName = @ChildName

By adding some basic formatting and applying a few common naming conventions the query immediately becomes more readable and communicates to the user a bunch of additional information. Example:
* By prefixing all the tables with readable names (and the actual fields with table specific prefixes) I can easily see which fields go with which tables - EmName and ChName.
* By adding a simple comment on the StatusID line in the where clause I can see that StatusID is mostly likely a lookup value.
* I can easily see the relationships between the tables because the keys are all named in the same fashion and are consistent across tables.

I’ve tried a few different styles when writing SQL queries. My current preference is similar to what you see above. Anyone else out there have a SQL formatting style that they prefer?

So, obviously this is nothing new and may seem like trivial stuff. But following rules like this is so easy to do and I think really creates great programming habits. Like Brad said, it's harder to make mistakes when you're "falling into the pit of success".

Aaron

Comments