Distilling some ideas for keeping that data access code (and test code) tidy...
I just spent a while doing some refactoring. It turned out to be a very long while, probably much more than I would have originally estimated. (Part of the reason for that is builds and tests that take a long time...) The changes I made were somewhat haphazard at first, and I wasn't sure why I thought they were good ideas, but now I'm starting to feel like some of them are definitely good ideas, in fact, maybe there are patterns or lessons to distil from the changes.
Principle #1 - Keep serialization knowledge where it belongs - in your data access layer!
Problem manifestation #1: we were using Entity Framework with string fields and effectively storing enumeration values as strings. What this tends to lead to is little bits of smelly code everywhere like the following:
deployment.Status = DeploymentStatus.Active.ToString();
if (String.Equals(deployment.Status, DeploymentStatus.Suspended.ToString());
DeploymentStatus status = (DeploymentStatus)Enum.Parse(typeof(DeploymentStatus), deployment.Status, true);
switch (status) {...}
Problem manifestation #2: we were also using Entity Framework to store json serialized dictionaries of settings....
Solution:
Fix the model. Make the code model really have an enum DeploymentStatus property, and a Dictionary<string,string> Settings property. Change the data abstraction layer to be what knows how to serialize that to and from a string in the database column, and then hurray! You can remove those ToStrings() and Parses() from code everywhere.
Principle #2 - Query objects can be testability goodness!
Problem manifestation: (Basically we were half-way there, but realizing none of the benefits, leading to brittle mocking code.)
Again with the Entity Framework, we had a bunch of smell test code which would be (Test code before:)
var resourceSpec = new ResourceSpecification(resourceName, otherQueryParameters);
var fakeStore = A.Fake<DeploymentStore>();
A.CallTo(() => fakeStore.GetDeployment(resourceSpec)).Returns(new Deployment { ... });
For all our methods which needed to do some action on an existing deployment, they would take either a ResourceSpecification or a bunch of string parameters, then look up the deployment, then do some stuff with it. Testing them was a PITA because you are constantly jumping through the hoop of 'how do I mock out the retrieval of the actual Deployment resource from the database'?
Code before:
var deployment = _context.GetDeployment(resourceSpec);
Code after:
var deployment = resourceSpec.GetDeployment(_context);
Test code after:
var resourceSpec = new MockResourceSpecification(new Deployment { });
Principle #3 - Resolve Queries for objects up-front at the input validation layer, then the actual work can be done
This may remind you of the 'query command separation' principle...
Code pattern to avoid:
function DoSomethingWithDeployment(ResourceSpecification r, blah, blah, blah)
{
...
CallOtherFunction(r, blah, blah, blah)
...
}
function CallOtherFunction(ResourceSpecification r, blah, blah, blah)
{
Deployment d = r.TryGetDeployment();
if (d == null) throw new DeploymentDoesn'tExistException();
}
Code pattern to use:
function DoSomethingWithDeployment(ResourceSpecification r, blah, blah, blah)
{
Deployment d = r.GetDeploymentOrBust(); //throws an exception if deployment doesn't exist
...
CallOtherFunction(d, blah, blah, blah);
...
}
Why is the second code so much better than the first? Because it's now much easier reasoning about and testing the code path for deployment doesn't exist.
-validation code is all right there in the function taking user input, which is where you expect that you have to write validation oriented test cases.
-contract between your functions got clearer and simpler
-the code path it is now one less function deep, so now less test cases for the deeper function!