Share via


Abstraction levels and dependencies

As a follow-up from yesterday's post about not lying, I wanted to discuss a bit some thoughts on abstraction levels.

At first, it would appear that high-level methods lie a lot. When you ask a SqlConnection object to Open, it actually does a bunch of work: look at the connection string, possibly using the web.config information to resolve it, parsing that, connecting to a server - with its potentially multi-step dance to do things like authentication, maybe looking at a local connection pool to reuse an existing connection, logging the connection attempt, and so on. Does "Open" really capture all of that?

Well, "Open" certainly captures the intent - open up a SQL connection. I can't think we could accuse it of lying, although perhaps some of the things it does 'under the covers' should be made available to be called explicitly. There's really no right answer, because it's a matter of communication: what context will the API user have that you can rely on? Does the user know that opening will probably involve server connectivity? Maintaining a local pool of connections? Being redirected to a configuration file?

You can think of this as a tension between encapsulation, where the component hides many of these details and presumably promises you don't need to worry about them, and transparency, where the component is clearly stating what it does and allows you to reason about its behavior in detail and have fine-grained control over its work.

In a sense, I lied about lying. It's good to be transparent, but there's value in encapsulation as well. When you choose to encapsulate, you're providing an interface that if done right will be more stable than the underlying implementation. This lets the system evolve in a more loosely coupled manner, without changes rippling through.

In fact, there are cases where the components cannot help but be opaque about some aspects of what they do, because their behavior is set up at runtime through dependant components. For example, let's say I have a method called WriteCustomer with the following implementation.

public static void WriteCustomer(Customer customer, TextWriter writer)
{
  // I promise not to throw any exceptions.
  writer.WriteLine("Customer name: " + customer.Name);
}

Here, the method would like to promise that exceptions won't be thrown, and that it's writing customer information to some durable medium. However, the TextWriter is given to the method, so it can't control these aspects of execution: the writer may throw an exception because it won't support encoding some specific character, or because the disk is full, or it may write out the name to a memory buffer instead of a disk.

The method and the TextWriter class have different contracts (TextWriter doesn't promise to be exception-free), and for them to keep their promises, they have to be put together in the right way. They don't lie about what they do, but they depend on being put together correctly to work.

To recap:

  • Design to the level of understanding and desired control of your users. Otherwise you'll overwhelm them with details and a cumbersome API, or provide too insufficient control over what your library does. Code scenarios are your friends - what code do you expect your users to write, and how will they know to write it so?
  • Transparency is your friend. It helps your API to be more useable by describing clearly what it does.
  • Encapsulation is your friend. It helps your API to be more useable by focusing on intent rather than implementation details and it helps control changes through information hiding.
  • Encapsulation and transparency need not be at odds. But if they are, see the first point above to make tradeoffs.

Enjoy!

PS: Note that the examples I gave yesterday were not encapsulation, but just outright misleading the developer. Don't justify poor naming with an absurd argument about 'encapsulating the behavior behind a method called DoWork'!

PPS: I'm using transparency to mean clearly communicating the behavior of a component. The term is commonly used in a different sense to mean that a system may work regardless of or without particular concerns about something else. For example, a method call can be location transparent if it runs locally on a remote computer without the caller having to know about which case it is.