Larry's Rules of software engineering, Part 4 - Writing servers is easy, writing clients is HARD.

Over the past 20 years or so, I've written both (I wrote the first NT networking client and I wrote the IMAP and POP3 servers for Microsoft Exchange), so I think I can state this with some authority.  I want to be clear - it's NOT easy to write a server - especially a high performance server.  But it's a heck of a lot easier to write a server than it is to write a client.

 

Way back when, when I joined the NT project (back in 1989ish), my job was to write the network file system (redirector) for NT 3.1.

Before that work item was assigned to me, it was originally on one of the senior developers on the team's plate. The server was assigned to another senior developer.

When I first looked at the schedules, I was surprised.  The development schedule for both the server AND the client was estimated to be about 6 months of work.

Now I've got the utmost respect for the senior developers involved.  I truly do.  And the schedule for the server was probably pretty close to being correct.

But the client numbers were off.  Way off.  Not quite an order of magnitude off, but close.

You see, the senior developer who had done the scheduling had (IMHO) forgotten one of the cardinal rules of software engineering:

Writing servers is easy, writing clients is hard.

If you think about it for a while, it actually makes sense.  When you're writing a server, the work involved is just to ensure that you implement the semantics in the specification - that you issue correct responses for the correct inputs.

But when you write a client, you need to interoperate with a whole host of servers.  Each of which was implemented to ensure that it implements the semantics in the specification.

But the thing is, the vast majority of protocol specifications out there don't fully describe the semantics of the protocol.  There are almost always implementation specifics that leak through the protocol abstraction.  And that's what makes the life of a client author so much fun. 

These leaks can be things like the UW IMAP server not allowing more than one connection to SELECT a mailbox at a time when the mailbox was in the MBOX format.  This is a totally reasonable architectural restriction (the MBOX file format doesn't allow the server to support multiple clients simultaneously connecting  to the mailbox), and the IMAP protocol is mute on this (this is not quite true: there are several follow-on RFCs that clarify this behavior).  So when you're dealing with an IMAP server, you need to be careful to only ever use a single TCP connection (or to ensure that you never SELECT the same mailbox on more than one TCP connection).

They can be more subtle.  For example the base HTML specification doesn't really allow for accurate placement of elements.  But web site authors often really want to be able to exactly place their visual elements.  Some author figured out that if you insert certain elements in a particular order, they can get their web site laid out in the form they want.  Unfortunately, they were depending on ambiguity in the HTML protocol (and yes, HTML is a protocol).  That ambiguity was implemented in one way with one particular browser. 

But every other browser had to deal with that ambiguity in the same way as the first browser if they wanted to render the web site properly.  It's all nice and good to say to the web site author "Fix your darned code", but the reality is that it doesn't work.  The web site author might not give a hoot about whether the site looks good for your browser, as long as it looks good on the browser that's listed on the site, they're happy campers. 

The server (in this case the web site author) simply pushes the problem onto the client.  It's easier - if the client wants to render the site correctly, they need to be ambiguity-for-ambiguity compatible with the existing browser.

Ambiguity is a huge part of what makes making clients so much fun.  In fact, I'm willing to bet that every single client for every single network protocol implemented by more than one vendor has had to make compromises in design forced by ambiguities in the design of the protocol (this may not be true for protocols like DCE RPC where the specification is so carefully specified, but it's certainly true for most other protocols).  Even a well specified protocol like IMAP has had 114 clarifications made to the protocol between RFC 2060 and RFC3501 (the two most recent versions of the protocol).  Not all the clarifications were to resolve ambiguities (some resolved spelling errors and typos), but the majority of them were to deal with ambiguities.

Clients also have to deal with multiple versions of  a protocol.  For CIFS clients, the client needs to be able to understand how to talk to at least 7 different versions of the protocol, and they need to be able to implement their host OS semantics on every one of those versions.  For the original NT 3.1 redirector, more than 3/4ths of the specification for the redirector was taken up with how each and every single Win32 API would be implemented against various versions of the server.  And each and every one of those needed specific code paths (and test cases) in the client.  For the server, each of the protocol dialects was essentially the same - you needed to know how to implement the semantics of the protocol on the server's OS. 

For the client, on the other hand, you had to pick and choose which of the protocol elements was most appropriate given the circumstances.  As a simple example, for the IMAP protocol, clients have two different access mechanisms - you can access the messages in a mailbox by UID or by sequence number.  UIDs have some interesting semantics (especially if the client's going to access the mailbox offline), but sequence numbers have different semantics.  The design of the client heavily depends on this choice - there are things you can't do if you use UIDs but there's a different set of things you can't do if you use sequence numbers.  It's a really tough design decision that will quite literally reflect the quality of your client - is your IMAP client nothing more than a POP3 client on steroids, or does it fully take advantage of the protocol?  Another decision made by clients: Do they fetch the full RFC 2822 header from the server and parse it on the client, or do they fetch only the elements of the header that they're going to display?

So when you're thinking about writing networking software, just remember the rule:

Writing servers is easy, writing clients is hard.

You'll be happy you did.

Comments

  • Anonymous
    May 23, 2005
    I'd say "Writing something that doesn't have to interact with previous implementations of the other side is easy, writing something that does is hard". I'm sure writing IIS, OWA etc to be bug-compatible with old browsers (Netscape 4.x, that tiny non-upgradable browser in 5-year-old PalmPilots, etc) is just as tricky.
  • Anonymous
    May 23, 2005
    Like the previous commenter, I don't see where the client/server split is in this. In general, both clients and servers have to operate with a variety of interpretations of the protocols they nominally speak.

    Clients are only harder to write than servers if servers are generally less accurate in their intepretation of specifications than clients are. Which would, of itself, imply that servers must be harder to get right than clients, which is why they're more likely to get it wrong.

    QED, not.
  • Anonymous
    May 23, 2005
    Jonathan and Will,
    You're missing half the point - regardless of leaky abstraction issues, clients have to make decisions of what protocol elements to use against what servers. Servers don't. All the server has to do is to figure out how to implement the protocol elements on the OS on the server. Now this may not be a trivial problem, but it's containable.

    On the other hand, client's need to deal with differences in server implementations AND in protocol differences. A really simple example: A client needs to choose between using HTTP 1.0 and HTTP 1.1 when it interacts with an HTTP server. The semantics of each protocol are subtly different, and the client has to support both. IMAP clients need to know if they're going to support IMAP2bis servers or just IMAP4rev1 servers. And they need to have code to handle both of these cases. CIFS clients need to determine which of the seven different CIFS variants that they're going to support, and how to implement platform-specific features against servers that don't necessarily support those platforms.

    You're right that servers have to deal with client variation - the Exchange IMAP server has code in it to deal with a buggy IMAP client from a 3rd party vendor. The client was in clear violation of the protocol, but we changed the server in the name of compatibility.

    But the SERVER was orders of magnitude easier to write than the client. Even though the server's semantics weren't quite those of IMAP, we were able to do that work in relatively little time. But the IMAP client (especially a quality IMAP client) took far longer to write (and get correct).
  • Anonymous
    May 23, 2005
    I'm afraid I'm still missing it. I don't doubt that you can find specific examples where the rules of a protocol make it easier at one end than the other, but I still don't see that this is a general fact of life.

    I don't even follow the HTTP example - surely a decent webserver needs to support HTTP1.1 for performance AND 1.0/0.9 for older clients. Meanwhile, a client could merely support 1.0 if it chose to be simpler? Of course, in real life they probably both need to support both to be considered 'good', but I still don't think we're proving an asymmetry here.
  • Anonymous
    May 23, 2005
    Larry,

    this, btw, also is true for windows clients, not only network clients... :-) [and yes, i did write my share of networking clients and servers.]

    WM_CHEERS
    thomas woelfer
  • Anonymous
    May 23, 2005
    The comment has been removed
  • Anonymous
    May 23, 2005
    I was explaining to someone the other day why I have such a great respect for shell devs. I wish I'd had this client/server metaphor in my repertoire. It makes a more compelling argument than my "they have to understand everyone else's stuff and still know all about UI".
  • Anonymous
    May 23, 2005
    Nice piece of advice :)

    I may be naive, but I have one question that begs to be asked:

    In my ideal world (no, I haven't implemented clients and servers for a living... yet :)), protocol implementations are simply deterministic finite automata. Just like parsers are, for instance. The natural question is: For describing programming languages we have de facto standards like the EBNF notation; why isn't there a standard for describing network protocols? This would simply throw uncertainty out of the window...
  • Anonymous
    May 23, 2005
    Yes. Finite state machines and Petri nets are often used to model protocols.

    An FSM can produce a highly detailed and unambiguous design for a protocol, if properly written and interpreted. Tanenbaum covers the topic quite nicely (Computer Networks, ISBN 0-13-394248-1).

    I've also read a little bit about languages that can be used to describe protocols, but I'm not familiar with any.
  • Anonymous
    May 23, 2005
    Addition:

    Formal specifications are excellent at describing syntax and deterministic operations, but terrible at describing semantics. BNF, for example, does not describe any semantics. Just take a look at the Algol68 specification - formally defined, and impossible to implement.

    It's pretty difficult to formally define the semantics of anything nontrivial.
  • Anonymous
    May 23, 2005
    I suppose, a server says "This is how things are, deal with it!" (Which is easy.)
    And the client has to deal with it, and all the other types of "it" that come from different servers. (Which is hard.)
  • Anonymous
    May 24, 2005
    The comment has been removed
  • Anonymous
    May 24, 2005
    Charlie, that's a good way of putting it.
  • Anonymous
    May 24, 2005
    Abstract State Machines can be used to model protocols. See [1] for an example.

    However, that doesn't help in the case Larry mentioned about choosing which Read-version to use when communicating with the file server.

    [1] U. Glässer, Y. Gurevich, and M. Veanes, “Universal Plug and PlayMachine Models, Foundations of Software Engineering,” Techni-cal Report MSR-TR-2001-59, Microsoft Research, Redmond, June2001 [ONLINE] http://research.microsoft.com/research/pubs/view.aspx?type=Technical%20Report&id=465
  • Anonymous
    May 24, 2005
    The comment has been removed
  • Anonymous
    May 24, 2005
    Nope, I'm afraid I remain unconvinced. I don't dispute the specific examples (except HTTP), because I don't know any of their details, but I just don't agree that this is a general rule.

    A counter example might be something like NNTP, where there are at least two common ways of requesting new news with most clients only using one or the other while servers have to support both.

    Or it that you're saying that the hard bit is not the implementation of the two features, but the act of deciding which one to implement?

    I don't think that's a general rule either, but I'll concede it's a different argument.
  • Anonymous
    May 24, 2005
    Will,
    That's exactly my point. Both clients AND servers have to deal with leaky abstraction issues.

    But clients have the additional burden of having to chose how to implement UI functionality within the context of the protocol. They also need to deal with things like users. You see a computer could care less if retrieving a packet from the network took 45 seconds. But a user would have rebooted their computer in disgust (I know, I certainly got enough complaints about the RPC timeouts when I owned the Exchange MAPI client code). So clients need to add in logic to abortively disconnect from the server (or at least to detect that the server's still processing the request).

    Servers don't care about that kind of thing - by their nature, if it takes 5 minutes to read the data from the disk, they assume that the client will happily wait for the data.
  • Anonymous
    May 24, 2005
    I think that Larry's principle is too focussed.

    V1.0 is easy, interoperating with everyone else's V1.0 is hard.

    Formal specifications sound attractive but it's usually only in hindsight that it's obvious that the product/protocol will succeed to such a great extent that it was worth the formal analysis and specification.
  • Anonymous
    May 24, 2005
    Larry,

    Like other posters I agree that there are some cases where a client is more difficult to write than the server. However I fail to see how we can generalize it.

    I would say many factors dictate if this principle holds. If the circumstances are such that the server is only required to interoperate with one client, whereas the client is required to interop with many servers, then yes, the clients will be difficult to write. Also, it becomes a strategic decision - when a company wants to get it's implementation out in the market, it might chose to write a client that interops with as many servers as possible.

    I guess I might take it so far as to say that writing the receiving side of client/server transactions is difficult, while writing the send side is easier (relatively). One of the important principles of protocol engineering is "Be lenient in what you receive, be strict in what you send". So, ability to receive protocol from many client implementations drives complexity into the implementation to account for all bugs etc.

    It also depends on the protocol in question and the application. In some cases it might be easy to write client/servers whereas in others there might be an asymmetry. I have seen a case where the burden of accounting for multiple clients caused a team to make the wrong decision for the server, and thus drove complexity and buggy workarounds for a lot of interoperating client products.
  • Anonymous
    May 24, 2005
    Larry, this is going to be my last post on the subject, because we're not going to agree!

    I don't see that implementing two features in a server is easier than implementing one in a client (and especially when the client implementer can choose the easiest one).

    Perhaps if we rephase this as 'writing software with a user-interface is harder than writing software without a user-interface', then we can find some common ground.

    Cheers,

    Will
  • Anonymous
    May 25, 2005
    The comment has been removed