Zen Faults

I've been talking about fault messages for a while now, specifically the kind that are sent around as the body contents of a SOAP message. However, some of the most important faults are reported without sending a message at all. In that case, we have to intuit that a fault has occurred based on the other sources that we can observe. I'll cover a special case of these zen faults, faults over HTTP status codes, tomorrow. There are plenty of examples of such faults that don't involved HTTP though. How are faults transmitted without sending messages? And, why do we do this?

Throughout this post I'm assuming that we're just talking about faults in the underlying system. There are plenty of ways for applications to zen fault the other side through ungraceful shutdowns or arbitrary behavior. This is especially caused by Abort, which simply disposes of all network resources without bothering to clean up after itself (certain protocol channels, such as transactions, may attempt to salvage the situation but that's another matter).

The zen fault that most everyone has probably seen is an EndpointNotFound. We sometimes have to intuit that fault based on a connection being closed or refused. Occasionally, the endpoint has legitimately gone missing. Frequently though, our intuitive guess about the fault is wrong. More likely problems are an error in configuration, the presence of firewalls, or the server simply being too busy to accept the connection in time. All of these situations can make it look like a service isn't running at all.

Another common zen fault is the result of a quota being exceeded. Quota faults often manifest themselves as "Remote connection aborted" type errors. The other side has decided that it has spent too many resources on this connection and doesn't want to spare any more sending back a fault message. The client has little or no indication of what went wrong as there's no way to determine why the server closed the connection. This type of fault is basically indistinguishable from an arbitrary call to Abort. In order to diagnose the fault successfully, you need to look at trace logs on the server where notice of the quota exception is recorded. You may be hinted to the presence of a quota fault by the problem occurring reliably at "round" numbers, such as 60 seconds, 65536 bytes, or 10 connections.

The last type of common zen fault occurs when there's a fault prior to the caller's identity being verified (for instance, if the identity verification failed). We can return a fault if there's a back channel that will direct the message to the original sender. However, we don't want to initiate new connections because that could send the fault to a location using unverified data. There have been a lot of historical flooding attacks based on evil machine A sending a small payload to unsuspecting server B which then sends a large error message to target machine C.

Next time: Faults and HTTP

Comments

  • Anonymous
    January 03, 2007
    I started this blog back in February hoping to produce a daily post throughout the entire month. I had

  • Anonymous
    January 08, 2007
    The last piece of this eleven part series on fault messages covers advice for channel authors that need

  • Anonymous
    February 19, 2007
    I haven't forgotten about the goal to put together a table of contents for all of these articles. The