Tripping over Missing Servers
A common complaint is that the first call on a client object takes some disproportionately large amount of time, usually ten seconds or more, while successive calls are instantaneous. There are many reasons why this might happen so there's no generic resolution for this problem. Sometimes it is caused by a truly legitimate need to do a great deal more work than normal. For example, if the service you're talking with has been shut down, hibernated, and put away, restoring things to sufficient operation for processing your request may actually take a noticeable amount of time.
On the other hand, this sometimes is caused by indiscernible factors that vary from machine to machine. Frequently, these hard to diagnose and track down slowdowns are caused by a characteristic of distributed systems that Leslie Lamport noted a long time ago:
A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.
There are several cases that spring to mind working out differing machine configurations and sometimes just the vagaries of different pieces of software interacting. Three common ones go like this:
1.
If your primary name server is down or responding very slowly, almost every connection attempt will end up failing or taking much longer than expected while waiting for a lookup response.
2.
If you take a laptop between a corporate domain and home network while hibernating, it will from time to time try to chat with domain controllers that are not in your living room.
3.
We use the default proxy settings for your web browser to make HTTP requests unless you tell us otherwise. This is probably what you want unless your web browser is configured to sit there trying to automatically detect the proxy server that you don't actually have.
Comments
Anonymous
May 22, 2009
The comment has been removedAnonymous
May 22, 2009
The comment has been removedAnonymous
May 22, 2009
If you were doing queued messaging between your services, these technical problems would just disappear. Of course, you'd have to rethink your service contracts to move to a more one-way model (possibly with callback contracts), but that's actually a good thing - technical decoupling leading to greater logical decoupling. <a href="http://www.nservicebus.com">NServiceBus</a> is a messaging framework which is based on these two principles - technical and logical decoupling, which is designed to make building robust and scalable distributed systems easier by preventing you from making decisions that can get you in trouble later.Anonymous
May 26, 2009
The comment has been removedAnonymous
May 26, 2009
Hi Udi, I agree with you that more people should be thinking about queued topologies that provide decoupled messaging than do so today. I avoid calling queues a generic resolution to the problem though because replacing connected messaging with queued messaging is a semantic change to the application. Adding a queue to the system can change the boundaries for transactions and acknowledgments, as well as some of the properties of message delivery sessions. Application changes or additional protocols may be needed, which you might not always have the design freedom to introduce. That's why more people need to think about these problems up front because it is often hard to switch to a better approach after the problem is seen in deployment.