Riffing on Raymond - Network performance...
I keep on doing this, clearly it's evidence of a lack of imagination on my part...
Raymond's post a while ago discussed some of the problems with network latency (no, I'm not going to touch that particular can of worms).
It's amazing how many people don't understand how big a deal this problem is. When I joined the Exchange team back in the mid 1990s, the perf team was spending a HUGE amount of time analyzing the Exchange store RPC traces trying to figure out ways of squeezing out every single byte from the RPC traffic.
They'd defined compressed forms of Exchange EntryIDs, they were considering encoding Unicode strings using some neutral encoding (UTF8 hadn't been invented at that point, so they were trying to roll their own).
I came on the team and looked at what they were doing and was astounded. They were sweating bricks trying to figure out how to squeeze out individual bytes of data from each packet.
The thing is that the reality was that for the vast majority of cases, all that work didn't actually make a difference.
The reason has to do with the basic nature of Ethernet based networking (token ring and ATM have different characteristics, but my comments here apply to them as well, it's just that the numbers and behavior characteristics are slightly different).
In general for all LAN networks, it takes essentially the same time to send one byte of data as it does to send 1K of data. When you start sending more than 1K of data, then the numbers will start to grow (because you're sending more than one packet), but even then, the overhead of sending 10K of data isn't significantly higher than sending 1K.
On the other hand, round trips will KILL your performance. So if you've got a choice between sending 100 messages with 1K in each message and 1 message with a 100K payload, you want to send the 1 100K message all the time.
Needless to say, I'm MASSIVELY glossing over the issues associated with sending data across a network, the above is simply a reasonable rule of thumb - the round trips are what matters, not the bytes being sent.
Now, having said all that, when you're dealing with dial-up networks, the rules are completely different. On a 9600 baud connection, it takes one millisecond to send one byte, which means that every single byte counts. In the Exchange case, since Exchange was designed for corporations with wired networks, it made sense to design the client/server protocol for the LAN environment. But when you're designing a feature that's intended to be used over dialup, the rules are totally different. Among other things to consider, on a dial up network, the modems themselves do compression, so compressing the data before transmission isn't always a benefit (compressing already compressed data tends to increase the size of the compressed data (assuming the compression algorithm's worth its salt)).
Comments
Anonymous
May 08, 2006
[Why is this under "Programmer Hubris"? Because it's about developers who find "an easy fix" and apply...Anonymous
May 08, 2006
The comment has been removedAnonymous
May 08, 2006
Maybe the two of you could work out a topic to both blog about ahead of time and you could do "trading fours with Raymond" instead.Anonymous
May 08, 2006
The comment has been removedAnonymous
May 08, 2006
No, about 3 years before that point.
http://en.wikipedia.org/wiki/UTF-8#HistoryAnonymous
May 08, 2006
Also - interesting essay on latency: "It's the latency, stupid!"
http://www.stuartcheshire.org/rants/Latency.htmlAnonymous
May 09, 2006
Larry,
I agree that it's amazing how many developers don't understand this.
I also think it's amazing that I've never seen this information written down in such a drop-dead, straightforward manner until now.
Thanks for taking the time to do it.Anonymous
May 09, 2006
So yesterday, I commented on how I was glossing over lots of details about how to make a client/server...Anonymous
May 09, 2006
Two people have written that UTF8 was invented in 1992 rather than in the mid-1980's. Probably I stand corrected.
In the mid-1980's I read a published paper about an encoding scheme. Years later when I read the description of an encoding scheme called UTF8 the scheme looked very familiar but the name was not. I assumed that it was just the same scheme and that it had been given a name. Now I must assume it wasn't the same scheme. I must have jumped to a conclusion based on inadequate recollections of the 1980's paper. I apologize.Anonymous
July 20, 2006
> assuming the compression algorithm's worth its salt
I thought only encryption algorithm's worth was based on its salt ;).Anonymous
January 18, 2009
PingBack from http://www.keyongtech.com/2244052-sql-server-8-421-max/2Anonymous
May 29, 2009
PingBack from http://paidsurveyshub.info/story.php?title=larry-osterman-s-weblog-riffing-on-raymond-network-performance