Larry's rules of software engineering, part 1: Every software engineer should know roughly what assembly language their code generates.
The first in an ongoing series (in other words, as soon as I figure out what more rules are, I’ll write more articles in the series).
This post was inspired by a comment in Raymond’s blog where a person asked “You mean you think I’m expected to know assembly language to do my job? Yech”.
My answer to that poster was basically “Well, yes, I do expect that everyone know assembly language”. If you don’t, you don’t really understand what your code is doing.
Here’s a simple quiz: How many string objects are created in the following code?
int __cdecl main(int argc, char *argv[])
{
std::string foo, bar, baz
foo = bar + baz + “abc”;
}
The answer? 5. Three of the strings are obvious – foo, bar, and baz. The other two are hidden in the expression: foo = bar + baz + “abc”.
The first of the hidden two is the temporary string object that’s created to encapsulate the “abc” string. The second is one that’s used to hold the intermediate result of baz + “abc” which is then added to bar to get the resulting foo. That one line of code generated 188 bytes of code. Now that’s not a whole lot of code today, but it can add up.
I ran into this rule a long, long time ago, back in the DOS 4 days. I was working on the DOS 4 BIOS, and one of the developers who was working on the BIOS before me had defined a couple of REALLY useful macros to manage critical sections. You could say ENTER_CRITICAL_SECTION(criticalsectionvariable) and LEAVE_CRITICAL_SECTION(criticalsectionvariable) and it would do just what you wanted.
At one point, Gordon Letwin became concerned about the size of the BIOS, it was like 20K and he didn’t understand why it would be so large. So he started looking. And he noticed these two macros. What wasn’t obvious from the macro usage was that each of those macros generated about 20 or 30 bytes of code. He changed the macros from inline functions to out-of-line functions and saved something like 4K of code. When you’re running on DOS, this was a HUGE savings (full disclosure – the DOS 4 BIOS was written in assembly language, so clearly I knew what the assembly language that I generated. But I didn’t know the assembly language the macro generated).
Nowadays, memory pressures aren’t as critical, but it’s STILL critical that you know what your code is going to generate. This is especially true if you’re using C++, since it’s entirely possible to hide huge amounts of object code in a very small amount of source. For instance, if you have:
CComPtr<IXmlDOMDocument> document;
CComPtr<IXMLDOMNode> node;
CComPtr<IXMLDOMElement> element;
CComPtr<IXMLDOMValue> value;
How many discrete implementations of CComPtr do you have in your application? Well, the answer is that you’ve got 4 different implementations – and all the code associated with CComPtr gets duplicated FOUR times in your application. Now it turns out that the linker has some tricks that it can use to collapse identical implementations of methods (and it uses them starting with VC.Net), but if your code is targeting VC6, or if it’s using some other C++ compiler, you can’t guarantee that you won’t be staring at <n> different implementations of CComPtr in your object code. CComPtr is especially horrible in this respect, since you typically need to use a LOT of interfaces in your application. As I said, with VC.Net onwards, this isn’t a problem, the compiler/linker collapses all those implementations into a single instance in your binary, but for many templates, this doesn’t work. Consider, for example std::vector.
std::vector<short> document;
std::vector<int> node;
std::vector<float> element;
std::vector<bool> value;
This requires that there be four separate implementations of std::vector compiled in with your application, since there’s no way of sharing the implementation between them (since the sizes of all the types are different, and thus the assembly language for the different implementations is different). If you don’t know this is going to happen, you’re going to be really upset when your boss starts complaining about the working set of your application.
The other time that not knowing what’s going on under the covers hits you is when a class author accidentally hides performance problems in their class.
This kind of problem happens a LOT. I recently inherited a class that used operator overloading extensively. I started using the code, and as I usually do, I started stepping though the code (to make sure that my code worked) and realized that the class implementation was calling the copy constructor for the class extensively. Basically it wasn’t possible to use the class at all without a half a dozen trips through the heap allocator. But I (as the consumer of the class) didn’t realize that – I didn’t realize that a simple assignment statement involved two trips through the heap manager, several calls to printf, and a string parse. The author of the class didn’t know this either, it was a total surprise when I pointed it out to him, since the calls were side effects of other calls he made). But if that class had been used in a performance critical situation, we’d have been sunk. In this case, the class worked as designed; it was just much less efficient than it had to be.
As it is, because I stepped through the assembly, and looked at ALL the code that was generated, we were able to fix the class ahead of time to make it much more implementation friendly. But if we’d blindly assumed that since the code functioned correctly (and it did), we’d have never noticed this potential performance problem.
If the developer involved had realized what was happening with his class, he’d have never written it that way, but because he didn’t follow Larry’s rule #1, he got burned.
Comments
Anonymous
April 06, 2004
The comment has been removedAnonymous
April 06, 2004
The problem with "knowing" what the assembly looks like is that your knowledge can get out of date.
Lots of Java programmers "know" that they shouldn't do String concatenation by writing x = a + b + c + d + e;, for instance -- that allocates all sorts of extra strings (as in your example above) and is just terribly inefficient.
Only, it doesn't: The compiler actually translates those concatenation into StringBuffer calls invisibly, and it all works just performantly delicious.
So for the past umpty-ump years, I've been dealing with code that's needlessly full of StringBuffer.append() operations, when the programmers could have used simple straightforward concatenation, because they "knew" the assembly that was generated, based on their experience with, I dunno, Java 1.1 or something.Anonymous
April 06, 2004
The comment has been removedAnonymous
April 06, 2004
The comment has been removedAnonymous
April 07, 2004
The comment has been removedAnonymous
April 07, 2004
The comment has been removedAnonymous
April 07, 2004
The comment has been removedAnonymous
April 07, 2004
The problem with saying that hardware is cheap compared to programmers is that hardware costs scale linearly with your user base, while programmer costs don't. If you've got a sufficiently large user base then you'd have to have some pretty expensive programmers for this to be true.Anonymous
April 14, 2004
The comment has been removedAnonymous
April 22, 2004
Why did you make the rule exclusive for assembly language? I would extend it to: Every engineer should understand at least the first layer below the one he's using. For example .NET programmers should understand the internals of CLR.
I dare to say that the quality of an engineer may be measured by the number of layers below the one he's working in, he trully understands.Anonymous
April 24, 2004
The comment has been removedAnonymous
April 25, 2004
Your example serves to demonstrate that developers should, under normal circumstances, not care about low level code or performance issues... They should just do the "right thing" where that is usually the simplest and most staightforward thing. If developers had followed this rule in Java all along then they woudln't have been bitten by all these stupid optimization rules that ended up being anti-optimizations when the VMs got smarter.
Trust the compiler people... trust the VM people and then when your code is working trust the profilers to tell you what is actually going on. People's intuition about performance issues is wrong more often thant it's right. You don't know anything unless you profile it - zip, nada, nothing. You may think you're a hotshot Java programmer (we all do) but you're wrong most of what you think about optimization. That's the real rule.
Pat Niemeyer
Author of Learning Java, O'Reilly & AssociatesAnonymous
May 20, 2004
The comment has been removedAnonymous
August 02, 2004
Pat Neimeyer: perhaps you've just proven Larrys point even better than he thought. You're stressing the importance of understanding optimizations, which is conceptually similar to understanding the assembly language (which itself is conceptually saying "Understand how your code works")
So many times I see code written without regard to "how it works" or even "how does this get optimized." To be honest, I see a lot of cut and paste, which is a prime reason why samples should illustrate good practices.Anonymous
August 03, 2004
The comment has been removedAnonymous
March 15, 2006
Well, this year I didn't miss the anniversary of my first blog post.
I still can't quite believe it's...Anonymous
January 21, 2008
PingBack from http://softwareinformation.247blogging.info/larry-ostermans-weblog-larrys-rules-of-software-engineering-part-1/Anonymous
October 03, 2008
PingBack from http://codegem.org/2008/10/generate-assembly-from-c-code-in-visual-studioAnonymous
June 19, 2009
PingBack from http://mydebtconsolidator.info/story.php?id=15140Anonymous
June 19, 2009
PingBack from http://debtsolutionsnow.info/story.php?id=11389