Dela via


Syntax, Semantics, Micronesian cults and Novice Programmers

I've had this idea in me for a long time now that I've been struggling with getting out into the blog space. It has to do with the future of programming, declarative languages, Microsoft's language and tools strategy, pedagogic factors for novice and experienced programmers, and a bunch of other stuff. All these things are interrelated in some fairly complex ways. I've come to the realization that I simply do not have time to organize these thoughts into one enormous essay that all hangs together and makes sense. I'm going to do what blogs do best -- write a bunch of (comparatively!) short articles each exploring one aspect of this idea. If I'm redundant and prolix, so be it.

Today I want to blog a bit about novice programmers. In future essays, I'll try to tie that into some ideas about the future of pedagogic languages and languages in general.

Novice programmers reading this: I'd appreciate your feedback on whether this makes sense or it's a bunch of useless theoretical posturing.

Experienced programmers reading this: I'd appreciate your feedback on what you think are the vital concepts that you had to grasp when you were learning to program, and what you stress when you mentor new programmers.

An intern at another company wrote me recently to say "I am working on a project for an internship that has lead me to some scripting in vbscript. Basically I don't know what I am doing and I was hoping you could help. " The writer then included a chunk of script and a feature request. I've gotten requests like this many times over the years; there are a lot of novice programmers who use script, for the obvious reason that we designed it to be appealing to novices.

Well, as I wrote last Thursday, there are times when you want to teach an intern to fish, and times when you want to give them a fish. I could give you the line of code that implements the feature you want. And then I could become the feature request server for every intern who doesn't know what they're doing… nope. Not going to happen. Sorry. Down that road lies cargo cult programming, and believe me, you want to avoid that road.

What's cargo cult programming? Let me digress for a moment. The idea comes from a true story, which I will briefly summarize:

During the Second World War, the Americans set up airstrips on various tiny islands in the Pacific. After the war was over and the Americans went home, the natives did a perfectly sensible thing -- they dressed themselves up as ground traffic controllers and waved those sticks around. They mistook cause and effect -- they assumed that the guys waving the sticks were the ones making the planes full of supplies appear, and that if only they could get it right, they could pull the same trick. From our perspective, we know that it's the other way around -- the guys with the sticks are there because the planes need them to land. No planes, no guys.

The cargo cultists had the unimportant surface elements right, but did not see enough of the whole picture to succeed. They understood the form but not the content. There are lots of cargo cult programmers -- programmers who understand what the code does, but not how it does it. Therefore, they cannot make meaningful changes to the program. They tend to proceed by making random changes, testing, and changing again until they manage to come up with something that works.

(Incidentally, Richard Feynman wrote a great essay on cargo cult science. Do a web search, you'll find it.)

Beginner programmers: do not go there! Programming courses for beginners often concentrate heavily on getting the syntax right. By "syntax" I mean the actual letters and numbers that make up the program, as opposed to "semantics", which is the meaning of the program. As an analogy, "syntax" is the set of grammar and spelling rules of English, "semantics" is what the sentences mean. Now, obviously, you have to learn the syntax of the language -- unsyntactic programs simply do not run. But what they don't stress in these courses is that the syntax is the easy part. The cargo cultists had the syntax -- the formal outward appearance -- of an airstrip down cold, but they sure got the semantics wrong.

To make some more analogies, it's like playing chess. Anyone can learn how the pieces legally move. Playing a game where the strategy makes sense is the hard (and interesting) part. You need to have a very clear idea of the semantics of the problem you're trying to solve, then carefully implement those semantics.

Every VBScript statement has a meaning. Understand what the meaning is. Passing the right arguments in the right order will come with practice, but getting the meaning right requires thought. You will eventually find that some programming languages have nice syntax and some have irritating syntax, but that it is largely irrelevant. It doesn't matter whether I'm writing a program in VBScript, C, Modula3 or Algol68 -- all these languages have different syntaxes, but very similar semantics. The semantics are the program.

You also need to understand and use abstraction. High-level languages like VBScript already give you a huge amount of abstraction away from the underlying hardware and make it easy to do even more abstract things.

Beginner programmers often do not understand what abstraction is. Here's a silly example. Suppose you needed for some reason to compute 1 + 2 + 3 + .. + n for some integer n. You could write a program like this:

n = InputBox("Enter an integer")

Sum = 0
For i = 1 To n
      Sum = Sum + i
Next

MsgBox Sum

Now suppose you wanted to do this calculation many times. You could replicate the middle four lines over and over again in your program, or you could abstract the lines into a named routine:

Function Sum(n)
      Sum = 0
      For i = 1 To n
            Sum = Sum + i
      Next
End Function

n = InputBox("Enter an integer")
MsgBox Sum(n)

That is convenient -- you can write up routines that make your code look cleaner because you have less duplication. But convenience is not the real power of abstraction. The power of abstraction is that the implementation is now irrelevant to the caller. One day you realize that your sum function is inefficient, and you can use Gauss's formula instead. You throw away your old implementation and replace it with the much faster:

Function Sum(n)
      Sum = n * (n + 1) / 2
End Function

The code which calls the function doesn't need to be changed. If you had not abstracted this operation away, you'd have to change all the places in your code that used the old algorithm.

A study of the history of programming languages reveals that we've been moving steadily towards languages which support more and more powerful abstractions. Machine language abstracts the electrical signals in the machine, allowing you to program with numbers. Assembly language abstracts the numbers into instructions. C abstracts the instructions into higher concepts like variables, functions and loops. C++ abstracts even farther by allowing variables to refer to classes which contain both data and functions that act on the data. XAML abstracts away the notion of a class by providing a declarative syntax for object relationships.

To sum up, Eric's advice for novice programmers is:

  • Don't be a cargo cultist -- understand the meaning and purpose of every line of code before you try to change it.
  • Understand abstraction, and use it appropriately.

The rest is just practice.

Comments

  • Anonymous
    March 01, 2004
    The comment has been removed

  • Anonymous
    March 01, 2004
    My next piece of advice would be: Learn to use your debugger.

    I see it so often on message boards where a novice's code isn't working right, and they have just run the code, looked at the output (which was wrong), and then been stuck on what to do next. Often it's something simple like a reversed boolean test, or an uninitialized variable, stuff that would be immediately evident if they stepped through the code.

    After that, I'd reiterate the rule: Just because your code compiles, doesn't mean the code is correct. If it compiles, but doesn't produce the right output, then don't just throw up your hands and say "what's wrong? it compiles!" Use the debugger. If it compiles, that just means you matched up all the {} or begin/end pairs. It says nothing about the semantics.

  • Anonymous
    March 01, 2004
    You are so cool! I love this story!

  • Anonymous
    March 01, 2004
    I agree with Mike about importance of knowing your debugger, but after learning to use the debugger many people, even very experienced, remain closely tied to it. When they find a bug, they immediately begin to debug it, passing lots of code just in order to get to the suspected piece and find that the problem isn't there. The continue the routine until they finally find the bug, then they fix it and feel proud of themselves.

    I think that the very important thing for novices (and not only for them) is: Learn to NOT use your debugger until you absolutely have to do so. Reading logs (and putting them in necessary places in first place) and passing code in mind can give you enough information to fix the bug, and it will take tenth of time you would need to find the problem with debugger.

  • Anonymous
    March 01, 2004
    I have been programming since 1997, when I took Introduction to Computer Graphics in art school. I thought it would be Macromedia Director type stuff, and the school's course catalog was a little confusing. It turned out that it was a semester of C followed by a semester of C with OpenGL. I ended up liking it more than art courses.

    I think I can still remember what that level of knowledge was like, and I wish it had been made clearer to me just how hard it is to be a good programmer. I think most novice courses lie about this difficulty, in order not to intimidate students.

    I wish someone had showed me this:

    Teach Yourself Programming in Ten Years
    http://norvig.com/21-days.html

  • Anonymous
    March 01, 2004
    The comment has been removed

  • Anonymous
    March 01, 2004
    The comment has been removed

  • Anonymous
    March 01, 2004
    My math teacher always said "It's not important to get the right answer. But it's important to get the right solution". Is it the same?

  • Anonymous
    March 01, 2004
    Getting the right answer is important. Very, very important!

    But I think I see what your teacher is getting at. If I could paraphrase, I'd say that what is important is that your logic be sound. If your logic is sound then you can't help but get the right answer. Just having the right answer is not enough -- you might have guessed, or cheated, or whatever. You must have a reliable method, because that's what mathematics is all about -- coming up with methods that work EVERY TIME.

    I think your teacher is making the same point that I'm making. It's not enough to have an arithmetic technique that does the right thing. The whole point of learning mathematics is to UNDERSTAND why those techniques work, and to PROVE that they are reliable.

  • Anonymous
    March 01, 2004
    Another fabulous adventure in blog-reading. :-) A couple of thoughts. Thought one: while no one can argue that the philosophy of "understand how it works" is right, there is also the type of programmer (let's call them the programmer-by-accident, occasionally thought of as "Mort") who actually doesn't want to become an advanced programmer; he or she has some job that needs doing. Perhaps we can call them the second-job programmer, whose first job is managing or crunching numbers or whatever. This type of programmer is probably the classic cut-n-paste programmer -- "well, I found this on the Web, and it seems to work." (A nod here to Scott Hanselman's recent comment on this practice.) Such a programmer can in fact learn in a slightly way from what you're describing. Ok, here's a piece of code that I got working. It's not quite what I need. What if I change this? Aha. What if I change this? Oops, better change that back. Etc. You can learn a lot by dismantling things. (And by pestering your colleagues and fellow listserv members to help you get it working.)

    Thought two, which I believe is complementary to your thesis: programming language, eh. It's object models that need to be mastered. I have a friend who's learning ASP.NET. He's competent enough in C#; when he calls me for help, it's because he's flummoxed by ASP.NET or ADO.NET. Is that semantics, too?

  • Anonymous
    March 01, 2004
    The comment has been removed

  • Anonymous
    March 01, 2004
    Perfect analogy. I'm neither novice or expert, but your article really spoke to me.

    I was once learning Pascal (long time ago) and at the same time assisted Fortran students - I didn't know Fortran. For the most part, it's just a different syntax for accomplishing a task, though.

  • Anonymous
    March 01, 2004
    What you need to learn first is learn to learn. I sometimes say, “A programmer who does not learn is either sleeping or dead. And I’m not actually sure about sleeping”.

    Many people ask for my help in programming matters. By “help” they usually mean “do for me”. They don’t care about learning, they want a solution, to show it to their instructor and get a passing grade. I then try to see if they can solve it on their own with a little pointing in the right direction, and if they can’t and actively resist, well, too bad for them. “Sorry, can’t help you.”

  • Anonymous
    March 01, 2004
    Cargo cult programming seems to be what I call parroting. That is people who only can answer back with what you have already told them. They can't use what they have learnt in one programming language with another programming language.

    A lot of understanding of programming comes from being thaught with the right tools. Getting the right error messages. It also helps that the teaching language reads reasonably well. Verbosity is a plus.

    That is why C/C++/Java are not good starter languages. It is the reason Ada/Eiffel/Delphi are good starting points. The latter programming languages are more productive. If you don't belive that, it is because you are unaware of how you are spending your time: You forgot to factor in the time it took you to get the program to actually work.

    Countrary to popular belief, the effectiveness of a programming language is not inverse of the number of characters a program consists of. That is probably why Cobol is alive and well.


    greetings,

  • Anonymous
    March 01, 2004
    How about starting education with the documentation ?
    As an experienced programmer who occasionnally has to (heck, wants to!) use scripting languages, the (human) language in which the scripting documentation is written consistently gets in my way when I'm looking for information on the semantics of a scriptable API, by repeatedly addressing the syntax, probably in the assumption that a) scripting languages are for beginners and b) beginners tend to forget about the syntax.

    I'm probably not making myself clear at this point, so let's look at an example : the documentation for Microsoft's scripting runtime library. Just about any reference page will do, like for example http://msdn.microsoft.com/library/en-us/script56/html/jsmthCopy.asp
    (the Copy method of the FileSystemObject).

    The method is presented like this :
    object.Copy( destination[, overwrite] );

    Below, we find that "object" is an "argument" (in a very technical sense, I suppose it is) which is (must be ?) "Always the name of a File or Folder object."

    Now is that syntax or semantics ? Does that mean I can't write :

    (new FileSystemObject()).Copy(...)

    Or :

    functionReturningAnFSO().Copy(...)

    ?

    I hope my point is clear now. I could go on for hours - in just this method's description there are several more problems such as not specifying the type of the "destination" argument (is it OK if I pass a Folder object, then ? Why not ?), but there is also the Internet SDK that insists on segregating 'properties' from 'objects' and 'collections', hopelessly confusing the cargo cultist into thinking they are funamentally different to the parent object.

    Don't you think improving the documentation should be the first priority ?

    Cheers,
    --Jonathan

  • Anonymous
    March 02, 2004
    Thomas and Jonathan, interesting points. Thomas first:

    >No, this is him being too lazy to read the
    >documentation. Especially ADO.NET is trivial
    >as object model. ASP.NET is not really
    >complex wither, but it is always an advantage
    >to know what happens there - means: for
    >ADO.NET it really helps a lot to know what
    >HTML is and how the web works (it is
    >interesting how many people working with
    >ASP.NET have no clue about how http does
    >work).

    All true (well, the "trivial" part is up for discussion). But I think you are illustrating an interesting issue: in order to be effective with ASP.NET, you need to know HTML. But suppose I don't really know HTML that well, but nonetheless I need to get this one thing working right now. In theory, I could go away and study HTML, etc. until I felt I really understood what's going on. But I don't have time! I need to get this thing done right now!

    ASP.NET + HTML is probably not the best example, but perhaps XML + XSLT is. You could spend months studying XSLT before you felt like you "understood" it, but in fact, you can get stuff working pretty rapidly by copy-and-tweak. Which leads to the kind of situation I was describing. Note that the person who is doing this IS learning XSLT, just on an incremental and JIT basis. It simply takes time to absorb all of these things, and it's not always (in fact, rarely) realistic to spend a long time studying the background on something before you start working with it.

    Thomas: interesting point. I can tell you that different types of documentation is written in different ways. The example you cite is from a reference topic (description of a class or member). That type of documentation is traditionally written tersely because reference docs are to some extent written for the already-experienced programmer who (it is theorized) does not need a bunch of background information repeated when all they wanted was the syntax of method such-and-this. (You'll note that almost all reference material for all languages -- JScript, C#, Java, T-SQL -- follows this philosophy.) The idea is that if you don't already know what an object is or whatever, you should be able to read about that in a conceptual topic, which in previous years would have been the "programmer's guide." Good reference material will cross-reference to the conceptual documentation that provides the background required to understand the reference material. A weak analogy, I suppose, is a dictionary. It defines the words, but it doesn't tell you each time what a noun or adjective or verb is; it's assumed you already know that or can look it up elsewhere.

    More germane to this discussion, reference documentation is not meant to be tutorial in nature. It's not where a beginner should be learning the semantics of the language. (Again, is the theory.) To be clear, I think it's possible to write good and bad reference documentation. I favor verbosity myself (obviously).

  • Anonymous
    March 02, 2004
    The comment has been removed

  • Anonymous
    March 02, 2004
    Programming and Personas

  • Anonymous
    March 02, 2004
    mike, I think you meant to reply to my post in the 2nd part of your post.

    Judging by your comment "[reference documentation] is traditionally written tersely" it seems I totally failed to illustrate my point with that example.
    What I wanted to say is that I find that sample topic to have, if anything, too much information. Too much syntactic garbage : what's the benefit of repeating on every reference page that to call a method you need to put a dot to the right of an object value ? Particularly when instead of well-defined words such as 'instance', they use 'object name' which is totally wrong and ill-defined.

    Once you get rid of that garbage, there is ample room to specify the details that really make the difference such as the data types of the arguments.

    And I disagree that reference material for all languages (I was actually talking about API, not language, references but I see what you mean) is written in the same way. Look at the CLR or PSDK docs, which essentially target non-scripting programmers; while they have their own share of problems, they have at least the advantage of actually being reference texts, in that they precisely and concisely define a specification for an API. The scripting runtime docs just leave too much to the imagination.

    Hoping this is clearer...

  • Anonymous
    March 04, 2004
    The comment has been removed

  • Anonymous
    March 04, 2004
    The comment has been removed

  • Anonymous
    March 04, 2004
    Might as well add "sesquipedalian" while you're at it.

    There's no doubt that poor understanding of the nature of software engineering is behind many poor management decisions -- but it is also the case that software engineering is not a mature enough discipline yet to consistently give good data to management!

    That's a huge topic in itself, which I know little about.

  • Anonymous
    March 04, 2004
    The comment has been removed

  • Anonymous
    March 12, 2004
    Excellent post, Eric! :)

    Just thought I'd mention that we've got a good thread about learning to program going on at the moment:
    http://www.sitepoint.com/forums/showthread.php?t=156802

  • Anonymous
    March 23, 2004
    The comment has been removed

  • Anonymous
    March 24, 2004
    The comment has been removed

  • Anonymous
    March 27, 2004
    The comment has been removed

  • Anonymous
    March 27, 2004
    Kevin, you are right. There are different types of people in the world. I've met super-intelligent guys (actually had them tested at interivews) who simply weren't able to think in abstract terms. This is all well and normal; they are not stupid for that, just differently wired.

    Furthermore some people learn by induction, other by deduction -- someone needs to see 1.000 examples before realizing how to go about it, while others first have to understand the big picture first.

    <provocation>
    That being said, Eric isn't your post a cargo cult type post: talking of mechanics of teaching the trade, but not understanding the big picture of how people learn?
    </provocation>

  • Anonymous
    March 29, 2004
    > Eric isn't your post a cargo cult type post: talking of mechanics of teaching the trade, but not understanding the big picture of how people learn?

    Why do you think that? The post isn't about how people learn at all, it's about a pitfall that people fall into: confusing the form of a program with its content.

    I could just as easily have written the piece in the context of writing sonnets, not programs. Yes, you have to learn what the structure of a sonnet is before you can write one, but just because you have the structure right doesn't mean you've written a GOOD sonnet, any more than just because you've moved the pieces legally means you can win at chess.

    That people have different learning styles is undoubtedly true, but I fail to see the relevance to this particular post.

  • Anonymous
    March 30, 2004
    The comment has been removed

  • Anonymous
    April 13, 2004
    The comment has been removed

  • Anonymous
    May 05, 2004
    Don't be a

  • Anonymous
    May 22, 2004
    Debugging is at least twice as hard as programming. If your code is as clever as you can possibly make it, then by definition you're not smart enough to debug it.

    - Brian Kernighan

  • Anonymous
    May 25, 2004
    I was reading this post by Eric Lippert when I stumbled the statement that Software Engineering is an immature discipline. This triggered the stream of consciousness below: Software Engineering is an immature discipline. Is it? What does that mean? Does it mean software engineering is a new discipline? How long...

  • Anonymous
    March 03, 2006
    The comment has been removed

  • Anonymous
    June 11, 2006
    PingBack from http://ctpoon.com/wordpress/2005/07/15/203/

  • Anonymous
    November 01, 2007
    My colleague Mike , in a comment in yesterday's entry , mentions "Mort". Who is this Mort guy? At Microsoft,

  • Anonymous
    December 12, 2007
    I agree with your sentiments, with an anecdote to boot. My first exposure to programming was with Microsoft QBasic. I was about 8, and had trolled Prodigy for scripts, and came across many neat looking games. I dissected them with a fine-tooth comb for several years, and got pretty good at hacking them up, adding additional menus, moving graphics around, etc. (I was never a good artist, so I left the DATA instructions alone...). I couldn't figure out why the numbers went from 0-F. I had no exposure to hex. With some guesswork, I usually managed to get what I was looking for. I didn't find out until college what a hexadecimal number was. I ran into similar problems with non-square matrix division, among other things. Throughout, I understood the syntax quite easily (even polymorphism, function overriding, and streams came simply--possibly I had a good teacher). However, without understanding the external concepts I was dealing with, I was lost. I'm a pretty well-versed programmer. No genius, but not a "Mort", as your term seems to be. I don't run into problems with languages as much as various toolkits, particularly (and unfortunately) Microsoft's .Net Framework. Microsoft could well take your advice when it comes to MSDN, in my opinion. I've found tons of information on how .Net works, but not much on the concepts that implement it: The reasons, design decisions, and descriptions that come with a thorough understanding, presented in a clear manner. In effect, I run into problems pretty regularly on my projects. More than once, I've implemented something that .Net already provides, but I never knew about. My point in this, is that "cargo cult" programming, in my experience, is not merely about understanding the semantics of your language, but a cohesive picture of the tools available to you--the semantics of your toolkit, so to speak.

  • Anonymous
    January 16, 2008
    Two additional quick notes about books: I am also pleased to announce the availability of the C# 3.0

  • Anonymous
    January 23, 2008
    Hi, I have this vbscript but I want it to make me a sandwich. Can you write the code for me? It should know what I like on my sandwiches. Thanks in advance.

  • Anonymous
    June 27, 2008
    Hi, I have this javascript but I want it to make me a milkshake. Can you write the code for me? It should know what I like on my sandwiches. Thanks in advance.

  • Anonymous
    May 17, 2009
    The comment has been removed

  • Anonymous
    August 04, 2009
    Hey Eric, you've written (in the comment above) that people don't need to have formal background or degrees to get a programming job. Is this true in organizations like Microsoft?

  • Anonymous
    February 27, 2010
    Excellent post Eric. I tried to teach myself programming, and many of the resources I found trained me to be a cargo cult programmer. By now I've developed a better approach and it has been working much better so far. The three things that have helped me the most so far are the following:

  1. Try to understand how the code works and why, not just 'make it work'. That is, try not to be a cargo cult programmer.
  2. Read programming blogs and code that others have written.
  3. Try to develop my own applications to learn from, and get feedback from more experienced programmers. Then, refactor my code and make it better.
  • Anonymous
    July 16, 2012
    Thanks Eric, for this blog. You had put it 8 year back but its true right now also.