Programming Proverbs 1: Define the problem completely

This is the first of a series of posts based on the book Programming Proverbs by Henry Ledgard. The index for the series is an earlier post and discussion of the list as a whole is taking place in the comments there. Comments on this "proverb" are of course very welcome here.

For a lot of people "define the problem completely" seems pretty obvious as the only right way to do things. To others, especially lately, there seems to be an idea that one should throw something together based on partial understanding, take it to the end user to see where they are, and then go to the next step. This really seems like a waste of time to me. Oh to be sure there are often times when the end user doesn't know what they want when a project is started. Someone commented earlier that "Walking On Water and Writing a Software is easy, if the Water and Specifications are FROZEN." There is some truth to that.

Historically software developers have lived with the idea that the specification will change and things will be come if not clearer at least different. That often makes it difficult, if not impossible, to define the whole system before starting.

But actually all of that is both a digression and a missing of the point. While for some people this proverb means understanding the whole system that is often, some would say always, a problem. And I think looking at that large a picture misses the point. I think that at some point one has to reach a granularity for which this proverb makes complete sense. That granularity is when one reaches a point when the amount to be coded by one person is reached. Is that a method? A class? A module? A complete program or even a small system? At that point it is a mistake not to have the problem defined and understood completely. To do otherwise is to ensure that the code will break when it comes into contact with expectations of the user or of other code. Code that is checked in with other code and does not work as expected means that the problem was not defined completely enough and someone started coding too early. There is no excuse for that in my opinion.

This idea of defining the problem completely has interesting ramifications for the teaching/learning environment. When a teacher defines a project for an assignment or an exercise for a test they assign to students there is an obligation to spell out the requirements completely. If there is ambiguity it should be there on purpose and to allow some leeway in problem solving. Students have an obligation to read and understand the definition completely before beginning to write code. Alas they seldom do. They are a lot like some professionals I have worked with I am afraid.

What I used to say to students was "I am much too lazy to write a lot of extra words so if they are there you'd better read them all." Assumptions are risky business and no less so where graded work is involved. I encouraged questions and often I re-wrote project assignments to make sure I answered those questions in the future. This was valuable both to the student, who learned to ask questions, and for me as I learned to better specify what I wanted. We never stop learning.

What is the problem? What are your inputs? What are your outputs? What do you need to know and to do to get from the inputs to the outputs? Unless those things are defined it is much too early to write code.

Comments

  • Anonymous
    February 09, 2007
    "...re-wrote project assignments to make sure I answered those questions in the future..." Here we find an admission that despite best efforts at a complete specification that refinement was necessary. This was true in your case for a problem that would be solved over and over again by students over time, a situation that rarely occurs "in the wild". The point is that even the simplest seeming specification will have points of refinement, ambiguity and room for clarification. Iterative development might seem to be a waste of time, but the fact that assignments needed to be re-written is proof positive that it is the only way things happen in the real world as assignments are the ideal instance where such iteration would be completely unnecessary if it was ever possible to be unnecessary. Of course the comment is very true at a different level. We need to understand the problem as completely as possible during implementation. In iterative development, this understanding is expressed by writing tests that prove our code does what it is supposed to do. If we don't know what the code should do, we can't write tests. Iterative development tells us we shouldn't develop code we can't test, so at this level, we must understand the problem fully.

  • Anonymous
    February 09, 2007
    The specs that needed fixing were generally rushed. If anything they taught me the need to take more time understanding what I wanted. I think the exercise was also valuable in that it taught students that they needed to be able to communicate with "customers." The important thing was to iterate during the understanding and specifician phase before starting the coding phase. If, more likely when, the design needs to change the coding should stop until the redesign is finished.

  • Anonymous
    February 10, 2007
    I'm currently staring at a project that contains 6,831 source files (3,943 of them generated by a code generator) and has been in development for 7 years now. We still don't understand the problem completely: customers bring us fresh insights and the problem space continues to evolve. Test driven iteration has allowed us to codify our "specification" in the form of executable tests. When the situation changes, we can change this specification, adapt our code and move forward. Traditional waterfall development would have us still discussing things with our clients because the spec moves as fast as we can code at some points. Goverment regulations, industry standards, integration requirements: the real world simply won't stand still long enough to do otherwise and actually deliver a product that works.