What’s different about testing open source projects?

Anyone who has read my bio or knows me personally knows that open source is not a new thing for me. For about 25 years, I worked in the proprietary UNIX, FreeBSD, and Linux world. We were using open source software before it was called open source. If you’ve read Eric Raymond’s essay “A Brief History of Hackerdom”, you know my background even if you don’t know me! I started programming in 1967 on so-called main frames, switched to DEC minicomputers in the mid 1970’s. At the time, I worked at Western Electric’s Engineering Research Center. Bell Labs was nearby so many of us learned C programming in lectures by Kernigan and Ritchie themselves. I started working on UNIX systems in 1978 or ‘79 and made my first changes to the OS about 6 months later. Yes, even way back in the dark ages, you could get the source code for UNIX and tweak it if you needed to do so. From that point until 2003, I used UNIX systems exclusively. (I consider Linux a UNIX system since it is POSIX compliant.)

 

Recalling that early exposure to C reminds me that what Kernigan and Ritchie did was similar to what happened later in UNIX and the rest of the Open Source world. Like all “real programmers” of the era, they programmed in assembly language. The procedural languages of the day were fairly useless for writing infrastructure software. Kernigan and Ritchie saw the need for a language for infrastructure software and developed one. They were often asked “Why is it called C?” If I recall correctly, their answer was quite simple. They referred to their first attempt in a way similar to what I did, “A language for infrastructure software”. When that first prototype didn’t work out, they decided to call the second attempt the “B language”. C was their third attempt. Like the Open Source world today, they saw a need and did something about it. Their work became the basis for others to adopt, adapt, and expand the relevance of C.

 

Since my background is in the UNIX world, obviously then, when I came to Microsoft, it wasn’t because of my expertise with Microsoft products. I had never used any Microsoft product until I came to work here 4 years ago. As part of a team that is producing open source software, my role at Microsoft is to help figure out how corporate teams can work in an open source world. Consequently, part of my work is evangelizing open source within Microsoft and exploring how processes need to change for corporate teams to work in the open source world. Lately, I’ve been thinking about what the differences are between testing closed source and open source software.

 

So, what is different about testing open source projects? I came to the conclusion that the answer is "Not much!"

 

The corporate models for software life cycle and for software development/testing are fairly well known. The idea for a project is conceived and an individual or a small team defines the functional requirements. Then, the software is designed by a system engineering team or the development team. The software test team defines and plans their work based on the functional requirements and the design documentation. Everybody schedules their work and things proceed more or less according to plan.

 

Although there are more similarities than differences, an open source project cannot function quite the same way. There are two open source models:

1. A corporate team which develops and maintains open source software but accepts community contributions

2. A group of volunteers with a project owner and a small core group in charge of putting together software releases. This is the common structure of many open source projects which are not run by a corporate team. For lack of a better name, we will call it the Community Based Open Source model).

 

First, I thought about an open source project that has a corporate team maintaining it because that is our team’s situation. Such a project can be handled in much the same way as in the traditional corporate model. The primary difference is that the team has to commit time and effort to reviewing and testing community contributions to the project.

 

Because the level of community contributions varies, it can be difficult for the corporate team to schedule its work related to community contributions accurately if the team uses traditional development methods. However, our team has had some success limiting the volatility. We adopted the SCRUM Agile Project Management technique which uses short work cycles called Sprints (typically 30 days each). The spirit of such techniques allows our team to adapt quickly when the requirements change or when the requirements are not fully known when we begin a project. Contributions received in one sprint are evaluated and, if the contribution is accepted, the team’s related work is either done immediately if it takes priority over something we are currently developing. Contributions of less urgency are scheduled for a future sprint. With this approach , the workload that we cannot predict is reduced to receiving contributions and doing a superficial review to estimate how much work they cause for the team and to determine what value they add to the project.

 

Then, I thought about the Community Based Open Source model. Everyone is usually a volunteer who works on the project in their free time. Corporate managers often ask how such projects can succeed with primarily volunteer efforts. The answer is simple: The volunteers are interested in the project and they are usually highly skilled. These are people I have always called “real programmers”. Eric Raymond calls them “hackers” in his essays. When such people encounter a problem, they can report the bug quite accurately. If they choose to debug it, in addition to pointing out exactly what the problem is, they will also often provide a fix. The “many eyes” or “many hands” factor is one of the primary reasons many open source projects are of such high quality.

 

Occasionally, there may be people who work on an open source project as part of their job because their company uses the software. However, that does not change the overall dynamic of the project. Relying on volunteers to develop and test software has several differences from a corporate team’s situation.

 

In Community Based Open Source projects, the idea, requirements, and design often come from one person who becomes the project’s owner. The initial version of the software is usually developed solely by the project owner. In some cases, there may be a small group of developers involved in the initial version. The primary difference from the corporate model, so far, is that a corporate designer/developer does not usually manage the project.

 

When the initial version of a Community Based Open Source project is ready, the project goes into open source mode. I do not know of any project that began its development in open source mode. So, early stages of the Community Based Open Source model are very similar to what our team has been doing. We conceive and develop the initial version of a project and release it to a shared source site (www.CodePlex.com).

 

In the Community Based Open Source world, the original project owner will, sooner or later, stop participating in the project. Similarly, our very small team cannot stay involved in a large number of projects for eternity. So, like the project owner in the Community Based Open Source world, we have to take steps to insure that projects will not be “orphaned”. We have to make it possible for someone else to take over project ownership. Of course, the most important step has already been taken: releasing the code. If a project is orphaned in the open source world, the user community has the code base and can continue using and supporting the project themselves as long as they feel it is useful. When a closed source team stops supporting a project, its users are simply out of luck

 

Although our team is primarily developing open source tools related to Visual Studio, we have taken some cues from the Community Based Open Source world. Community Based Open Source projects rely on freeware for development and testing tools. Volunteers are unlikely to spend their own money buying proprietary software solely because they want to contribute to a Community Based Open Source project. Using freeware is a “good thing” because it helps increase the number of eyes and hands going over the code base. In other words, using freeware contributes to the “many eyes” and “many hands” effect mentioned earlier. We use only freeware tools (whether they are open source or not) so that our projects do not require high end editions of Visual Studio. Not all our projects can be used with Visual Studio Express editions but our goal is to make our projects accessible to anyone whether they are students, professionals or hobbyists.

 

I think the main thing our corporate team may do differently than the Community Based Open Source world is that we think about testing during the project definition phase. We plan our testing effort early on and every member of the team, whether they are involved with the development of the project or not, tests the project before it goes open. We set our sights pretty high in that we want all requirements tested and enough tests to get 80% code coverage. We do not set our sights as high as mission critical projects like medical equipment or avionics software or even as high as I would like if we were testing a commercial closed source product. But, we want our projects to be useful and reasonably good quality when we first release them. Nothing will kill an Open Source project faster than releasing bug ridden software and frustrating your users! Next time, I’ll outline some recommendations for other teams that intend to release open source projects.

Comments

  • Anonymous
    February 09, 2007
    As I said in my earlier post , our corporate team develops tools that we want to turn into Community

  • Anonymous
    February 09, 2007
    This is a great article.  Not only do I remember open-source development from the first generation (late 50's), but I realize that my current project fits exactly into your community picture.  I'm just short of being ready for a public beta and the code is embargoed until that happens: when all potential show-stoppers are killed off and there is some orderliness for testing and deployment. I have thought about testing from the beginning, although mostly to confirm the interface contracts and demonstrate function: I spend a lot of up-front time in each stage fussing over how can I demonstrate that it is working and how I can troubleshoot when it isn't.  Coverage and other kinds of inspections/tests come when I have all of the functions in place and have breathing room for refactoring and hardening.  I think that is a natural condition while in solo-developer mode.   That is also motivated by there being an external customer (putting a proprietary product on top of my open-sourced layer) with their customer to satisfy.   I'm clipping your article and going to check against it as I move from 0.50beta to 1.0 stable release.

  • Anonymous
    February 13, 2007
    John, our only SDET (tester) on the team, has written a couple of blog entries on what we've learned

  • Anonymous
    July 21, 2007
    On our one-year anniversary of our power toys release, I gave a Microsoft Engineering Excellence Talk

  • Anonymous
    July 21, 2007
    On our one-year anniversary of our power toys release, I gave a Microsoft Engineering Excellence Talk