Software Contracts, Part 3 - Sometimes implicit contracts are subtle
I was planning on discussing this later on in the series, but "Dave" asked a question that really should be answered in a complete post (I did say I was doing this ad-hoc, it shows).
Let's go back to the PlaySound API, and let's ask two different questions that can be answered by looking at the APIs software contract (the first one is Dave's question):
I am happy to fulfill my contractual obligations but I need to know what they are. If you don't tell them, how is the caller to know that you need their memory until the sound finishes playing?
If I call PlaySound with the SND_ASYNC flag set, how can I know if the sound's been played.
As I implied, both of these questions can be answered by carefully reading the APIs contract (and by doing a bit of thinking about the implications of the contract).
Let's take question 2 first.
The explicit contract for the PlaySound API states that it returns TRUE if successful and FALSE otherwise. If you specify the SND_ASYNC, what does that TRUE/FALSE return mean though? Well, that's not a part of the explicit contract, it must be a part of the impicit contract.
Remember that the PlaySound API only has three parameters (the sound name, a module handle and a set of flags). All of these parameters are INPUT parameters - there's no way to return the final status in the async case. Since there's no way for the AP to return whether or not the sound successfully played, the only way that the return from the API contained an indication of the success/failure of playing the sound implies that the SND_ASYNC flag didn't actually do anything. And that violates the principle of least surprise - if the SND_ASYNC flag was a NOP, it would be a surprise.
And in fact all the call to PlaySound does is to queue the request to a worker thread and return - the success/failure code refers to whether or not the request was successfully queued to the worker thread, not to whether or not the sound actually played.
No for Dave's question...
First off: One critical part of interpreting software contracts is: If you have a question about whether or not a function behaves in a specific manner, if it's not specified in the explicit contract, assume the answer is 'no' unless otherwise specified.
Since the contract for PlaySound is currently silent about the use of memory in combination with the SND_ASYNC flag, you should always make the most conservative assumptions about the behavior of PlaySound. Since the API documentation doesn't say explicitly that the memory can be freed while the sound is playing, you should assume that it shouldn't. And that means that the memory handed to the PlaySound call must remain valid until the call to PlaySound has completed playing the sound.
But even without that, with a bit of digging, you can come to the same answer.
Here's how my logic works. Both of the givens below are either explicit or implicit in the contract.
- You own the memory handed to PlaySound - you are responsible for allocating and freeing it. You know this because PlaySound is mute about what is done with the memory, thus it has no expectations about what happens to the memory it uses (this is an implicit part of the contract).
- The default behavior for PlaySound is synchronous (you know this because the documentation states that the SND_SYNC flag is the default behavior) (this is an explicit part of the contract).
You can also assume that the SND_ASYNC flag is implemented by dispatching some parts of the call PlaySound to a background thread. This is pretty obvious given the fact that something has to execute the code to open the file, load it into memory, and play it. You can verify this trivially by using your favorite debugger and looking at the threads after calling PlaySound with the SND_ASYNC flag. In addition, there are no asynchronous playback calls in Windows, so again, it's highly unlikely the playback is done using some kind of interrupt time processing (it's possible, but highly unlikely - remember that PlaySound was written for Windows 3.1). I actually went back to the Windows 3.1 source code for PlaySound and checked how it did it's work (there were no threads in Windows 3.1) - on Windows 3.1, if you specified the SND_ASYNC flag, it created a hidden window and played the sound from that windows wndproc.
But even given this, we're not done. After all, it's possible that the PlaySound code makes a private copy of the memory passed into PlaySound before returning from the original call. So the decision about whether or not the memory passed into the PlaySound API can be freed when specifying SND_ASYNC really boils down to this: If PlaySound makes a private copy of the memory, then the memory can be freed immediately on return, if it doesn't, you can't.
This is where you need to step back and make some assumptions. Up until now, pretty much everything that's been discussed has been a direct consequence of how the API must work - SND_ASYNC MUST be implemented on a background thread, you DO own the memory for the API, etc.
So let's consider the kind of data that appears in the memory for which the PlaySound API is called.
Remember that most WAV files shipped with Windows (before Vista) were authored as 22kHz, 16 bit sample, mono files (for Vista, the samples are all stereo). That means that each second of audio takes up 44K of RAM. That means that all non trivial WAV files are likely to be more than 64K in size (this is important). Again, consider that the PlaySound API was written for Windows 3.1 where memory was at a premium, especially huge blocks of memory (any block larger than 64K of RAM had to be kept in "huge" memory allowing the blocks to be contiguous.
If Windows were to take a copy of the memory, it would require allocating another block the size of the original block. And on a resource constrained OS like Windows 3.1 (or Windows 95) that would be a big deal.
Also remember my 2nd point above - the defaut behavior for PlaySound is synchronous. That means that the PlaySound call assumes that it's going to be called synchronously.
Given the fact that PlaySound was originally written for Windows 3.1 and given that the default for PlaySound is synchronous, and given the size of the WAV files involved, it thus makes sense that the PlaySound API would not allocate a new copy of the memory for the .WAV file and instead would use the samples that were already in memory - why take the time to allocate a new block and copy its contents over when it was already available.
Now this is a big assumption to make - it might not even be right. But it's likely to be a reasonable assumption.
So you should assume that PlaySound doesn't take a copy of the memory being rendered, and thus you need to ensure that the memory is valid across the life of the call.
Btw, I just was told by the doc writers that they're planning on making this part of the contract explicit at some point in the future.
Tomorrow: Let's look at some explicit contracts.
Comments
Anonymous
January 08, 2007
The comment has been removedAnonymous
January 08, 2007
Lonnie, there are two major cases for PlaySound. The first is calling PlaySound with an alias or filename - in that case, all you need to is to ensure that the memory containing the name of the alias or filename in question remains in memory. If you use the alias IDs, then you don't even need to do that. The second case is when you call PlaySound with a chunk of memory. It turns out that you can determine the length of the file from the contents of the FMT section and the contents of the DATA section. Even without that, if you call PlaySound(NULL, ...), you'll stop the playback. So just call PlaySound(NULL, ...) before freeing the memory and you'll be just fine.Anonymous
January 08, 2007
Oh, and Lonnie, you're right - if you specify SND_RESOURCE, then PlaySound calls LoadResource etc so it guarantees that the memory is still valid during the course of playback. Similarly, for SND_FILENAME it allocated a block of memory and frees it when done (all behind the scenes). Again, this behavior can be deduced from the API and some common sense.Anonymous
January 08, 2007
Whoa, did my RSS reader accidentally cross-link Raymond Chen's feed with yours??? (That's meant as a compliment, BTW. :p)Anonymous
January 08, 2007
The comment has been removedAnonymous
January 08, 2007
The comment has been removedAnonymous
January 08, 2007
Adam, you're right - that's why I'm having the documentation fixed. But to answer your question: strlen doesn't also say that it works asynchronously. On the other hand, the ReadFile API's documentation doesn't say anything about the memory pointed to by lpbuffer being valid from the time the API is called to when it completes. It's another example of an implicit API contract - even though ReadFile doesn't say it, it's contract is: you provide a buffer and ReadFile fills it in. If the read is asynchronous, the contents of the buffer aren't valid until the read completes, but until it completes, the buffer MUST remain valid (this falls out of the fact that the buffer is effectively an OUT parameter - out parameters must remain valid from the time an API is invoked until it completes). The PlaySound API with the SND_ASYNC flag behaves the same as ReadFile behaves with an LPOVERLAPPED parameter - the buffer passed in must remain valid until the API completes. There is absolutely no difference in the contracts. Having said that, there IS one significant difference: ReadFile provides a mechanism (the LPOVERLAPPED) that can be used to determine when the ReadFile API has completed, the PlaySound API (as has been discussed earlier in this thread) doesn't. On the other hand, the PlaySound API DOES provide a mechanism for canceling any outstanding asynchronous call to PlaySound - calling PlaySound with a NULL filename is documented as terminating any and all outstanding calls to PlaySound, so there IS a safe way of ensuring that the API is completed.Anonymous
January 08, 2007
Yes, you're clever that you can use the fact that it's written for Windows 3.1 to work out it must require the memory block not to be freed, but 10 years after Windows 95, who will think of such things? Also, was not copying the memory a good design decision given that it makes it much harder for the caller to free the memory after the sound has been played (meaning memory is less likely to be freed after being passed to PlaySound)?Anonymous
January 08, 2007
Tim, as I pointed out earlier: The reason you can't free the memory is that the API doesn't say that it's ok to free the memory. And in general, memory passed to APIs must be valid until the API has completed (even when the API is asynchronous). But the same logic (not copying multi-hundred K blocks of memory) applies to Win9x as well. And yeah, 20/20 hindsight is wonderful, it would have been great if the API had been defined differently. But as Raymond Chen likes to say: Time machines haven't been invented yet. This behavior has been the behavior of PlaySound since 1991 when the API was originally added to Windows 3.x.Anonymous
January 08, 2007
If you are interested in this subject, I suggest that you take a look at "Design by Contract" (http://en.wikipedia.org/wiki/Design_by_contract) and the Eiffel language which heavily relies on it.Anonymous
January 08, 2007
The comment has been removedAnonymous
January 08, 2007
jugger: If this API hadn't behaved in this manner for 15 years now, I agree - the async code should copy the memory. But time travel hasn't been invented yet.Anonymous
January 08, 2007
But couldn't this behaviour be "fixed" in future Windows releases? Surely if Vista had just buffered ASYNC calls to PlaySound then the issue of applications releasing memory too early would eventually go away.Anonymous
January 08, 2007
Larry, it's true that time machine hasn't been invented yet. But what harm would it be done if we create a new version that DOES copy the memory, and run an old program that does not expect this on it? (Anyway, would someone write a program that modifies the buffer as it plays?) I'd say that if it'd be a non-breaking change and your teams should consider fix it this way.Anonymous
January 08, 2007
how about a simple memlock inside PlaySound API ? Wouldnt that re-affirm the implicit contract ?Anonymous
January 08, 2007
The comment has been removedAnonymous
January 08, 2007
The comment has been removedAnonymous
January 08, 2007
Cheong: In that case, I can a imagine a programmer writing a program, testing it in Vista and releasing it, and then it crashes on XP. The correct way to do this would be to introduce a PlaySoundEx function with proper async behavior - having a mechanism to know that the sound ended, maybe even freeing the memory automatically using an explicitely-stated allocator, etc.Anonymous
January 08, 2007
Ah - sorry. The way you'd worded the doc change sentence made me think that "the documentation might get fixed at some point, which was probably going to happen anyway", not "I've made them aware this is an error, so they're definitely fixing it". As for ReadFile(), if what you say is the case then its documentation is also bad, and also needs fixing. Get the MSDN folks to read the documentation for some non-Microsoft asynchronous APIs. They could start with the POSIX aio stuff (e.g. http://www.opengroup.org/onlinepubs/009695399/basedefs/aio.h.html , http://www.opengroup.org/onlinepubs/009695399/functions/aio_read.html , etc...) "And yeah, 20/20 hindsight is wonderful, it would have been great if the API had been defined differently." It doesn't have to be designed differently, it just needs to be documented properly. That it's taken 15 years to do this is ... stunning.Anonymous
January 08, 2007
You can effectively use SND_NOSTOP to poll to see whether the sound has finished playing, but that's not much fun and will of course use lots of CPU (depending on how you poll). I think the only other alternative to spinning up a thread is using the waveOut APIs directly, which is even worse.Anonymous
January 08, 2007
Larry, the hard way: http://en.wikipedia.org/wiki/Tachyon :) Now seriously, we can see that is not possible to change the code, so you change the information about the code ("it's not a bug, it's feature!" can of a way). But a question remains: If the documentation states that you can only free allocation only after the sound being played, how can you code so you check that the "sound stop playing"?Anonymous
January 09, 2007
One thing that I'm sometimes noticing in blog posts from both you and Raymond Chen is that you assume that everyone has your inside-Microsoft knowledge. You know that you, as an implementor of PlaySound, would use threads to get async behavior. A couple of years ago, before there were any MS blogs, I just had no clue at all how the OS calls were implemented. For all I knew you would be using something kernel-specific that was totally different than user-mode threads to implement async stuff, so I didn't assume anything about the behavior of a function other than what is written explicitly in the documentation. What you've written about PlaySound is easy for you to think of because you know how it works, and how other API calls are implemented behind the scenes. That doesn't mean that it's also easy to figure that out if you haven't worked on Windows internals for tens of years. :-)Anonymous
January 09, 2007
Gordias > You honestly think that a function that accesses a caller-supplied memory buffer, after it has returned to the caller, even if it exhibits asynchronous behaviour, should not mention this in it's documentation? You could make a similar point about localtime() - it should be obvious that the returned buffer is static and that the result will be overwritten by subsequent calls, but every single manual or book I've ever read that describes what localtime() does has still pointed this out. There are a number of "implicit" rules when it comes to calling C functions. One is that pointer arguments should not be NULL unless the documentation says otherwise. One is that functions should be reentrant unless the documentation says otherwise. And one is that functions do not affect user-supplied memory after they've returned, unless the documentation says otherwise. No doubt there are more. But if your function breaks any of these rules, document it as such. It's not hard - a short sentence or two will suffice. This is not an onerous burden to place on library writers.Anonymous
January 09, 2007
I don't know, Larry: usually I agree with you, but not this time. This way lie memory leaks and other hard-to-track-down problems. I appreciate that fixing the documentation, checking back later and calling with a NULL before freeing the memory, and such are necessary workarounds for a bad contract. But it is a bad contract. Any API that takes an input buffer and does something asynchronously with it has to also do one of the following things:
- Copy the contents to its own buffer synchronously, before it returns.
- Take ownership of the buffer, freeing it itself when it's done.
- Provide for some sort of callback so it can tell the caller when it's done. An API that doesn't do one of those is, simply put, bad. ReadFile isn't quite the same, since, as you point out, the buffer is an output parameter. Even so, I'd say that the API should provide a callback so the caller can find out when it's done without having to poll. Consider a parallel to PlaySound, where you want me to hear some favourite music that you have on CD. You can invite me over (Thank you!) and play it for me (synchronous). Or you can give me your CD (asynchronous). In the latter case, I can copy your CD (and fend off the DRM police), you can just give it to me and not want it back, or I can drop by and give it back to you when I'm done. Phoning me every 20 minutes to ask if I'm done with it yet... isn't really a viable option.
Anonymous
January 09, 2007
Jonathan: Totally agreed. Just like the ReadFileEx() and WriteFileEx() that exclusively released to deal with async. read/write. :)Anonymous
January 10, 2007
One question that kept on coming up during my earlier post was "How long is it going to take to playAnonymous
January 10, 2007
I suspect the first point Tim Lovell-Smith tried to make was not that you should have invented a time machine and fixed the design to copy the data (which I think would have been a mistake anyway) but that the method by which you arrived at your conclusion simply isn't available to those of us who came to the platform only recently. While I think your suggested method is sound in principle, the example may not be the best. (64K? Oh yeah, I guess I remember that from 20 years ago in my DOS days, but Windows programmers had to deal with that, too? Insane.) (To me, it seems obvious that when you tell a function to process some memory asynchronously that you can't free the memory until whatever logic is spawned by the function finishes. But then I'm used to old-school asynch Mac OS parameter blocks. And patching the functions that traffic in them. And far viler things you don't even want to know about. At least we had a flat address space. :-)Anonymous
January 10, 2007
A fix to this api would be to enable a callback, or wait mechanism when the sound has finished.Anonymous
January 10, 2007
The lack of a robust way to know when the sound is finished playing, is what have led developers assume things that isn't in this contract. You have two options to fix this:
- Make a more robust dokumentation.
- Make a more robust api (as suggested in a previous post, a way to know when the sound has finished, use overlapped i/o as an good example of this).
- Anonymous
January 10, 2007
The comment has been removed - Anonymous
January 15, 2007
The comment has been removed - Anonymous
January 15, 2007
I'm more discombobulated than usual on this series, I totally missed the third article in the series