What's wrong with this code, part 14
Keeping up with the "theme" of COM related bad code examples, here's another real-world example. To avoid any ATL confusion, it's 100% pure C++.
Our test team rolled out a new set of tests last week and immediately hit this one. The funny thing is that the code in question had worked without flaw for two years before this.
Obviously this has been abstracted to the point of ridiculousness, there's just enough here to demonstrate the bug...
struct FooConfig{ int _Value1; int _Value2;};[ object, uuid("0A0DDEDC-C422-4BB3-9869-4FED020B66C5"), pointer_default(unique) ] __interface IFoo : IUnknown{ HRESULT GetFooConfig([out] struct FooConfig **ReturnedFooConfig);};FooConfig _GlobalFooConfig = { 1, 2};class CFoo: public IFoo{ LONG _refCount;public: CFoo() : _refCount(1) {}; // IFoo HRESULT GetFooConfig(FooConfig **ReturnedFooConfig) { *ReturnedFooConfig = &_GlobalFooConfig; return S_OK;`` } // IUnknown virtual HRESULT STDMETHODCALLTYPE QueryInterface(const IID& iid, void** ppUnk) { HRESULT hr=S_OK; *ppUnk = NULL; if (iid == __uuidof(IFoo)) { AddRef(); *ppUnk = reinterpret_cast<void *>(static_cast<IFoo *>(this)); } else if (iid == IID_IUnknown) { AddRef(); *ppUnk = reinterpret_cast<void *>(static_cast<IUnknown *>(this)); } else { hr = E_NOINTERFACE; } return hr; } virtual ULONG STDMETHODCALLTYPE AddRef(void) { return InterlockedIncrement(&_refCount); } virtual ULONG STDMETHODCALLTYPE Release(void) { LONG refCount; refCount = InterlockedDecrement(&_refCount); if (refCount == 0) { delete this; } return refCount; }};
This one's pretty straighforward, and I expect that people will see the problem right away. So I'm going to raise the bar a bit - to get full credit, you not only have to explain not only what the problem is, but also why we'd never seen a problem with this code before.
And, as always, kudos and mea culpas on Monday.
Edit: Fixed return value from GetFooConfig.
Comments
- Anonymous
July 07, 2005
GetFooConfig is missing a 'return S_OK', and I'm guessing the reason it isn't noticed is that typically QueryInterface is called first then GetFooConfig, and since by convention return values are stored in the eax register on x86, QueryInterface's returned value of S_OK is retained as GetFooConfig's return value as well. Correct? - Anonymous
July 07, 2005
bao, no, that's not it (but thanks for picking it up). For some reason the C compiler didn't complain (although I'm not sure why). - Anonymous
July 07, 2005
Gee... I guess it has to do with GetFooConfig returning a pointer to that global variable. What happens if you try to marshal that call on a different process? Won't the stub assume the memory has been dynamically allocate and try a CoTaskMemFree on it? - Anonymous
July 07, 2005
ReturnedFooConfig should be allocated using CoTaskMemAlloc and returned. It should be a copy of the global variable. The error will be observed only when the calling application retains the value returned by the function beyond the time the inproc server is loaded in memory. It will also be observed when the call is marshaled in that case COM Runtime will try to free the returned pointer using CoTaskmemFree. If the call is within the same apartment and the caller doesnot reatin the pointer beyond the lifetime of the dll, and he doesnot free the returned memory using CoTaskMemFree (which legaly he should) the problem will not be observed. - Anonymous
July 07, 2005
ThalesC got the problem, but not the answer.
vrk also added in a reason. But why didn't we find the problem, and what's different about the test case? - Anonymous
July 07, 2005
In the case somebody call GetFooConfig with a NULL we will se a access violation.
The method should contain a check:
if (!ReturnedFooConfig)
return E_POINTER;
CoTaskMemFree can fail silent if you use a wrong pointer (not allocate with CoTaskMemAlloc). Maybe the new test case contains a custom memory allocator. - Anonymous
July 07, 2005
hey larry, post some C# code ;)
btw when do MS devs comment their code ..after or while they are coding???
why don't I see comments? - Anonymous
July 07, 2005
Is the test loading the class as a DLL? Once the DLL is unloaded globals defined in it will go away - thus the returned FooConfig pointer will no longer be valid.
So if they do something like:
h = LoadLibrary("MyTest");
p = new CFoo();
p->GetFooConfig(&r);
FreeLibrary(h);
r->_Value1 = 0;
then the test will asplode. - Anonymous
July 07, 2005
Larry, perhaps I'm being pedantic, but that's not 100% pure C++. Sure, it's 100% pure Microsoft C++ as of 7.0. - Anonymous
July 07, 2005
Orbit, I've done C# code in bad code examples in the past. But the problem doesn't show up easily with C# (since COM's implemented via interop).
And the reason you don't see comments is that there's a lot of code in the example already.
Aaron: This code is never accessed by any mechanism other than CoCreateInstance - so nobody's directly instantiating a CFoo - Anonymous
July 07, 2005
are you still on the media team? - Anonymous
July 07, 2005
Alex, you're right, but the actual code is clean as far as I know - I could have broken the interface and struct into their own .idl file instead of using the extensions but it was easier this way. - Anonymous
July 07, 2005
Oh, and Orbit, I'm still on windows core audio (which is part of the Media Technologies Group (mediatech), which is part of the Digital Media Division, which is a part of...) - Anonymous
July 07, 2005
I suspect that the old tests were using the COM object inproc. Hence they didnt hit this issue. THe new tests might be using this object out of proc? - Anonymous
July 07, 2005
The comment has been removed - Anonymous
July 07, 2005
IFoo isn't a proper COM interface because GetFooConfig uses the thiscall calling convention instead of stdcall. This will make the interface impossible to call from C code. - Anonymous
July 07, 2005
orbit:
F# is no extension to C#. It has its roots in Caml. Read Don Syme's blog about it here http://blogs.msdn.com/dsyme/. It's way different then C#.
There are plenty of languages which targets .NET, and you may consider some of them as "niche languages". See http://www.dotnetpowered.com/languages.aspx for a list. - Anonymous
July 07, 2005
I reread the F# blog and I stand corrected..
I meant a new .NET language that MS supports
even if MS supports it by coding a small program like Express Edition.. - Anonymous
July 07, 2005
Names that begin with an underscore then a capital or another underscore ( [A..Z] ) are reserved to the C++ implementation.
You can see discussion on comp.lang.c++.moderated if you search on "underscore variable C++".
By the way, I would also believe that the problem was revealed by changing from a non-marshalling scenario to a marshalling one. Maybe not in-proc to out-of-proc. Maybe apartments. Also a struct of int's is not automation-compatible if I recall correctly. - Anonymous
July 07, 2005
Ok, I admit I am foggy on my proxy generation rules, but...
If this call is being marshaled, as already stated, the proxy would try to ::CoTaskMemFree the pointer. However, this would be a silent failure. The real problem would be that the caller is now working on an allocated copy of the structure and not the original which could easily fail the validation tests. Also we now have a memory leak. - Anonymous
July 07, 2005
Mike, the "reserved" names is a style definition, not a language restriction - no compile on the earth will enforce that one. And it isn't a code defect.
But you're right - the original (working code) ran in-proc, so the problem wasn't found. It's when the tests started testing the failing method (the tests run out-of-proc) that we found the problem. - Anonymous
July 07, 2005
hey larry, you never fully reveal what exactly your working on...? :)
btw..Steve Ballmer is on channel9 - Anonymous
July 07, 2005
Orbit, you're right, I haven't.
And Steve was cool. - Anonymous
July 07, 2005
is it WMP 11? ;) - Anonymous
July 07, 2005
CFoo should derive from IUnknown. The cast to IUnknown happens to work because the QueryInterface, AddRef, and Release functions are layed out exactly as required for IUnknown so the vtable layout is the same at the moment. It would break if you inserted a new function between one of these. The GetFooConfig function doesn't matter because it isn't virtual. - Anonymous
July 07, 2005
... Of course, now I see that IFoo does indeed derive from IUnknown. Please ignore that last comment... - Anonymous
July 07, 2005
The comment has been removed - Anonymous
July 07, 2005
It seems that the expected answer was already posted (about marshalling and giving out a pointer to internal data). I would like to add a little:
* I’m worried by the initial value of _refCount = 1. If the factory code creates a new CFoo, then QueryInterface()s it for the IID asked by the client, then Release()s the newborn, that’s ok.
* Your _refCount is a signed LONG, while AddRef()’s and Release()’s return value is an unsigned ULONG. Ditto refCount local in Release(). This won’t usually cause any trouble, especially if Release() checks it for equality to zero, but someone debugging might see negative values. - Anonymous
July 07, 2005
Aside from the other problems mentioned, you're passing around a data structure that can presumably be modified by your caller without providing any mechanism to lock/"guard" the structure.
All of the lovely dangers of modifying the same bits in memory from different threads apply. - Anonymous
July 07, 2005
My guess the reason the bug has not been seen before is to do with the client code. If you had a COM object being used from VC 6 or VB, and the object is created apartment threaded and in process this will work fine so long as:
a) You only have one instance of CFoo
b) no one tries to change the contents of FooConfig on one of 2 instances.
However, with a .NET client a COM wrapper will be created, which will then cause it to operate with a marshalling layer and you'll see the bug. - Anonymous
July 08, 2005
Yesterday's post was a classic example of Joel Spolsky's Law of Leaky Abstractions.
Why?&nbsp; Well,... - Anonymous
June 16, 2009
PingBack from http://fixmycrediteasily.info/story.php?id=15753