Type.Missing, C#, and Word
Recently there was a little bit of a ruckus about the correct way to talk to the Word object model in C# when it comes to missing arguments. If you've ever used the Word PIAs with C# (Primary Interop Assemblies) you will be familiar with the coding practice below. For example, this slightly modified example comes from the MSDN VSTO 1.0 documentation--an example of how to spell check a string using the word object model in C#:
internal void SpellCheckString()
{
string str = "Speling erors here.";
object ignoreUpperCase = true;
object missingType = Type.Missing;
bool blnSpell = ThisApplication.CheckSpelling(str,
ref missingType, ref ignoreUpperCase, ref missingType,
ref missingType, ref missingType, ref missingType,
ref missingType, ref missingType, ref missingType,
ref missingType, ref missingType, ref missingType);
MessageBox.Show(blnSpell.ToString(), "False if Errors, True if OK");
}
The first thing that probably comes to mind if you're a VB.NET programmer and you've never seen code written against Word in C# is “Why is this so verbose?“
VB.NET does some special things for you when there are optional arguments in a method, so the VB version of this looks like this:
Friend Sub SpellCheckString()
Dim str As String = "Speling erors here."
Dim blnSpell As Boolean = _
ThisApplication.CheckSpelling(str, , True)
MessageBox.Show(blnSpell.ToString, "False if Errors, True if OK")
End Sub
In VB.NET you don't have to worry about passing a value for each optional argument--the language handles this for you. You can even use commas as shown above to omit one particular variable you don't want to specify--in this case we didn't want to specify a custom dictionary, but we did want to set IgnoreUpperCase, so we ommitted the custom dictionary argument by just leaving it out between the commas.
The first thing that probably comes to mind if you're a C# programmer and you've never seen code written against Word in C# is “Why is all that stuff passed by reference?“
The first thing to understand is that when you are talking to Word methods, you are talking to the Word object model through interop. The PIA (Primary Interop Assembly) is the vehicle through which you talk to the unmanaged Word object model from managed code.
If you were to examine the IDL definition for “CheckSpelling“ generated from Word's COM Type Library you would see something like this:
[id(0x00000144), helpcontext(0x09700144)]
HRESULT CheckSpelling(
[in] BSTR Word,
[in, optional] VARIANT* CustomDictionary,
[in, optional] VARIANT* IgnoreUppercase,
[in, optional] VARIANT* MainDictionary,
[in, optional] VARIANT* CustomDictionary2,
[in, optional] VARIANT* CustomDictionary3,
[in, optional] VARIANT* CustomDictionary4,
[in, optional] VARIANT* CustomDictionary5,
[in, optional] VARIANT* CustomDictionary6,
[in, optional] VARIANT* CustomDictionary7,
[in, optional] VARIANT* CustomDictionary8,
[in, optional] VARIANT* CustomDictionary9,
[in, optional] VARIANT* CustomDictionary10,
[out, retval] VARIANT_BOOL* prop);
Note that any parameter that is marked as optional--meaning you can omit the value and Word will pick a reasonable default value or ignore that option--is marshalled as a pointer to a VARIANT in Word (Excel doesn't typically use a pointer to a VARIANT for optional parameters so you don't have this by ref issue for most of Excel). When the PIA is generated, the generated IL ends up looking like this in the PIA:
.method public hidebysig newslot abstract virtual
instance bool CheckSpelling([in] string marshal( bstr) Word,
[in][opt] object& marshal( struct) CustomDictionary,
[in][opt] object& marshal( struct) IgnoreUppercase,
[in][opt] object& marshal( struct) MainDictionary,
[in][opt] object& marshal( struct) CustomDictionary2,
[in][opt] object& marshal( struct) CustomDictionary3,
[in][opt] object& marshal( struct) CustomDictionary4,
[in][opt] object& marshal( struct) CustomDictionary5,
[in][opt] object& marshal( struct) CustomDictionary6,
[in][opt] object& marshal( struct) CustomDictionary7,
[in][opt] object& marshal( struct) CustomDictionary8,
[in][opt] object& marshal( struct) CustomDictionary9,
[in][opt] object& marshal( struct) CustomDictionary10) runtime managed internalcall
{
.custom instance void [mscorlib]System.Runtime.InteropServices.DispIdAttribute::.ctor(int32) = ( 01 00 44 01 00 00 00 00 )
} // end of method _Application::CheckSpelling
Or, what you see in the C# intellisense looks like this:
bool _Application.CheckSpelling(string Word,
ref object CustomDictionary,
ref object IgnoreUppercase,
ref object MainDictionary,
ref object CustomDictionary2,
ref object CustomDictionary3,
ref object CustomDictionary4,
ref object CustomDictionary5,
ref object CustomDictionary6,
ref object CustomDictionary7,
ref object CustomDictionary8,
ref object CustomDictionary9,
ref object CustomDictionary10)
So the upshot of all this is that any optional argument in Word has to be passed by ref from C# and has to be declared as an object. Even though you'd like to strongly type the IgnoreUppercase to be a boolean in the CheckSpelling example, you can't. You have to type it as an object or you'll get a compile error. This ends up being a little confusing because you can strongly type the first argument--the string you want to check. That's because in the CheckSpelling method, the “Word“ argument (the string you are spell checking) is not an optional argument to CheckSpelling. Therefore, it is strongly typed and not passed by reference.
So this all brings us back to Type.Missing.
The way you specify in C# that you want to omit an argument because it's optional (after all, who really wants to specify 10 custom dictionaries?) is you pass an object by reference which you have set to Type.Missing. In our example, we just declared one variable called missingType and passed it in 11 times.
Now when you pass objects by reference to managed functions, you do that because the managed function is telling you that it might change the value of that object you passed into the function. So it might seem bad to you that we are passing one object set to missingType to all the parameters of CheckSpelling that we don't care about.
After all, imagine you have a function called DoStuff (shown below) that takes two parameters by ref. If you set the first parameter to true, it will do something happy. If you set the second parameter to true, it will delete an important file. But if you pass in Type.Missing to both parameters, it won't do anything--or so you thought.
Because you are passing by ref, what if the code evaluating the first parameter changes it from Type.Missing to true as a side-effect? Now, when the code executes later in the function to look at the second parameter, it will see the second parameter is now true because you passed the same instance to both parameters:
namespace
ConsoleApplication1
{
class Class1
{
[STAThread]
static void Main(string[] args)
{
object missingType = Type.Missing;
DoStuff(ref missingType, ref missingType);
}
static void DoStuff(ref object DoSomethingHappy, ref object DeleteImportantFile)
{
if (DoSomethingHappy == Type.Missing)
{
// Don't do something happy but set DoSomethingHappy to true
DoSomethingHappy = true;
}
if (DeleteImportantFile == Type.Missing)
{
// Don't do anything
}
else if (((bool)DeleteImportantFile) == true)
{
// Do It
System.Diagnostics.Debug.Assert(false, "About to delete an important file");
System.IO.File.Delete("c:\veryimportantfile.txt");
}
}
}
}
You could fix this by declaring an object for each by ref parameter, as shown below.
static void Main(string[] args)
{
object missingType1 = Type.Missing;
object missingType2 = Type.Missing;
DoStuff(ref missingType1, ref missingType2);
}
So you might guess that you might need to rewrite the first method, CheckSpelling, to declare a missingType1..missingType11 because of the possibility that Word might go and change one of the by ref parameters on you and thereby make it so you are no longer passing Type.Missing but something else like “true” that may cause unintended side effects...
WRONG!
Remember that Word is an unmanaged object and you are talking to it through interop. The interop layer realizes that you are passing a Type.Missing to an optional argument on a COM object. Word expects a missing optional argument to be a VARIANT of type VT_ERROR set to DISP_E_PARAMNOTFOUND. So interop obliges and instead of passing a reference to your missingType object in some way, the interop layer passes a variant of type VT_ERROR set to DISP_E_PARAMNOTFOUND. Your missingType object that you passed by reference is safe because it never really got passed directly into Word. It is impossible for Word to mess with your variable, even though you look at the syntax of the call and think it would be possible because it is passed by ref.
So the inital CheckSpelling code is completely correct. Your missingType variable is safe--it won't be changed on you by Word even though you pass it by ref.
But remember this is sort of a special case that only applies when talking through interop to an unmanaged object model that has optional arguments. Don't let this Word special case make you sloppy with other managed methods that you pass values to by ref. When talking to managed methods, you have to be careful when passing by ref because the managed method can change the variable you pass in as shown in the DoStuff example.
Comments
- Anonymous
April 15, 2004
Please note--Eric Lippert (http://blogs.msdn.com/ericlippert) contributed greatly to the investigation of this issue. Many keystrokes will be saved thanks to his great work. - Anonymous
April 15, 2004
So, this leads me to ask, why wasn't the PIA created more intelligently? As a software developer, I would certainly want the PIA to be created to make it relatively easily to call the unmanaged application, considering it is the PIA. It would have been nice if the kind folks at Microsoft in charge of this development work had gone in and tweaked the generated interop assembly to have method overloads that mimic the way the optional operators worked. Perhaps even using a ParamArray for all those custom dictionaries, and checking to make sure that no more than 10 were ever passed. Redefining most of Word's ref fields so that nearly everything wasn't declared as a ByRef argument when it didn't need to be.
Granted, the Office PIAs are huge, and this would take a lot of work, but it would make developing against office from a .NET perspective 1000% easier. Using the Office PIAs is so complex compared to developing from a VB6 prospective that it almost seems easier to continue using the VB6 language right now. - Anonymous
April 15, 2004
Another thing to note is that your implementation of DoStuff, though legal, is pathological. Anyone who would write a method which changes the value of a missing ref parameter is just asking for trouble. Don't do that! - Anonymous
April 15, 2004
Ryan: We are looking at ways to create PIAs more intelligently. And of course the best long term thing that could happen is that eventually Office will just have a first class managed object model. But as you can imagine this takes time. In VSTO 2.0 there will be no improvements to the PIAs per say, other than the view control stuff that I've blogged about which will make certain objects more accessible and a little more .NET friendly. - Anonymous
April 18, 2004
The comment has been removed - Anonymous
April 18, 2004
Yes--I'm afraid our team gave out some bad information in this area initially before really doing our homework on this--sorry about that Ken. - Anonymous
June 16, 2004
Day 2 -- Delving into assemblies, with a focus on multifile assemblies - Anonymous
January 21, 2009
PingBack from http://www.hilpers.it/2652011-automazione-word