The Woes of Excel's Optional Parameters and C# ... a (potential) Workaround
When coding in C# with Excel, it doesn't take long before you encounter the dreaded 'optional parameter' issue. If you have ever copied-and-pasted numerous System.Reflection.Missing.Value statements then you've probably wondered a few things.
Firstly, what is this and why is it happening? Secondly, how do I get round it?
In this post, I'll offer answers to those questions.
What's the Problem with Optional Parameters?
As you know, some of Excel's APIs (such as Application.Intersect or Range.Sort) use optional parameters . When you are coding in VBA you hardly notice this but it's a whole different story in C#.
As I briefly covered in my introduction to managed programming and Excel, when you use a managed language to code against Excel, Excel's API is exposed as a collection of Runtime Callable Wrappers (RCWs). These are managed classes that have the same methods and properties as the COM classes they wrap. When you call Application.Intersect from C# you are really calling Application.Intersect on the RCW. The RCW then internally marshals the call to the real, COM Application.Intersect.
In other words, the RCWs expose managed wrappers of the COM methods.
However, the methods and properties that have optional parameters in COM are not optional in C#. As a C# programmer, you have to supply a value for every single parameter, regardless of whether or not it is optional in COM.
At first glance, this seems like not such a big deal. That is, however, until you realise that some methods (like Application.Intersect) take 30 parameters; 28 of which are optional!
Suddenly, having to specify a value for every single parameter becomes tedious and messy and litters your code with statements such as:
_application.Intersect(
rng1,
rng2,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
Is this a .NET Problem?
No. The CLR supports optional parameters. This is a C# problem (Incidentally, VB.NET doesn't have this issue).
To underline this point, let's leave C# for a moment and jump into what managed code is compiled to: MSIL. We'll disassemble Excel's RCWs and take a look at the MSIL they contain. This gives us a language agnostic view of what the RCW is really defined as.
The following screenshot shows the MSIL of the Application.Intersect method.
What to notice here is that the first two parameters are of type Microsoft.Office.Interop.Excel.Range but that the rest are of type object and they are also decorated with the [opt] attribute.
When you call this method from C# you are forced to specify a value for every parameter because C# (not .NET) doesn't support optional parameters.
It's worth noting that this is also not an Excel problem. Any COM library that uses optional parameters will have the same problem when accessed from C#.
So, what can we do about this?
The obvious answer is: 'don't use C# use VB.NET instead' but this isn't really an option for those programmers who (for other reasons) may want to use C#.
Therefore, is there anything we can do in C# to make the story a little better?
Working Around the Problem
The most common workaround I know of is to create wrapper classes of the RCWs. For example, we could write a wrapper of the Microsoft.Office.Interop.Excel.ApplicationClass class with more friendly C# function signatures. Internally, our wrapper would still have to make the call to the RCW with a value for all parameters but our wrapper would expose a nicer programming interface which we would use throughout our program. This means that the messy call to the RCW happens in just one place in our code rather than in multiple.
The big problem with this workaround is that the Application class has quite a lot of methods and properties and duplicating them all is time consuming. Also, we'd have to do the same thing for the Range class and the Workbook class and every other (Excel) class that we use.
We could, instead, just wrap up the functions that we care about. So, instead of wrapping every function we only wrap the the Excel functions we use in our program. This is a much smaller task and the is the route most people (I think) have taken. But it's still not ideal and it's not very generic.
Therefore, is there another option?
Using Reflection to Dynamically Invoke Functions at Runtime
.NET assemblies have lots of metadata in them which can be accessed at runtime via a technology known as Reflection. Reflection, in .NET, is hugely powerful and is a massive subject in its own right. But one of the things it enables us to do is to invoke methods by name at runtime.
When our code is compiled, the compiler ensures that we have used the correct function signatures and types. If we haven't, the compiler emits an error. Reflection in .NET allows us to discover function signatures and parameter types at runtime. We can then go ahead and invoke these functions 'adhoc' at runtime; bypassing the type checking and validation normally done by the compiler. If we use the wrong function signature or the wrong data type we'll get a runtime error rather than a compile-time error.
Let's take our example of calling Application.Intersect. The C# compiler requires us to specify a value for all 30 parameters. If we don't do this, the compilation fails. There's no way round this ... unless we don't actually tell the compiler that we are invoking the Intersect method.
In the remainder of this post, I'll introduce a C# class I've written that uses .NET reflection. The class invokes methods and properties of Excel RCWs dynamically at runtime, thus, hiding from the compiler what we are really doing and removing the need for us to specify a value for all optional parameters.
The ExcelUtility Class
This class is very small and is mostly defined in about 30 lines of code. However, in these few lines I make use of reflection and generics and my explanation assumes you have some basic knowledge of these two topics.
The important method in the ExcelUtility class is the Invoke method. This method contains the code which discovers function signatures at runtime and handles the invocation of them.
public static TRet Invoke<TRet>(object Obj, Type ObjType, string MethodName, params object[] CallerParams) where TRet : class
The MethodName parameter is, well, the name of the method we want to call on the RCW. For example, if we want to call Application.Intersect then we set MethodName to the string "Intersect".
ObjType is the Type object of the class which we are invoking. The framework contains a class called Type which represents the metadata of data types. Every class has a Type which can be accessed at runtime. For the Excel Application class (well, the RCW) we get its Type object using typeof(Microsoft.Office.Interop.Excel.ApplicationClass).
Obj is the actual object which we want to invoke i.e. the Application instance.
Finally, the CallerParams array contains the parameters that the caller wants to pass into the function call. So, in the case of calling Application.Intersect, we would supply the range objects we want to intersect.
Notice that Invoke is a generic method. The generic parameter, TRet, specifies the type of the object to return. Invoke will attempt to cast the result of the dynamic function call to this type. So, if the method we are invoking returns a string then we specify string as the generic parameter i.e. string val = Invoke<string>( ... )
Let's now look at the method's code.
public static TRet Invoke<TRet>(object Obj, Type ObjType, string MethodName, params object[] CallerParams) where TRet : class{
MemberInfo[] memberInfo = ObjType.GetMember(MethodName, BindingFlags.Public | BindingFlags.IgnoreCase | BindingFlags.Instance);
BindingFlags memberTypeFlag;
int paramCount = 0;
switch (memberInfo[0].MemberType) {
default:
return null;
case MemberTypes.Method:
MethodInfo methodInfo = ObjType.GetMethod(MethodName, BindingFlags.Public | BindingFlags.IgnoreCase | BindingFlags.Instance);
paramCount = methodInfo.GetParameters().Length;
memberTypeFlag = BindingFlags.InvokeMethod;
break;
case MemberTypes.Property:
PropertyInfo propInfo = ObjType.GetProperty(MethodName, BindingFlags.Public | BindingFlags.IgnoreCase | BindingFlags.Instance);
paramCount = propInfo.GetIndexParameters().Length;
memberTypeFlag = BindingFlags.GetProperty;
break;
}
object[] paramArray = new object[paramCount];
for (int i = 0; i < paramCount; ++i) {
paramArray[i] = Missing.Value;
}
if (CallerParams != null)
CallerParams.CopyTo(paramArray, 0);
TRet ret = Obj.GetType().InvokeMember(MethodName, memberTypeFlag | BindingFlags.Instance | BindingFlags.Public, null, Obj, paramArray) as TRet;
return ret;
}
The basic idea is this. First, get the metadata for the member specified in MethodName and then store how many parameters the member takes. We also use a bitmap (memberTypeFlag) to store whether the member is a method or a property (this is used later).
Next, we create an array of objects. the size of the array is equal to the number of parameters the method/property needs. We also initialise every object in that array to Missing.Value. So if the method takes 30 parameters paramArray will have 30 objects in it, each set to Missing.Value.
We then copy the parameters in CallerParams into paramArray . So, if CallerParams had two objects in it, paramArray would have 30 objects with the first two being the same as those specified in CallerParams (i.e. the parameters we care about) and the remaining 28 being set to Missing.Value.
Finally, we invoke the method/property passing in paramArray as its parameters and casting the result to an object of type TRet and returning it to the caller.
So to summarise, Invoke takes an object reference and a name of a method. It then figures out how many parameters the method expects and ensures that it supplies any missing parameters not specified by the caller. That is, the caller only needs to supply the parameters it cares about; Invoke will fill-in-the-blanks.
Using ExcelUtility
The Invoke method will call methods on any class. But using it directly means that we have to constantly specify which class we want to use. So, to make our lives a little simpler, I have added wrapper functions that can be used with the Application and Range classes.
These functions are all one-liners and simply reduce the number of parameters we need to pass when using the ExcelUtility in our code.
Here's the wrapper function for invoking the Application class (the Range function wrapper can be obviously deduced).
public static TRet ApplicationInvoke<TRet>(object Obj, string MethodName, params object[] Params) where TRet : class {
return Invoke<TRet>(Obj, typeof(Excel.ApplicationClass), MethodName, Params);
}
So, let's take a look at how the ExcelUtility class changes the landscape.
Using the RCWs directly, to invoke Application.Intersect for two ranges (rng1 and rng2) we have to write:
Excel.Range rng = _application.Intersect(
rng1,
rng2,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
With the ExcelUtility class, this changes to:
Excel.Range rng = ExcelUtility.ApplicationInvoke<Excel.Range>(_application, "Intersect", rng1, rng2);
(where _application is our instance of Excel's application object).
Quite a difference...
Great ... but not so great ...
On the surface, it looks like a nice solution. But there are downsides to this workaround that become apparent once you start using it for real.
First of all ... things start looking not so nice again when you invoke methods/properties with a non-optional parameter in the middle of a list of optional ones. For example, the Range.Address property has 5 parameters. The first two are optional, as are the last two, but the third parameter is non-optional.
Directly invoking the RCW to get Range.Address means writing:
string addr = _columnRange.get_Address(
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
Microsoft.Office.Interop.Excel.XlReferenceStyle.xlA1,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value
);
and using ExcelUtility we do:
addr = ExcelUtility.RangeInvoke<string>(_columnRange, "Address", null, null, Microsoft.Office.Interop.Excel.XlReferenceStyle.xlA1);
I'd say that wasn't much of an improvement. It might even be worse.
(Note that we specify the property name "Address" in the call to RangeInvoke where as the RCW uses a get_Address method. This is because C# does not support properties with parameters - but that's another topic).
Second problem ... we only find out about coding errors at runtime. Because we are hiding the call to the RCW from the compiler we won't get any compilation errors if we use the wrong data types or incorrectly spell a function name. If, for example, we try to invoke the "Addres s" property - our code will compile. It will only be at runtime that we get an error (and even then the error message might not be that useful in figuring out our mistake). This means that we need to already know (or have looked up) the signatures of all the Excel APIs we are calling.
Third problem ... when the method we are calling on the RCW does not have optional parameters, calling the RCW directly is nicer. So, ExcelUtility isn't always a good replacement for calling the RCW directly. It's more like something that should be used when there are optional parameters.
Summary
What I've presented here is an idea for how we can workaround the problem of optional parameters in Excel's API when coding in C#. I should point out that it's just my idea and isn't a Microsoft recommended solution. Also, I have not exhaustively tested this code. By that I mean I haven't tested whether this works with every single method and property in the Excel API so there may be (and likely are) cases where this fails.
These are the key points:
- ExcelUtility uses reflection to discover function signatures at runtime;
- The Invoke method fills-in-the-blanks; we only have to specify the parameters we care about (most of the time);
- Because we don't tell the compiler which methods we are invoking, we only get told of errors at runtime;
- To call a method or a property on the Application class, use the ApplicationInvoke method;
- To call a method or a property on the Range class, use the RangeInvoke method;
- The main method is Invoke which can be used (in theory) to call methods on any class;
Download the Source Code
I have published ExcelUtility.cs to the MSDN Code Gallery and it can be downloaded by clicking on the link below (go to the Releases tab to download).
Comments
Anonymous
February 07, 2008
Nice solution which I noew have implemented in my own C# managed add-in template. Thanks and keep up the good level, DennisAnonymous
February 07, 2008
It not just the Type.Missing. It's all the ref parameters as well. You can't just have true/false as a parameter, you have to have that as a variable and use "ref myVar". Yuck! It's too sad that a good fix for this isn't in the PIA or in some other way included in VSTO. The C# templates could have overloaded methods. I'm sure you could generate this from the typelibraries and included it in VSTO. Atleast for the common things like Workbook.Open, Save, Application.Intersect, Union, many range methods etc. If it's to bloated with 32 overloads on a single method, maybe you could have 2 overload, one with the most common scenario, like Workbook.Open(filename) and one with Workbook.Open(filename, paramarray[] params) I love C#, but for Office development I use VB.Net because it's impossible to write clean code with this ref Type.Missing madness. If the Office platform should be a professional developer platform I think you must target C# developers in a better way than today. As I see it, Office developments needs VB.Net today. But I like C# alot better than VB.Anonymous
February 07, 2008
Btw... The really bad thing with reflection is that you don't get IntelliSense and compile time error checking. And isn't relection alot slower since it's evaluated at runtime?Anonymous
February 29, 2008
Gabhan Berry (read his blog - he has lots of Excel programmability related information in there!) wroteAnonymous
March 11, 2008
When presented with a table of data in Excel, sometimes it is useful to be able to learn some quick factsAnonymous
April 10, 2008
Johan, yes ... reflection is slow so we should avoid (if we can) using reflection too often. Gauging what is too often and what is not is something I think we each need to decide ourselves, based on our own application's requirements.Anonymous
June 24, 2008
The comment has been removedAnonymous
June 30, 2008
Thank you! This will come in very handy! There are also situations where you might not want to have a reference to Microsoft.Office.Interop.Excel in your code. In these circumstances, the following code could be a useful addition (in which case, your "Address" example comes down to: string addr = ExcelUtility.RangeInvoke<string>(excelApp.Selection, "Address", null, null, ExcelUtility.FieldValue("XlReferenceStyle","xlA1")); ) //... public static Type ExcelType(String name) { Type t = System.Reflection.Assembly.Load("Microsoft.Office.Interop.Excel").GetType("Microsoft.Office.Interop.Excel." + name, true, true); return t; } public static object FieldValue(string typeName, string fieldName) { Type t = ExcelType(typeName); FieldInfo fi = t.GetField(fieldName); return fi.GetValue(t); } public static TRet Invoke<TRet>(object Obj, string TypeName, string MethodName, params object[] CallerParams) where TRet : class { return Invoke<TRet>(Obj, ExcelType(TypeName), MethodName, CallerParams); }