Jaa


Creating a Collection from Singletons and Collections using LINQ

A key operation when doing pure functional transformations is the process of creating complex hierarchies of objects.  We see this when transforming some data source (such as an Open XML WordprocessingML document) to a LINQ to XML tree, and we see this when writing a recursive descent parser.  The recursive descent parser for Excel formulas is nothing more or less than a pure functional transform from the source string to a parse tree, which is a hierarchical object graph.  In both of these cases, objects contain ‘child’ or ‘constituent’ objects.  An XElement object can contain child XElement objects, and a symbol node in a parse tree is made up of ‘constituent’ or ‘child’ symbols.  When constructing an object there are times that you may have one or more singletons, and one or more collections that will comprise the child objects of the object you are constructing.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCThis post is one in a series on using LINQ to write a recursive-descent parser for SpreadsheetML formulas.  You can find the complete list of posts here.

The question is: what is the best way to create a collection of some type T when you have one or more singletons of T, and one or more collections of T.  You want the resulting collection to contain all of the singletons and all of the collections in some particular order.

The simplest way to create a collection that contains one element from a singleton is to create an array with a single element:

int z = 5;
var z1 = new[] { z };

So if you have a couple of singletons and a collection, you can create a new collection by creating arrays for the singletons, and then using the concat extension method to assemble the lot into a new collection:

int x1 = 5;
int x2 = 7;
int[] x3 = new[] { 1, 2, 3 };
var c = new[] { x1 }.Concat(new[] { x2 }).Concat(x3);

This isn’t bad, but there is a better way.  We can write a method that takes a params array of objects as an argument, and constructs a collection from those arguments.

It is necessary for the method to take a params array of object because in C#, there is no way to declare that a particular parameter can either be a singleton of T or a collection of T.  The type system doesn’t allow for this.  We have to rely on runtime checking of types passed into the method.

using System;
using System.Collections.Generic;
using System.Linq;

public class MyException : Exception
{
public MyException(string message) : base(message) { }
}

class Program
{
static IEnumerable<int> AssembleCollection(params object[] args)
{
foreach (var arg in args)
{
int? i = arg as int?;
if (i != null)
{
yield return (int)i;
continue;
}
var collection = arg as IEnumerable<int>;
if (collection != null)
{
foreach (var c in collection)
yield return c;
continue;
}
throw new MyException("Internal error - argument is not int or collection of int");
}
}

static void Main(string[] args)
{
int x1 = 5;
int x2 = 7;
int[] x3 = new[] { 1, 2, 3 };
var c = AssembleCollection(x1, x2, x3);
foreach (var i in c)
Console.WriteLine(i);
}
}

When assembling a collection, it is easier to write code to call AssembleCollection than to use the concat extension method.  Compare the two approaches:

var c = AssembleCollection(x1, x2, x3);
var c = new[] { x1 }.Concat(new[] { x2 }).Concat(x3);

In my opinion, the first approach is easier to read.

This is, of course, the approach that LINQ to XML takes.  In the following statement, x1, and x2 are singletons, and x3 is a collection.

XElement c = new XElement("z", x1, x2, x3.Where(x => x.Value == "5"));

And if you look at the signature of the XElement constructor, you see the idiom that I’m discussing in this post.

public XElement(XName name, params Object[] content);

The LINQ to XML object model is very, very good.  I have learned a huge amount from it.

Comments

  • Anonymous
    July 27, 2010
    Personally, I would rather add more overloads of the Concat extension method like this: public static class ConcatExtension {    public static IEnumerable<TItem> Concat<TItem>(this TItem first, IEnumerable<TItem> second)    {        return Enumerable.Concat(new TItem[] { first }, second);    }    public static IEnumerable<TItem> Concat<TItem>(this IEnumerable<TItem> first, TItem second)    {        return Enumerable.Concat(first, new TItem[] { second });    } } You can then do the following: var intList = new int[] { 1, 2 }; var singletonAtStart = 9.Concat(intList);    // ==> 9, 1, 2 var singleTonAtEnd = intList.Concat(9);      // ==> 1, 2, 9 This is probably more efficient than your code but less flexible. Just my 2 cents. :-)