Aggregation

[Blog Map]  [Table of Contents]  [Next Topic]

There are many times when composing queries that you have to do aggregation. There are a number of built-in aggregators:

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC·         Count

·         LongCount

·         Sum

·         Min

·         Max

·         Average

·         Aggregate

Aggregators are extension methods that operate on IEnumerable<T>, and return some type T (or U) that is not a collection.  Another way to say it: aggregation is the process of taking a collection and making a singleton.  Using the Sum aggregate extension method can be as simple as:

int[] ia = { 1, 5, 2, 6, 7 };
Console.WriteLine(ia.Sum());

The Aggregate operator is a general purpose aggregator.  You can use it to sum an array, concatenate a bunch of strings, or anything else where you need to take a collection and reduce it to a single type.  It has a couple of different forms.  If you want to use it to aggregate a collection of some type T into a single T, you can use it like this:

int[] ia = { 1, 5, 2, 6, 7 };
Console.WriteLine(ia.Aggregate((a, i) => a += i));

This does the same thing as Sum.

There is another overload of the Aggregate extension method that allows you to specify a seed for the aggregation - in other words, to provide an initial value for aggregation.  The following code shows setting the seed to zero: 

int[] ia = { 1, 5, 2, 6, 7 };
Console.WriteLine(ia.Aggregate(0, (a, i) => a += i));

This will produce the same results as the previous example.  

To use the Aggregate operator to concatenate strings, you could do this:

string[] ia = { "aaa", "bbb", "ccc" };
Console.WriteLine(ia.Aggregate((a, i) => a += i));

The observant will notice that this use of Aggregate creates a new short-lived string object on the heap for every item in the collection.

You can use Aggregate in combination with the StringBuilder class, as follows:

string[] ia = { "aaa", "bbb", "ccc" };
Console.WriteLine(
ia.Aggregate(
new StringBuilder(),
(sb, i) => sb.Append(i),
sb => sb.ToString()
)
);

This example uses an overload of the Aggregate extension method that allows you to set the seed (where it news up the StringBuilder, and to specify another lambda expression that projects the results of the aggregation).  In this case, the projection is to convert the StringBuilder to a string.

This eliminates the creation of so many strings on the heap, but it adds syntactic complexity.  It also isn't pure, as the side effect of the changing StringBuilder would be observable in the lambda expression, but I don't care.  This is a side effect that I can live with.

But the best method to concatenate strings is to write an extension method, StringConcatenate, as follows:

public static string StringConcatenate(
this IEnumerable<string> source)
{
return source.Aggregate(
new StringBuilder(),
(s, i) => s.Append(i),
s => s.ToString());
}

Its use:

string[] ia = { "aaa", "bbb", "ccc" };
Console.WriteLine(ia.StringConcatenate());

Finally, it is useful to have another overload of StringConcatenate.

This one that takes a delegate that does the projection from T to string:

public static string StringConcatenate<T>(
this IEnumerable<T> source,
Func<T, string> projectionFunc)
{
return source.Aggregate(
new StringBuilder(),
(s, i) => s.Append(projectionFunc(i)),
s => s.ToString());
}

Here is a small program that contains both extension methods, and code to exercise them.  The code is attached to this page:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;

public static class LocalExtensions
{
public static string StringConcatenate(
this IEnumerable<string> source)
{
return source.Aggregate(
new StringBuilder(),
(s, i) => s.Append(i),
s => s.ToString());
}

public static string StringConcatenate<T>(
this IEnumerable<T> source,
Func<T, string> projectionFunc)
{
return source.Aggregate(
new StringBuilder(),
(s, i) => s.Append(projectionFunc(i)),
s => s.ToString());
}
}

class Program
{
static void Main(string[] args)
{
string[] stringList = new[] { "aaa", "bbb", "ccc" };
XElement xmlDoc = XElement.Parse(
@"<Root>
<Value>111</Value>
<Value>222</Value>
<Value>333</Value>
</Root>");

string s1 = stringList.StringConcatenate();
string s2 = xmlDoc.Elements().StringConcatenate(el => (string)el);

Console.WriteLine(s1);
Console.WriteLine(s2);
}
}

[Blog Map]  [Table of Contents]  [Next Topic]

Aggregation.cs

Comments

  • Anonymous
    June 18, 2008
    PingBack from http://blogs.msdn.com/ericwhite/pages/FP-Tutorial.aspx

  • Anonymous
    July 08, 2008
    This topic took some work to understand, mostly because it uses C# syntax and methods I've never used. It would help to explain Func<T, TResult>, and the 3 different forms of Aggregate that are used. An explanation of the Aggregate methods would have helped a lot since unlike Sum, the use of their arguments and their operation are not apparent from reading the code. For example, it was not obvious where seed is created or initialized in: Console.WriteLine(ia.Aggregate((seed, i) => seed += i)); and would have be easier to understand if Console.WriteLine(ia.Aggregate(0, (seed, i) => seed += i)); were introduced first with the explanation that 0 is the initial value for the aggregate, is can be any identifier and represents the current value of the aggregate, and that when the initial value is the default initial value for the aggregate type, the expression can be shortened to the first form.

  • Anonymous
    August 28, 2008
    [Table of Contents] [Next Topic] Our next goal is to retrieve the text of the paragraphs in the document.

  • Anonymous
    September 13, 2008
    You are absolutely right, Marv.  I've modified the topic accordingly. -Eric