Jaa


Ad-Hoc String Concatenation using LINQ

I regularly use functional programming and LINQ in two contexts – when writing code that is part of an example or tool that will potentially execute millions of times, and when writing ad-hoc queries.  These days, I use C# and LINQ as my ‘scripting language’, to iterate through directory structures, open and process Open XML documents, and do whatever else is part of my task at hand.  I have different coding practices when I write these ad-hoc queries.  I basically do no query optimization.  So long as my little program does what I want, I don’t care if it is inefficient.  I use different idioms for string concatenation when writing these ad-hoc queries / projections.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCWhen you are writing small ad-hoc queries, a very common task is to concatenate a list of strings into a single string.  Sometimes you don’t need to interject any additional text between items in the source list, and sometimes you do need to interject text, such as Environment.NewLine.  I use a different idiom for each of these cases.

When writing example code or tools, I use the StringConcatenation extension method that I introduced in this topic.  I keep this method in the PtUtil.cs module (available in HtmlConverter.zip under the downloads tab at www.codeplex.com/powertools.  But when I’m writing an ad-hoc query, sometimes it’s not worth the effort to put that source file in my project.  After all, I’m going to spend only about 3 minutes (or less) to write and execute the query.

When I don’t need to interject a new line between each string, I use the Aggregate extension method as follows.

string[] stringArray = new[] {
"abc",
"def",
"ghi",
};
string str = stringArray.Aggregate((s, i) => s + i);
Console.WriteLine(str);

This outputs:

abcdefghi

When I need to interject a newline between each string, then it is necessary to supply an empty string as the seed value:

string[] stringArray = new[] {
"abc",
"def",
"ghi",
};
string str = stringArray.Aggregate("", (s, i) => s + i + Environment.NewLine);
// seed value -------------------- ^
Console.WriteLine(str);

This outputs:

abc
def
ghi

If you don’t supply the seed value when concatenating strings, then the query will concatenate the first two strings in the source collection:

string[] stringArray = new[] {
"abc",
"def",
"ghi",
};
string str = stringArray.Aggregate((s, i) => s + i + Environment.NewLine);
// no seed value here ------------ ^
Console.WriteLine(str);

This outputs:

abcdef
ghi

Using the Aggregate extension method in this fashion is not as efficient as using a StringBuilder object (the StringConcatenate extension method uses a StringBuilder).  When using Aggregate in this fashion, it creates a short-lived string object for every element in the source collection.  When writing a tool or an example, it makes sense to spend the little bit of time to include the StringConcatenate extension method, but when writing an ad-hoc query, I don’t care.

Comments

  • Anonymous
    March 10, 2010
    You can do both..... string str2 =  stringArray.Aggregate(new StringBuilder(), (s, i) => s.AppendLine(i)).ToString();

  • Anonymous
    March 10, 2010
    The comment has been removed

  • Anonymous
    March 10, 2010
    @James and Henk, I've been accused of being lazy (as a compliment of sorts :-) I plead guilty. -Eric

  • Anonymous
    March 10, 2010
    Why not just use string.Join? string.Join("", stringarray); string.Join(Environment.NewLine, stringarray);

  • Anonymous
    March 10, 2010
    Hi Justin, The problem is that string.Join only works for arrays of strings.  In the examples here, I used an array of strings as an easy way to supply an IEnumerable<string>, so string.Join would work, but in most cases, you have an IEnumerable<string> that is the result of a query (using lazy evaluation), so in that case, string.Join doesn't work, unless you use the ToArray() method on the collection.  Also, this affects how you read the query, because using Aggregate in this fashion means that you can place the Aggregate extension method in the proper place in the query, whereas with string.Join, you would need to surround the expression that produces a collection of strings with parentheses, and call string.join before the parenthesized expression.  Does this make sense? -Eric

  • Anonymous
    June 30, 2010
    I use this technique a lot, too. In order to get a newline between the string, I usually do string str = stringArray.Aggregate((s, i) => s + Environment.NewLine + i); Console.WriteLine(str); without using a seed. Works fine. =)