Jaa


Hex Dump using LINQ (in 7 Lines of Code)

At one point while debugging the HtmlConverter class, when I found certain situations in the XML, I wanted to dump the XML in binary to see the actual hex values of characters being used.  I got tired of stopping and examining the values in the debugger.  I did a couple of searches, and found some sample C# code to implement a simple hex dump, and noticing that it was about 30 lines of code, thought that it I could re-write the code using LINQ and it would be cleaner and smaller.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCFollowing is a sample that dumps a byte array in hex:

byte[] ba = File.ReadAllBytes("test.xml");
int bytesPerLine = 16;
string hexDump = ba.Select((c, i) => new { Char = c, Chunk = i / bytesPerLine })
.GroupBy(c => c.Chunk)
.Select(g => g.Select(c => String.Format("{0:X2} ", c.Char))
.Aggregate((s, i) => s + i))
.Select((s, i) => String.Format("{0:d6}: {1}", i * bytesPerLine, s))
.Aggregate("", (s, i) => s + i + Environment.NewLine);
Console.WriteLine(hexDump);

To break up the binary data into groups of bytes for each line, this example uses the idiom that I discussed in Chunking a Collection into Groups of Three.  Because this is quick-and-dirty code that I didn’t plan on leaving in the delivered code, I used the idiom from Ad-Hoc String Concatenation using LINQ.

The example converts the binary data into a string that you can then dump to the console or whatever.  The resulting string looks something like this:

000000: FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00
000016: 65 00 72 00 73 00 69 00 6F 00 6E 00 3D 00 22 00
000032: 31 00 2E 00 30 00 22 00 20 00 65 00 6E 00 63 00
000048: 6F 00 64 00 69 00 6E 00 67 00 3D 00 22 00 75 00
000064: 74 00 66 00 2D 00 31 00 36 00 22 00 20 00 73 00
000080: 74 00 61 00 6E 00 64 00 61 00 6C 00 6F 00 6E 00
000096: 65 00 3D 00 22 00 79 00 65 00 73 00 22 00 3F 00
000112: 3E 00 0D 00 0A 00 3C 00 52 00 6F 00 6F 00 74 00
000128: 3E 00 31 00 3C 00 2F 00 52 00 6F 00 6F 00 74 00
000144: 3E 00

This ratio of imperative code to declarative code (30 lines vs. 7 lines) is what I typically see when writing functional code using LINQ.  The declarative code is approximately 20% of the size of the imperative code.

Comments

  • Anonymous
    March 15, 2010
    If would probably be more efficient to replace: c => String.Format("{0:X2} ", c.Char) with: c => c.Char.ToString("X2") + " " Even better, with a suitable extension method: public static string JoinString(this IEnumerable<string> value, string separator) {    if (null == value) return string.Empty;    return string.Join(separator, value.ToArray()); } you can replace: g => g.Select(c => String.Format("{0:X2} ", c.Char)).Aggregate((s, i) => s + i) with: g => g.Select(c => c.Char.ToString("X2")).JoinString(" ") and: .Aggregate("", (s, i) => s + i + Environment.NewLine) with: .JoinString(Environment.NewLine) Of course, with a simple implementation of IGrouping<TKey, TValue>, you could write your own ReadFile method, which would be much more efficient: public static IEnumerable<IGrouping<int, byte>> ReadFile(string path, int bytesPerLine) {    ... }

  • Anonymous
    March 15, 2010
    The comment has been removed

  • Anonymous
    March 16, 2010
    The comment has been removed

  • Anonymous
    March 16, 2010
    The comment has been removed

  • Anonymous
    March 17, 2010
    Why aren't the offsets in hex? ;)

  • Anonymous
    March 18, 2010
    The comment has been removed

  • Anonymous
    March 25, 2010
    I also have to wonder about utility of LINQ in this case. I can hardly imagine how you'd end up with 30 lines of code unless it actually did quite a bit more than what you have here. Using C as the quintessential procedural language, I wrote up what seemed like the most obvious implementation: #include <stdio.h> int main() {    FILE *f = fopen("test.xml", "rb");    int ch, offset = 0;    while ((ch=getc(f)) != EOF) {        if (offset %16 == 0)            printf("n%6.6d: ", offset);        ++offset;        printf("%2.2x ", ch);    } } We're left with a few choices: maybe you picked an example in a spectacularly verbose language. Maybe you're comparing apples to oranges, and the 30 lines of code really did a lot that yours doesn't. Maybe the 30 lines of code was just a whole lot longer than necessary. IMO, you owe all LINQ users an apology. Publishing a comparison that's so grossly and obviously distorted and misleading makes it easy for others to brand LINQ advocates in general as sloppy, negligent, ignorant, and quite possible dishonest.

  • Anonymous
    March 25, 2010
    Ouch,  dude!    My intent here on my blog is to share what I learn as I learn it.  I have certainly published posts where I subsequently discovered better ways to do something, and in those cases, I always go back to the original post and put a note pointing to my new approach about that subject.  In those cases where I sent someone down the wrong path, I apologize. -Eric

  • Anonymous
    March 25, 2010
    My apologies -- I probably shouldn't have been posting anything that early in the morning, before I had any caffeine.

  • Anonymous
    April 12, 2010
    The comment has been removed