Hex Dump using LINQ (in 7 Lines of Code)
At one point while debugging the HtmlConverter class, when I found certain situations in the XML, I wanted to dump the XML in binary to see the actual hex values of characters being used. I got tired of stopping and examining the values in the debugger. I did a couple of searches, and found some sample C# code to implement a simple hex dump, and noticing that it was about 30 lines of code, thought that it I could re-write the code using LINQ and it would be cleaner and smaller.
This blog is inactive.
New blog: EricWhite.com/blog
Blog TOCFollowing is a sample that dumps a byte array in hex:
byte[] ba = File.ReadAllBytes("test.xml");
int bytesPerLine = 16;
string hexDump = ba.Select((c, i) => new { Char = c, Chunk = i / bytesPerLine })
.GroupBy(c => c.Chunk)
.Select(g => g.Select(c => String.Format("{0:X2} ", c.Char))
.Aggregate((s, i) => s + i))
.Select((s, i) => String.Format("{0:d6}: {1}", i * bytesPerLine, s))
.Aggregate("", (s, i) => s + i + Environment.NewLine);
Console.WriteLine(hexDump);
To break up the binary data into groups of bytes for each line, this example uses the idiom that I discussed in Chunking a Collection into Groups of Three. Because this is quick-and-dirty code that I didn’t plan on leaving in the delivered code, I used the idiom from Ad-Hoc String Concatenation using LINQ.
The example converts the binary data into a string that you can then dump to the console or whatever. The resulting string looks something like this:
000000: FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00
000016: 65 00 72 00 73 00 69 00 6F 00 6E 00 3D 00 22 00
000032: 31 00 2E 00 30 00 22 00 20 00 65 00 6E 00 63 00
000048: 6F 00 64 00 69 00 6E 00 67 00 3D 00 22 00 75 00
000064: 74 00 66 00 2D 00 31 00 36 00 22 00 20 00 73 00
000080: 74 00 61 00 6E 00 64 00 61 00 6C 00 6F 00 6E 00
000096: 65 00 3D 00 22 00 79 00 65 00 73 00 22 00 3F 00
000112: 3E 00 0D 00 0A 00 3C 00 52 00 6F 00 6F 00 74 00
000128: 3E 00 31 00 3C 00 2F 00 52 00 6F 00 6F 00 74 00
000144: 3E 00
This ratio of imperative code to declarative code (30 lines vs. 7 lines) is what I typically see when writing functional code using LINQ. The declarative code is approximately 20% of the size of the imperative code.
Comments
Anonymous
March 15, 2010
If would probably be more efficient to replace: c => String.Format("{0:X2} ", c.Char) with: c => c.Char.ToString("X2") + " " Even better, with a suitable extension method: public static string JoinString(this IEnumerable<string> value, string separator) { if (null == value) return string.Empty; return string.Join(separator, value.ToArray()); } you can replace: g => g.Select(c => String.Format("{0:X2} ", c.Char)).Aggregate((s, i) => s + i) with: g => g.Select(c => c.Char.ToString("X2")).JoinString(" ") and: .Aggregate("", (s, i) => s + i + Environment.NewLine) with: .JoinString(Environment.NewLine) Of course, with a simple implementation of IGrouping<TKey, TValue>, you could write your own ReadFile method, which would be much more efficient: public static IEnumerable<IGrouping<int, byte>> ReadFile(string path, int bytesPerLine) { ... }Anonymous
March 15, 2010
The comment has been removedAnonymous
March 16, 2010
The comment has been removedAnonymous
March 16, 2010
The comment has been removedAnonymous
March 17, 2010
Why aren't the offsets in hex? ;)Anonymous
March 18, 2010
The comment has been removedAnonymous
March 25, 2010
I also have to wonder about utility of LINQ in this case. I can hardly imagine how you'd end up with 30 lines of code unless it actually did quite a bit more than what you have here. Using C as the quintessential procedural language, I wrote up what seemed like the most obvious implementation: #include <stdio.h> int main() { FILE *f = fopen("test.xml", "rb"); int ch, offset = 0; while ((ch=getc(f)) != EOF) { if (offset %16 == 0) printf("n%6.6d: ", offset); ++offset; printf("%2.2x ", ch); } } We're left with a few choices: maybe you picked an example in a spectacularly verbose language. Maybe you're comparing apples to oranges, and the 30 lines of code really did a lot that yours doesn't. Maybe the 30 lines of code was just a whole lot longer than necessary. IMO, you owe all LINQ users an apology. Publishing a comparison that's so grossly and obviously distorted and misleading makes it easy for others to brand LINQ advocates in general as sloppy, negligent, ignorant, and quite possible dishonest.Anonymous
March 25, 2010
Ouch, dude! My intent here on my blog is to share what I learn as I learn it. I have certainly published posts where I subsequently discovered better ways to do something, and in those cases, I always go back to the original post and put a note pointing to my new approach about that subject. In those cases where I sent someone down the wrong path, I apologize. -EricAnonymous
March 25, 2010
My apologies -- I probably shouldn't have been posting anything that early in the morning, before I had any caffeine.Anonymous
April 12, 2010
The comment has been removed