Splitting a Hex-Encoded String into Pairs of Hex Characters (a.k.a. To Pull a Noah)
Simple enough task: I have a hex-encoded string and need to decode it. Now, we all know that to encode a string to hex is to cast each [char] to [int], then shove it through the "{0:X}"
format specifier, then concatenate all the strings.
$string = "The quick brown dog";
[string]::Join($null, ([char[]]$string | % { "{0:X}" -f [int]$_; }));
This returns:
54686520717569636B2062726F776E20646F67
And we know to decode a hex-encoded character, we shove each pair of hex characters through the reverse process (more or less):
[char][Convert]::ToInt16('54', 16);
T
However, how do we split a string into pairs? After all, the above magic only works for pairs of hex characters. We can cast it into a [char[]]
array, then iterate over the array to concatenate every two characters back to a bunch of two-character strings:
$hex = "54686520717569636B2062726F776E20646F67";
[char[]]$hex | % -begin {
$i = 0;
[string]$s = $null;
} -process {
if ($i % 2)
{
$s + $_;
[string]$s = $null;
}
else
{
$s += $_;
}
$i +=1;
}
Perfectly valid. However, PowerShell provides an obscure, yet incredibly succinct way to do this:
$hex = "54686520717569636B2062726F776E20646F67"; $hex -split '(..)' | ? { $_ }
Three notes:
- The "| ? { $_; }"
filter (effectively equivalent to IsNotNull()
) is required. Otherwise, the split has the nasty habit of interleaving $null elements in the returned list of strings. I have no idea why.
- The parentheses in the "'(..)'"
split specifier is also required. "'..'"
matches on the literal twin periods. Why the split specifier becomes a RegEx when parenthesized? No idea.
- For those who like Perl’s readability, you can smash the loop into a single line so it’s as inobvious as the "-split '(..)'"
split specifier.
$hex = "54686520717569636B2062726F776E20646F67"; [char[]]$hex | % -begin { $i = 0; [string]$s = $null; } -process { if ($i % 2) { $s + $_; [string]$s = $null; } else { $s += $_; } $i +=1; }
Why would you want to do that? You guessed it: no idea.
Comments
- Anonymous
January 03, 2014
Cool. "...interleaving $null elements in the returned list of strings": No idea as well. "Why the split specifier becomes a RegEx when parenthesized?": It's always a RegEx. The parantheses make the 2 dots a Capturing Group. What you get back are the Values of the Captures. Knowing this, you can do the following: [regex]::Matches($hex,'(..)') | % {$.Captures} | % {$.Value} It's even shorter, if you retrieve the Values of the Matches: [regex]::Matches($hex,'..') | % {$_.Value} Since you need no Captures, you can omit the parantheses for the Capturing Group. No idea, why the -split operator returns the Captures but not the Matches. By the way: For RegEx testing, especially with .Net, I'm very happy with the free "Rad Software Regular Expression Designer":