Compartir a través de


Splitting a Hex-Encoded String into Pairs of Hex Characters (a.k.a. To Pull a Noah)

Simple enough task: I have a hex-encoded string and need to decode it.  Now, we all know that to encode a string to hex is to cast each [char] to [int], then shove it through the "{0:X}" format specifier, then concatenate all the strings.

 $string = "The quick brown dog"; 
[string]::Join($null, ([char[]]$string | % { "{0:X}" -f [int]$_; }));

This returns:

 54686520717569636B2062726F776E20646F67

And we know to decode a hex-encoded character, we shove each pair of hex characters through the reverse process (more or less):

 [char][Convert]::ToInt16('54', 16);
 T

However, how do we split a string into pairs?  After all, the above magic only works for pairs of hex characters.  We can cast it into a [char[]] array, then iterate over the array to concatenate every two characters back to a bunch of two-character strings:

 $hex = "54686520717569636B2062726F776E20646F67";
[char[]]$hex | % -begin { 
        $i = 0; 
        [string]$s = $null; 
    } -process { 
        if ($i % 2) 
        { 
            $s + $_; 
            [string]$s = $null; 
        } 
        else 
        { 
            $s += $_; 
        } 
        $i +=1; 
    }

Perfectly valid.  However, PowerShell provides an obscure, yet incredibly succinct way to do this:

 $hex = "54686520717569636B2062726F776E20646F67"; $hex -split '(..)' | ? { $_ }

Three notes:

- The "| ? { $_; }" filter (effectively equivalent to IsNotNull()) is required. Otherwise, the split has the nasty habit of interleaving $null elements in the returned list of strings. I have no idea why.

- The parentheses in the "'(..)'" split specifier is also required. "'..'" matches on the literal twin periods. Why the split specifier becomes a RegEx when parenthesized? No idea.

- For those who like Perl’s readability, you can smash the loop into a single line so it’s as inobvious as the "-split '(..)'" split specifier.

$hex = "54686520717569636B2062726F776E20646F67"; [char[]]$hex | % -begin { $i = 0; [string]$s = $null; } -process { if ($i % 2) { $s + $_; [string]$s = $null; } else { $s += $_; } $i +=1; }

Why would you want to do that? You guessed it: no idea.

Comments

  • Anonymous
    January 03, 2014
    Cool. "...interleaving $null elements in the returned list of strings": No idea as well. "Why the split specifier becomes a RegEx when parenthesized?": It's always a RegEx. The parantheses make the 2 dots a Capturing Group. What you get back are the Values of the Captures. Knowing this, you can do the following: [regex]::Matches($hex,'(..)') | % {$.Captures} | % {$.Value} It's even shorter, if you retrieve the Values of the Matches: [regex]::Matches($hex,'..') | % {$_.Value} Since you need no Captures, you can omit the parantheses for the Capturing Group. No idea, why the -split operator returns the Captures but not the Matches. By the way: For RegEx testing, especially with .Net, I'm very happy with the free "Rad Software Regular Expression Designer":