Jaa


You Can't Convert Data Structures To Strings In VBScript Without Breaking A Few Eggs

Here's a question I get every now and then:

I've written a VBScript program which calls a method on an object that returns an array of bytes containing a GUID. VBScript only supports arrays of variants. How can I turn this into a human-readable string?

Good question. It is doable without writing an object in C++, but it's a little tricky. The first thing to know is that even though VBScript does not support arrays of anything other than variant, the underlying OLE Automation library supports turning byte arrays into strings. Therefore you can use CStr to turn the thing into a string, right?

Function GuidToString(ByteArray)
  GuidToString = CStr(ByteArray)
End Function

Print GuidToString(MyObject.GetTheGuid)

Which prints out ~å ??ATErU%'èÅp±

Oops. We've taken those bytes and interpreted them as Unicode characters in a UTF-16 encoding. That's not right. We want to convert the bytes to text, preferably in hex format. Fortunately we have it in a string now, so we can extract the bytes with the byte-manipulating versions of the string library functions. Let's try that again.

Function GuidToString(ByteArray)
  Dim Binary, S
  Binary = CStr(ByteArray)
  S = "{"
  S = S & Hex(AscB(MidB(Binary, 1, 1)))
  S = S & Hex(AscB(MidB(Binary, 2, 1)))
  S = S & Hex(AscB(MidB(Binary, 3, 1)))
  S = S & Hex(AscB(MidB(Binary, 4, 1)))
  S = S & "-"
  S = S & Hex(AscB(MidB(Binary, 5, 1)))
  S = S & Hex(AscB(MidB(Binary, 6, 1)))
  S = S & "-"
  S = S & Hex(AscB(MidB(Binary, 7, 1)))
  S = S & Hex(AscB(MidB(Binary, 8, 1)))
  S = S & "-"
  S = S & Hex(AscB(MidB(Binary, 9, 1)))
  S = S & Hex(AscB(MidB(Binary, 10, 1)))
  S = S & "-"
  S = S & Hex(AscB(MidB(Binary, 11, 1)))
  S = S & Hex(AscB(MidB(Binary, 12, 1)))
  S = S & Hex(AscB(MidB(Binary, 13, 1)))
  S = S & Hex(AscB(MidB(Binary, 14, 1)))
  S = S & Hex(AscB(MidB(Binary, 15, 1)))
  S = S & Hex(AscB(MidB(Binary, 16, 1)))
  S = S & "}"
  GuidToString = S
End Function

Which prints out {7E0E50-200-BA25-4026-410540450}

Uh, shouldn't the character counts of each section be 8-4-4-4-12, instead of 6-3-4-4-9 ? 

Oops. We need the single digit bytes like 0 to go to "00", not "0". That's easy enough to fix up:

Function HexByte(b)
      HexByte = Right("0" & Hex(b), 2)
End Function

Function GuidToString(ByteArray)
  Dim Binary, S
Binary = CStr(ByteArray)
  S = "{"
  S = S & HexByte(AscB(MidB(Binary, 1, 1)))
  S = S & HexByte(AscB(MidB(Binary, 2, 1)))
  S = S & HexByte(AscB(MidB(Binary, 3, 1)))
  S = S & HexByte(AscB(MidB(Binary, 4, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 5, 1)))
  S = S & HexByte(AscB(MidB(Binary, 6, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 7, 1)))
  S = S & HexByte(AscB(MidB(Binary, 8, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 9, 1)))
  S = S & HexByte(AscB(MidB(Binary, 10, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 11, 1)))
  S = S & HexByte(AscB(MidB(Binary, 12, 1)))
  S = S & HexByte(AscB(MidB(Binary, 13, 1)))
  S = S & HexByte(AscB(MidB(Binary, 14, 1)))
  S = S & HexByte(AscB(MidB(Binary, 15, 1)))
  S = S & HexByte(AscB(MidB(Binary, 16, 1)))
  S = S & "}"
  GuidToString = S
End Function

Which prints out {7E00E500-2000-BA25-4026-410054004500}

Which is also wrong. What's wrong this time?

The logical format of a GUID in memory is not in the same order as the bytes are in the string.   A GUID stored in binary format in memory is a sixteen byte structure in the following format:

DWORD-WORD-WORD-BYTE BYTE-BYTE BYTE BYTE BYTE BYTE BYTE

So what? Why does that matter?

It matters because a WORD consists of two bytes, but they are stored in memory in order from the least to the most significant on my Intel machine. Same with the four-byte DWORD. Intel boxes are "little endian" machines. Motorolas are "big endian" -- on Macs, the big byte comes first in memory. Which is the better scheme is one of the great holy wars of information technology. Apparently some poor deluded people still fail to realize that little-endian architecture is much more sensible than big-endian, or that vi is a much better editor than emacs. J

(ASIDE: These whimsical terms were borrowed from Gulliver's Travels, in which Swift satirizes the political parties of his day. In Lilliput, the Protestant rulers of England are represented by the Little Endians, the oppressed Catholics as the Big Endians. They disagree on which is the correct way to break an egg. See the last half of part one, chapter four for details.)

We need to decode that thing into the correct order:

Function GuidToString(ByteArray)
  Dim Binary, S
  Binary = CStr(ByteArray)
  S = "{"
  S = S & HexByte(AscB(MidB(Binary, 4, 1)))
  S = S & HexByte(AscB(MidB(Binary, 3, 1)))
  S = S & HexByte(AscB(MidB(Binary, 2, 1)))
  S = S & HexByte(AscB(MidB(Binary, 1, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 6, 1)))
  S = S & HexByte(AscB(MidB(Binary, 5, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 8, 1)))
  S = S & HexByte(AscB(MidB(Binary, 7, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 9, 1)))
  S = S & HexByte(AscB(MidB(Binary, 10, 1)))
  S = S & "-"
  S = S & HexByte(AscB(MidB(Binary, 11, 1)))
  S = S & HexByte(AscB(MidB(Binary, 12, 1)))
  S = S & HexByte(AscB(MidB(Binary, 13, 1)))
  S = S & HexByte(AscB(MidB(Binary, 14, 1)))
  S = S & HexByte(AscB(MidB(Binary, 15, 1)))
  S = S & HexByte(AscB(MidB(Binary, 16, 1)))
  S = S & "}"
  GuidToString = S
End Function

Which prints out {00E5007E-0020-25BA-4026-410054004500}, the correct string.

The whole point of script programming languages is to abstract away from the underlying details of how the machine works. Occasionally though these abstractions prove to be leaky. This is one of those times when in order to make sense of something, you need to understand some pretty low-level trivia about how computers work.

Comments

  • Anonymous
    May 25, 2004
    Did you know that in <b>The Matrix Reloaded's</b> freeway chase there is a truch that says "Big Endian Eggs" on the side? There are pictures here:

    http://whatisthematrix.warnerbros.com/rl_cmp/onset_page08.html

  • Anonymous
    May 25, 2004
    I did not know that. That is quite amusing! I'll have to look for that next time I see it.

  • Anonymous
    May 25, 2004
    Why do we care that "The logical format of a GUID in memory is not in the same order as the bytes are in the string"? GUIDs only need an equality relation defined on them, therefore we should not care how we serialize GUIDs, just as long as we have uniquely (and possibly the process is reversible).

  • Anonymous
    May 25, 2004
    Lahnakoski: Sometimes we need to convert GUID stored in memory into a human readable string just to show it to the user; then it matters how we are displaying it and how it is stored.

  • Anonymous
    May 25, 2004
    The comment has been removed

  • Anonymous
    May 26, 2004
    Guids need more than an equality relation, they also need a consistent guid-to/from-string operation. Otherwise you can't take a class id and look it up in the registry.

  • Anonymous
    May 26, 2004
    Eric: The third and fourth definitions of GuidToString() both produce strings that can be recomposed into guids. So both are equally good.

  • Anonymous
    May 26, 2004
    You sure?

    Then write me a method which takes as its input a byte array containing a guid, and outputs True if that guid is registered under HKEY_CLASSES_ROOTCLSID, False if it is not.

  • Anonymous
    May 27, 2004
    Eric: Good example, that is the type of example I was fishing for. I was not aware of all your requirements. Secifically, I was not aware you had to compare your serialized GUIDs to other systems' serialized GUIDs (like the registry system).

  • Anonymous
    July 26, 2004
    Doing something similar - thought I'd use a timestamp (actually nothing to do with dates, can be considered as an array of 8 bytes or possibly as a bigint).

    Gets returned to vbscript from ado as an array of bytes (8 elements, zero to seven).

    Having real probs dealing with it in asp - so am doing it in sql ie CAST(CAST(stamp AS BIGINT) AS VARCHAR) AS converted_stamp

    cludgy but time is tight! All I waant to do is to store it in a hidden so that the update is dependent on no other edits of the record - overwrite or reload.

  • Anonymous
    November 24, 2005
    I noticed that:

    S = S & HexByte(AscB(MidB(Binary, 10, 1)))
    S = S & HexByte(AscB(MidB(Binary, 9, 1)))

    should be swapped.

    I built a small script to collect the msExchMailboxGUID from AD, but it only came out right after I swapped the two lines mentioned above.

    Any thoughts on that?

  • Anonymous
    November 25, 2005
    Whoops -- yep, that's a typo. Thanks for pointing that out. I've corrected the text.

  • Anonymous
    December 02, 2006
    The comment has been removed

  • Anonymous
    March 28, 2008
    PingBack from http://70.84.136.34/aqtpen/knowledge-base/articles/code-techniques/code-design/abstraction-leaks/

  • Anonymous
    March 28, 2008
    PingBack from http://70.84.136.34/aqtpen/knowledge-base/articles/code-techniques/code-design/abstraction-leaks/

  • Anonymous
    October 21, 2009
    It's worth noting for those still dealing with VBScript that this routine is not needed for GUIDs returned from SQL Server as a "uniqueidentifier" via ADO.  (Yes, the article starts with a mention of "array of bytes", but some people might say that a uniqueidentifier IS an "array of bytes", too). Anyway, if the argument to the function is in a form that VBSCript / ADO / OLE recognizes as a uniqueidentifier the statement "Binary = CString(ByteArray)" will simply do the conversion to string format complete with braces. In that case, GuidToString() will be trying to format the text string, not the original binary string. To make the functino more "univeral", it could check if BINARY simply contained a GUID-formatted string and skip the "MyHex" calls.  Simplistically: BINARY = CStr(ByteArray) IF LEN(Binary) = 38 THEN   IF MID(Binary, 1, 1) = "{" AND MID(Binary,38,1) = "}" THEN      GuidToSTring = BINARY      EXIT FUNCTION      END IF   END IF