How to generate a GUID in 250K easy steps

A friend on my previous team wrote a tool to generate GUID values where the first two bytes are zero. (And I'll say right here that I don't think there is a legitimate need for such a tool, but that's not the point.) I'm working from memory but the C# code was something like:

public static Guid Generate()
{
   Guid guid = Guid.NewGuid();
   while (!guid.ToString().StartsWith("0000"))
   {
      guid = Guid.NewGuid();
   }
   return guid;
}

 

As I said, it was for a tool, and it would only get called once so the perf doesn't really matter; it gets the job done.  I just thought it was a pretty funny approach.  The friend in question was an excellent programmer, so I am going to blame it on too many limoncellos.  All I know is that now he's in law school.

 

Generating a GUID and just zeroing out the first two bytes (using ToByteArray and the appropriate Guid ctor) appears to be about 250,000 times faster.  That seems about right, if you think about it.

Comments

  • Anonymous
    December 04, 2006
    Yes, it is faster, but doesn't that increase the chance of a collision with another generated guid?

  • Anonymous
    December 04, 2006
    John Doe:  Good point, but I don't think so.  The original algorithm throws out Guids until it gets one with the first 16 bits equal to zero, so there will be 2^112 possible values (but see the next paragraph).  The second one generates a guid (2^128) and zeroes out the first 16 bits, but we're still left with 2^112 possible values.  If you consider generating a random string of four bits and zeroing out the first two then it is easier to think about. I am pretty sure (I haven't looked it up) that Guid.NewGuid uses information about the environment to generate values so I doubt that all 2^128 values are equally likely.  In that case there may be differences in the chance for collision between the two algorithms.  (e.g. suppose Guid.NewGuid() uses information about the machine to fill the first 16 bits.  Then running Guid.NewGuid() on some hypothetical machine may turn up 0's for the first 16 bits and the first function would run forever.) Even so, the domain is so large that the chance of collision is essentially zero for both algorithms.