The Imprecise Nature of Geometry

Hi Folks,

Let's kick this off with what may be a surprising result.  What should this code return?

 declare @g geometry = 'LINESTRING (0 0, 1 3)'
 declare @h geometry = 'LINESTRING (0 3, 2 0)'
 declare @i geometry = @g.STIntersection(@h).ToString()
 select @i.STIntersects(@g), @i.STIntersects(@h)  -- returns 0, 0

It would seem that @i should be the point where @g and @h intersect, and so a test whether @i intersects @g and @h should clearly return true (or 1).  However, for some reason @i does not appear to intersect either @g or @h.  How can that be true?

The reason is that geometric calculations are by nature imprecise, and this is an unavoidable result of the imprecise nature of the data types upon which these calculations are built.  To illustrate, let's take a look at a non-geometric example.  First, some values to work with:

 declare @u float = 1000
 declare @v float = -1
 declare @w float = 1.0001

Note that none of these values are particularly large: they all fit comfortably within our 64-bit floating point variables.  Next let's look at a pair of equations:

 @u * (@v + @w)
 (@u * @v) + (@u * @w)

If we think back to basic algebra, we know that these two equations should be the same: in the second one we've simply distributed @u over (@v + @w).  But, if we test this...

 if  (@u * (@v + @w)) = ((@u * @v) + (@u * @w))
     print 'equal'
 else 
     print 'not equal'

...we find that SQL Server doesn't think they're equal.  We can look at each of these values out to see what's going on:

 select @u * (@v + @w)        -- 0.099999999999989
 select (@u * @v) + (@u * @w) -- 0.100000000000023

Ack!  If we were working with true rational numbers, these values should be exactly the same (and should be exactly .1).  Our floating point calculations don't return the correct result, and don't even return the same results: they will never compare as equal.

There is, in fact, no way to avoid this type of problem completely with any bounded numerical representation, so we are stuck dealing with imprecise calculations no matter how we represent values.  One result of this is the maxim that you should never test floating point numbers for equality, and this general rule extends to geometries (and geographies) as well.

Let's look again at our geometry example.  If we again do a little algebra, we find that the actual intersection point should be (2/3, 2).  If we as SQL Server, however, our results are a little off:

 declare @g geometry = 'LINESTRING (0 0, 1 3)'
 declare @h geometry = 'LINESTRING (0 3, 2 0)'
 declare @i geometry = @g.STIntersection(@h).ToString()
 select @i.ToString()  -- result: POINT (0.66666666666666674 2)

 

Errors like these are unavoidable in principle.  One may be inclined to simply increase the precision of one's calculations, but no (finite) amount of tightening can truly eliminate this problem.

Cheers, and enjoy  SQL Server 2008.

-Isaac

Comments

  • Anonymous
    August 07, 2008
    Wouldn't internally using a highly precise fixed-point (rather than 8-btye double) type address some of this? Grant decimal(38,n) isn't storage appealing, but...

  • Anonymous
    August 07, 2008
    Hi Kent, Unfortunately, fixed point---binary or decimal---is no panacea here.  For example, I think we'd agree that a * (x / a) should always equal x, so long as a is not zero. But... declare @a decimal(38,20) = 3 declare @x decimal(38,20) = 1 select  @a * (@x / @a) ...comes back with the result 0.999999, not 1.0. The only true way to handle all computations like the ones in this post without error would be to actually store rational numbers, but the storage size of a rational is not bounded.  Rationals also don't account for what happens when we take a sine, or even a square root. Cheers, -Isaac

  • Anonymous
    August 09, 2008
    The comment has been removed

  • Anonymous
    August 09, 2008
    The comment has been removed

  • Anonymous
    August 09, 2008
    Hi Regina, This could probably use another post or two---there's quite a bit of meat here.  I would be very surprised (shocked, even!) if SQL Server actually matched PostGIS in all such cases. SQL Server can make use of the index to answer the STDistance predicate you suggest, though. Cheers, -Isaac

  • Anonymous
    September 12, 2008
    Using the STBuffer function is another way to specify the desired intersection within a desired range of error (conceptually the same as Regina's response): declare @g geometry = 'LINESTRING (0 0, 1 3)' declare @h geometry = 'LINESTRING (0 3, 2 0)' declare @i geometry = @g.STIntersection(@h).STBuffer(.0001) select @i.STIntersects(@g), @i.STIntersects(@h)  -- Returns 1, 1 Luke