Filter: One Odd Duck

Hi Folks,

Over at his blog, Nicklas Avéns questions the behavior of Filter. He's right: Filter is a little odd. It is also very much by design.

Before we look at filter, let's review the data flow for a typical spatial query. A spatial selection generally has two internal components: a primary filter, which is designed to be fast, but which may produce extra results; and a secondary filter, which is slower, but only works to remove the false positives passed by the primary filter.

The primary filter is accomplished by using the spatial index. The secondary filter requires calls to actual geometry routines, and will be much more expensive. But the secondary filter also guarantees index-independent—and geometrically correct—results.

Without an index, most operations will just skip the primary filter. The result is slow, but returns the same results as the full primary–secondary pathway.

Filter is designed for a pretty common, but somewhat odd use case. Imagine that you want to display a portion of a map: you want to retrieve from the database those objects that would show on the map, but if you happen to retrieve a few extras you don't care—they'll just fall off the display.

In other words, false positives are okay. In fact, they're often preferred, because the overall performance is can be much better if you can skip the expensive geometry operations and instead render some additional objects.

So we added Filter, which simply skips the secondary filter and gives you the quick index-only results.

But what if you don't have an index? We had two choices, either (a) return an error, or (b) revert to STIntersects. Note that both of these options have index-dependent behavior: that ship sailed when we decided to skip the secondary filter.

We chose to allow the operation, but revert to STIntersects in the non-indexed case. This gives us clear semantics: the correct results are always a subset of what we return. It's true that for SQL Server, these are somewhat odd semantics, since they are index-dependent.

But this is the only place in spatial where we have such behavior. If you want consistent results regardless of index, use STIntersects or any of the other methods we provide. If your use case fits the one I described above, you may very well want Filter.

Call it useful, call it odd. Call it Filter.

Cheers,
-Isaac

Comments

  • Anonymous
    March 04, 2010
    Hi Isaac Thank for the reply! I understand the idea and I see the use. What made me call it ugly (sorry for that :-) ) is that since the planner handles this function differently in the select-part and the where-part even if you have an index on it you get different answers in different parts of the same query. I guess it is just a function that shouldn’t be used in the select part… ever :-) Cheers Nicklas

  • Anonymous
    March 04, 2010
    Hi Nicklas, No worries: I wasn't offended by the "ugly" comment. It is a little ugly. The SELECT comment is a reasonable point: I don't see much of a use for it in outside the WHERE clause. But it's uglier to make contextual distinctions about where the method could be used and, e.g., prohibit it in a SELECT clause. Preemptive snarky comment: Yes, SQL does things like that all over. But that doesn't mean it's good. :) Cheers, -Isaac