Subject: Spatial data posting in HBase


Michael, don't you think Geohashes can be satisfying and well-suited for
many cases anyway? Searching in a bounding box or arbitrary polygon is not
that heavy with Geohash, even on edge conditions. The biggest risk IMHO is
to have to deal with tons of invalid extra points if the geohash query is
not accurate enough and your points distribution is very sparse so that
many points will be found in a geohash despite they don't respond to your
query criteria.

However, if your query embeds enough bits of precision, Geohashes offer
some nice guarantees for distributed databases and your queries should
remain efficient enough.

Another worst case of course is to look for K-NN since Geohash is not a
real longest-common-prefix algorithm but once again, if your points
distribution is approximately well balanced, this works not that bad
without doing lots of recursive queries or fetching tons of useless data
(but I do agree looking into your tiles would probably be more appropriate
in that case).

I'm planning to write an article on that points, so further technical
arguments are welcome :-}

On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
--
Adrien Mogenet
http://www.borntosegfault.com