HBase Digest, June 2010

HBase 0.20.5 is out! It fixes 24 issues since the 0.20.4 release. HBase developers “recommend that all users, particularly those running 0.20.4, upgrade to this release”.

Community trends:

  • There’s a clear need in “sanity check DNS across my cluster” tool as a lot of questions/help requests related to the name/address resolution in the cluster are submitted over time. Any volunteers?
  • Bulk incremental load into an existing table feature (HBASE-1923) is commited to trunk. No multi-family support still.
  • Good number of advice about increasing the write performance/speed in this thread, including shared numbers/techniques from a large production cluster.
  • A set of ORM tools to consider for HBase are suggested here.

Notable efforts:

FAQ:

  • Common issue: tables/data disappears after system restart. Usually people face it when playing with HBase for the first time and even on the single node set-up. The problem is that by default HDFS is configured to store its data in the /tmp dir which might get cleaned up by OS. Configure “dfs.name.dir” and “dfs.data.dir” properties in hdfs-site.xml to aviod these problems.

3 thoughts on “HBase Digest, June 2010

    1. Yes, it’s relevant. I have it in one of my browser tabs, waiting to be read (big page that requires lots of reading means I keep postponing reading it).

Leave a Reply