Hadoop Digest, July 2010

Strong moves towards the 0.21 Hadoop release “detected”: 0.21 Release Candidate 0 was out and tested. A number of issues were identified and with it the roadmap to the next candidate is set. Tom White has been hard at work […]

Hadoop Digest, June 2010

Hadoop 0.21 release is getting close: a few blocking issues remain in Common, HDFS and MapReduce modules. Big announcement from Cloudera: CDHv3 and Cloudera Enterprise were released. In CDHv3 beta 2 the following was added: HBase: the popular distributed columnar […]

Hadoop Digest, May 2010

Big news: HBase and Avro have become Apache’s Top Level Projects (TLPs)! The initial discussion happened when our previous Hadoop Digest was published, so you can find links to the threads there. The question of whether to become a TLP […]

HBase Digest, March 2010

We were waiting until the end of the month hoping to include coverage of the new HBase 0.20.4 version, but HBase developers are still working on it. This release will contain a lot of critical fixes and enhancements, so stay […]

Hadoop Digest, March 2010

Main news first: Hadoop 0.20.2 was released! The list of changes may be found in the release notes here. Related news: Maven artifacts have been pushed to repository.apache.org. This version has entered Debian unstable repository. Cloudera officially announced CDH2 release […]

Nutch Digest, March 2010

This is the first post in the Nutch Digest series and a little introduction to Nutch seems in order. Nutch is a multi-threaded and, more importantly, a distributed Web crawler with distributed content processing (parsing, filtering), full text indexer and […]