We are facing problems with really slow HBase region server recoveries ~ 20
minuted. Version is hbase 0.94.3 compiled with hadoop.profile=2.0.

Hadoop version is CDH 4.2 with HDFS 3703 and HDFS 3912 patched and stale
node timeouts configured correctly. Time for dead node detection is still
10 minutes.

We see that our region server is trying to read an HLog is stuck there for
a long time. Logs here:

2013-04-12 21:14:30,248 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to / for file
for block
15000 millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/]

I would think that HDFS 3703 would make the server fail fast and go to the
third datanode. Currently, the recovery seems way too slow for production