Our application is waiting on this piece of code for more than 37 hours.
Following is the thread which is stuck:

This is the Reducer's GetMapCompletionEvent thread. All the Map task are
successfully complete.
"IPC Client (906199566) connection to /xxx.xxx.xxx.xxx:xxxxx from root"
daemon prio=10 tid=0x00007f1dc0055000 nid=0x3b18 runnable
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
- locked <0x00007f1dcc77bdf0> (a sun.nio.ch.Util$1)
- locked <0x00007f1dcc77bdd8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00007f1dcc77ba48> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)

We have checked the CPU usage at this time but, we didn't find anything
fishy with CPU usage.
The select usage is configured with timeout of 60000ms.

Has anyone fallen into the same trap??
Any clues and suggestions...

The JDK Version used is:

java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)

Linux Kernel Version:
Linux HOST-xxxx #1 SMP 2009-02-28 04:40:21 +0100 x86_64
x86_64 x86_64 GNU/Linux




Subroto Sanyal