Matt possesses the proper combination of expertise in both databases and MapReduce. Hive offers hive-specific tools like Matt suggested (map-side joins) to help out.

The short answer on MapReduce algorithms is that the individual computational units can't communicate with each other (each mapper or each map() in fact cannot communicate with the others, likewise for reducers).  That's one of the major distinctions between MapReduce and more general parallel processing frameworks like MPI.  This is the wrong mailing list to go much deeper than that however.

