1Dec Tue2009 | Hadoop on Sun Grid Engine
Dan Templeton, who I used to work with at the Grid Forum
sessions, has
written up on Sun Grid Engine's Hadoop integration
A key feature is they've added location awareness to the
scheduler; something runs on each datanode to log and report the
blocks there, so when work is scheduled against that data, it can
be run on or near that node.
There are some devious things going on there, I suspect. What's
interesting here is you can do other kinds of work than just
MapReduce, and when you do want to work against data -be it for
MapReduce or other tasks- then the scheduler can do work near it.
This makes it an alternative to Hadoop's own JobTracker, that only
likes a limited set of Jobs: Map, Shuffle, Reduce.
|