I am learning the MapReduce framework and have the following questions about the same:
The MapReduce paradigm essentially has
map()andreduce()(and a few others as well). Can all the programming logic be effectively represented as either amap()or areduce()?For e.g., suppose I want to do an in-order traversal of a tree. Can this task be effectively divided only into a
map()andreduce()task? If yes, how? If no, then how do I leverage theMapReduceframework for this task?
Generic code for in-order traversal:
// Iterative solution
public void inOrderIter(TreeNode root) {
if(root == null)
return;
Stack<TreeNode> s = new Stack<TreeNode>();
TreeNode currentNode=root;
while(!s.empty() || currentNode!=null){
if(currentNode!=null)
{
s.push(currentNode);
currentNode=currentNode.left;
}
else
{
TreeNode n=s.pop();
System.out.printf("%d ",n.data);
currentNode=n.right;
}
}
}
Can we have only a
map()without a correspondingreduce()and vice versa?As per this and this, the
reduce()function generates the final output - is it necessary that it should generate only a single value?How do you decide if a certain task should be a part of
map()orreduce()?Any general pointers about how to
map-ify andreduc-ify a given task?