MapReduce program other than WordCount
Understanding fundamental of MapReduce
MapReduce is a framework designed for writing programs that process large volume of structured and unstructured data in parallel fashion across a cluster, in a reliable and fault-tolerant manner. MapReduce concept is simple to understand who are familiar with distributed processing framework.
MapReduce is a game all about Key-Value pair. I will try to explain key/value pairs by covering some similar concepts in the Java standard library. The java.util.Map interface is used for key-value in Java.
For any Java Map object, its contents are a set of mappings from a given key of a specified type to a related value of a potentially different type.
In the context of Hadoop, we are referring to keys that is associated with values. This data in MapReduce is stored in such a way that the values can be sorted and rearranged (Shuffle and sort wrt to MapReduce) across a…
View original post 1,314 more words