MapReduce: Simplified Data Processing on Large Clusters (2004)

In order to be robust and efficient, programs that process huge amounts of data have to take into account how to parallelize work, distribute work, handle failures, and load balance work. The MapReduce framework implements these complicated pieces of boilerplate, allowing programmers to process huge amounts of data only having to write simple map and reduce functions.

Programming Model

Logically, users implement map and reduce functions of the following type:

though in reality, the MapReduce framework deals only with strings. In addition to providing map and reduce functions, the user also provides the name of inputs, the name of outputs, tuning parameters, etc. The MapReduce framework is expressive enough to implement distributed grep, URL access counts, reverse web-link graph, inverted index, and distributed sort.

Implementation

The MapReduce interface can be implemented in many different ways. For example, a simple single-threaded implementation could be used for debugging, or a NUMA multi-processor implementation could be used for datasets that fit in memory. Here, we discuss an implementation for a cluster of commodity machines. Execution proceeds in a number of steps:

  1. The input data is split into M 16-64 MB partitions. A master and a collection of workers are spawned.
  2. The single master assigns the M map tasks and R reduce tasks to workers.
  3. A mapper reads in its assigned partitions and applies the user provided map function to them. The intermediate key-value pairs are buffered in memory.
  4. Periodically, a mapper writes intermediate data to a local disk and sends the location of the files to the master who forwards them to the reducers.
  5. Reduces read the data written by the mappers using RPC when prompted to do so by the master. The reducer then (externally) sorts the data by intermediate key.
  6. The reducer applies the user provided reduce function creating one output file for each of the R output partitions.
  7. Finally, the user program is awoken.

Refinements