An Introduction to Apache Hadoop MapReduce

Views:
 
     
 

Presentation Description

An Introduction to Apache Hadoop MapReduce, what is it and how does it work ? What is the map reduce cycle and how are jobs managed. Why should it be used and who are big users and providers ?

Comments

Presentation Transcript

Apache Hadoop MapReduce:

Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users

MapReduce – What is it ?:

MapReduce – What is it ? Processing engine of Hadoop Developers create Map and Reduce jobs Used for big data batch processing Parallel processing of huge data volumes Fault tolerant Scalable

MapReduce – Why use it ?:

MapReduce – Why use it ? Your data in Terabyte / Petabyte range You have huge I/O Hadoop framework takes care of Job and task management Failures Storage Replication You just write Map and Reduce jobs

MapReduce – How does it work ?:

MapReduce – How does it work ? Take word counting as an example, something that G oogle does all of the time.

MapReduce – How does it work ?:

MapReduce – How does it work ? Input data split into shards Split data mapped to key,value pairs i.e. Bear,1 Mapped data shuffled/sorted by key i.e. Bear Sorted data reduced i.e. Bear, 2 Final data stored on HDFS There might be extra map layer before shuffle JobTracker controls all tasks in job TaskTracker controls map and reduce

MapReduce - Some examples:

MapReduce - Some examples A visual example with colours to show you the cycle Split -> Map -> Shuffle -> Reduce

MapReduce - Some examples:

MapReduce - Some examples A visual example of MapReduce with job and task trackers added to individual map and reduce jobs.

Hadoop MapReduce – Big users:

Hadoop MapReduce – Big users Users Facebook Yahoo Amazon Ebay Providers Amazon Cloudera HortonWorks MapR

Contact Us:

Contact Us Feel free to contact us at www.semtech-solutions.co.nz info@semtech-solutions.co.nz We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

authorStream Live Help