Introduction to Apache Hadoop

Views:
 
     
 

Presentation Description

A short presentation to introduce Apache Hadoop, what is it and what can it do ? What are the other products associated with it ?

Comments

Presentation Transcript

Apache Hadoop:

Apache Hadoop What is it ? Architecture Related Projects Large users

Hadoop – What is it ?:

Hadoop – What is it ? An open source system developed using Java Supports very large data sets Supports large clusters of servers Designed to run on pre existing low cost hardware Allows for fragmentation of work over cluster Allows for fragmentation of storage over cluster Provides resiliance via automatic failure handling

Hadoop - Architecture:

Hadoop - Architecture Hadoop consists of Hadoop Common Common utilities for Hadoop module support Hadoop MapReduce Parallel processing of Hadoop data Hadoop Yarn Scheduler and resource manager Hadoop Distributed File System (HDFS)‏ A Master/Slave file system which spreads the Hadoop data over a very large cluster of slave data nodes controlled by a single name node.

Hadoop – Related Projects:

Hadoop – Related Projects

Hadoop – Related Projects:

Hadoop – Related Projects Pig - for analysing large data sets Hive – data warehouse system for Hadoop Mahout – machine learning and data mining Avro – a data serialization system Zoo Keeper – helps build distributed applications Chukwa – data collection and analysis

Hadoop – Related Projects:

Hadoop – Related Projects Hue – Hadoop user interface Oozie – work flow scheduler Hama – bulk synchronous parallel framework For massive scientific computations Nutch – web crawler Hbase – Non relational database

Hadoop – Large Users:

Hadoop – Large Users Yahoo 10,000 core Linux cluster Facebook 100 Petabytes, growing at .5 Petabytes a day Amazon Its possible to run Hadoop on Amazon's EC2 and S3

Contact Us:

Contact Us Feel free to contact us at www.semtech-solutions.co.nz info@semtech-solutions.co.nz We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

authorStream Live Help