An introduction to Apache Falcon


Presentation Description

A short introduction to Apache Falcon, what is it and what is it used for ? How can it help with Hadoop based data life cycle management ? What is it's architecture and what are the benefits of using it ?


Presentation Transcript

Apache Falcon:

Apache Falcon What is it ? Benefits Architecture Example

Apache Falcon – What is it ?:

Apache Falcon – What is it ? A data life cycle management framework Created for Hadoop Logic based in Falcon rather than apps Simplifies data management Developed by InMobi and HortonWorks Falcon can manage Work flows Replication Provides data abstraction

Apache Falcon – What is it ?:

Apache Falcon – What is it ? Falcon provides services Data import / replication Scheduling / coordination Lifecycle policies Cluster management SLA Management An enterprise solution for data lifecycle management Currently an Apache incubator project

Apache Falcon – Benefits:

Apache Falcon – Benefits Reduce workflow / ETL development time Reduce costs No need to re implement functionality Already in Falcon Already tested Use a single Falcon configuration file to Define replication points Define data processing pipeline

Apache Falcon – Architecture:

Apache Falcon – Architecture

Apache Falcon – BI Example:

Apache Falcon – BI Example Falcon used to manage work flow Falcon used to manage Cluster data replication BI example Staged and presented data replicated Presented data visible for Reporting Analytics See next slide .....

Apache Falcon – BI Example:

Apache Falcon – BI Example

Contact Us:

Contact Us Feel free to contact us at We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

authorStream Live Help