Apache Flink


Presentation Description

This presentation gives an overview of the Apache Flink project. It explains Flink in terms of its architecture, use cases and the manner in which it works. Links for further information and connecting http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ https://nz.linkedin.com/pub/mike-frampton/20/630/385 https://open-source-systems.blogspot.com/


Presentation Transcript

slide 1:

What Is Apache Flink ● A stream processing framework ● Open source / Apache 2.0 license ● Written in Java and Scala ● For batch and stream processing ● For high volume low latency ● Develop in Java Scala Python SQL ● Automatic compilation/optimization into data flows

slide 2:

How Does Flink Work ● Process Unbounded and Bounded Data ● Uses file systems to consume/persistently store data i.e. – local hadoop-compatible Amazon S3 MapR FS OpenStack Swift FS Aliyun OSS and Azure Blob Storage ● Leverages In-Memory Performance ● Provides a rich function set for handling – Streams state and time – When building applications ● Provides layered APIs which provides a balance between – Conciseness and expressiveness – See next slide

slide 3:

How Does Flink Work Flink layered APIs

slide 4:

Flink APIs ● SQL Table API ● DataStream API ● ProcessFunctions – event processing ● Flink also has libraries for common data processing – Complex Event Processing CEP – DataSet API – Gelly - library for scalable graph processing/analysis

slide 5:

Flink Used By

slide 6:

Flink Deployment ● Deploy Flink to use the following cluster managers – YARN – Mesos – Kubernetes – Stand alone ● All application control communications via REST calls ● Deploy at any scale – multiple trillions of events per day – multiple terabytes of state – thousands of cores

slide 7:

Flink Architecture

slide 8:

Flink Stateful Functions ● Simplifies building distributed stateful applications ● Provides a runtime built for serverless architectures ● Key Benefits – Dynamic Messaging – Consistent State – Multi-language Support – No Database Required – Cloud Native – "Stateless" Operation

slide 9:

Flink Stateful Functions

slide 10:

Flink Use Cases ● Event-driven Applications i.e. – Fraud detection – Anomaly detection ● Data Analytics Applications – Quality monitoring of Telco networks – Analysis of product updates experiment evaluation in mobile applications ● Data Pipeline Applications – Real-time search index building in e-commerce – Continuous ETL in e-commerce

slide 11:

Flink Use Cases

slide 12:

Flink Use Cases

slide 13:

Available Books ● See “Big Data Made Easy” – Apress Jan 2015 ● See “Mastering Apache Spark” – Packt Oct 2015 ● See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” ● Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ ● Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020

slide 14:

Connect ● Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at – open-source-systems.blogspot.com/ ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration

authorStream Live Help