An introduction to Databricks


Presentation Description

A introduction to Databricks, what is it and how does it work ? What can it do ?


Presentation Transcript


Databricks What is Databricks ? Cloud services used Functionality Languages Spark Usage 3 rd Party Apps Architecture Books

Databricks – What is it ?:

Databricks – What is it ? A Cloud based Apache Spark cluster service Offers scalable Spark clusters based on AWS Developed by the same people who created Spark Multiple cluster management Job scheduling and library import Offers access to all Spark modules

Databricks – Cloud Services:

Databricks – Cloud Services Currently uses Amazon AWS Uses EC2 and has access to S3 buckets Uses a minimum of 2 EC2 instances Attempts to optimise EC2 usage Plans to extend to other cloud providers

Databricks – Functionality:

Databricks – Functionality Architecture based on Notebooks and folders Has a cluster manager for Defined (min 54gb) clusters Spot clusters On Demand clusters Has a job manager and scheduler Has user management Has full Spark functionality Has strong data visualisation capability Can export reports and dashboards

Databricks – Languages:

Databricks – Languages Can have Notebooks in Scala Python SQL SQL can be executed in non SQL Notebooks Markdown comments can be placed in Notebooks Notebooks can be shared by multiple sessions Libraries can be imported and called in Notebooks

Databricks – Spark Usage:

Databricks – Spark Usage Lastest Spark version available i.e. DB 1.3.4 uses Spark 1.3.1 at June 2015 All Spark modules available SQL, GraphX, MlLib, Streaming Strong integration between modules and visualisation Extensive use of tables to import data Tables available via SQL

Databricks – 3rd Party Apps:

Databricks – 3 rd Party Apps Current available and more to come Pentaho Qlik Tableau TIBC Jaspersoft PanTera ZoomData

Databricks – Architecture:

Databricks – Architecture

Available Books:

Available Books See our Hadoop book from Apress / Springer “Big Data Made Easy” Look out for our Apache Spark based book from Packt in 2015

Contact Us:

Contact Us Feel free to contact us at We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

authorStream Live Help