An introduction to Pentaho


Presentation Description

A short introduction to Pentaho, how can it be used for business intelligence. How can it help us in our processing of big data on Hadoop ?


Presentation Transcript


Pentaho What is it ? Pentaho Server Pentaho Client Plugins Big Data / Hadoop Architecture Screen Shots

Pentaho – What is it ?:

Pentaho – What is it ? It is a business intelligence system It offers Analytics Visual data integration OLAP Reports Dashboards Data mining ETL Written in Java

Pentaho – What is it ?:

Pentaho – What is it ? Offered as Free community edition Purchased enterprise edition Available for Windows Linux Mac OSX Community supported Open source plugins available Uses the Apache Java Application Server

Pentaho – Server:

Pentaho – Server Pentaho server components BI / BA Platform - the core platform, hosts contents and offers services Analysis (Mondrian) – an OLAP server, query via MDX, XML In memory data aggregation Dashboard Designer (PDD) – supports monitoring / decision making Analysis (PAZ) – drag and drop OLAP viewer, create MDX queries Reporting (PIR) – plugin for adhoc reports Data Access – create data sources / define data models Mobile – extends Pentaho to the mobile world Dashboards and reports on a small screen

Pentaho – Client:

Pentaho – Client Pentaho client components Data Integration (PDI) – Kettle ETL engine, single node / cloud Big Data – ETL for Hadoop / NoSQL databases Report Designer – codeless report creation Data Mining – uses Weka for pattern searching and trend prediction Meta Data (PME) – create business models of data sources Aggregate Designer (PAD) – pre calculation for aggregation Schema Workbench (PSW) – OLAP cube analysis Design Studio (PDS) – automation / business logic task support

Pentaho – Plugins:

Pentaho – Plugins Pentaho plugins Ctools – free / open source tools by webdetails Charting (CCC) – create dashboard charts Build Framework (CBF) – release management for Pentaho apps Data Access (CDA) – common layer for data access Data Browser (CDB) – visual OLAP data browser Distributed Cache (CDC) – high performance, scalable shared cache Data Generator (CDG) – for dashboard creation Creates tables, data and Mondrian schema

Pentaho – Plugins:

Pentaho – Plugins Pentaho plugins Data Validator (CDV) – data integrity via validation tests Graphics Gen (CGG) – server side rendering of charts as images Dashboard Editor (CDE) – sophisticated community dashboard editor Dashboard Framework (CDF) extendable, AJAX based dashboard framework Startup Tabs (CST) – define start up tabs by user Saiku – analysis suite / open source Saiku Reporting – drag & drop report design tool

Pentaho – Big Data:

Pentaho – Big Data What are Hadoop's characteristics ? Mainly command line driven Minimal built in processing / filtering / validation options No drag & drop ETL like job creation What can Pentaho bring to Hadoop ? Visual drag & drop job creation Visual reports Ad / Hoc querying Dashboards And more … It would be a good extension to Hadoop's core functionality

Pentaho – Architecture:

Pentaho – Architecture

Pentaho – Screen Shots – Drag & Drop ETL:

Pentaho – Screen Shots – Drag & Drop ETL

Pentaho – Screen Shots – Dashboards:

Pentaho – Screen Shots – Dashboards

Pentaho – Screen Shots – Reports:

Pentaho – Screen Shots – Reports

Contact Us:

Contact Us Feel free to contact us at We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

authorStream Live Help