JovianDATA Tech Presentation AWS event

Views:
 
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Slide 1: 

2460 North First Street, Suite 170, San Jose, CA 95131  408-433-9383  www.joviandata.com Analytics at the Speed of Thought Satya Ramachandran Vice President of Engineering Anupam Singh Chief Technology Officer April 14, 2010

JovianDATA Mission : 

Technology platform to optimize your conversion funnel at the lowest cost JovianDATA Mission

Why move to the cloud? : 

Why move to the cloud? Considering AWS actively but not sure about Cap Ex benefits Current stack’s cloud readiness Application provisioning challenges

Introducing JovianDATA : 

Introducing JovianDATA

Transforming Data to Actionable Insights : 

Transforming Data to Actionable Insights

Agenda : 

Agenda JovianDATA Company Overview JovianInsights – The Power of Analytics Analytics Lifecycle Management Innovations in Cloud Infrastructure Management JovianDATA Cube Storage Innovations in Advanced Analytics using commodity clusters

Avoiding Expensive Data Processing : 

Avoiding Expensive Data Processing Reduce Disk I/O By Materializing Expensive Groups Usage based Automatic View Materialization Avoid Network I/O Multi-Dimensional Partitioning

Why move to the cloud? : 

Why move to the cloud?

Agenda : 

Agenda Reducing Capex Application Isolation Dynamic Provisioning

Managing CapEx with Role Based Clusters : 

Managing CapEx with Role Based Clusters SINGLE CLUSTER FOR DATA CLEANSING, LOAD AND QUERY 15TB 100 NODES Monthly Cost = $28,800

Managing Cap-Ex with Role Based Clusters : 

Managing Cap-Ex with Role Based Clusters 2 hours daily for load on 10 nodes 8 hours daily for query on 5 nodes Monthly Cost = $2,052

Agenda : 

Agenda Reducing Capex Application Isolation Dynamic Provisioning

Selective Replication for on demand perf : 

Selective Replication for on demand perf Power analyst needs to perform complex, heavy number-crunching query that typically take 8 - 10 hours Solution FlexRestoreTM Adds two new temporary nodes (Temp1, Temp2) Creates new replicas for hot partitions and redistributes across nodes P1 P1 P1 With Replication Factor = 1 Site Section Analytics = 10 minutes With Replication Factor = 10 Site Section Analytics = 30 seconds

Reduce replication to maintain cost : 

Reduce replication to maintain cost When the analysis is done and the extra performance is not needed, the SLA Controller brings down the two temporary nodes (and the extra replicas) Benefits High performance computing power when you need it But only when you need it to hold down operating costs P34 P22 P12 P3 P1 P3 P22 P34 P12 P3 P34 P1 P1 P12 P22 Node1 Node2 Node3 Node4 Nodeset1 P34 P22 P1 P12 Temp1 Temp2 P3   

Agenda : 

Agenda Reducing Capex Application Isolation Dynamic Provisioning

Provision Tera Scale Applications in Minutes : 

FUNNEL ANALYSIS FOR CLIENT Provision Tera Scale Applications in Minutes Campaign Manager needs to run heavy duty reports for a Big Advertiser Without Application Isolation Data for all advertisers is kept ‘live’ on 50 nodes 50 live nodes per month = $14, 400

Provision Tera Scale Applications in Minutes : 

Provision Tera Scale Applications in Minutes Application is provisioned in parallel from S3/EBS into EC2 50 nodes for fortnightly analysis = $320

Summary : 

Summary

Thank You : 

Thank You