An introduction to Apache Gora

Views:
 
Category: Entertainment
     
 

Presentation Description

A short introduction to Apache Gora, what is it and how does it work ? How can it provide data store abstraction and persistency for big data ?

Comments

Presentation Transcript

Apache Gora:

Apache Gora What is it ? Gora – Nutch Supports Data Access API's www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – What is it ?:

Apache Gora – What is it ? Provides for Big Data In memory data model Persistence Data store abstraction Supports persisting to Column stores Key/value stores Document stores RDBMS's Supports use of Hadoop www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – What is it ?:

Apache Gora – What is it ? Released via Apache 2 license Written in Java Offers a persistence framework Designed for big data applications Used by Nutch 2.x for web crawl data storage Used for Persistence Indexing Analytics www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Nutch:

Apache Gora – Nutch Nutch 2.x now uses Gora Abstracted storage Data store independence Handles object to persistent mappings Use various NoSql solutions www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Supports:

Apache Gora – Supports Gora supports the following Apache Accumulo Apache Cassandra Apache Hbase Amazon DynamoDB Pig Hive Cascading MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Data Access:

Apache Gora – Data Access Java API for data access Independent of location Core Gora API's Store Persistency Query MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Store API:

Apache Gora – Store API Java API – org.apache.gora.store.* DataStore handles object persistence DataStore methods process objects Persist Fetch Query Delete www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Persistency API:

Apache Gora – Persistency API Java API – org.apache.gora.persistency.* Core classes BeanFactory Construct keys Persistent Persist objects State State managed through StateManager NEW, CLEAN (UNMODIFIED)‏ DIRTY (MODIFIED), DELETED www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Query API:

Apache Gora – Query API Java API – org.apache.gora.query.* Core classes Query Constructed via DataStore PartitionQuery Divide results of Query into partitions. Run queries on data nodes. Generate Hadoop InputSplits Result www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – MapReduce API:

Apache Gora – MapReduce API Java API – org.apache.gora.mapreduce.* GoraMapper GoraReducer ALL Record Counter Reader Writer Hadoop / Avro Serialise De-serialise Persistent www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Contact Us:

Contact Us Feel free to contact us at www.semtech-solutions.co.nz info@semtech-solutions.co.nz We offer IT project consultancy We are happy to hear about your problems You can just pay for those hours that you need To solve your problems

authorStream Live Help