logging in or signing up Virtuoso Relational To RDF Mapping rumito Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: Embed: Flash iPad Copy Does not support media & animations WordPress Embed Customize Embed URL: Copy Thumbnail: Copy The presentation is successfully added In Your Favorites. Views: 361 Category: Science & Tech.. License: All Rights Reserved Like it (0) Dislike it (0) Added: May 19, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Mapping Relational Databases to RDF with OpenLink Virtuoso : © 2008 OpenLink Software, All rights reserved. Mapping Relational Databases to RDF with OpenLink Virtuoso Orri Erling - Lead Developer, Virtuoso Team Who Wants to Map? : © 2008 OpenLink Software, All rights reserved. Who Wants to Map? Semantic Web Scalers Expose whatever there is as RDF, the next guy will unify terms, make search and apps Data Warehouse Keepers Data is spread out, has implicit semantics, complex schemas, heterogeneous sources, ambiguous terms but we must make it join and aggregate cleanly Present State : © 2008 OpenLink Software, All rights reserved. Present State SPARQL to SQL exists but still, complex integrations are data warehouses We'd really like to map, but... Can it be otherwise? Why RDF Data Warehouse? : © 2008 OpenLink Software, All rights reserved. Why RDF Data Warehouse? Pros Even query performance across all data Possibility of forward-chaining inference Some SPARQL features may be better supported, e.g. Unspecified predicates Cons Keeping data up-to-date Complex set up, needs dedicated servers: you don't build them on a whim Why Map? : © 2008 OpenLink Software, All rights reserved. Why Map? No copying, no timeliness issues RDBMS outperforms RDF for analytics workloads Agile reconfiguration without reloading data Virtuoso : © 2008 OpenLink Software, All rights reserved. Virtuoso Mapping of SPARQL to SQL against any existing schema - whether stored in Virtuoso or elsewhere Physical quad store Federated/local RDBMS For Mapping to Deliver... : © 2008 OpenLink Software, All rights reserved. For Mapping to Deliver... Tackle any SQL analytics workload in SPARQL without extra cost Deal with arbitrary SQL schema Produce single SQL statements, optimizable by target RDBMS Have intelligence for cases where one RDF entity can come from many relational sources The Cases of Integration : © 2008 OpenLink Software, All rights reserved. The Cases of Integration Bring similar but heterogeneous schemas into a unified ontology - Union View Translate FKs of one schema to PKs in another - Distributed Join Hide differences in normalization - Views for hiding joins - Unit/Terminology conversions Defining a Mapping : © 2008 OpenLink Software, All rights reserved. Defining a Mapping Define URI formats and their subclass relations Define which key-column-value combinations make a triple Arbitrary SQL is allowed for mapping values and filtering A single RDF node can be a composite of many columns, e.g. multipart key Use SPARQL/SQL to: The TPC-H Case : © 2008 OpenLink Software, All rights reserved. The TPC-H Case The 22 queries as extended SPARQL Each generates a single SQL statement, executable by Virtuoso, Oracle, Others Next make several TPC-H databases on different servers and run the queries against the union http://demo.openlinksw.com/tpc-h/ Where Problems Begin : © 2008 OpenLink Software, All rights reserved. Where Problems Begin In OpenLink Data Spaces, 6 Collaborative apps all mapped to SIOC: Trivially becomes a union of everything, 1000+ lines of SQL Intelligently (once per app) becomes a Union of : select * from <ods> where {?s ?p ?o . ?s has_comment ?c . ?c has_author <xxx> } select post.* from post, comment, user where c_post = p_id and c_author = u_id and u_name = f ('xxx') What One Must Know : © 2008 OpenLink Software, All rights reserved. What One Must Know Mapping for integration is not trivial Be careful when mapping multiple tables/columns to one class/property Make URI schemes which encode type and source, so that senseless joins are not attempted if types not specified in query Understand what the mapping logic can and cannot optimize Understand what SQL can and cannot optimize View resulting SQL for sanity check SQL Extensions : © 2008 OpenLink Software, All rights reserved. SQL Extensions Mapping must work against any RDBMS/Schema, as is But there is Virtuoso SQL between the mapping and target RDBMS(s) Location and latency - conscious distributed cost model Breakup for making a wide result set into a row per property Inverse functions Use Cases : © 2008 OpenLink Software, All rights reserved. Use Cases OpenLink Data Spaces - Blog, Wiki, News, Social Network, Feed Aggregation, Tag Clouds, Bookmarks etc. OpenLink's own MIS - “total information awareness”: URI for any CRM Object, Account, Product, Support Case, Email etc.. Musicbrainz phpBB, Drupal, MediaWiki, WordPress, Bugzilla, and others. OpenLink Software : © 2008 OpenLink Software, All rights reserved. OpenLink Software Thank You! http://virtuoso.openlinksw.com You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Virtuoso Relational To RDF Mapping rumito Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT lite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: Embed: Flash iPad Copy Does not support media & animations WordPress Embed Customize Embed URL: Copy Thumbnail: Copy The presentation is successfully added In Your Favorites. Views: 361 Category: Science & Tech.. License: All Rights Reserved Like it (0) Dislike it (0) Added: May 19, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Mapping Relational Databases to RDF with OpenLink Virtuoso : © 2008 OpenLink Software, All rights reserved. Mapping Relational Databases to RDF with OpenLink Virtuoso Orri Erling - Lead Developer, Virtuoso Team Who Wants to Map? : © 2008 OpenLink Software, All rights reserved. Who Wants to Map? Semantic Web Scalers Expose whatever there is as RDF, the next guy will unify terms, make search and apps Data Warehouse Keepers Data is spread out, has implicit semantics, complex schemas, heterogeneous sources, ambiguous terms but we must make it join and aggregate cleanly Present State : © 2008 OpenLink Software, All rights reserved. Present State SPARQL to SQL exists but still, complex integrations are data warehouses We'd really like to map, but... Can it be otherwise? Why RDF Data Warehouse? : © 2008 OpenLink Software, All rights reserved. Why RDF Data Warehouse? Pros Even query performance across all data Possibility of forward-chaining inference Some SPARQL features may be better supported, e.g. Unspecified predicates Cons Keeping data up-to-date Complex set up, needs dedicated servers: you don't build them on a whim Why Map? : © 2008 OpenLink Software, All rights reserved. Why Map? No copying, no timeliness issues RDBMS outperforms RDF for analytics workloads Agile reconfiguration without reloading data Virtuoso : © 2008 OpenLink Software, All rights reserved. Virtuoso Mapping of SPARQL to SQL against any existing schema - whether stored in Virtuoso or elsewhere Physical quad store Federated/local RDBMS For Mapping to Deliver... : © 2008 OpenLink Software, All rights reserved. For Mapping to Deliver... Tackle any SQL analytics workload in SPARQL without extra cost Deal with arbitrary SQL schema Produce single SQL statements, optimizable by target RDBMS Have intelligence for cases where one RDF entity can come from many relational sources The Cases of Integration : © 2008 OpenLink Software, All rights reserved. The Cases of Integration Bring similar but heterogeneous schemas into a unified ontology - Union View Translate FKs of one schema to PKs in another - Distributed Join Hide differences in normalization - Views for hiding joins - Unit/Terminology conversions Defining a Mapping : © 2008 OpenLink Software, All rights reserved. Defining a Mapping Define URI formats and their subclass relations Define which key-column-value combinations make a triple Arbitrary SQL is allowed for mapping values and filtering A single RDF node can be a composite of many columns, e.g. multipart key Use SPARQL/SQL to: The TPC-H Case : © 2008 OpenLink Software, All rights reserved. The TPC-H Case The 22 queries as extended SPARQL Each generates a single SQL statement, executable by Virtuoso, Oracle, Others Next make several TPC-H databases on different servers and run the queries against the union http://demo.openlinksw.com/tpc-h/ Where Problems Begin : © 2008 OpenLink Software, All rights reserved. Where Problems Begin In OpenLink Data Spaces, 6 Collaborative apps all mapped to SIOC: Trivially becomes a union of everything, 1000+ lines of SQL Intelligently (once per app) becomes a Union of : select * from <ods> where {?s ?p ?o . ?s has_comment ?c . ?c has_author <xxx> } select post.* from post, comment, user where c_post = p_id and c_author = u_id and u_name = f ('xxx') What One Must Know : © 2008 OpenLink Software, All rights reserved. What One Must Know Mapping for integration is not trivial Be careful when mapping multiple tables/columns to one class/property Make URI schemes which encode type and source, so that senseless joins are not attempted if types not specified in query Understand what the mapping logic can and cannot optimize Understand what SQL can and cannot optimize View resulting SQL for sanity check SQL Extensions : © 2008 OpenLink Software, All rights reserved. SQL Extensions Mapping must work against any RDBMS/Schema, as is But there is Virtuoso SQL between the mapping and target RDBMS(s) Location and latency - conscious distributed cost model Breakup for making a wide result set into a row per property Inverse functions Use Cases : © 2008 OpenLink Software, All rights reserved. Use Cases OpenLink Data Spaces - Blog, Wiki, News, Social Network, Feed Aggregation, Tag Clouds, Bookmarks etc. OpenLink's own MIS - “total information awareness”: URI for any CRM Object, Account, Product, Support Case, Email etc.. Musicbrainz phpBB, Drupal, MediaWiki, WordPress, Bugzilla, and others. OpenLink Software : © 2008 OpenLink Software, All rights reserved. OpenLink Software Thank You! http://virtuoso.openlinksw.com