logging in or signing up mondrian Arundel0 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 585 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript MONDRIAN: Annotating and querying databases through colors and blocks: Henrico Dolfing Seminar Digital Information Curation MONDRIAN: Annotating and querying databases through colors and blocks Outline: Outline Introduction Colors and Blocks Color Algebra Mondrian System Discussion References Introduction: Introduction Geerts, F., Kementsietsidis, A., Milano, D., 'MONDRIAN: Annotating and querying databases through color and blocks', accepted for ICDE 2006 Annotation-oriented data model for manipulating and querying both data and annotations. MONDRIAN, a prototype implementation of the annotation mechanism Motivation: Motivation Scientific databases Huge amounts of data Different formats (flat text, images, xml, ...) Challenges Integrate, annotate and cross reference such diverse collections of data. Maintain data provenance Pressing needs of biological databases Use Case (1/2): Use Case (1/2) GDB, a human genome database Swissprot, a proteine database Use Case (2/2): Use Case (2/2) PIR, a protein sequence database SwissProt andamp; PIR UniProt Colors and Blocks (1/2): Colors and Blocks (1/2) I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID John Mary Peter Mary John John, Mary Colors and Blocks (2/2): Colors and Blocks (2/2) Block = annotated group of attribute values Color = each annotation is represented by a color Block overlapping Inheritance Transitivity Color Queries = queries on annotated databases, that are written in a 'Color Algebra' Color Algebra (1/2): Color Algebra (1/2) Projection Selection Cartesian product Block selection Block projections Merge Recoloring Renaming Union Color Algebra (2/2): Color Algebra (2/2) Definition: The color algebra consists of all expressions obtained by composing a finite number of the operators. Theorem: The set of operators in the color algebra is minimal Projection: Projection L-Type Block Projection: L-Type Block Projection U-Type Block Projection: U-Type Block Projection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID Combined Block Projection: Combined Block Projection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID Query example: Query example Consider original relation in our use case. Assume we want to find all the tuples that have a block annotated by Mary, or concern the protein with sid P038138. Assume we are only interested in keeping the {gid,sid} attributes from these tuples. Block Selection: Block Selection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID John Mary Peter Mary John John, Mary Block Selection: Block Selection I78825 A45770 A01399 PID 120231 120232 120233 GID P21359 P35240 P01138 SID Mary Mary John, Mary Selection: Selection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Selection: Selection A25218 PID 120234 GID P08138 SID Union: Union Union: Union I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Mary Mary John, Mary Projection: Projection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Mary Mary John, Mary Projection: Projection 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Mary Mary John, Mary Cartesian Product: Cartesian Product Cartesian Product: Cartesian Product I78825 A45770 A25218 PID 120231 120232 120234 GID 120231 120232 120234 GID’ P21359 P35240 P08138 SID’ Merge: Merge Projecting out GID’ I78825 A45770 A25218 PID 120231 120232 120234 GID P21359 P35240 P08138 SID’ Merge: Merge Projecting out GID I78825 A45770 A25218 PID 120231 120232 120234 GID’ P21359 P35240 P08138 SID’ Merge: 120231 120232 120234 Merge I78825 A45770 A25218 PID GID’ P21359 P35240 P08138 SID’ Mondrian System : Mondrian System Piet Mondria(a)n: Dutch painter whose paintings mainly consist of color blocks Victory Boogie Woogie (€ 40.000.000) Desirable properties: Desirable properties No restructuring of the existing database schema Only extra tables need to be added Minimum overhead in terms of Space Query execution time Annotations should be treated as first class citizens of the database, ie be able to query them Current state of Mondrian System: Current state of Mondrian System Text based CA Query Equivalent CRA Query Equivalent SQL Query MySQL Relational DBMS Result Graphical CA Query Relational Representation: Relational Representation Assume assoc(pid,bpid), assoc(gid,bgid) and assoc (sid,bsid) Data is separated from annotation representation Current state: Current state Text based CA Query Equivalent CRA Query Equivalent SQL Query MySQL Relational DBMS Result Graphical CA Query Experimental Results: Experimental Results Discussion: Discussion Literature : Literature [Geerts et al., 2005] Geerts, F., Kementsietsidis, A., and Milano, D., „MONDRIAN: Annotating and querying databases through colors and blocks', Accepted for ICDE 2006, 2005 [Buneman et al., 2005] Buneman, P., Bose, R., Ecklund, D., „Annotation in Scientific Data: a Scoping Report', 2005 [Grey et al., 2002] Grey, J., Szalay, A.S., Thakar, A.R., Stoughton, C., van den Berg, J., „Online Scientific Data Curation, Publication, and Archiving' ,Technical Report MSR-TR-2002-74, Microsoft Research, 2002 Colour chart: Colour chart How to colour an object Select the preferred colour Click the ‘Format Painter’ button on the button bar Go to preferred slide and click the target object you want to colour What to do with Clipart colours Do not use them! (Except for Océ clipart purposes) Format Painter You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
mondrian Arundel0 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 585 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: June 20, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript MONDRIAN: Annotating and querying databases through colors and blocks: Henrico Dolfing Seminar Digital Information Curation MONDRIAN: Annotating and querying databases through colors and blocks Outline: Outline Introduction Colors and Blocks Color Algebra Mondrian System Discussion References Introduction: Introduction Geerts, F., Kementsietsidis, A., Milano, D., 'MONDRIAN: Annotating and querying databases through color and blocks', accepted for ICDE 2006 Annotation-oriented data model for manipulating and querying both data and annotations. MONDRIAN, a prototype implementation of the annotation mechanism Motivation: Motivation Scientific databases Huge amounts of data Different formats (flat text, images, xml, ...) Challenges Integrate, annotate and cross reference such diverse collections of data. Maintain data provenance Pressing needs of biological databases Use Case (1/2): Use Case (1/2) GDB, a human genome database Swissprot, a proteine database Use Case (2/2): Use Case (2/2) PIR, a protein sequence database SwissProt andamp; PIR UniProt Colors and Blocks (1/2): Colors and Blocks (1/2) I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID John Mary Peter Mary John John, Mary Colors and Blocks (2/2): Colors and Blocks (2/2) Block = annotated group of attribute values Color = each annotation is represented by a color Block overlapping Inheritance Transitivity Color Queries = queries on annotated databases, that are written in a 'Color Algebra' Color Algebra (1/2): Color Algebra (1/2) Projection Selection Cartesian product Block selection Block projections Merge Recoloring Renaming Union Color Algebra (2/2): Color Algebra (2/2) Definition: The color algebra consists of all expressions obtained by composing a finite number of the operators. Theorem: The set of operators in the color algebra is minimal Projection: Projection L-Type Block Projection: L-Type Block Projection U-Type Block Projection: U-Type Block Projection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID Combined Block Projection: Combined Block Projection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID Query example: Query example Consider original relation in our use case. Assume we want to find all the tuples that have a block annotated by Mary, or concern the protein with sid P038138. Assume we are only interested in keeping the {gid,sid} attributes from these tuples. Block Selection: Block Selection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID John Mary Peter Mary John John, Mary Block Selection: Block Selection I78825 A45770 A01399 PID 120231 120232 120233 GID P21359 P35240 P01138 SID Mary Mary John, Mary Selection: Selection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Selection: Selection A25218 PID 120234 GID P08138 SID Union: Union Union: Union I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Mary Mary John, Mary Projection: Projection I78825 A45770 A01399 A25218 PID 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Mary Mary John, Mary Projection: Projection 120231 120232 120233 120234 GID P21359 P35240 P01138 P08138 SID Mary Mary John, Mary Cartesian Product: Cartesian Product Cartesian Product: Cartesian Product I78825 A45770 A25218 PID 120231 120232 120234 GID 120231 120232 120234 GID’ P21359 P35240 P08138 SID’ Merge: Merge Projecting out GID’ I78825 A45770 A25218 PID 120231 120232 120234 GID P21359 P35240 P08138 SID’ Merge: Merge Projecting out GID I78825 A45770 A25218 PID 120231 120232 120234 GID’ P21359 P35240 P08138 SID’ Merge: 120231 120232 120234 Merge I78825 A45770 A25218 PID GID’ P21359 P35240 P08138 SID’ Mondrian System : Mondrian System Piet Mondria(a)n: Dutch painter whose paintings mainly consist of color blocks Victory Boogie Woogie (€ 40.000.000) Desirable properties: Desirable properties No restructuring of the existing database schema Only extra tables need to be added Minimum overhead in terms of Space Query execution time Annotations should be treated as first class citizens of the database, ie be able to query them Current state of Mondrian System: Current state of Mondrian System Text based CA Query Equivalent CRA Query Equivalent SQL Query MySQL Relational DBMS Result Graphical CA Query Relational Representation: Relational Representation Assume assoc(pid,bpid), assoc(gid,bgid) and assoc (sid,bsid) Data is separated from annotation representation Current state: Current state Text based CA Query Equivalent CRA Query Equivalent SQL Query MySQL Relational DBMS Result Graphical CA Query Experimental Results: Experimental Results Discussion: Discussion Literature : Literature [Geerts et al., 2005] Geerts, F., Kementsietsidis, A., and Milano, D., „MONDRIAN: Annotating and querying databases through colors and blocks', Accepted for ICDE 2006, 2005 [Buneman et al., 2005] Buneman, P., Bose, R., Ecklund, D., „Annotation in Scientific Data: a Scoping Report', 2005 [Grey et al., 2002] Grey, J., Szalay, A.S., Thakar, A.R., Stoughton, C., van den Berg, J., „Online Scientific Data Curation, Publication, and Archiving' ,Technical Report MSR-TR-2002-74, Microsoft Research, 2002 Colour chart: Colour chart How to colour an object Select the preferred colour Click the ‘Format Painter’ button on the button bar Go to preferred slide and click the target object you want to colour What to do with Clipart colours Do not use them! (Except for Océ clipart purposes) Format Painter