logging in or signing up Workshop WebDB 2000b Arley33 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 21 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 15, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Partial Query-Evaluation in Internet Query Engines: Partial Query-Evaluation in Internet Query Engines Jayavel Shanmugasundaram Kristin Tufte David DeWitt David Maier Jeffrey Naughton University of Wisconsin & Oregon Graduate InstituteOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionQuerying the WWW: The Present: Querying the WWW: The Present Who won the Nobel prize for Physics in 1999? Querying the WWW: The Present: Querying the WWW: The Present Want 1998 Red BMW No accidents 20% < avg. model price HTML File HTML File HTML File HTML File HTML File HTML File HTML FileQuerying the WWW: The Future?: Querying the WWW: The Future? Want 1998 Red BMWNo accidents 20% < avg. model priceInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red Used BMW Cars (carId, model, price, otherinfo)The Problem: The Problem Return results to users as soon as possible “Results so far” for queries with blocking operators Arbitrary blocking operators Not exists, Average, Nest … Blocking operators occurring anywhere in the query Potentially intermixed with non-blocking operatorsOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionWhat is a Partial Result of a Query?: What is a Partial Result of a Query? Let Full Result of Query Q on Inputs A and B be: Q(A, B) Then Partial Result of Query Q on Inputs A and B is: Q(PA, PB) PA A PB BMaximal Output Property: Maximal Output Property Produce “correct” results as soon as possible Why? If query is non-blocking Produces results soon If query is blocking Return “non-blocking parts” soon (e.g., outer join)Inside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Anytime Property: Anytime Property Blocking operators should be able to return the “result so far” at any time Why? User can request partial results at any timeInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Non-Monotonic Input/Output Property: Non-Monotonic Input/Output Property Operators should handle “changes”, not just additions to input Similarly, operators should produce “changes”, not just additions to output Both blocking and non-blocking operators Why? Partial results may represent “wrong” answers Need to be corrected laterInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Flexible Input Property: Flexible Input Property Should be able to process data from any input at any time Processes data as it becomes available Why? If query is non-blocking: Can return results soon If query is blocking Faster partial result response timeA Note on Partial Result Accuracy: A Note on Partial Result Accuracy Focus is on producing partial results Architecture is general enough to exploit existing techniques Online aggregation [Hellerstein et. al.] Nested aggregates [Tan et. al.] Accuracy for general blocking operators?Outline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionWhere do we start?: Where do we start? Use known flexible input, maximal output operator implementations Non-blocking: select, symmetric hash join, Xjoin Blocking: group-by, symmetric outer join Blocking operator implementations should satisfy anytime property All operator implementations should satisfy non-monotonic input/output propertyNon-Monotonic Input/Output: Non-Monotonic Input/Output Re-evaluation Approach: On partial result request, compute results “so far” Then forget all “potentially incorrect” inputs Differential Approach: On partial result request, compute results “so far” “Update” incorrect inputs for future result computationInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Re-evaluation Join: Re-evaluation Join (1, Z3, 10000) (Z3, 15000) (19, Z3, 20000) (5, 400i, 30000) (400i, 25000) (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 25000) (3, 400i, 20000) (3, 400i, 20000) Re-evaluation Join: Re-evaluation Join (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (3, 400i, 20000) (8, 400i, 20000) (8, 400i, 20000) (Z3, 15000) (Z3, 15000) (400i, 23333) (400i, 23333) Differential Join: Differential Join (1, Z3, 10000) (Z3, 15000) (19, Z3, 20000) (5, 400i, 30000) (400i, 25000) (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 25000) (3, 400i, 20000) (3, 400i, 20000) Differential Join: Differential Join (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 25000) (3, 400i, 20000) update (400i, 23333) Differential Join: Differential Join (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 23333) (3, 400i, 20000) (8, 400i, 20000) (8, 400i, 20000)Re-evaluation vs. Differential: Re-evaluation vs. Differential Re-evaluation Approach: Simple – just “forget” partial inputs Easier to extend (no changes to tuple structure) Unnecessary computation Differential Approach: Need to handle deletions/updates of inputs Changes to tuple structure Re-computes only what is necessaryOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionResponse Time: Response TimeOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionConclusion: Conclusion New properties for query engine operators Operator implementation alternatives Re-evaluation Differential Evaluation Partial results improve response time Re-evaluation approach is simpler Differential approach is more efficientFuture Work: Future Work General GUI Partial result accuracy for general blocking operators Changes at finer granularities Consistent partial resultsRelated Work: Related Work Online aggregation [Hellerstein et. al.] Nested aggregates [Tan et. al.] Online reordering [Raman et. al.] Symmetric hash join [Wilschut et. al.] Adaptive operators [Ives et. al.] XJoin [Urhan et. al.] You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Workshop WebDB 2000b Arley33 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 21 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: October 15, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Partial Query-Evaluation in Internet Query Engines: Partial Query-Evaluation in Internet Query Engines Jayavel Shanmugasundaram Kristin Tufte David DeWitt David Maier Jeffrey Naughton University of Wisconsin & Oregon Graduate InstituteOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionQuerying the WWW: The Present: Querying the WWW: The Present Who won the Nobel prize for Physics in 1999? Querying the WWW: The Present: Querying the WWW: The Present Want 1998 Red BMW No accidents 20% < avg. model price HTML File HTML File HTML File HTML File HTML File HTML File HTML FileQuerying the WWW: The Future?: Querying the WWW: The Future? Want 1998 Red BMWNo accidents 20% < avg. model priceInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red Used BMW Cars (carId, model, price, otherinfo)The Problem: The Problem Return results to users as soon as possible “Results so far” for queries with blocking operators Arbitrary blocking operators Not exists, Average, Nest … Blocking operators occurring anywhere in the query Potentially intermixed with non-blocking operatorsOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionWhat is a Partial Result of a Query?: What is a Partial Result of a Query? Let Full Result of Query Q on Inputs A and B be: Q(A, B) Then Partial Result of Query Q on Inputs A and B is: Q(PA, PB) PA A PB BMaximal Output Property: Maximal Output Property Produce “correct” results as soon as possible Why? If query is non-blocking Produces results soon If query is blocking Return “non-blocking parts” soon (e.g., outer join)Inside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Anytime Property: Anytime Property Blocking operators should be able to return the “result so far” at any time Why? User can request partial results at any timeInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Non-Monotonic Input/Output Property: Non-Monotonic Input/Output Property Operators should handle “changes”, not just additions to input Similarly, operators should produce “changes”, not just additions to output Both blocking and non-blocking operators Why? Partial results may represent “wrong” answers Need to be corrected laterInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Flexible Input Property: Flexible Input Property Should be able to process data from any input at any time Processes data as it becomes available Why? If query is non-blocking: Can return results soon If query is blocking Faster partial result response timeA Note on Partial Result Accuracy: A Note on Partial Result Accuracy Focus is on producing partial results Architecture is general enough to exploit existing techniques Online aggregation [Hellerstein et. al.] Nested aggregates [Tan et. al.] Accuracy for general blocking operators?Outline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionWhere do we start?: Where do we start? Use known flexible input, maximal output operator implementations Non-blocking: select, symmetric hash join, Xjoin Blocking: group-by, symmetric outer join Blocking operator implementations should satisfy anytime property All operator implementations should satisfy non-monotonic input/output propertyNon-Monotonic Input/Output: Non-Monotonic Input/Output Re-evaluation Approach: On partial result request, compute results “so far” Then forget all “potentially incorrect” inputs Differential Approach: On partial result request, compute results “so far” “Update” incorrect inputs for future result computationInside the Internet Query Engine: Inside the Internet Query Engine (carId, model, price, otherinfo) Red 1998 BMW Cars Accident Reports (carId) (carId, model, price, otherinfo)Re-evaluation Join: Re-evaluation Join (1, Z3, 10000) (Z3, 15000) (19, Z3, 20000) (5, 400i, 30000) (400i, 25000) (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 25000) (3, 400i, 20000) (3, 400i, 20000) Re-evaluation Join: Re-evaluation Join (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (3, 400i, 20000) (8, 400i, 20000) (8, 400i, 20000) (Z3, 15000) (Z3, 15000) (400i, 23333) (400i, 23333) Differential Join: Differential Join (1, Z3, 10000) (Z3, 15000) (19, Z3, 20000) (5, 400i, 30000) (400i, 25000) (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 25000) (3, 400i, 20000) (3, 400i, 20000) Differential Join: Differential Join (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 25000) (3, 400i, 20000) update (400i, 23333) Differential Join: Differential Join (1, Z3, 10000) (19, Z3, 20000) (5, 400i, 30000) (Z3, 15000) (400i, 23333) (3, 400i, 20000) (8, 400i, 20000) (8, 400i, 20000)Re-evaluation vs. Differential: Re-evaluation vs. Differential Re-evaluation Approach: Simple – just “forget” partial inputs Easier to extend (no changes to tuple structure) Unnecessary computation Differential Approach: Need to handle deletions/updates of inputs Changes to tuple structure Re-computes only what is necessaryOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionResponse Time: Response TimeOutline: Outline Motivation Desired Operator Properties Implementation Alternatives Performance Evaluation ConclusionConclusion: Conclusion New properties for query engine operators Operator implementation alternatives Re-evaluation Differential Evaluation Partial results improve response time Re-evaluation approach is simpler Differential approach is more efficientFuture Work: Future Work General GUI Partial result accuracy for general blocking operators Changes at finer granularities Consistent partial resultsRelated Work: Related Work Online aggregation [Hellerstein et. al.] Nested aggregates [Tan et. al.] Online reordering [Raman et. al.] Symmetric hash join [Wilschut et. al.] Adaptive operators [Ives et. al.] XJoin [Urhan et. al.]