6 Xquery

Uploaded from authorPOINT Lite
Download as
 PPT
Presentation Description 

No description available

By:
 (3 month(s) ago)  
nice presentation, can I get download it. Thx

Happy Thanksgiving
What's up on authorSTREAM?
Views: 161
Like it  ( Likes) Dislike it  ( Dislikes)
Added: December 05, 2007 This Presentation is Public 
Presentation Category : Entertainment All Rights Reserved
Presentation Transcript

XQuery – A Language for Querying XML Documents: XQuery – A Language for Querying XML Documents Werner Nutt


Requirements for an XML Query Language: Requirements for an XML Query Language David Maier, W3C XML Query Requirements: Closedness: output must be XML Composability: wherever a set of XML elements is required, a subquery is allowed as well Support for key operations: selection extraction, projection restructuring combination, join fusion of elements


Requirements for an XML Query Language (cntd.): Requirements for an XML Query Language (cntd.) Can benefit from a schema, but should also be applicable without Retains the order of nodes Formal semantics: structure of results should be derivable from query defines equivalence of queries Queries should be representable in XML documents can have embedded queries


How Does One Design a Query Language?: How Does One Design a Query Language? In most query languages, there are two aspects to a query: Retrieving data (e.g., from … where … in SQL) Creating output (e.g., select … in SQL) Retrieval consists of Pattern matching (e.g., from … ) Filtering (e.g., where … ) … although these cannot always be clearly distinguished


XQuery Principles: XQuery Principles Data Model identical with the XPath data model documents are ordered, labeled trees nodes have identity nodes can have simple or complex types (defined in XML Schema) Xquery can be used without schemas, but can be checked against DTDs and XML schemas XQuery is a functional language no statements evaluation of expressions


A Query over the Recipes Document: {for $r in doc("recipes.xml")//recipe return $r/title} returns Beef Parmesan with Garlic Angel Hair Pasta Ricotta Pie … A Query over the Recipes Document


Query Features: {for $r in doc("recipes.xml")//recipe return $r/title} Query Features doc(String) returns input document Sequence of results, one for each variable binding


Features: Summary: Features: Summary The result is a new XML document A query consists of parts that are returned as is ... and others that are evaluated (everything in {...} ) Calling the function doc(String) returns an input document XPath is used to retrieve nodes sets and values Iteration over node sets: let binds a variable to all nodes in a node set Variables can be used in XPath expressions return returns a sequence of results, one for each binding of a variable


XPath is a Fragement of XQuery: XPath is a Fragement of XQuery doc("recipes.xml")//recipe[1]/title returns Beef Parmesan with Garlic Angel Hair Pasta doc("recipes.xml")//recipe[position()Beef Parmesan with Garlic Angel Hair Pasta, Ricotta Pie, Linguine Pescadoro an element a list of elements


Beware: XPath Attributes: Beware: XPath Attributes doc("recipes.xml")//recipe[1]/ingredient[1] /@name → attribute name {"beef cube steak"} string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name) → "beef cube steak" a constructor for an attribute node a value of type string


XPath Attributes (cntd.): XPath Attributes (cntd.) {string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)} → beef cube steak an element with string content


XPath Attributes (cntd.): XPath Attributes (cntd.) {doc("recipes.xml")//recipe[1] /ingredient[1]/@name} → an element with an attribute Note: The XML that we write down is only the surface structure of the data model that is underlying XQuery


XPath Attributes (cntd.): XPath Attributes (cntd.) Beef → Beef


Iteration with the For-Clause: Iteration with the For-Clause Syntax: for $var in xpath-expr Example: for $r in doc("recipes.xml")//recipe return string($r) The expression creates a list of bindings for a variable $var If $var occurs in an expression exp, then exp is evaluated for each binding For-clauses can be nested: for $r in doc("recipes.xml")//recipe for $v in doc("vegetables.xml")//vegetable return ...


Nested For-clauses: Example: Nested For-clauses: Example {for $r in doc("recipes.xml")//recipe return {for $i in $r//ingredient return {string($i/@name)} } } Returns my-recipes with titles as attributes and my-ingredients with names as text content


The Let Clause: The Let Clause Syntax: let $var := xpath-expr binds variable $var to a list of nodes, with the nodes in document order does not iterate over the list allows one to keep intermediate results for reuse (not possible in SQL) Example: let $ooreps := doc("recipes.xml")//recipe [.//ingredient/@name="olive oil"]


Let Clause: Example: Let Clause: Example {let $ooreps := doc("recipes.xml")//recipe [.//ingredient/@name="olive oil"] for $r in $ooreps return {$r/title/text()} {": "} {string($r/nutrition/@calories)} } Calories of recipes with olive oil Note the implicit string concatenation


Let Clause: Example (cntd.): Let Clause: Example (cntd.) The query returns: Beef Parmesan: 1167 Linguine Pescadoro: 532


The Where Clause: The Where Clause Syntax: where occurs before return clause similar to predicates in XPath comparisons on nodes: "=" for node equality ">" for document order Example: for $r in doc("recipes.xml")//recipe where $r//ingredient/@name="olive oil" return ...


Quantifiers: Quantifiers Syntax: some/every $var in satisfies $var is bound to all nodes in Test succeeds if is true for some/every binding Note: if is empty, then “some” is false and “all” is true


Quantifiers (Example): Quantifiers (Example) Recipes that have some compound ingredient Recipes where every ingredient is non-compound for $r in doc("recipes.xml")//recipe where some $i in $r/ingredient satisfies $i/ingredient Return $r/title for $r in doc("recipes.xml")//recipe where every $i in $r/ingredient satisfies not($i/ingredient) Return $r/title


Element Fusion: Element Fusion “To every recipe, add the attribute calories!” {let $rs := doc("recipes.xml")//recipe for $r in $rs return {$r/nutrition/@calories} {$r/title} }


Element Fusion (cntd.): Element Fusion (cntd.) The query result: Beef Parmesan with Garlic Angel Hair Pasta Ricotta Pie Linguine Pescadoro Zuppa Inglese Cailles en Sarcophages


Join: Join “Pair every ingredient with the recipes where it is used!” let $rs := doc("recipes.xml")//recipe for $i in $rs//ingredient for $r in $rs where $r//ingredient/@name=$i/@name return {$i/@name} {$r/title}


Join (cntd.): Join (cntd.) The query result: Beef Parmesan with Garlic Angel Hair Pasta , Beef Parmesan with Garlic Angel Hair Pasta , Beef Parmesan with Garlic Angel Hair Pasta ,


Restructuring: “For every ingredient, return all the recipes where it is used!” {let $rs := doc("recipes.xml")//recipe for $i in $rs//ingredient return {$i/@*} {$rs[.//ingredient/@name=$i/@name]/title} } Restructuring


Restructuring (cntd.): Restructuring (cntd.) The query result: Zuppa Inglese … Beef Parmesan with Garlic Angel Hair Pasta Linguine Pescadoro …


Eliminating Duplicates : Eliminating Duplicates The function distinct-values(Node Set) extracts the values of a sequence of nodes creates a duplicate free sequence of values Note the coercion: nodes are cast as values! Example: let $rs := doc("recipes.xml")//recipe return distinct-values($rs//ingredient/@name) yields xdt:untypedAtomic("beef cube steak"), xdt:untypedAtomic("onion, sliced into thin rings"), ...


Avoiding Multiple Results in a Join : Avoiding Multiple Results in a Join We want that every ingredient is listed only once: Eliminate duplicates using distinct-values! {let $rs := doc("recipes.xml")//recipe for $in in distinct-values( $rs//ingredient/@name) return {$rs[.//ingredient/@name=$in]/title} }


Avoiding Multiple Results (cntd.): Avoiding Multiple Results (cntd.) The query result: Beef Parmesan with Garlic Angel Hair Pasta Beef Parmesan with Garlic Angel Hair Pasta ... Linguine Pescadoro Cailles en Sarcophages ...


The Order By Clause: The Order By Clause Syntax: order by expr [ ascending | descending ] for $iname in doc("recipes.xml")//@name order by $iname descending return string($iname) yields "whole peppercorns", "whole baby clams", "white sugar", ...


The Order By Clause (cntd.): The Order By Clause (cntd.) The interpreter must be told whether the values should be regarded as numbers or as strings (alphanumerical sorting is default) for $r in $rs order by number($r/nutrition/@calories) return $r/title Note: The query returns titles ... but the ordering is according to calories, which do not appear in the output Not possible in SQL!


FLWOR Expresssions (pronounced “flower”): FLWOR Expresssions (pronounced “flower”) We have now seen the main ingredients of XQuery: For and Let clauses, which can be mixed a Where clause imposing conditions an Order by clause, which determines the order of results a Return clause, which constructs the output. Combination these yields FLWOR expressions.


Conditionals: Conditionals if (expr) then expr else expr Example let $is := doc("recipes.xml")//ingredient for $i in $is[not(ingredient)] let $u := if (not($i/@unit)) then attribute {"unit"} {"pieces"} else () creates an attribute unit="pieces" if none exists and an empty nodelist otherwise


Conditionals (cntd.): We use the conditional to construct variants of ingredients: let $is := doc("recipes.xml")//ingredient for $i in $is[not(ingredient)] let $u := if (not($i/@unit)) then attribute {"unit"} {"pieces"} else () return {$i/@* | $u} Conditionals (cntd.)


Conditionals (cntd.): Conditionals (cntd.) The query result: , ... , …


Grouping and Aggregation: Grouping and Aggregation Aggregation functions count, sum, avg, min, max Example: The number of simple ingredients per recipe for $r in doc("recipes.xml")//recipe return {attribute {"title"} {$r/title/text()}} {count($r//ingredient[not(ingredient)])}


Grouping and Aggregation (cntd.): Grouping and Aggregation (cntd.) The query result: 11, 12, 15, 8, 30


Nested Aggregation: Nested Aggregation “The recipe with the maximal number of calories!” let $rs := doc("recipes.xml")//recipe let $maxCal := max($rs//@calories) for $r in $rs where $r//@calories = $maxCal return string($r/title) returns "Cailles en Sarcophages"


Running Queries with Galax: Running Queries with Galax Galax is an open-source implementation of XQuery (http://www.galaxquery.org/) The main developers have taken part in the definition of XQuery Galax has is installed on the Linux machines of the department. To use it, you have to adjust your paths If you run a c-shell, add to .cshrc setenv GALAXHOME /usr/local/galax setenv PATH /usr/local/galax/bin:${PATH} If you run a bash shell, add to your .profile export GALAXHOME /usr/local/galax export PATH=$PATH:$GALAXHOME/bin


Running Queries with Galax (cntd.): Running Queries with Galax (cntd.) Write your query in a file .xq Call galax-run .xq in your shell The answer will be returned in the shell … or an error message More info on Galax can be found in the manual on the Galax website


Exercises: Exercises Write queries that produce A list, containing for every recipe the recipe's title element and an element with the number of calories The same, ordered according to calories The same, alphabetically ordered according to title The same, ordered according to the fat content The same, with title as attribute and calories as content. A list, containing for every recipe the top level ingredients, dropping the lower level ingredients


Sample Solution: Sample Solution {for $r in doc("recipes.xml")//recipe return {attribute {"title"} {$r/title} {for $i in $r/ingredient return if (not($i/ingredient)) then $i else {$i/@*} } }


More Exercises: More Exercises The file pods98.xml contains a list of all the papers published at the database conference PODS’98. It follows the DTD:


More Exercises (cntd.): More Exercises (cntd.) This is the beginning of the document: 17. PODS 1998: Seattle, Washington Ronald Fagin Fuzzy Queries in Multimedia Database Systems 1 10 Frank Neven Jan Van den Bussche Expressiveness of Structured Document Query Languages Based on Attribute Grammars 11 17 …


More Exercises (cntd.): More Exercises (cntd.) Write queries in XQuery to produce: For each paper title the number of authors The average number of authors per paper For each author, the list of papers to which they contributed A list of authors ordered according to the number of papers they have at the conference and alphabetically if the have the same number of papers with author name, paper titles and the total number of pages their papers occupy in the proceedings