6 XQuery

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

By: tp2006ster (20 month(s) ago)

hello sir i want to download can yo provide access

By: saleem2005ji (21 month(s) ago)

Hi its a nice knowledge.... can i have a download url of ppt

By: fitsum (30 month(s) ago)

nice presentation, can I get download it. Thx

Presentation Transcript

XQuery – A Language for Querying XML Documents: 

XQuery – A Language for Querying XML Documents Werner Nutt

Requirements for an XML Query Language: 

Requirements for an XML Query Language David Maier, W3C XML Query Requirements: Closedness: output must be XML Composability: wherever a set of XML elements is required, a subquery is allowed as well Support for key operations: selection extraction, projection restructuring combination, join fusion of elements

Requirements for an XML Query Language (cntd.): 

Requirements for an XML Query Language (cntd.) Can benefit from a schema, but should also be applicable without Retains the order of nodes Formal semantics: structure of results should be derivable from query defines equivalence of queries Queries should be representable in XML documents can have embedded queries

How Does One Design a Query Language?: 

How Does One Design a Query Language? In most query languages, there are two aspects to a query: Retrieving data (e.g., from … where … in SQL) Creating output (e.g., select … in SQL) Retrieval consists of Pattern matching (e.g., from … ) Filtering (e.g., where … ) … although these cannot always be clearly distinguished

XQuery Principles: 

XQuery Principles Data Model identical with the XPath data model documents are ordered, labeled trees nodes have identity nodes can have simple or complex types (defined in XML Schema) Xquery can be used without schemas, but can be checked against DTDs and XML schemas XQuery is a functional language no statements evaluation of expressions

A Query over the Recipes Document: 

<titles> {for $r in doc("recipes.xml")//recipe return $r/title} </titles> returns <titles> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> <title>Ricotta Pie</title> … </titles> A Query over the Recipes Document

Query Features: 

<titles> {for $r in doc("recipes.xml")//recipe return $r/title} </titles> Query Features doc(String) returns input document Sequence of results, one for each variable binding

Features: Summary: 

Features: Summary The result is a new XML document A query consists of parts that are returned as is ... and others that are evaluated (everything in {...} ) Calling the function doc(String) returns an input document XPath is used to retrieve nodes sets and values Iteration over node sets: let binds a variable to all nodes in a node set Variables can be used in XPath expressions return returns a sequence of results, one for each binding of a variable

XPath is a Fragement of XQuery: 

XPath is a Fragement of XQuery doc("recipes.xml")//recipe[1]/title returns <title>Beef Parmesan with Garlic Angel Hair Pasta</title> doc("recipes.xml")//recipe[position()<=3] /title returns <title>Beef Parmesan with Garlic Angel Hair Pasta</title>, <title>Ricotta Pie</title>, <title>Linguine Pescadoro</title> an element a list of elements

Beware: XPath Attributes: 

Beware: XPath Attributes doc("recipes.xml")//recipe[1]/ingredient[1] /@name → attribute name {"beef cube steak"} string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name) → "beef cube steak" a constructor for an attribute node a value of type string

XPath Attributes (cntd.): 

XPath Attributes (cntd.) <first-ingredient> {string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)} </first-ingredient> → <first-ingredient>beef cube steak</first-ingredient> an element with string content

XPath Attributes (cntd.): 

XPath Attributes (cntd.) <first-ingredient> {doc("recipes.xml")//recipe[1] /ingredient[1]/@name} </first-ingredient> → <first-ingredient name="beef cube steak"/> an element with an attribute Note: The XML that we write down is only the surface structure of the data model that is underlying XQuery

XPath Attributes (cntd.): 

XPath Attributes (cntd.) <first-ingredient oldName="{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}"> Beef </first-ingredient> → <first-ingredient oldName="beef cube steak"> Beef </first-ingredient>

Iteration with the For-Clause: 

Iteration with the For-Clause Syntax: for $var in xpath-expr Example: for $r in doc("recipes.xml")//recipe return string($r) The expression creates a list of bindings for a variable $var If $var occurs in an expression exp, then exp is evaluated for each binding For-clauses can be nested: for $r in doc("recipes.xml")//recipe for $v in doc("vegetables.xml")//vegetable return ...

Nested For-clauses: Example: 

Nested For-clauses: Example <my-recipes> {for $r in doc("recipes.xml")//recipe return <my-recipe title="{$r/title}"> {for $i in $r//ingredient return <my-ingredient> {string($i/@name)} </my-ingredient> } </my-recipe> } </my-recipes> Returns my-recipes with titles as attributes and my-ingredients with names as text content

The Let Clause: 

The Let Clause Syntax: let $var := xpath-expr binds variable $var to a list of nodes, with the nodes in document order does not iterate over the list allows one to keep intermediate results for reuse (not possible in SQL) Example: let $ooreps := doc("recipes.xml")//recipe [.//ingredient/@name="olive oil"]

Let Clause: Example: 

Let Clause: Example <calory-content> {let $ooreps := doc("recipes.xml")//recipe [.//ingredient/@name="olive oil"] for $r in $ooreps return <calories> {$r/title/text()} {": "} {string($r/nutrition/@calories)} </calories>} </calory-content> Calories of recipes with olive oil Note the implicit string concatenation

Let Clause: Example (cntd.): 

Let Clause: Example (cntd.) The query returns: <calory-content> <calories>Beef Parmesan: 1167</calories> <calories>Linguine Pescadoro: 532</calories> </calory-content>

The Where Clause: 

The Where Clause Syntax: where <condition> occurs before return clause similar to predicates in XPath comparisons on nodes: "=" for node equality "<<" and ">>" for document order Example: for $r in doc("recipes.xml")//recipe where $r//ingredient/@name="olive oil" return ...

Quantifiers: 

Quantifiers Syntax: some/every $var in <node-set> satisfies <expr> $var is bound to all nodes in <node-set> Test succeeds if <expr> is true for some/every binding Note: if <node-set> is empty, then “some” is false and “all” is true

Quantifiers (Example): 

Quantifiers (Example) Recipes that have some compound ingredient Recipes where every ingredient is non-compound for $r in doc("recipes.xml")//recipe where some $i in $r/ingredient satisfies $i/ingredient Return $r/title for $r in doc("recipes.xml")//recipe where every $i in $r/ingredient satisfies not($i/ingredient) Return $r/title

Element Fusion: 

Element Fusion “To every recipe, add the attribute calories!” <result> {let $rs := doc("recipes.xml")//recipe for $r in $rs return <recipes> {$r/nutrition/@calories} {$r/title} </recipes>} </result>

Element Fusion (cntd.): 

Element Fusion (cntd.) The query result: <result> <recipe calories="1167"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </recipe> <recipe calories="349"><title>Ricotta Pie</title></recipe> <recipe calories="532"><title>Linguine Pescadoro</title></recipe> <recipe calories="612"><title>Zuppa Inglese</title></recipe> <recipe calories="8892"> <title>Cailles en Sarcophages</title> </recipe> </result>

Join: 

Join “Pair every ingredient with the recipes where it is used!” let $rs := doc("recipes.xml")//recipe for $i in $rs//ingredient for $r in $rs where $r//ingredient/@name=$i/@name return <pair> {$i/@name} {$r/title} </pair>

Join (cntd.): 

Join (cntd.) The query result: <pair name="beef cube steak"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </pair>, <pair name="onion, sliced into thin rings"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </pair>, <pair name="green bell pepper, sliced in rings"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </pair>,

Restructuring: 

“For every ingredient, return all the recipes where it is used!” <result> {let $rs := doc("recipes.xml")//recipe for $i in $rs//ingredient return <ingredient> {$i/@*} {$rs[.//ingredient/@name=$i/@name]/title} </ingredient>} </result> Restructuring

Restructuring (cntd.): 

Restructuring (cntd.) The query result: <result> <ingredient amount="1" name="Alchermes liquor" unit="cup"> <title>Zuppa Inglese</title> </ingredient> … <ingredient amount="2" name="olive oil" unit="tablespoon"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> <title>Linguine Pescadoro</title> </ingredient> …

Eliminating Duplicates : 

Eliminating Duplicates The function distinct-values(Node Set) extracts the values of a sequence of nodes creates a duplicate free sequence of values Note the coercion: nodes are cast as values! Example: let $rs := doc("recipes.xml")//recipe return distinct-values($rs//ingredient/@name) yields xdt:untypedAtomic("beef cube steak"), xdt:untypedAtomic("onion, sliced into thin rings"), ...

Avoiding Multiple Results in a Join : 

Avoiding Multiple Results in a Join We want that every ingredient is listed only once: Eliminate duplicates using distinct-values! <result> {let $rs := doc("recipes.xml")//recipe for $in in distinct-values( $rs//ingredient/@name) return <recipes with="{$in}"> {$rs[.//ingredient/@name=$in]/title} </recipes> } </result>

Avoiding Multiple Results (cntd.): 

Avoiding Multiple Results (cntd.) The query result: <result> <recipes with="beef cube steak"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </recipes> <recipes with="onion, sliced into thin rings"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </recipes> ... <recipes with="salt"> <title>Linguine Pescadoro</title> <title>Cailles en Sarcophages</title> </recipes> ...

The Order By Clause: 

The Order By Clause Syntax: order by expr [ ascending | descending ] for $iname in doc("recipes.xml")//@name order by $iname descending return string($iname) yields "whole peppercorns", "whole baby clams", "white sugar", ...

The Order By Clause (cntd.): 

The Order By Clause (cntd.) The interpreter must be told whether the values should be regarded as numbers or as strings (alphanumerical sorting is default) for $r in $rs order by number($r/nutrition/@calories) return $r/title Note: The query returns titles ... but the ordering is according to calories, which do not appear in the output Not possible in SQL!

FLWOR Expresssions (pronounced “flower”): 

FLWOR Expresssions (pronounced “flower”) We have now seen the main ingredients of XQuery: For and Let clauses, which can be mixed a Where clause imposing conditions an Order by clause, which determines the order of results a Return clause, which constructs the output. Combination these yields FLWOR expressions.

Conditionals: 

Conditionals if (expr) then expr else expr Example let $is := doc("recipes.xml")//ingredient for $i in $is[not(ingredient)] let $u := if (not($i/@unit)) then attribute {"unit"} {"pieces"} else () creates an attribute unit="pieces" if none exists and an empty nodelist otherwise

Conditionals (cntd.): 

We use the conditional to construct variants of ingredients: let $is := doc("recipes.xml")//ingredient for $i in $is[not(ingredient)] let $u := if (not($i/@unit)) then attribute {"unit"} {"pieces"} else () return <ingredient> {$i/@* | $u} </ingredient> Conditionals (cntd.)

Conditionals (cntd.): 

Conditionals (cntd.) The query result: <ingredient name="beef cube steak" amount="1.5" unit="pound"/>, ... <ingredient name="eggs" amount="12" unit="pieces"/>, …

Grouping and Aggregation: 

Grouping and Aggregation Aggregation functions count, sum, avg, min, max Example: The number of simple ingredients per recipe for $r in doc("recipes.xml")//recipe return <number> {attribute {"title"} {$r/title/text()}} {count($r//ingredient[not(ingredient)])} </number>

Grouping and Aggregation (cntd.): 

Grouping and Aggregation (cntd.) The query result: <number title="Beef Parmesan with Garlic Angel Hair Pasta">11</number>, <number title="Ricotta Pie">12</number>, <number title="Linguine Pescadoro">15</number>, <number title="Zuppa Inglese">8</number>, <number title="Cailles en Sarcophages">30</number>

Nested Aggregation: 

Nested Aggregation “The recipe with the maximal number of calories!” let $rs := doc("recipes.xml")//recipe let $maxCal := max($rs//@calories) for $r in $rs where $r//@calories = $maxCal return string($r/title) returns "Cailles en Sarcophages"

Running Queries with Galax: 

Running Queries with Galax Galax is an open-source implementation of XQuery (http://www.galaxquery.org/) The main developers have taken part in the definition of XQuery Galax has is installed on the Linux machines of the department. To use it, you have to adjust your paths If you run a c-shell, add to .cshrc setenv GALAXHOME /usr/local/galax setenv PATH /usr/local/galax/bin:${PATH} If you run a bash shell, add to your .profile export GALAXHOME /usr/local/galax export PATH=$PATH:$GALAXHOME/bin

Running Queries with Galax (cntd.): 

Running Queries with Galax (cntd.) Write your query in a file <filename>.xq Call galax-run <filename>.xq in your shell The answer will be returned in the shell … or an error message More info on Galax can be found in the manual on the Galax website

Exercises: 

Exercises Write queries that produce A list, containing for every recipe the recipe's title element and an element with the number of calories The same, ordered according to calories The same, alphabetically ordered according to title The same, ordered according to the fat content The same, with title as attribute and calories as content. A list, containing for every recipe the top level ingredients, dropping the lower level ingredients

Sample Solution: 

Sample Solution <results> {for $r in doc("recipes.xml")//recipe return <recipe> {attribute {"title"} {$r/title} {for $i in $r/ingredient return if (not($i/ingredient)) then $i else <ingredient> {$i/@*} </ingredient> } </recipe> } </results>

More Exercises: 

More Exercises The file pods98.xml contains a list of all the papers published at the database conference PODS’98. It follows the DTD: <!ELEMENT proceedings (name, contents )> <!ELEMENT name (#PCDATA )> <!ELEMENT contents (article*)> <!ELEMENT article (author+, title, from, to)> <!ELEMENT author (#PCDATA )> <!ELEMENT title (#PCDATA )> <!ELEMENT from (#PCDATA )> <!ELEMENT to (#PCDATA )>

More Exercises (cntd.): 

More Exercises (cntd.) This is the beginning of the document: <proceedings><name>17. PODS 1998: Seattle, Washington</name> <contents> <article> <author>Ronald Fagin</author> <title>Fuzzy Queries in Multimedia Database Systems</title> <from> 1</from> <to>10</to> </article> <article> <author>Frank Neven</author> <author>Jan Van den Bussche</author> <title>Expressiveness of Structured Document Query Languages Based on Attribute Grammars</title> <from> 11</from> <to>17</to> </article> …

More Exercises (cntd.): 

More Exercises (cntd.) Write queries in XQuery to produce: For each paper title the number of authors The average number of authors per paper For each author, the list of papers to which they contributed A list of authors ordered according to the number of papers they have at the conference and alphabetically if the have the same number of papers with author name, paper titles and the total number of pages their papers occupy in the proceedings