Share PowerPoint. Anywhere!

seml

Uploaded from authorPOINT Lite
Download as Download Not Available PPT
Presentation Description

No description available

Views: 6
Like it  ( Likes) Dislike it  ( Dislikes)
Added: September 29, 2007 This presentation is Public
Presentation Category :Entertainment
Tags Add Tags
Presentation StatisticsNew!
Views on authorSTREAM: 5 | Views from Embeds: 1
Others - 1 views
Presentation Transcript

Semantics for Valid XML Documents : Semantics for Valid XML Documents Harold Boley Dagstuhl Seminar 01021 Semantics in Databases Jan. 7-12, 2001


Overview : Overview Introduction of valid XML documents to establish a ‘grammar-typed’ syntax for Web data Survey of some practical (Web) aspects of three semantics: Transformational semantics (incl. proof-theoretic ~) Model-theoretic semantics Metadata semantics Study of (Web) applicability and combination of the three semantics Systems using or implementing such semantics Running example: address-document processing


Address Example: HTML to XML : Address Example: HTML to XML Xaver M. Linde
Wikingerufer 7
10555 Berlin HTML Markup: XML tags are chosen for content-structuring needs

Xaver M. Linde Wikingerufer 7 10555 Berlin
XML Markup: While not conveying any formal semantics:


Address Example: XML to XML :
Xaver M. Linde Wikingerufer 7 10555 Berlin
Address Example: XML to XML
Xaver M. Linde Wikingerufer 7 10555 Berlin
XML Markup 1: XML Markup 2: XML stylesheets are usable to transform XML elements E.g., for data interoperation:


Address Example: XML to XML :
Xaver M. Linde Wikingerufer 7 10555 Berlin
Address Example: XML to XML
Xaver M. Linde Wikingerufer 7 10555 Berlin
XML Markup 1: XML Markup 2: XML stylesheets are usable to transform XML elements E.g., for a kind of normalization:


Address Example: XML Queries :
Xaver M. Linde Wikingerufer 7 10555 Berlin
WHERE
Xaver M. Linde $s $t
CONSTRUCT $s $t Address Example: XML Queries XML Markup: XML Query (XML-QL): XML queries can select subelements of XML elements element s subelements Wikingerufer 7 10555 Berlin


Address Example: Prolog Queries : address( name("Xaver M. Linde"), street("Wikingerufer 7"), town("10555 Berlin") ) Address Example: Prolog Queries Prolog Term: Prolog Query: Prolog queries can select substructures of Prolog structures S = "Wikingerufer 7" T = "10555 Berlin" structure s substructures address( name("Xaver M. Linde"), street(S), town(T) )


Address Example: The Element Tree : Address Example: The Element Tree Node-Labeled, (Left-to-Right-)Ordered Element Tree: subtrees tree


Address Example: Document Type Definition and Tree (1) : Address Example: Document Type Definition and Tree (1) Document Type Tree: Document Type Definition (DTD): address PCDATA PCDATA PCDATA name street town address ::= name street town name ::= PCDATA street ::= PCDATA town ::= PCDATA Extended Backus-Naur Form (EBNF):


Address Example: Document Type Definition and Tree (2) : Address Example: Document Type Definition and Tree (2) Document Type Tree: Document Type Definition (DTD): address PCDATA PCDATA PCDATA name street town place


Well-Formedness and Validity : Well-Formedness and Validity Open and close all tags Empty tags end with /> There is a unique root element Elements may not overlap Attribute values are quoted < and & are only used to start tags and entities Only the five predefined entity references are used Matches the type-like constraints listed in the DTD (or, can be generated from DTD as linearized CF grammar-derivation tree) XML principles for a document being well-formed: XML principle for a document being valid with respect to a DTD : Checked by validators such as http://www.stg.brown.edu/service/xmlvalid/


Practical Semantics Need: Web(-Page) Transformations, Models, and Metadata : Practical Semantics Need: Web(-Page) Transformations, Models, and Metadata Up to now: XML with Document Type Definitions (DTDs) or XML Schemas as the syntactic basis Practical need for Web semantics: 1) Getting meaning from XML Web pages through translation results 2) Modeling formal XML elements by constructing their extensions (finite or infinite sets) 3) Annotating arbitrary Web objects in RDF/XML for semantic retrieval


Practical Semantics Techniques: Web Transformations, Models, and Metadata : Practical Semantics Techniques: Web Transformations, Models, and Metadata Corresponding semantic techniques: 1) Transformational semantics translates XML into other XML or HTML documents via XSLT stylesheets (e.g. using Cocoon engine) 2) Model-theoretic semantics explicates rule consequences by generating Herbrand models for XML knowledge bases of relations and functions 3) Metadata semantics in XML-based RDF (Resource Description Framework) and RDF Schema enables high-precision search engines for Berners-Lee’s "Semantic Web"


Address Document: Transformational Semantics via an XSLT Stylesheet : Address Document: Transformational Semantics via an XSLT Stylesheet
Me2XML 96 Hyper Road Boston
RDF4All 2001 Broadway New York
XML4You 96 Hyper Road Boston
Me2XML 96 Hyper Road Boston
RDF4All 2001 Broadway New York
XML4You 96 Hyper Road Boston
% start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses XSLT template


Address Document: XSLT Stylesheet Template as a Tree-Transforming Rule : Address Document: XSLT Stylesheet Template as a Tree-Transforming Rule


Colocation Rule: Model-Theoretic Semantics via Consequence Generation : Colocation Rule: Model-Theoretic Semantics via Consequence Generation Me2XML XML4You % start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses % start rule base for colocated colocated(name(N1),name(N2)) :- address(name(N1),place(P)), address(name(N2),place(P)), lexiless(N1,N2). % end rule base for colocated % start fact base for colocated colocated( name( "Me2XML" ), name( "XML4You") ). % end fact base for colocated Horn rule The Herbrand model of the rule and addresses is the set of the colocated and address ground facts N1 N2 . . .


Linked Address Documents: Metadata Semantics via RDF Annotations : Linked Address Documents: Metadata Semantics via RDF Annotations flat
Me2XML 96 Hyper Road Boston
RDF4All 2001 Broadway New York
. . .
nested
Me2XML 96 Hyper Road Boston
. . .
http://addr.flat.com http://addr.nest.com


Practical Semantics Combination: Metadata  Transformation  Model : Practical Semantics Combination: Metadata  Transformation  Model Generate the finite model containing all colocated facts derivable from given flat-address base facts, with inference rules available only for nested facts This problem can be divided into three subproblems: Navigate metadata, starting from flat-address URL, for available nested-address version (alternatively, use a semantic search engine with Shape = nested) If none available, transform flat-address facts into nested addresses via the URL’s XSLT stylesheet Apply colocated rule to nested-address base facts to generate finite model of colocated facts Consider the following problem of (inferential, XML) data mining with report generation for findings:


(1) Check Metadata via RDF Annotations : (1) Check Metadata via RDF Annotations flat
Me2XML 96 Hyper Road Boston
RDF4All 2001 Broadway New York
. . .
http://addr.flat.com http://addr.nest.com ConvertsTo Shape


(2) Transform via the XSLT Stylesheet : (2) Transform via the XSLT Stylesheet
Me2XML 96 Hyper Road Boston
RDF4All 2001 Broadway New York
XML4You 96 Hyper Road Boston
% start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses PCD


(3) Generate Model as Rule Consequences : (3) Generate Model as Rule Consequences Me2XML XML4You % start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses % start rule base for colocated colocated(name(N1),name(N2)) :- address(name(N1),place(P)), address(name(N2),place(P)), lexiless(N1,N2). % end rule base for colocated Horn rule Data findings report: Me2XML and XM4You might be the same organization


Model-Theoretic Semantics : Model-Theoretic Semantics Practically usable only for finite models (such as in the address example) Still theoretically interesting to formalize semantics of XML-based inference systems (such as RFML, RuleML, DAML, or OIL) Even when finite, not practical for highly distributed and highly dynamic fact bases (such as the ever-changing geographic data scattered over the Web) Perhaps to be replaced/augmented by semantics characterizing new logic for the Web, which is open, uncertain, and paraconsistent


Transformational Semantics : Transformational Semantics Practically usable for all declarative programs, e.g. for normalization or interoperation (such as in the address example) XSLT stylesheet engines flourish, e.g. Cocoon, and probably to be built directly into most Web browsers XSLT with variables and parameter passing recently shown to be relationally complete The emerging XML query algebra permits similar transformations, incl. certain functional programs The Rule Markup Initiative will provide a lattice of XML DTDs (Schemas) for RuleML subsets containing inference and/or transformation rules


Metadata Semantics : Metadata Semantics Practically usable for describing/localizing all possible (Web) objects, whose internals need not be accessible (unlike in the address example) RDF can be formalized logically and its expressive power may be generalized via logic programming (and hypergraphs): metadata combined with rules Metadata complemented by subsumption semantics for XML tags to better integrate XML and RDF RDF extensible by subClassOf/subPropertyOf vocabularies (cf. sorted logics), as in RDF Schema, and, further, by full ontologies, as in DAML or OIL


SubPropertyOf Example: An Illustrative Hierarchy of Properties : SubPropertyOf Example: An Illustrative Hierarchy of Properties Color Shape Texture Surface Composition Density Hardness Body nested flat . . . . . . Property inheritance handled as, e.g., for description logic roles: RDF Schema  OIL and DAML


Conclusions : Conclusions Identified and exemplified three complementary semantics for XML data: Transformations Models Metadata For data distributed in the Web, models are of limited use, while transformations and metadata are being widely applied Further semantics will be needed for the Web, e.g.: SQL semantics: Contributions to this Dagstuhl Seminar “URI-deictic” logics: Berners-Lee’s “pointing as proving” Already the different usage of transformations and metadata would suggest: