Towards a new generation ofsemantic web applications : Towards a new generation of semantic web applications Prof. Enrico Motta, PhD Knowledge Media Institute The Open University Milton Keynes, UK
The Semantic Web : The Semantic Web A large scale, heterogenous collection of formal, machine processable, web accessible, ontology-based statements (semantic metadata) about web resources and other entities in the world, expressed in a XML-based syntax
The Semantic Web (pragmatic def.) : The Semantic Web (pragmatic def.) The collection of all statements expressed in one of the following formalisms: {OWL, RDF, DAML, DAML+OIL, RDF-A…}, which can be accessed on the web
Slide4 : Person Organization String Organization-Unit partOf hasAffiliation worksInOrgUnit hasJobTitle Ontology Metadata
Slide5 : Ontology Metadata UoD
Proposition #1 : Proposition #1 The SW today has already reached a level of scale good enough to make it a very useful source of knowledge to support intelligent applications
This is unprecedented in the history of AI
So, let's have a look at the semantic web as it is today…. : So, let's have a look at the semantic web as it is today….
Charting the web : Charting the web
Charting the web (2) : Charting the web (2)
Proposition #2 : Proposition #2 The SW may well provide a solution to one of the classic AI challenges: how to construct and manage large volumes of knowledge to construct truly intelligent problem solvers and address the brittleness of traditional KBS
Knowledge Representation Hypothesis : Knowledge Representation Hypothesis Any mechanically embodied intelligent process will be comprised of structural ingredients that
we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and
independent of such external semantic attribution, play a formal but causal and essential role in engendering the behaviour that manifests that knowledge
Brian Smith, 1982
Intelligence as a function of possessing domain knowledge : Intelligence as a function of possessing domain knowledge Intelligent Behaviour KA
Bottleneck
The Knowledge Acquisition Bottleneck : The Knowledge Acquisition Bottleneck Intelligent Behaviour KA
Bottleneck
SW as Enabler of Intelligent Behaviour : SW as Enabler of Intelligent Behaviour Intelligent Behaviour
KBS vs SW Systems : KBS vs SW Systems
Key Paradigm Shift : Key Paradigm Shift
Overall Goal : Overall Goal Our research programme is to contribute to the development of this large-scale web of data and develop a new generation of web applications able to exploit it to provide intelligent functionalities
So, how can we exploit this emerging, large scale semantic resource? : So, how can we exploit this emerging, large scale semantic resource? Some examples….
Ontology Matching : Ontology Matching
New paradigm: use of background knowledge : New paradigm: use of background knowledge A B Background Knowledge
(external source) A’ B’ R
External Source = One Ontology : External Source = One Ontology Aleksovski et al. EKAW’06
Map (anchor) terms into concepts from a richly axiomatized domain ontology
Derive a mapping based on the relation of the anchor terms Assumes that a suitable (rich, large) domain ontology (DO) is available.
External Source = Web : External Source = Web van Hage et al. ISWC’05
rely on Google and an online dictionary in the food domain to extract semantic relations between candidate terms using IR techniques A B rel + OnlineDictionary IR Methods Precision increases significantly if domain specific sources are used:
50% - Web;
75% - domain texts. Does not rely on a rich Domain Ont,
External Source = SW : Proposal:
rely on online ontologies (Semantic Web) to derive mappings
ontologies are dynamically discovered and combined A B rel Semantic Web Does not rely on any pre-selected knowledge sources. M. Sabou, M. d’Aquin, E. Motta, “Using the Semantic Web as Background Knowledge in Ontology Mapping", Ontology Mapping Workshop, ISWC’06. Best Paper Award External Source = SW
Slide33 : How to combine online ontologies to derive mappings?
Strategy 1 - Definition : Strategy 1 - Definition Find ontologies that contain equivalent classes for A and B and use their relationship in the ontologies to derive the mapping. A B rel Semantic Web A1’ B1’ A2’ B2’ An’ Bn’ O1 O2 On For each ontology use these rules: … These rules can be extended to take into account indirect relations between A’ and B’, e.g., between parents of A’ and B’:
Strategy 1- Examples : Strategy 1- Examples
Strategy 2 - Definition : Strategy 2 - Definition Principle: If no ontologies are found that contain the two terms then combine information from multiple ontologies to find a mapping. A B rel Semantic Web A’ B C C’ B’ rel rel Details:
(1) Select all ontologies containing A’ equiv. with A
(2) For each ontology containing A’:
(a) if find relation between C and B.
(b) if find relation between C and B. Details:
(1) Select all ontologies containing A’ equiv. with A
(2) For each ontology containing A’:
(a) if find relation between C and B.
(b) if find relation between C and B.
Strategy 2 - Examples : Strategy 2 - Examples Vs. (midlevel-onto) (Tap) Ex1: Vs. Ex2: (r1) (pizza-to-go) (SUMO) (Same results for Duck, Goose, Turkey) (r1) Vs. Ex3: (pizza-to-go) (wine.owl) (r3)
Large Scale Evaluation : Evaluation: 1600 mappings, two teams
Average precision: 70% (comparable to best in class) (derived from 180 different ontologies) Matching AGROVOC (16k terms) and NALT(41k terms) Large Scale Evaluation M. Sabou, M. d’Aquin, W.R. van Hage, E. Motta, “Improving Ontology Matching by Dynamically Exploring Online Knowledge“.
Chart 2 : Chart 2
Proposition #3 : Proposition #3 Using the SW to provide dynamically background knowledge to tackle the Agrovoc/NALT mapping problem provides the first ever test case in which the SW, viewed as a large scale heterogeneus resource, has been successfully used to address a real-world problem
Next Generation Semantic Web Applications : Next Generation Semantic Web Applications NG SW Application
Able to exploit the SW at large
Dynamically retrieving the relevant semantic resources
Combining several, heterogeneous Ontologies
Contrast with 1st generation SW Applications : Typically use a single ontology
Usually providing a homogeneous view over heterogeneous data sources.
Limited use of existing SW data
Typically closed to semantic resources Contrast with 1st generation SW Applications 1st generation SW applications are far more similar to traditional KBS (closed semantic systems) than to 'real' SW applications (open semantic systems)
It is still early days.. : 1895 2007 It is still early days..
Current Gateway to the Semantic Web : Current Gateway to the Semantic Web
Limitations of Swoogle : Limitations of Swoogle Very limited quality control mechanisms
Many ontologies are duplicated
No quality information provided
Limited Query/Search mechanisms
Only keyword search; no distinction between types of elements
need for more powerful query methods (e.g., ability to pose formal queries; ability to distinguish between classes and instances, etc…)
Limited range of ontology ranking mechanisms
Swoogle only uses a 'popularity-based' one
No support for ontology modularization
A New Gateway to the Semantic Web : A New Gateway to the Semantic Web
Ontology Structuring Relations : Ontology Structuring Relations extends inconsistent-with
Ontology Structuring Relations : Ontology Structuring Relations extends Inconsistent-with inconsistent-with
Formal Queries and relation discovery… : Formal Queries and relation discovery…
Current state of Watson : Current state of Watson Initial version implemented
Demo version available online
See http://watson.kmi.open.ac.uk/
However still rather unstable…..
Stable version to be available within 4-6 weeks
Initial crawl of the SW has already produced interesting results….
Some initial figures… : Some initial figures… Lots of ontologies are in OWL FULL (3x the number of OWL Lite)
… but most of the ontologies use only a very restricted sub-part of the expressivity of OWL and DAML, e.g.,
only 147 go beyond ALC
role transitivity is used in only 11 ontologies……..
Almost 20% of semantic resources appear to be duplicates
Next Generation Semantic Web Applications : Next Generation Semantic Web Applications PowerMagpie PowerAqua
Folksonomies : Folksonomies Tags are great to organize data!!! But they don’t help much when searching…
Finding tagged images : Finding tagged images
Slide55 : Finding tagged images – FLOWER
What if … : What if … Rose Tulip Flower Lilac …folksonomies were semantically richer
Finding tagged images –FLOWER (II) : Finding tagged images – FLOWER (II)
Learning Relations Between Tags : Learning Relations Between Tags Tags {camera, digital, photograph}
{damage, flooding, hurricane, katrina, Louisiana} Clusters Ontology NLP/Clustering Find and combine
Online ontologies L.Specia, E. Motta, "Integrating Folksonomies with the Semantic Web", ESWC 2007.
In More Detail… : In More Detail…
Examples : Examples
Examples : Examples
Examples : Examples
Key Research Tasks : Key Research Tasks Overall Infrastructure
crawling, storing, structuring, querying the SW
Ontology Selection
In the context of dynamically identifying the sources of knowledge relevant to the needs of a system
Ontology Mapping
When integrating information from different ontologies
When mapping query/specs to ontologies
Ontology Modularization
Find the sub-modules relevant to a system's query.
Semantic Markup Generation
From various types of sources
New task context : New task context Key point is that NG-SW applications require solutions in a new dynamic context (run-time rather than design-time)
Example: Ontology Mapping
Much current work focuses on design-time mapping of complete ontologies
Example: Ontology Selection
Current work focuses on user-mediated ontology selection
Example: Ontology Modularization
Current work by and large assumes that the user is in the loop
References : References Ontology Selection
Sabou, M., Lopez, V., Motta, E. (2006). "Ontology Selection for the Real Semantic Web: How to Cover the Queen’s Birthday Dinner?". Proceedings of EKAW 2006
Ontology Modularization
D'Aquin, M., Sabou, M., Motta, E. (2006). "Modularization: A key for the dynamic selection of relevant knowledge components". ISWC 2006 Workshop on Ontology Modularization
Watson
d’Aquin, M., Sabou, M., Dzbor, M., Baldassarre, C., Gridinoc, L., Angeletou, S. and Motta, E.: "WATSON: A Gateway for the Semantic Web". Poster Session at ESWC 2007
References (2) : References (2) Ontology Mapping
Lopez, V., Sabou, M., Motta, E. (2006). "Mapping the real semantic web on the fly". ISWC 2006
Sabou, M., D'Aquin, M., Motta, E. (2006). "Using the semantic web as background knowledge for ontology mapping". ISWC 2006 Workshop on Ontology Mapping.
Intg. of folksonomies and SW
L.Specia, E. Motta, "Integrating Folksonomies with the Semantic Web", ESWC 2007.
'Vision' Papers : 'Vision' Papers Motta, E., Sabou, M. (2006). "Next Generation Semantic Web Applications". 1st Asian Semantic Web Conference, Beijing.
Motta, E., Sabou, M. (2006). "Language Technologies and the Evolution of the Semantic Web". LREC 2006, Genoa, Italy.
Motta, E. (2006). "Knowledge Publishing and Access on the Semantic Web: A Socio-Technological Analysis". IEEE Intelligent Systems, Vol.21, 3, (88-90).
Conclusions : Conclusions SW provides an unprecedented opportunity to build a new generation of intelligent systems, able to exploit large scale, heterogeneous KBs
This new class of systems is fundamentally different in many respects both from traditional KBS and even from early SW applications
The size of the SW is increasing steadily and the infrastructure is getting more and more robust. These developments should enable more and more new generation SW applications to emerge within 2-3 years
Current Gateway to the Semantic Web : Current Gateway to the Semantic Web