training06 warren bsa 01

Category: Entertainment

Presentation Description

No description available.


Presentation Transcript

Building semantic applications: 

Building semantic applications Paul Warren BT ACAI’05/SEKT’05 ADVANCED COURSE ON KNOWLEDGE TECHNOLOGIES

Introductions: myself, and SEKT: 

Introductions: myself, and SEKT Paul Warren Next Generation Web Research, BT SEKT SEmantic Knowledge Technologies machine learning for ontology creation HLT for metadata extraction managing and reasoning with ontologies Motivated by corporate knowledge needs looking forward to the Semantic Web


Motivation The need for semantics Acquiring and using semantics Integrating information SEKT Applications Ontology engineering in SEKT Applications Challenges


Motivation ‘Knowledge management’ Knowledge worker productivity is the biggest challenge facing organisations 40% of U.S. workforce are knowledge workers Peter Drucker Information integration Heterogeneous data sources across or within organisations sensor networks

The need for semantics: 

The need for semantics Corporate workers are overwhelmed with information: from intranets, emails, external newslines … but may still lack the information they require They need information identified: by semantics, not just keywords precise and complete by their interests and their task context defined semantically

Higher precision, greater recall: 

Higher precision, greater recall Precision Find me information about Washington the man, not the state or city Find me information about a company called X which operates in industry Y Recall Ask for information about ‘George W Bush’ and be given documents on ‘the President’ the need for semantics

Precision in searching: 

Precision in searching the need for semantics

Interests and context: 

Interests and context Need information about Jaguar? interested in cars, the natural world, South America … with a context defined by current activities Not just about searching interest & context to share information … … and to push information to user … plus many integrated applications the need for semantics

Too much relevant information: 

Too much relevant information They may even have too much relevant information Need to: aggregate from disparate sources remove duplication present meaningfully classified summarised the need for semantics

In the right form: 

In the right form Depending on physical context mobile phone, PDA, blackberry With appropriate visualisation relation between documents & concepts And expressed in natural language where this aids understanding multilingual Integrated into the desktop applications Seamless proactive, not reactive the need for semantics

Visualisation knowledge: 

Visualisation knowledge Key: white - concepts; orange – projects; lighter shading - clusters of projects the need for semantics

The goal: 

The goal Finding and sharing knowledge through its semantics for improved precision and recall for the user’s interests and current context Presenting information visually in natural language Extracting information in a meaningful way, without duplication Displaying all relevant information from the document and the knowledgebase the need for semantics

Acquiring and using semantics: 

Acquiring and using semantics Some manually generated for high value applications e.g. life sciences Most (semi-)automatically generated machine learning / statistical techniques HLT: ontology-based information extraction from context


Context What is known about the author? use his interests to disambiguate What is attached to an object import an object, import its metadata Where is it? position in folder structure Provenance attachment from email acquiring and using semantics

Ontology modelling: 

Ontology modelling sells to operates in employee size acquiring and using semantics

Understanding ontologies: 

Understanding ontologies On the left is a hierarchical classification of companies. This distinguishes between private and public companies and EU and non-EU companies. Note that, unlike in a taxonomy, a class may have more than one superclass. So that ‘companies on the New York Stock Exchange’ is both a subclass of the class ‘non E.U. companies’ and also of the class ‘public companies’. The classes are made up of instances, in this case individual companies, which are not shown here. Instances of a class are, of course, also instances of its superclasses. So any instance of ‘companies on the London Stock Exchange’ (e.g. BT) is also an instance of ‘public companies’, ‘E.U. companies’ and ‘companies’. On the right is shown another part of the ontology, this time concerned with classifying industries. All classes in an ontology are related by a chain of superclasses to a class ‘Thing’ which contains all instances in the ontology As well as classes and instances, an ontology contains properties. Properties, shown by arrows in the diagram, are defined on a given class and are of two kinds. One kind of property relates the instances of the class to some literal value. An example of this is the property ‘employee size’ which could be used to describe how many employees a company has. The other kind of property relates instances of one defined class to instances of another, or the same, defined class. The property ‘operates in’ relates companies to industries; whilst ‘sells to’ relates companies to one another. The properties shown here apply to all the subclasses of ‘company’ (since instances of the subclasses are also instances of ‘company’), whilst we could have defined additional properties specific to any of the subclasses. Ontologies have formally defined semantics. This means that computers can reason about the constructs in an ontology. Computer scientists, mathematicians and logicians have developed a great deal of formal theory to understand how to do this most effectively and efficiently. Recently this has resulted in the standardisation by the W3C of the ontology language, OWL ( OWL exists in a variety of species, which correspond to varying degrees of implementational and computational difficulty. acquiring and using semantics


Metadata Describing documents, sub-documents, pages … author, creation date, topic(s), related to, … entities within documents classes: people, companies, roles … relations: CEO of … building a knowledgebase acquiring and using semantics

Accessing a knowledgebase: 

Accessing a knowledgebase acquiring and using semantics

The knowledgebase: 

The knowledgebase acquiring and using semantics

Ontology-based information extraction: 

Ontology-based information extraction Ryanair announced yesterday that it will make Shannon its next European base, expanding its route network to 14 in an investment worth around €180m. The airline says it will deliver 1.3 million passengers in the first year of the agreement, rising to two million by the fifth year. acquiring and using semantics

Information integration: 

Information integration Motivated by Incompatible legacy systems Mergers and acquisitions Rapidly forming virtual organisations and supply chains Sensor networks Goal Merging information from heterogeneous unstructured (text) sources … with structured information

Mapping ontologies: 

Mapping ontologies Semi-automatic techniques based on similarities of name & structure or even sound (for = 4) e.g. PROMPT suite – plug-ins for Protégé Semantic mapping – set based equality(=) mismatch (┴) more general(⊆) more specific(⊇) overlap (∩) information integration

Applications … in SEKT: 

Applications … in SEKT

Intelligent content management: 

Intelligent content management Currently Two major document databases million articles – abstracts plus some full text Text-based and some attribute-based querying: e.g. author, date information spaces defined by queries BT digital library SEKT applications

Improving and extending: 

Improving and extending Better precision and recall in searching, alerting, sharing Automatic document annotation extending the knowledgebase clicking through to the knowledgebase An extended document corpus focussed crawling from Web and intranet Automatic classification extending and improving manual approach Browsing related documents Driven by interests and context learned from user’s behaviour SEKT applications – intelligent content management

SEKT architecture: 

SEKT architecture SEKT applications creating & amending concepts, instances annotating & correcting annotations

Knowledge management: 

Knowledge management sharing and reusing knowledge across a global team … … building on Siemens knowledgemotion® SEKT applications

Improving knowledge sharing: 

Improving knowledge sharing Sharing: Elements presentations, lessons learned Solutions application module, graphical interface Project approaches methodologies, models Pre-packaged projects with direct sales impact SEKT applications – knowledge management

Intelligent decision support: 

Intelligent decision support SEKT applications a database of frequently asked questions – using semantic distance to identify questions and answers with justification drawn from comprehensive legal databases combining formal and informal knowledge

Semantic distance: 

Semantic distance Semantic distance is based on weighted path length between concepts Path length is based on navigation from one concepts to another through any relation available Is-a Part-of Follows Actor … SEKT applications – intelligent decision support Source: iSOCO

Better decisions: 

Better decisions Using Ontology of Professional Legal Knowledge developed with DILIGENT methodology Rulings a variety of legal databases Mapping between models of PLK and rulings SEKT applications – intelligent decision support

OPLK classes identified: 

OPLK classes identified SEKT applications – intelligent decision support


PROCEEDINGS Intuitive ontological subdomains SEKT applications – intelligent decision support

Using factorial analysis: 

Using factorial analysis SEKT applications – intelligent decision support

Ontological subdomains: 

Ontological subdomains SEKT applications – intelligent decision support

Architecture of Iuriservice: 

Architecture of Iuriservice SEKT applications – intelligent decision support

Ontology engineering in SEKT: 

Ontology engineering in SEKT PROTON – PROTo ONtology ~ 250 classes; ~ 100 properties domain independent compliance with popular standards good coverage of concrete entities people, organisations, numbers OWL Lite

Person class: 

Person class Subclass of Agent Superclass of Man and Woman hasPosition Person -> JobPosition hasProfession Person -> Profession hasRelative, isBossOf Person -> Person Ontology engineering in SEKT

Property and class hierarchies: 

Property and class hierarchies Ontology engineering in SEKT hasRelative Agent Group Organization Commercial Organization Charity Company Airline Bank Insurance Company Media Company

Profiles in the Digital Library: 

Profiles in the Digital Library Ontology engineering in SEKT


Topics UserProfile isCurrentlyInterestedIn Topic InspecRecord hasSubject Topic Topics are instances of the class Topic Compare ‘taxonomic’ approach Avoids classes as property values OWL Full Ontology engineering in SEKT

Classes as property values: 

Classes as property values Ontology engineering in SEKT Source: Representing Classes as Property Values, Natasha Noy, W3C


Diligent DIstributed Loosely-controlled and evolvInG Engineering of oNTologies Motivated by the need to develop shared ontologies for sharing knowledge Ex-post analysis in biology domain Based on Rhetorical Structure Theory seeks to explain the coherence of texts identifies relations: elaboration, evaluation, justification, contrast, alternative, example, counter example, background knowledge, motivation, summary, solutionhood, restatement, purpose condition, preparation, circumstance, result, enablement, list … DILIGENT uses subset of these Ontology engineering in SEKT

Distributed and loosely controlled: 

Distributed and loosely controlled The steps build – domain experts, users … local adaption – users analysis and revision – board local update - users Ontology engineering in SEKT

Diligent Wiki: 

Diligent Wiki Ontology engineering in SEKT

More applications: 

More applications Portals building on content management Knowledge discovery Business intelligence Inter-enterprise cooperation overcoming heterogeneity Semantic desktop Communication Collaboration Semantic Grid applications

Knowledge discovery: 

Knowledge discovery Extracting information from heterogeneous sources knowing your customer national security e.g. Semagix Sentiment analysis IBM’s WebFountainTM Intelliseek applications

Business intelligence: 

Business intelligence Text-driven business intelligence e.g. ClearForest Identifying trends and patterns Merging with structured data from databases applications

The semantic desktop: 

The semantic desktop Personal information management Desktop data as web resources Interoperable applications through common (RDF-based) data standards Items are first class objects Gnowsis – Haystack - Fenfire - applications

Extensible and interoperable: 

Extensible and interoperable Ontology and knowledgebase OWL text mining reasoning, ontology management and evolution context mapping app3, e.g. diary app1, e.g. diary app2, e.g. idea management applications

Keeping the context: 

Keeping the context When a file is emailed context is lost creation, classification … and more is lost when the received file is stored sender, email thread Use to create metadata to enhance, e.g. search applications


Communication applications Using information extraction to detect linkages between personal databases onto intranet or Web


Collaboration applications Plus using semantics: to find the right partners, e.g. in project set-up to create the right context for a conference agenda, minutes, documents

Semantic Grid: 

Semantic Grid applications Source: Definitions “flexible, secure coordinated resource sharing” (David de Roure) see also Wikipedia:

Grid services and resources: 

Grid services and resources Semantic description for, e.g.: resource discovery matchmaking negotiation composition monitoring Must be stateful – compare current web services applications – the semantic grid

Semantic grid - challenges: 

Semantic grid - challenges Automated virtual organisations their formation and management Service negotiation and contracts Security, trust and provenance Self organisation David de Roure University of Southampton applications – the semantic grid


State-of-the-art Text mining well developed Semagix, Intelliseek, ClearForest point solutions Standardisation currently mostly at XML level Little use yet of: context OWL reasoning applications


Challenges What do users really want? how not to overwhelm them? alerts, hyperlinks … Differentiate between users? novice, sophisticate varying at different times What kind of user interfaces? to make use of all the metadata

Bibliography - 1: 

Bibliography - 1 The semantic desktop Sauermann, L, The Gnowsis Semantic Desktop for Information Integration, at the IOA Workshop of the ISWC2005 Conference Decker, S., Frank, M., The Networked Semantic Desktop, in WWW2004 Workshop: Application Design, Development and Implementation Issues in the Semantic Web Chirita, P.A. et al, Activity Based Metadata for Semantic Desktop Search, in – The Semantic Web: Research and Applications, Springer, May / June 2005, p.p. 439-454 The semantic grid De Roure, D, Jennings, N., Shadbolt, N., The Semantic Grid: Past, Present and Future, in Proceedings of the IEEE, Vol. 93, No. 3, March 2005, p.p. 669-681

Bibliography - 2: 

Bibliography - 2 Semantic annotation Kiryakov, A., et al, Semantic Annotation, Indexing and Retrieval, Journal of Web Semantics, Vol. 2, December 2004, p.p. 49-79 Information integration Bouquet, P., Serafini, L., Zanobini, S., Semantic Coordination: A new approach and an application, in – Proceedings of ISWC 2003 Giuinchiglia, F., and Shvaiko P., Semantic Matching in The Knowledge Engineering Review, 18(3):265-280, 2004

Bibliography - 3: 

Bibliography - 3 Ontology engineering Noy, N., Representing Classes as Property Values on the Semantic Web, W3C Working Group, April 2005 Tempich, C., Pinto, S., Sure, Y., Staab, S., An Argumentation Ontology for DIstributed, Loosely-controlled and evolvInG Engineering of oNTologies (DILIGENT), ESWC2005, p.p. 241-256 Legal case study Benjamins, R., The Semantic Web: Legal Application, iSOCO, May 2005

Bibliography - 4: 

Bibliography - 4 General Introducing Semantic Technologies and the Vision of the Semantic Web, Semantic Interoperability Community of Practice (US) Evaluation and Market Report (WonderWeb project), Top Quadrant

authorStream Live Help