Share PowerPoint. Anywhere!

GGFpart1

Uploaded from authorPOINT Lite
Download as Download Not Available PPT
Presentation Description

No description available

Views: 8
Like it  ( Likes) Dislike it  ( Dislikes)
Added: November 20, 2007 This presentation is Public
Presentation Category :Entertainment
Tags Add Tags
Presentation StatisticsNew!
Views on authorSTREAM: 8
Presentation Transcript

Ontologies and the Grid : Ontologies and the Grid Professor Carole Goble University of Manchester UK carole@cs.man.ac.uk Professor Nigel Shadbolt University of Southampton UK nrs@ecs.soton.ac.uk


Acknowledgements : Acknowledgements We are grateful for material from the following collaborators: Alexander Maedche, Steffen Staab, University of Karlsruhe *. Natasha Noy Friedman, Deborah McGuinness, Stanford. Robert Meersman, Vrije University of Brussels. Mike Uschold, Boeing Corp. Dieter Fensel, Vrije University of Amsterdam. Terry Payne, Katia Sycara, CMU. Asun Gomez-Perez, University of Madrid. Judith Blake, The Jackson Laboratory. The AKT team, Univeristy of Southampton. Clive embrey, Paul Smart, Epistemics Ltd Bertram Ludäscher, San Diego Super Computing Centre. Alan Rector, Ian Horrocks, Chris Wroe, Angus Roberts, Sean Bechhofer, Norman Paton, Jeremy Rogers, University of Manchester. The myGrid team. * Especially grateful to Alex and Steffen.


Roadmap : Roadmap What is an ontology? How should I represent them? What are they used for? Do I have any? How do I get one? Methodologies & Communities Ontology lifecycle management What are the issues? Where do I go for more information? Tools threaded throughout


Part I What is an Ontology? : Part I What is an Ontology? Definitions Examples Issues


Ontology : Ontology Semantics – the meaning of meaning. Philosophical discipline, branch of philosophy that deals with the nature and the organisation of reality. Science of Being (Aristotle, Metaphysics, IV,1) What is being? What are the features common to all beings?


Slide6 : The art of ranking things in genera and species is of no small importance and very much assists our judgment as well as our memory. You know how much it matters in botany, not to mention animals and other substances, or again moral and notional entities as some call them. Order largely depends on it, and many good authors write in such a way that their whole account could be divided and subdivided according to a procedure related to genera and species. This helps one not merely to retain things, but also to find them. And those who have laid out all sorts of notions under certain headings or categories have done something very useful. Gottfried Wilhelm Leibniz, New Essays on Human Understanding


In computer science … : In computer science … An ontology is an explicit specification of a conceptualization [Gruber93] An ontology is a shared understanding of some domain of interest. [Uschold, Gruninger96] There are many definitions a formal specification EXECUTABLE of a conceptualization of a domain COMMUNITY of some part of world that is of interest APPLICATION Defines A common vocabulary of terms Some specification of the meaning of the terms A shared understanding for people and machines


Why develop an ontology? : Why develop an ontology? To make domain assumptions explicit Easier to change domain assumptions Easier to understand and update legacy data To separate domain knowledge from operational knowledge Re-use domain and operational knowledge separately A community reference for applications To share a consistent understanding of what information means.


Ontologies: made for sharing : Ontologies: made for sharing Interoperating resources, be it by people or systems, requires a consistent shared understanding of what the information contained means “... people [and machines] can’t share knowledge if they don’t speak a common language” [Davenport] Disparate modeling paradigms, languages and software tools limit => Interoperability => Knowledge sharing & reuse


Sharing info  Sharing meaning : Sharing info  Sharing meaning Metadata Data describing the content and meaning of resources and services. But everyone must speak the same language… Terminologies Shared and common vocabularies For search engines, agents, curators, authors and users But everyone must mean the same thing… Ontologies Shared and common understanding of a domain Essential for search, exchange and discovery


Origin and History : Origin and History Humans require words (or at least symbols) to communicate efficiently. The mapping of words to things is only indirect possible. We do it by creating concepts that refer to things. The relation between symbols and things has been described in the form of the meaning triangle:


Human and machine communication : Human and machine communication ... Machine Agent 1 Things Human Agent 2 Ontology Description Machine Agent 2 exchange symbol, e.g. via nat. language ‘‘JAGUAR“ Internal models Concept Formal models exchange symbol, e.g. via protocols MA1 HA1 HA2 MA2 Symbol commit commit a specific domain, e.g. animals commit commit Ontology Formal Semantics Human Agent 1 Meaning Triangle [Maedche et al., 2002]


Human and machine communication : Human and machine communication


An explicit description of a domain : An explicit description of a domain Concepts (class, set, type, predicate) event, gene, gammaBurst, atrium, molecule, cat Properties of concepts and relationships between them (slot) Taxonomy: generalisation ordering among concepts isA, partOf, subProcess Relationship, Role or Attribute: functionOf, hasActivity location, eats, size


Concepts : Concepts Primitive concepts: properties are necessary Globular protein must have hydrophobic core, but a protein with a hydrophobic core need not be a globular protein Defined concepts: properties are necessary + sufficient Eukaryotic cells must have a nucleus. Every cell that contains a nucleus must be Eukaryotic.


What is a concept? : What is a concept? Different communities have different notions on what a concept means: Formal concept analysis (see http://www.math.tu-dresden.de/~ganter/fba.html) talk about formal concepts Description Logics (see http://dl.kr.org/): They talk about concept labels ISO-704:2000 – Terminology Work: (see http://www.iso.ch/) Often the classical notion of a frame in AI or a class in OO modeling is seen as equivalent to a concept.


An explicit description of a domain : An explicit description of a domain Constraints or axioms on properties and concepts: value: integer domain: cat cardinality: at most 1 range: 0 <= X <= 100 oligonucleiotides < 20 base pairs cows are larger than dogs cats cannot eat only vegetation cats and dogs are disjoint Values or concrete domains integer, strings 20, trypotoplan-synthetase


An explicit description of a domain : An explicit description of a domain Individuals or Instances sulphur, trpA Gene, felix Nominals Concepts that cannot have instances Instances that are used in conceptual definitions ItalianDog = Dog bornIn Italy Instances An ontology = concepts+properties+axioms+values+nominals A knowledge base = ontology+instances


Light and Heavy expressivity : Light and Heavy expressivity Lightweight Concepts, atomic types Is-a hierarchy Relationships between concepts Heavyweight Metaclasses Type constraints on relations Cardinality constraints Taxonomy of relations Reified statements Axioms Semantic entailments Expressiveness Inference systems A matter of rigour and representational expressivity


So what is an ontology? : So what is an ontology? Catalog/ ID Thesauri Terms/ glossary Informal Is-a Formal Is-a Formal instance Frames (properties) General Logical constraints Value restrictions Disjointness, Inverse, partof Gene Ontology Mouse Anatomy EcoCyc PharmGKB TAMBIS Arom [Deborah McGuinness, Stanford]


A semantic continuum : A semantic continuum [Mike Uschold, Boeing Corp] Shared human consensus Text descriptions Semantics hardwired; used at runtime Semantics processed and used at runtime Pump: “a device for moving a gas or liquid from one place or container to another” (pump has (superclasses (…)) Implicit Informal (explicit) Formal (for humans) Formal (for machines) Further to the right means: Less ambiguity More likely to have correct functionality Better inter-operation Less hardwiring More robust to change More difficult


EcoCyc : EcoCyc


Gene Ontology http://www.geneontology.org : Gene Ontology http://www.geneontology.org “a dynamic controlled vocabulary that can be applied to all eukaryotes” Built by the community for the community. Three organising principles: Molecular function, Biological process, Cellular component Isa and Part of taxonomy – but not good! ~10,000 concepts Lightweight ontology, Poor semantic rigour. Ok when small and used for annotation. Obstacle when large, evolving and used for mining.


Controlled vocabulary : Controlled vocabulary AGROVOC: Agricultural Vocabulary


Thesauri : Thesauri AAT: Art & Architecture Thesaurus


Thesauri & Classification : Thesauri & Classification UNSPSC: Product Classification A comprehensive list is provided at http://www.lub.lu.se/metadata/subject-help.html Thesauri act as a good starting point for developing an ontology


UMLS (Unified Medical Language System) http://umlsks.nlm.nih.gov/ : UMLS (Unified Medical Language System) http://umlsks.nlm.nih.gov/ National Library of Medicine (NLM) database of medical terminology. Terms from several medical databases (MEDLINE, SNOMED International, Read Codes, etc.) are unified so that different terms are identified as the same medical concept. Metathesaurus provides the concordance of medical concepts: 730.000 concepts, 1.5 million concept names in different source vocabularies Specialist lexicon provides word synonyms, derivations, lexical variants, and grammatical forms of words used in MetaThesaurus terms: 130,000 entries. Semantic Network codifies the relationships (e.g. causality, "is a", etc.) among medical terms: 134 semantic types, 54 relationships.


KA2 Ontology : KA2 Ontology Ontology that models the knowledge acquisition community (its researchers, topics, products, etc.) Small, application specific ontology: 73 concepts 124 relations 50 rules Available at: http://www.aifb.uni-karlsruhe.de/WBS/broker/ka-onto.onto Application: Semantic Community Web Portals: http://ka2portal.aifb.uni-karlsruhe.de Successor ontology: SWRC/OntoWeb community ontology [Staab et al., 00] [Decker et al, 98]


The KA ontology : The KA ontology


Web-KB project at CMU : Web-KB project at CMU http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ [Craven et al, 98]


Meta-Ontologies : Meta-Ontologies Meta-ontologies describ metadata about ontologies and their associated elements. Examples: Interoperability issues between two ontologies, e.g. Semantic Translation Ontology or RDFT Ontology Capturing changes supporting ontology evolution using an evolution ontology


Taxonomy remark 1 : Taxonomy remark 1 The world is not a tree, it’s a lattice animal rodent cow cat mouse dog domestic vermin wild pet working


Taxonomy remark 2 : Taxonomy remark 2 What does the taxonomy mean? Concept A is a parent of concept B iff every instance of B is also an instance of A Superset/subset ICONCLASS


Classification trickiness : Classification trickiness "On those remote pages it is written that animals are divided into: a. those that belong to the Emperor b. embalmed ones c. those that are trained d. suckling pigs e. mermaids f. fabulous ones g. stray dogs h. those that are included in this classification i. those that tremble as if they were mad j. innumerable ones k. those drawn with a very fine camel's hair brush l. others m. those that have just broken a flower vase n. those that resemble flies from a distance" The Celestial Emporium of Benevolent Knowledge, Borges


Classification is task and culture specific : Classification is task and culture specific Dyirbal classification of objects in the universe, Bayi: men, kangaroos, possums, bats, most snakes, most fishes, some birds, most insects, the moon, storms, rainbows, boomerangs, some spears, etc. Balan: women, anything connected with water or fire, bandicoots, dogs, platypus, echidna, some snakes, some fishes, most birds, fireflies, scorpions, crickets, the stars, shields, some spears, some trees, etc. Balam: all edible fruit and the plants that bear them, tubers, ferns, honey, cigarettes, wine, cake. Bala: parts of the body, meat, bees, wind, yamsticks, some spears, most trees, grass, mud, stones, noises, language, etc.


Ontology desiderata : Ontology desiderata Precision formal, unambiguous high fidelity Systematic control, quality, clarity Explicitness clarity, commitment, reuse Flexibility expressivity, evolution


Ontology description space : Ontology description space Coverage upper, domain general, domain specific Expressivity taxonomy, relationships, axioms Knowledge representation languages and models words, OO, frames, logics Inference mechanisms classification, coherency


Coverage : Coverage top-level upper ontology task & problem-solving ontology application ontology domain ontology Grid service ontology [Guarino, 98] describe very general concepts like space, time, event, which are independent of a particular problem or domain. It seems reasonable to have unified top-level ontologies for large communities of users. describe the vocabulary related to a generic domain by specializing the concepts introduced in the top-level ontology. describe the vocabulary related to a generic task or activity by specializing the top-level ontologies. the most specific ontologies. Concepts in application ontologies often correspond to roles played by domain entities while performing a certain activity.


Specific ontologies : Specific ontologies Domain-oriented Domain-specific Medicine => cardiology => rhythm disorders E. coli, Domain generalizations components, organs, documents, gene function Task-oriented task specific configuration design, instruction, planning, annotation analysis task generalisations problem solving methods e.g. UPML http://www.ibrow.org/


Upper Ontologies : Upper Ontologies Top Level ontologies WordNet EuroWordNet CyC SENSUS Sowa Top Level GUM Etc… A.k.a. core, generic or reference Common high level concepts “Physical”, “Abstract”, “Structure”, “Substance” Useful for ontology re-use Important when generating or analysing natural language expressions


Example upper ontologies : Example upper ontologies Sowa’s upper ontology http://www.bestweb.net/~sowa/ontology


Example upper ontologies : Example upper ontologies Generalised Upper Model 2.0 http://www.darmstadt.gmd.de/publish/komet/gen-um/newUM.html


WordNet (Miller et al.) : WordNet (Miller et al.) http://www.cogsci.princeton.edu/~wn/


WordNet : WordNet


Problems with current lexicons : Problems with current lexicons In WordNet: clear that news_item is-a item Maybe acceptable that news_item is-a part But what of news_item is-a relation !? depends on context, role played… But: “role” and “context” knowledge is missing Also: some lexicographer’s bias is present


CYC (Lenat & Guha) : CYC (Lenat & Guha) © CYCORP, Inc. http://www.cyc.com/


DAML-S http://www.daml.org : DAML-S http://www.daml.org US DARPA Agent Markup Language – Services An upper ontology for Services


Multi-Classification & Multi-Perspective : Multi-Classification & Multi-Perspective phrase-based classification ID =GO:0005469 (decommissioned concept) succinate (cytosol) to fumarate (mitochondrial) transporter is a kind of transporter but it should also classified on the basis of its… location in the mitochondrial membrane orientation of the transporter molecules transported relationships to biological processes e.g. metabolism Need to express these things and get the multi-axial classification sorted


Pre-enumeration vs Post-coordination : Pre-enumeration vs Post-coordination Pre-enumeration – an attempt to identify and organise all the concepts pre-hoc Enumerating the noun phrases of the English language Thesauri, object models Post-coordination – controlled combination of terms when needed A vocabulary and a grammar.


The International Statistical Classification of Diseases and Related Health Problems, 10th revision : The International Statistical Classification of Diseases and Related Health Problems, 10th revision


The exploding bicycle : The exploding bicycle ICD-9 (E826) 8 READ-2 (T30..) 81 READ-3 87 ICD-10 (V10-19) 587 V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities


Defusing the exploding bicycle: 500 codes in pieces : Defusing the exploding bicycle: 500 codes in pieces 10 things to hit… Pedestrian / cycle / motorbike / car / HGV / train / unpowered vehicle / a tree / other 5 roles for the injured… Driving / passenger / cyclist / getting in / other 5 activities when injured… resting / at work / sporting / at leisure / other 2 contexts… In traffic / not in traffic V12.24 Pedal cyclist injured in collision with two- or three-wheeled motor vehicle, unspecified pedal cyclist, nontraffic accident, while resting, sleeping, eating or engaging in other vital activities


Goodbye to picking lists… : Goodbye to picking lists… What you hit Your Role Activity Location Cycling Accident


Coordination: Conceptual Lego : Coordination: Conceptual Lego acute chronic ischaemic deletion bacterial polymorphism cell protein gene expression


Conceptual Lego : Conceptual Lego “SNPolymorphism of CFTRGene causing Defect in MembraneTransport of ChlorideIon causing Increase in Viscosity of Mucus in CysticFibrosis…” “Hand which is anatomically normal”


FAQ : FAQ Whats the difference between a database schema and an ontology? Is there only one ontology? Is development one off? Do I need to first get the ontology right before I use it? How do I represent an ontology?


Current ontology standardization initiatives : Current ontology standardization initiatives SUO (SUO consortium proposal) http://suo.ieee.org/ Global WordNet Consortium ISO SC4 eCommerce standards (UCEC, ebXML,…) Cultural repositories standards (Harmony, CIDOC) CEN/ISSS EC WG (MULECO) DAML (especially DAML-S) http://www.daml.org/ W3C Web Ontology Working Group http://www.w3.org/2001/sw/WebOnt/ Projects OntoWeb http://www.ontoweb.org/ WonderWeb http://wonderweb.semanticweb.org/


Further Reading : Further Reading