mesmuses methodology

Uploaded from authorPOINT
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

MESMUSES methodology: 

MESMUSES methodology Lessons learned and open issues… Alain Michard Florence, June 2003

MESMUSES broad vision: 

MESMUSES broad vision Just like several other projects SW is all about semantic interoperability Sharing machine-readable terminologies and classification schemes Science and culture are collective and international Semantic Web methodology should be highly relevant for managing and sharing scientific and cultural information

Some key S&T issues in the Project: 

Some key Sandamp;T issues in the Project Model : is RDFS / OWL-Lite adequate ? Schema authoring : method and tools needed ! Metadata : where does it come from ? Automatic Indexing : experiments with a categorizer

The basic SW model: 

The basic SW model Type : texte imprimé, monographie Auteur(s) : Zola, Émile (1840-1902) Titre(s) : L'assommoir [Texte imprimé] / par Emile Zola Edition : 50e éd. Publication : Paris : G. Charpentier, 1878 Description matérielle : 111-569 p. Notice n° : FRBNF35963044 Real-world entities

Model and Schema Language: 

Model and Schema Language Typed attributes are needed XML-Schema types Derived types (e.g.: Celsius temperature, Gregorian date, etc.) Enumerated types, thesauri Time-stamping Cardinality constraints Explicit transitivity of properties (e.g.: geographic inclusion)

Schema authoring issues (1): 

Schema authoring issues (1) Find the right level of abstraction Is « Glucid » a class or an instance ? Or is it sometime a class and sometime an instance ? Avoid the « KR » attitude and practices ! It’s all about indexing resources with shared terminologies, not about representing human knowledge !

Schema authoring issues (2): 

Schema authoring issues (2)

Schema authoring issues (3): 

Schema authoring issues (3)

Schema authoring issues (4): 

Schema authoring issues (4) Authoring tools are badly needed Graphical representation of the schema Zooming on sub-graphs (hierarchies) Versioning Consider using UML authoring environment ? Established methodology and tutorials are needed

Creating Surrogates: 

Creating Surrogates Data extraction and fusion from structured sources R-DB, XML-DB, LDAP Updating When ? Should not create duplicates ! Detect cross-references Authority lists Thesauri Lexical distance ???

Automatic Categorization: 

Automatic Categorization Automatic indexing By extracting metadata from resources By automatic categorization Define hierarchies of « concepts » inside the schema Seeding with representative documents Machine learning to create categorizers Pros : enriched search functionality Cons : hierarchies of categories are static Adding a category may change the categorizers of the others

Bottom-line…: 

Bottom-line… RDFS schema authoring may be more difficult than E-R modelling Debates on syntactic features are irrelevant Should be grounded on real-world implementations and testbeds A new query language (e.g.: RQL) is not high priority We have not addressed the « logical rules » layer Semantic Web vs. Community Webs