Metadata and Handling of Heterogeneity as Central Means for the Development of an European School Portal - The Project European Schools Treasury Browser – ETB Presentation at the 7th Annual Meeting of the IuK Initiative Trier 11.-14.03.2001 Michael Kluck Humboldt University Berlin, Computer Uses in Education (HUB) Social Sciences Information Centre Bonn (IZ)

Introduction (I): 

Introduction (I) The ETB project is embedded in the context of the European Schoolnet (EUN) www.eun.org The European Schoolnet is the new framework for the co-operation between the European Ministries of Education on Information and Communication Technology in Education. EUN builds a European network of national and regional computer networks of repositories on schools.



Introduction (II): 

Introduction (II) ETB works out the technological and structural prerequisites for this network of networks. Building on a preceding project, ETB shall realise the technical infrastructure and the content-based integration of the different services and of their cultural and linguistic contexts. The presentation is concentrated on the content integration of the participating networks and repositories. The main user groups will be teachers and pupils.

Developing a Common Metadata Set: 

Developing a Common Metadata Set Context and General purpose: Get similarly structured information Facilitate targeted search Avoid mismatch of the specific search and the unstructured universe of the Internet: Topic versus person (i.e. Ohm, Kierkegaard) Different domain-specific meanings (i.e. Leistung, Disziplin) Domain-specific meaning versus general meaning (i.e. Lehre, services)

ETB Metadata: 

ETB Metadata Derived from the Dublin Core metadata elements and the EUN Metadata Element Set (developed in the preceding EUN project) Quite minimalised, but with obligation types M = mandatory O = optional Using RDF syntax

ETB Metadata Elements (I): 

ETB Metadata Elements (I) Title M Creator M Subject O or M?! Description M Publisher O Contributor O Date O Type O

ETB Metadata Elements (II): 

ETB Metadata Elements (II) Format O Identifier M Source O Language M Relation O Coverage O Rights Management O Audience O EUN User Level O

ETB Metadata Elements (III): 

ETB Metadata Elements (III) Element Subject Besides freely chosen keywords ETB thesaurus terms Sound or video clip representing the content of an audio, audiovisual, visual or multimedia resource

ETB Metadata Elements (IV): 

ETB Metadata Elements (IV) Element EUN User Level School level or age group Pre-school (education) Primary (education) AdultEducation Secondary (education) Vocational (eduction and training) HigherEducation Juvenile (material for children and adolescents in general) Adult (material for adults in general)

Producing Metadata: 

Producing Metadata Direct entry by authors (adapting given rules/definitions or using an online template) Generation by repositories during input Extraction from existing un-coded data by defining extraction rules

Metadata Extraction and Mapping: 

Metadata Extraction and Mapping For different repositories which have different metadata structures mapping schemes will be set up into the ETB Metadata Element Set. For repositories without metadata schemes metadata will be extracted from the entries as far as structured elements of the resources can be detected and an algorithm for converting them into metadata fields can be applied.

Metadata Exchange via NNTP: 

Metadata Exchange via NNTP

Technical Goals of ETB: 

Technical Goals of ETB A new approach for a European Network of repositories Network based on “Publish” not “Pull” Added value to users from a thesaurus Retain full local editorial policy High quality control tools Wider outreach Support of multilinguality

ETB Thesaurus (I): 

ETB Thesaurus (I) Search problems Natural language problems: Synonymy, homonymy, polysemy, phrases, compounds, spelling variations Lack of relevance control Multilinguality

ETB Thesaurus (II): 

ETB Thesaurus (II) Thesaurus benefits Effective control of indexing language (preferred terms, inter-language equivalence) Systematic display of descriptors (ease of navigation through the terminology) Indexing and searching by using post-coordination Following recommendations of Dublin Core Basics for solving heterogeneity

ETB Thesaurus (III): 

ETB Thesaurus (III) The content of the repositories in the EUN context (= multimedia material, teaching material, school projects) and schools as target area and teachers and pupils as main target groups need specific terminology. Only few repositories have developed an own terminology.

Handling Heterogeneity (I): 

Handling Heterogeneity (I) Making use of existing content descriptions Dealing with heterogeneity on the content level means: Same words or phrases may indicate different meanings in different environments (i.e. education, or class): Occurring anywhere in the full text of an Internet resource Being the code of an classification scheme assigned to an document Being an indexing term taken from a specific thesaurus

Handling heterogeneity (II): 

Handling heterogeneity (II) Use of existing intellectual work done by the different repositories or resource authors: indexing or classifying documents even with different schemes or terminologies Use of existing terminologies or classification schemes for automatic processing of transfer relations

Handling heterogeneity (III): 

Handling heterogeneity (III) Methods for solving heterogeneity problems Intellectual building of cross-concordances between relevant terminologies and classification schemes and between different languages, and automatic (statistical) building of transfer components Developing transfer components in between those terminologies and schemes and between those and the words occurring in the full texts (co-occurrence analysis, fuzzy methods, neural networks etc.)

Multilingual Access: 

Multilingual Access Using ETB thesaurus and heterogeneity handling ETB thesaurus allows indexing or searching in any covered language and results can automatically be retrieved in all other languages. Heterogeneity handling (intellectually or automatically processed) allows the use of any (language specific) scheme: results can also be retrieved in other schemes or languages. Integration of results in the area of cross-language information retrieval and its evaluation (see: CLEF = Cross-Language Evaluation Forum at www.clef-campaign.org )


Conclusion ETB is strongly integrated in an existing and rapidly developing application for practitioners (teachers and pupils) with a good political support for handling ICT in education. ETB is strongly integrated into top level research on distributed networking, metadata, (cross-language) information retrieval, multilingual thesauri, and heterogeneity handling.

Thank you for your attention!: 

Thank you for your attention! Further information On the multilingual ETB thesaurus http://www.en.eun.org/eun.org2/eun/en/etb/content_frame.cfm?lang=en&ov=3813 On other aspects of the ETB Project (collection description, quality management, technical solutions) http://www.en.eun.org/eun.org2/eun/en/etb/sub_area_frame.cfm?sa=195&row=1 Michael Kluck‘s publications http://www.educat.hu-berlin.de/~kluck/kl-personal.html


