Dublin Core for MuseumsDay 1 : Dublin Core for Museums Day 1
CIMI
John Perkins jperkins@cimi.org
Slide2 : Overview for Thursday March 25 Introduction to Metadata
Introducing the Dublin Core
CIMI DC Guidelines - Dublin Core for Museums
Break
DC for museums continued...
Lunch
Practicalities of Implementing DC
Break
Introduction to MICI
Slide3 : What’s the Problem ? Need to serve a Web audience
Demand for content
Uncertain quality
Expectations for rapid easy access
Need to be visible on the Web
Two million web sites
Half a billion addressable pages
Many communities with the same problem
Slide4 : What’s the Problem ? Manage and organise interconnected data
Different types
Different repositories
Packages
Interoperate with other communities
Interoperate with other applications
Need a way to:
Express meanings in rich and complex data
Express the structure of our data
Encode the transfer of data
Slide5 : What’s the Solution ? Communities address their own needs
Do so in a way that works across communities
Standards based
Collaborative
What is a Community? : What is a Community? Based on a slide by Stu Weibel
Slide7 : Based on a slide by Stu Weibel Communities working together
Slide8 : Based on a slide by Stu Weibel Communities working together Metadata
What is Metadata? : What is Metadata? Meaningless jargon
or a fashionable term for what we’ve always done
or “a means of turning data into information”
and “data about data”
and the name of a film director (‘Luc Besson’)
and the title of a book (‘The Lord of the Flies’).
What is Metadata? : What is Metadata? Metadata exists for almost anything
People
Places
Objects
Concepts
Databases
Web pages
What is Metadata? : What is Metadata? Metadata fulfils three main functions:
description of resource content
“What is it?”
description of resource form
“How is it constructed?”
description of issues behind resource use
“Can I afford it?”.
What is Metadata? : What is Metadata? Many structures have evolved at different levels, and to meet different requirements... MICI
For human communication we need... : For human communication we need... Semantic Interoperability Structural Interoperability Syntactic Interoperability “Let’s talk English” Standardisation of content Standardisation of form “Here’s how to make a sentence” Standardisation of expression “These are the rules of grammar” “cat milk sat drank mat ” “Cat sat on mat. Drank milk.” “The cat sat on the mat. It drank some milk.”
Challenges : Challenges Many flavours of metadata
which one do I use?
Managing change
new varieties, and evolution of existing forms
Tension between functionality and simplicity, extensibility and interoperability
Opportunities
Introducing the Dublin Core : Introducing the Dublin Core An attempt to improve resource discovery on the Web
now adopted more broadly
Building an interdisciplinary consensus about a core element set for resource discovery
simple and intuitive
cross–disciplinary
international
flexible.
Introducing the Dublin Core : Introducing the Dublin Core 15 elements of descriptive metadata
All elements optional
All elements repeatable
The whole is extensible
offering a starting point for semantically richer descriptions
Interdisciplinary
libraries, museums, government, education...
International
available in 20 languages, with more on the way.
Introducing the Dublin Core : Introducing the Dublin Core Title
Creator
Subject
Description
Publisher
Contributor
Date
Type Format
Identifier
Source
Language
Relation
Coverage
Rights http://purl.org/dc/
Extending DC (semantic refinement) : Extending DC (semantic refinement) Contact Info Affiliation Based on a slide by Stu Weibel Improve descriptive precision by adding
sub–structure (subelements and schemes) Greater precision = lesser interoperability Should ‘dumb down’ gracefully Element qualifier Value qualifier
Extending DC (a modular approach) : Extending DC (a modular approach) Modular extensibility...
additional elements to support local needs
complementary packages of metadata
…but only if we get the building blocks right Based on a slide by Stu Weibel
Extending DC? : Extending DC? DC offers a semantic framework
through use of further substructure, meaning can often be clarified “John” John Inc. ?
John xyz ?
xyz John ? “John” John Inc.
John xyz
xyz John.
Extending DC? : Extending DC? DC offers a semantic framework
Use of domain–specific schemes greatly increases precision “Washington” Washington State ?
Washington DC ?
Washington monument ? “Washington” Washington State
Washington DC
Washington monument “North and Central America, United States, Washington”
Dublin Core in the physical world : Dublin Core originally designed with electronic resources in mind
Physical resources are fundamentally different
Issues of surrogacy become more important
Genre, Type, and Format models vary greatly
Difficult to remember what is being described, and which characteristics of the resource and its surrogates are ‘correct’. Dublin Core in the physical world
Introducing Physical Objects : Aspects of the real world are key to much of what museums do
Physical objects have dimensions
23 x 46 cm
12 x 52 x 18 in
18.6 cm3
823 pages
Physical objects have a form
oil on canvas
Tadcaster limestone
stainless steel.
Introducing Physical Objects
Introducing Physical Objects : Physical objects change over time
constructed between AD524 and 873
repaired in AD1270
incorporated into ornamental arch in AD1320
Physical objects move
cast in Beijing
used in Shanghai
taken to Hong Kong
on display in Macau.
Introducing Physical Objects
Introducing Physical Objects : Physical objects are associated with people
written by William Shakespeare
acquired by Lord Elgin
decreed by the Emperor Hadrian
associated with Prince Charles Edward Stuart
Physical objects are contextualised
fired at the Battle of Trafalgar
carried on Apollo 11 from the moon
printed on the first printing press
salvaged from the Titanic.
Introducing Physical Objects
Introducing Collections : Museum objects, whether original or surrogate, are normally part of a collection
Collections may be ‘real’...
the Sutton Hoo hoard
the Terracotta Warriors
...an aspect of the process by which objects enter the museum...
the Burrell Collection
Solomon Guggenheim’s art collection
…or simply practical
coins at the British Museum
the Tate Gallery’s collection of works by Da Vinci. Introducing Collections
Introducing Surrogacy : Many of the resources we describe are, in reality, surrogates for something else
a photograph of King Tutankhamen’s death mask
a photograph of a statue of George Washington
a film of President Kennedy’s assassination
a sound recording of Neil Armstrong’s “One small step for man…” speech on the moon
a copy of the Mona Lisa
a model of the Great Wall of China
a reproduction of the Terracotta warriors. Introducing Surrogacy
Issues of Surrogacy : Many of the resources we describe are, in reality, surrogates for something else
we need to be clear whether we are describing the resource or its surrogate
the sculptor of a statue is often not the person who made its photographic surrogate
the model of the Forbidden City is unlikely to have been created at the same date as the Forbidden City itself
the format of a computer image of the Mona Lisa (image/jpeg ?)is not the same as the format of the original painting (oil on canvas ?). Issues of Surrogacy
Other Museum Issues : Museums need to describe real objects and surrogates in a similar manner
guidelines/standards therefore need to encompass both, despite their differences
Resource descriptions will often be drawn from existing collection management systems in the first instance, rather than created afresh
guidelines therefore need to respect existing practices within established systems
There is often no ‘right’ answer
so practices need to allow for approximate dates, multiple possible creators, etc. Other Museum Issues
Introducing the 1:1 Principle : The broader Dublin Core community is tackling some of the problems relevant to museums
Their work on the ‘1:1 Principle’ is especially useful in resolving museum issues over original versus surrogate and item versus collection:
each Dublin Core ‘record’ should describe only one resource, whether surrogate or original. Associated resources should be linked together by means of the Relation element in Dublin Core. Introducing the 1:1 Principle
Introducing the 1:1 Principle : In a record describing a photo of the Mona Lisa on a web page, for example…
Leonardo da Vinci is not the creator of the image
The image was not created during the Renaissance
…but you might include these as Subject terms, and you could usefully provided a link to the record describing the real painting via Dublin Core’s Relation element
Equally, in describing the painting itself…
http://www.louvre.fr/…/monalisa.jpg is not the Identifier of the painting
but you might link to this image via Relation, just to show people what the painting looks like. Introducing the 1:1 Principle
The primacy of ‘Type’ : In describing museum objects, it is often most useful to first decide what you are describing and why, rather than beginning with ‘who made it’ and ‘what is it called’, as is often the case with books
if you know you’re describing a surrogate of the Mona Lisa, then you know Leonardo da Vinci is not the Creator; whoever made the surrogate is
if you know you’re describing a collection of 20th century paintings, then you know that Picasso, Hockney et al are not the Creators; the collector is. The primacy of ‘Type’
The primacy of ‘Type’ : if you know you’re describing the Sutton Hoo helmet, then the fact that it was added to a particular museum collection in 1939 perhaps doesn’t matter; that information is better placed in the collection record
if you know you’re describing a natural specimen, then perhaps it has no Creator; there may be a ‘creator’ associated with its identification or collection, though. The primacy of ‘Type’
Dublin Core for Museums: Assumptions : In applying Dublin Core to museums, we are making certain basic assumptions, many of which were tested by CIMI
DC is appropriate for use in describing both physical and digital resources
DC is easy to learn and simple to use
Information can be meaningfully and efficiently extracted from existing museum systems in order to populate DC records
the creation of a DC record to describe a museum object is cost–effective, and aids the discovery of resources more than simply allowing access to the underlying Collection Management system might.
Dublin Core for Museums: Assumptions
Practicalities of Implementing Dublin Core : Practicalities of Implementing Dublin Core
Paul Miller Uk Office for Library & Information Networking p.miller@ukoln.ac.uk Thomas Hofmann Australian Museums On-Line thomash@amol.org.au
Overview : Overview Creation and Maintenance
Harvesting and Distribution
Retrieval
Implementation Models
Case Study
Dublin Core - Refresher : Dublin Core - Refresher 15 simple elements
Focus on Resource Discovery not Resource Description
One Dublin Core record per resource
Interoperable across communities
Can be easy populated from existing databases
Can be formatted in XML/ RDF or HTML
When should I use Dublin Core? : When should I use Dublin Core? You have a rich standard, need simpler one
You want to disclose your data to other communities using commonly understood semantics
You want to provide unified access to databases with different underlying schemas
You need core description semantics and don’t feel compelled to invent them anew
Considerations : Considerations
Encoding Dublin Core : Encoding Dublin Core HTML
Unqualified
Easy
Qualified
Overloaded Content (HTML 3.2)
Additional Attribute (HTML 4)
RDF
Based on XML
Sophisticated
More complex
Encoding Dublin Core - Unqualified : Encoding Dublin Core - Unqualified
Encoding Dublin Core - Qualified (HTML 3.2) : Encoding Dublin Core - Qualified (HTML 3.2)
Encoding Dublin Core - Qualified (HTML 4) : Encoding Dublin Core - Qualified (HTML 4)
Encoding Dublin Core - Sub-Elements : Encoding Dublin Core - Sub-Elements
Encoding Dublin Core - RDF : Encoding Dublin Core - RDF ...
1999–03–25
Example Tool: DC Dot : Example Tool: DC Dot http://www.ukoln.ac.uk/metadata/dcdot/
Semi-automated generation of Dublin Core
Cut and past into document
Conversions to HTML, SOIF, XML, WHOIS++, USMARC, GILS
Example Tool: DC Dot : Example Tool: DC Dot Screenshot of http://www.ukoln.ac.uk/metadata/dc-dot/
Example Tool: DC Dot : Example Tool: DC Dot Screenshots of DC Dot output
Example Tool: Reggie : Example Tool: Reggie http://metadata.net
Generic creation tool for any metadata schema published to metadata.net
Currently supports: Dublin Core in 5 languages
Syntax: HTML META tags (V3.2 and 4.0), RDF
Example Tool: Reggie : Example Tool: Reggie Screenshot of Reggie
Example Tool: Site Generator : Example Tool: Site Generator http://www.dstc.edu.au/RDU/MetaWeb/
Tool which parses local web site and automatically creates Dublin Core metadata
Syntax: HTML
JAVA based tool which requires JDK 1.1
Further Information - Creation and Maint. : Further Information - Creation and Maint. Metadata Creation Tools General METADATA PAGE AT UKOLN http://www.ukoln.ac.uk/metadata/software-tools/ METAWEB http://www.dstc.edu.au/RDU/MetaWeb/ TagGen SE http://www.hisoftware.com/fact_sheetcc.htm
User Guides
Official User Guide for Simple Dublin Core http://purl.org/dc/core/documents/working_drafts/wd-guide-current.htm
CIMI Guide to Best Practice: Dublin Core
Harvesting / Distribution : Harvesting / Distribution Tools
Z39.50 Gateway
Metadata Harvester
Full-text Search Engine
Resources
Indexing, harvesting tools http://www.searchenginewatch.com/ http://www.searchtools.com/ http://www.ukoln.ac.uk/metadata/software-tools/ http://www.dstc.edu.au/RDU/MetaWeb/
Z39.50 http://www.ilrt.bris.ac.uk/discovery/z3950/resources/ http://www.ukoln.ac.uk/dlis/z3950/resources/
Retrieval : Retrieval Tools
HTML - search forms
HTML - predefined queries
Z39.50 clients/ Java applets
Standalone applications
Interface design
Assist users: -help them to understand what they are looking for -give them an idea what terminologies you are using -use commonly understood design language
Slide58 : Bringing it all together: Implementation Models
Implementation Models : Implementation Models Harvesting DC into a repository (database)
Distributed Database Search
Full-text indexing with metadata extraction
Implementation Models : Implementation Models Harvesting DC into a repository (database) HTML XML Other types Repository Harvester Query Dynamic document creation from database retrieve resource
Implementation Models : Implementation Models Distributed Database Search Query retrieve resource
Implementation Models : Implementation Models Full-text indexing with metadata extraction Indexer Index DB Query HTML XML Other types Dynamic document creation from database retrieve resource
Questions before implementation : Questions before implementation Do I really need Dublin Core?
What is my budget?
What type of resources do I want to describe?
Which encoding format for which resource?
Do I have community support?
Can I provide creation tools?
Challenges of implementing Dublin Core : Challenges of implementing Dublin Core Intellectual
Education of information creators
Community consensus
Resistance against sharing information
Technical
Efficient tools
Infrastructure
Economical
Automatic generation vs. manual creation
Cost of training
Cost of tools
Dublin Core for the masses : Dublin Core for the masses Why Dublin Core hasn’t hit the consumer market yet
No killer application
Lack of standardisation
No support in public search engines
No support in mass market applications
Non transparent applications
Inefficient handling in HTML
Further Information : Further Information Projects Official Dublin Core web site http://purl.oclc.org/dc/projects/index.htm
Mailing lists Dublin Core Implementors workgroup Mailing list http://www.mailbase.ac.uk/lists/dc-implementors/
Case Study AMOL (1) : Case Study AMOL (1) Gateway to Australian Museums and Galleries
Initial idea: One central access point for all Australian collections
Creation of AMOL standard record for object data due to lack of common standards
8 basic field with focus on resource discovery and easy deployment from within existing databases
Fields: Object Title, Object Name, Creator, Description, Item ID, KeySearchTerms, Date/DateRange, Associated Places
Case Study AMOL (2) : Case Study AMOL (2) AMOL search/ system architecture - current system
Case Study AMOL (3) : Case Study AMOL (3) Data and technology related
Lack of consistent use of controlled vocabularies, quality of data recorded
Performance of indexing software, lack of metadata support in public search engines
high administration efforts
Intellectual
Users have problems with “empty text box” approach
Limited information in record to see context with larger picture
General
Large institutions: bureaucratic machinery, complex collection systems designed without interoperability in mind
Small institutions: concerned about security issues, fear of larger institutions
Lessons Learned
Case Study AMOL (4) : Case Study AMOL (4) New resource types: Information about institutions, Images, Video, Audio, general HTML pages - goes beyond capabilities of standard AMOL record
Need to provide easier access for users
New cross community projects require interoperable metadata standards for cross domain searching
Strong move in Australia towards Dublin Core based metadata schemas driven by government
Strong move towards interpretation of objects through stories
Search Architecture and extended AMOL metadata standard New perspectives
Case Study AMOL (5) : Case Study AMOL (5) NEW AMOL search/ system architecture
Case Study AMOL (6) : Case Study AMOL (6) Future Directions
Implementation of RDF for dynamically served databases and text style resources
Consensus of community: Metadata Forum
Further education of users: Metadata Workshops
Creation of multi-type metadata schema based on Dublin Core
Creation of mapping tools for easier database implementation
Case Study AMOL (7) : Case Study AMOL (7) Recommendations
Prepare good user guides
Run workshops and educate museum professionals
Get consensus from community
Plan with interoperability in mind
Evaluate tools and plan for future additions
Biggest Problem still remaining:
what is the benefit to the individual institution other than being interoperable for networked resources
Dublin Core for the masses : Dublin Core for the masses Why Dublin Core hasn’t hit the consumer market yet
No killer application
Lack of standardisation
No support in public search engines
No support in mass market applications
Non transparent applications
Inefficient handling in HTML
Further Information : Further Information Projects Official Dublin Core web site http://purl.oclc.org/dc/projects/index.htm
Mailing lists Dublin Core Implementors workgroup Mailing list http://www.mailbase.ac.uk/lists/dc-implementors/
Slide79 : http://www.cimi.org/
For Machine Communication we need.. : For Machine Communication we need.. Semantic Interoperability Structural Interoperability Syntactic Interoperability “Let’s talk Resource Description” Standardisation of content Standardisation of form “Lets use MICI” Standardisation of expression “Here’s how to say it in HTML” “Creator, Publisher..,” “Field # 1 Element Name “”