Digital Libraries : Archaeology, Automation, ETDs, and EnhancementsEdward A. Fox (fox@vt.edu)Virginia Tech, USA : Digital Libraries : Archaeology, Automation, ETDs, and Enhancements Edward A. Fox (fox@vt.edu) Virginia Tech, USA IADLC 2005 The International Advanced Digital Library Conference in Nagoya
August 25-26, 2005
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
Acknowledgements: Students : Acknowledgements: Students Pavel Calado, Yuxin Chen, Fernando Das Neves, Shahrooz Feizabadi, Robert France, Marcos Gonçalves, Nithiwat Kampanya, S.H. Kim, Aaron Krowne, Bing Liu, Ming Luo, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo Torres, Wensi Xi, Baoping Zhang, Qinwei Zhu, …
Acknowledgements: Faculty, Staff : Acknowledgements: Faculty, Staff Lillian Cassel, Debra Dudley, Roger Ehrich, Joanne Eustis, Weiguo Fan, James Flanagan, C. Lee Giles, Eberhard Hilf, John Impagliazzo, Filip Jagodzinski, Rohit Kelapure, Neill Kipp, Douglas Knight, Deborah Knox, Aaron Krowne, Alberto Laender, Gail McMillan, Claudia Medeiros, Manuel Perez, Naren Ramakrishnan, Layne Watson, …
Other Collaborators (Selected) : Other Collaborators (Selected) Brazil: FUA, UFMG, UNICAMP
Case Western Reserve University
Emory, Notre Dame, Oregon State
Germany: Univ. Oldenburg
Mexico: UDLA (Puebla), Monterrey
College of NJ, Hofstra, Penn State, Villanova
University of Arizona
University of Florida, Univ. of Illinois
University of Virginia
Acknowledgements - Mentors : Acknowledgements - Mentors JCR Licklider – undergrad advisor (1969-71)
Author in 1965 of 'Libraries of the Future'
Before, at ARPA, funded start of Internet
Michael Kessler – BS thesis advisor
Project TIP (technical information project)
Defined bibliographic coupling
Gerard Salton – graduate advisor (1978-83)
'Father of Information Retrieval'
Acknowledgements: Support : Acknowledgements: Support ACM, Adobe, AOL, CAPES, CNI, CONACyT, DFG, IBM, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0086227, 0080748, 0325579; ITR-0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
Information Life Cycle : Information Life Cycle Authoring
Modifying Organizing
Indexing Storing
Retrieving Distributing
Networking Retention
/ Mining Accessing
Filtering Using
Creating
DL Curriculum Framework : DL Curriculum Framework
5S Layers : 5S Layers Societies Scenarios Spaces Structures Streams
5S Layers : 5S Layers Societies Scenarios Spaces Structures Streams Fire Wood Earth Metal Water 5 Elements
5Ss : 5Ss
Informal 5S & DL Definitions DLs are complex systems that : Informal 5S andamp; DL Definitions DLs are complex systems that help satisfy info needs of users (societies)
provide info services (scenarios)
organize info in usable ways (structures)
present info in usable ways (spaces)
communicate info with users (streams)
Hypotheses : Hypotheses A formal theory for DLs can be built based on 5S.
The formalization can serve as a basis for modeling and building high-quality DLs.
Research Questions : Research Questions 1. Can we formally elaborate 5S?
2. How can we use 5S to formally describe digital libraries?
3. What are the fundamental relationships among the Ss and high-level DL concepts?
4. How can we allow digital librarians to easily express those relationships?
5. Which are the fundamental quality properties of a DL? Can we use the formalized DL framework to characterize those properties?
6. Where in the life cycle of digital libraries can key aspects of quality be measured and how?
Book Parts : Book Parts Ch. 1. Introduction (Motivation, Synopsis)
Part 1 – The 'Ss'
Part 2 – Higher DL Constructs
Part 3 – Advanced Topics
Appendix
Book Parts and Chapters - 1 : Book Parts and Chapters - 1 Ch. 1. Introduction (Motivation, Synopsis)
Part 1 – The 'Ss'
Ch. 2: Streams
Ch. 3: Structures
Ch. 4: Spaces
Ch. 5: Scenarios
Ch. 6: Societies
Book Parts and Chapters - 2 : Book Parts and Chapters - 2 Part 2 – Higher DL Constructs
Ch. 7: Collections
Ch. 8: Catalogs
Ch. 9: Repositories and Archives
Ch. 10: Services
Ch. 11: Systems
Ch. 12: Case Studies
Book Parts and Chapters - 3 : Book Parts and Chapters - 3 Part 3 – Advanced Topics
Ch. 13: Quality
Ch. 14: Integration
Ch. 15: How to build a digital library
Ch. 16: Research Challenges, Future Perspectives
Appendix
A: Mathematical preliminaries
B: Formal Definitions: Ss
C: Formal Definitions: DL terms, Minimal DL
D: Formal Definitions: Archeological DL
E: Glossary of terms, mappings
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
Slide22 :
Slide23 : Map courtesy: www.enchantedlearning.com Initial ETANA-DL Member Locations Virginia Tech Mississippi State University Vanderbilt University Canadian University College Walla Walla College Andrews University CWRU Willamette University
Slide24 :
Slide25 :
Lahav Website : Lahav Website
Megiddo Opening Screen : Megiddo Opening Screen
Locus Screen: Pictures : Locus Screen: Pictures View all
Area Screen : Area Screen
Slide30 :
Slide31 : ETANA-DL Approach Applying and extending Digital Library (DL) techniques to solve key problems: making primary data available, data preservation, and interoperability
Modeling archaeological information systems using 5S to better understand the domain and design the system and the supporting services
Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks:
eliciting requirements
refining metamodel and union schema
modeling sites
mapping
harvesting
providing useful services
ETANA-DL Website : ETANA-DL Website
Slide33 : Marking – writing
notes for
a specific user Marking Items
Slide34 : Marked Items Display Sender, Date,
Object OAI ID Sender
Comments Options:
View Record,
Add record to Items Of Interest,
Re-mark item (Redirect),
Unmark item (Remove item from list)
Slide35 : Discussions Page Discussions
about an
object View/Post
messages,
create new
threads
Slide36 : Recommendations Items recommended
on the basis of
similar interests
Slide37 : ETANA-DL Multi-dimensional Browsing 3 new sites 2 new types of artifacts
Slide38 : ETANA-DL Visual Browsing Service Visual Browse By site
Slide39 : Visual Browsing Nimrin: Topographical Drawings Full site North west quadrant Square:
N40/W20
Slide40 : Visual Browsing Nimrin : Square information Square:
N40/W20 Locus: 86 Loci layout
Slide41 : Visual Browsing Nimrin : locus sheet
Slide42 : Visual Browsing Bab edh-Dhra' Cemetery Pottery # 25
Slide43 : Visual Browsing Bab edh-Dhra' Cemetery Pottery # 25
ETANA Societies : ETANA Societies Historic and pre-historic societies (being studied)
Archaeologists (in academic institutes, fieldwork settings, or local and national governmental bodies)
Project directors
Technical staff (consisting of photographers, technical illustrators, and their assistants)
Field staff (responsible for the actual work of excavation)
Camp staff (e.g., camp managers, registrars, tool stewards)
General public (e.g., educators, learners, citizens)
ETANA Societies : ETANA Societies Social issues
Who owns the finds?
Where should they be preserved?
What nationality and ethnicity do they represent?
Who has publication rights?
What interactions took place between those at the site studied, and others? What theories are proposed by whom about this?
ETANA Scenarios : ETANA Scenarios Life in the site in former times
Digital recording: the planning stage and the excavation stage
Planning stage: remote sensing, fieldwalking, field surveys, building surveys, consulting historical and other documentary sources, and managing the sites and monuments
Excavation
Detailed information is recorded, including for each layer of soil, and for features such as pole holes, pits, and ditches.
Data about each artifact is recorded together with information about its exact find spot.
Numerous environmental and other samples are taken for laboratory analysis, and the location and purpose of each is carefully recorded.
Large numbers of photographs are taken, both general views of the progress of excavation and detailed shots showing the contexts of finds.
Organization and storage of material
Analysis and hypotheses generation and testing
Publications, museum displays
Information services for the general public
ETANA Spaces : ETANA Spaces Geographic distribution of found artifacts
Temporal dimension (as inferred by archaeologists)
Metric or vector spaces
used to support retrieval operations, and to calculate distance (and similarity)
used to browse / constrain searches spatially
3D models of the past, used to reconstruct and visualize archaeological ruins
2D interfaces for human-computer interaction
ETANA Structures : ETANA Structures Site Organization
Region, site, partition, sub-partition, locus, …
Temporal orderings (ages, periods)
Taxonomies
for bones, seeds, building materials, …
Stratigraphic relationships
above, beneath, coexistent
ETANA Streams : ETANA Streams successive photos and drawings of excavation sites, loci, unearthed artifacts
audio and video recordings of excavation activities and discussions
textual reports
3D models used to reconstruct and visualize archaeological ruins.
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
5S and DL formal definitions and compositions (April 2004 TOIS) : 5S and DL formal definitions and compositions (April 2004 TOIS)
Slide52 :
Slide53 :
Slide54 :
The XML Log Format : The XML Log Format Log SessionId MachineInfo Statement Transaction Timestamp SessionInfo RegisterInfo Statement Event Timestamp Action Search Browse StoreSysInfo Update SearchBy QueryString Catalog Collection PresentationInfo StatusInfo Timeout
5S Modeling -> Systems : 5S Modeling -andgt; Systems
Tools/Applications : Tools/Applications
Slide58 : Digital Object Minimal DL Metadata Catalog Descriptive Metadata Specification A Minimal DL in the 5S Framework Structural Metadata Specification Structured Stream
Slide59 : A Minimal ArchDL in the 5S Framework
Overview of 5SGraph : Overview of 5SGraph Workspace
(instance model) Structured
toolbox
(metamodel)
Slide61 :
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
Computing and Information Technology Interactive Digital Educational Library (CITIDEL) : Computing and Information Technology Interactive Digital Educational Library (CITIDEL) Domain: computing / information technology
Genre: one-stop-shopping for teachers andamp; learners: courseware (CSTC, JERIC), leading DLs (ACM, IEEE-CS, DBandamp;LP, CiteSeer), PlanetMath.org, NCSTRL (technical reports), …
Submission andamp; Collection: sub/partner collections www.citidel.org
Slide64 : Digital library architecture for local
and interoperable CITIDEL services
Slide65 :
CITIDEL -> NSDL : CITIDEL -andgt; NSDL A collection project in the
National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL
National Science Digital Library
www.nsdl.org
(Next slides courtesy Lee Zia, NSF)
NSDL ProgramTracks : NSDL ProgramTracks Core Integration: coordinate a distributed alliance of resource collection and service providers; and ensure reliable and extensible access to and usability of the resulting network of learning environments and resources
Collections: aggregate and actively manage a subset of the digital library’s content within a coherent theme / specialty
Services: increase the impact, reach, efficiency, and value of the digital library in its fully operational form
Targeted (Applied) Research: have immediate impact on one or more of the other three tracks
Pathways: large efforts across broad ranges of areas or approaches or users
Slide68 :
Slide69 :
NSDL Information ArchitectureEssentially as developed by the Technical Infrastructure Workgroup : NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup Usage Enhancement Collection Building User Interfaces Core NSDL
'Bus'
Digital Libraries in Education : Digital Libraries in Education Analytical Survey, ed. Leonid Kalinichenko
© 2003, www.iite-unesco.org, info@iite.ru
Transforming the Way to Learn
DLs of Educational Resources andamp; Services
Integrated/Virtual Learning Environment
Educational Metadata
Current DLEs: US (NSDL, DLESE, CITIDEL, NDLTD), Europe (Scholnet, Cyclades), UK (Distributed National Electronic Resource)
A Digital Library Case Study : A Digital Library Case Study Domain: graduate education, research
Genre:ETDs=electronic theses andamp; dissertations
Submission: http://etd.vt.edu
Collection: http://www.theses.org Project:
Networked Digital
Library of Theses
andamp; Dissertations
(NDLTD)
http://www.ndltd.org
Slide73 : Student Gets Committee
Signatures and Submits ETD
Slide74 : Library Catalogs ETD, Access is
Opened to the New Research WWW NDLTD
Slide75 :
Slide76 :
Slide77 :
Slide78 : OCLC SRU Interface
Slide79 :
ETD Union Search Mirror Site in China (CALIS)(http://ndltd.calis.edu.cn – popular site!) : ETD Union Search Mirror Site in China (CALIS) (http://ndltd.calis.edu.cn – popular site!)
Slide81 :
Board of Directors : Board of Directors Suzie Allard (ETD 2004, U. Kentucky)
Denise A. D. Bedford (World Bank)
Julia C. Blixrud (ARL, SPARC)
José Luis Borbinha (Natl Lib Portugal)
Alex Byrne (ETD 2005, ADT: Australia)
Tony Cargnelutti (ETD 2005, Australia)
Vinod Chachra (VTLS)
Susan Copeland (RGU, UK)
Jude Edminster (Bowling Green St. U.)
Scott Eldredge (Treasurer, ETD 2002, BYU)
Edward A. Fox (Exec Director,Virginia Tech)
John H. Hagen (West Virginia U.)
Thomas B. Hickey (OCLC)
Christine Jewell (U. Waterloo, Canada) Delphine Lewis (ProQuest)
Joan K. Lippincott (CNI)
Mike Looney (Adobe)
Gail McMillan (Secretary, Virginia Tech)
Joseph Moxley (ETD 2000, USF)
Eva Müller (U. Uppsala, Sweden)
Ana Pavani (PUC Rio, Brazil)
Axel Plathe (UNESCO, Paris)
Sharon Reeves (National Library Canada)
Peter Schirmbacher (ETD 2003, Humboldt)
Hussein Suleman (U.Cape Town, S. Africa)
Shalini R. Urs (U. Mysore, India)
Eric F. Van de Velde (ETD 2001, Caltech)
Selected Projects / Sponsors : Selected Projects / Sponsors Australia (ADT)
Brazil (BDT, IBICT)
Canada
Catalunya
Chile (Cybertesis)
Germany
India (Vidyanidhi)
Korea
OhioLINK: 79 colleges/univs Portugal (National Library)
South Africa
UK (British Library, JISC, Edinburgh, …)
UNESCO (especially Latin America, Eastern Europe, Africa)
Venezuela
Why ETD? Short Answer : Why ETD? Short Answer For Students:
Gain knowledge and skills for the Information Age
Richer communication (digital information, multimedia, …)
For Universities:
Easy way to enter the digital library field and benefit thereby
For the World:
Global digital library – large, useful, many services
General:
Save time and money
Increased visibility for all associated with research results
Slide85 :
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
Describing Quality inDigital Libraries : Describing Quality in Digital Libraries What’s a 'good' digital Library?
Central Concept: Quality!
Hypotheses of this work:
Formal theory can help to define 'what’s a good digital library' by:
New formalizations of quality indicators for DLs within our 5S framework
Contextualizing these measures within the Information Life Cycle
Quality and the Information Life Cycle : Quality and the Information Life Cycle
Formal Definition of DL Integration : Formal Definition of DL Integration DLi=(Ri, DMi, Servi, Soci), 1 i n
Ri is a network accessible repository
DMi is a set of metadata catalogs for all collections
Servi is a set of services
Soci is a society
UnionRep
UnionCat
UnionServices
UnionSociety
Formal Definition of DL Integration (Cont.) : Formal Definition of DL Integration (Cont.) DL integration problem definition:
Given n individual libraries, integrate the n DLs to create a UnionDL.
Slide91 : Repository1 DL1 Repository2 Union
Catalog Union
Repository Catalog1 Catalog2 Searching Union DL DL2 Service Browsing Service Union Service Harvesting, Mapping,
Searching, Browsing,
Clustering, Visualization Architecture of a Union DL
Slide92 : Example of Union Service: CitiViz
Multidimensional Browsing: Percentages of Animal Bones Across Nimrin Cultural Phases : Multidimensional Browsing: Percentages of Animal Bones Across Nimrin Cultural Phases
Slide94 :
Slide95 :
Slide96 :
Slide97 : ETANA-DL
Union Services
Descriptions
Harvesting
Mapping
Searching
Browsing
… Inverted Files Services DB Index Index Browse
Service Search
Service Browse DB Other
ETANA-DL
Services Web Interface XOAI XOAI Union
Catalog
Outline : Outline Acknowledgements
Introduction: Life Cycle, Curric., 5S, Book
ETANA-DL, 5S Description
Theory and Automation
Education: CS, ETDs
Quality, Integration, and Automation
Selected Links, Discussion
Selected Links - http://fox.cs.vt.edu : Selected Links - http://fox.cs.vt.edu CITIDEL (computing education resources)
www.citidel.org
NCSTRL (computing technical reports)
www.ncstrl.org
NDLTD (electronic theses and dissertations worldwide)
www.ndltd.org and etdguide.org
NSDL (National Science Digital Library)
www.nsdl.org
OAI (Open Archives Initiative)
www.openarchives.org
Virginia Tech Digital Library Research Laboratory (DLRL, www.dlib.vt.edu)
5S, AmericanSouth.Org, CSTC, DL-in-a-box, ENVISION, ETANA, MARIAN, NDLTD, NSDL, OAD, ODL, …)
Questions?Discussion? : Questions? Discussion?
Thank You!