nelson

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

U.S. Government Use of the OAI-PMH: 

U.S. Government Use of the OAI-PMH Michael L. Nelson Old Dominion University Norfolk Virginia, USA mln@cs.odu.edu http://www.cs.odu.edu/~mln/ ISTEC / NSF Ibero-American Digital Library Joint Project Development Symposium Campinas, Brazil - March 20, 2003

Acknowledgements: 

Acknowledgements ODU: K. Maly, M. Zubair, J. Bollen, X. Liu LANL: R. Luce, X. Liu NASA: G. Roncaglia, J. Rocker MAGiC (UK): Paul Needham

Outline: 

Outline Review of data provider / service provider model including “aggregators” Role of registration for repositories NASA projects OSTI demo project Technical Report Interchange (TRI) NASA, DOE, DOD

Disclaimer: Scientific and Technical Information (STI): 

Disclaimer: Scientific and Technical Information (STI) This talk will cover US Government focused / sponsored STI only This talk will not cover American Memory a cultural history project from the Library of Congress (LoC) http://memory.loc.gov/ the LoC played a significant role in the definition and early adoption of the OAI-PMH

Acronym Review: 

Acronym Review NASA Department of Energy Department of Defense CASI (Center for AeroSpace Information) http://www.sti.nasa.gov/ OSTI (Office of Scientific and Technical Information) http://www.osti.gov/ DTIC (Defense Technical Information Center) http://www.dtic.mil/ LaRC = Langley Research Center LANL = Los Alamos National Laboratory Sandia = Sandia National Laboratory AFRL = Air Force Research Laboratory

Data Providers / Service Providers: 

Data Providers / Service Providers

Aggregators: 

Aggregators data providers (repositories) service providers (harvesters) aggregator aggregators allow for: scalability for OAI-PMH load balancing community building discovery

Aggregators: 

Aggregators Frequently interchangeable terms: aggregators: likely to be community / institutionally focused caches: stores a copy, less likely to be community-oriented proxies: less likely to store a copy, may gateway between OAI-PMH and other protocols Dienst / OAI Gateway; Harrison, Nelson, Zubair, JCDL 03 To learn more about aggregators, caches & proxies: http://www.openarchives.org/OAI/2.0/guidelines-aggregator.htm http://www.cs.odu.edu/~mln/jcdl02/

Example Aggregators: 

Example Aggregators Arc - http://arc.cs.odu.edu/ first described “hierarchical harvesting” in D-Lib Magazine, 7(4) 2001 http://www.dlib.org/dlib/april01/liu/04liu.html Celestial - http://celestial.eprints.org/ among other services, it provides a history of harvests (successful vs. errors) http://celestial.eprints.org/cgi-bin/status

OAI-PMH 2.0 Registration: 

OAI-PMH 2.0 Registration Data Providers: http://www.openarchives.org/Register/BrowseSites.pl Service Providers: http://www.openarchives.org/service/listproviders.html 75 repositories registered ??? unregistered repositories unregistered because: testing / development not for public harvesting public, but “low-profile” never got around to it… ??? DP:SP ~= 5:1

Registration is Nice… …But Not Required: 

Registration is Nice… …But Not Required OAI-PMH is (becoming) the “http” for digital libraries there is no central registry of http servers remember the NCSA “What’s New” page? (ca. 1994) There will never be “registration support” in OAI-PMH registries are a type of service provider, built on top of OAI-PMH registration will be an integral part of community building friends…

<friends>: 

<friends> A light weight, optional, DP-centric method to communicate the existence of “others” http://techreports.larc.nasa.gov/ltrs/oai2.0/?verb=Identify .. <description> <friends ..namespace stuff..> <baseURL>http://naca.larc.nasa.gov/oai2.0</baseURL> <baseURL>http://ntrs.nasa.gov/oai2.0</baseURL> <baseURL>http://horus.riacs.edu/perl/oai/</baseURL> <baseURL>http://ston.jsc.nasa.gov/collections/TRS/oai/</baseURL> </friends> </description> ..

NASA <friends> example: 

NASA <friends> example

Langley Technical Report Server: 

Langley Technical Report Server publicly available began as an anonymous ftp server in 1992; http access in 1993 model for other technical report servers at other NASA centers details in NASA TM-109162 mostly LaTeX, MS Word, other systems some scanned reports http://techreports.larc.nasa.gov/ltrs/ http://techreports.larc.nasa.gov/ltrs/oai2.0/

NACA Technical Report Server: 

NACA Technical Report Server publicly available began in 1996 details in NASA TM-1999-209127 scanned reports from 1917-1958 NACA = predecessor to NASA contents mirrored with the MaGIC project a UK-based grey-literature preservation project OAI-PMH used to mirror contents http://naca.larc.nasa.gov/ http://naca.larc.nasa.gov/oai2.0/

Slide16: 

NACA Report 1345 as seen through its native DL http://naca.larc.nasa.gov/

Slide17: 

NACA Report 1345 as seen through MAGiC http://www.magic.ac.uk/

Slide18: 

NACA Report 1345 as seen through its Scirus (Elsevier) http://www.scirus.com/

Slide19: 

NACA Report 1345 as seen through my.OAI (FS Consulting) http://www.myoai.com/

NTRS OAI Architecture: 

NTRS OAI Architecture user . . . search for “cfd applications” local copy of metadata metadata harvested offline, through OAI interface each node independently maintained individual nodes can still support direct user interaction NTRS LTRS ATRS GTRS CASITRS all searching, browsing, etc. performed on the metadata here content (reports) remain archived at the local sites

NASA Technical Report Server: 

NASA Technical Report Server (nearly) publicly available replacement for the current distributed searching version of NTRS MySQL Va Tech harvester modified “bucket” details in Nelson, Rocker, Harrison, Library Hi-Tech, 21(2) (March 2003) a service provider & aggregator same OAI baseURL as used for interactive searching http://ntrs.nasa.gov/

NASA Technical Report Server: 

NASA Technical Report Server advanced, fielded search explicit query routing 10 NASA repositories 4 non-NASA repositories turned “off” by default

Slide23: 

non-NASA repositories > 0.5M records

NASA DLs in the Larger STI Realm: 

NASA DLs in the Larger STI Realm DOE DOD Universities Publishers . . . International NTRS could also be a data provider from the point of view of other DLs; allowing the harvesting of NASA report metadata. NTRS could also harvest metadata from other DLs, and provide access to non-NASA content. We hope to influence the direction of the science.gov effort to use OAI-PMH this could be a fully connected graph

OSTI Energy Citations Database: 

OSTI Energy Citations Database OAI-PMH support just recently added (Feb 2003) not yet officially announced 20k records, 8k full-text other OSTI collections planned http://www.osti.gov/energycitations/

Technical Report Interchange : 

Technical Report Interchange Goal: share technical reports between 4 US government labs without creating new digital libraries for users to learn! NASA Langley Research Center Air Force Research Laboratory Los Alamos National Laboratory (DOE) Sandia National Laboratory (DOE) Solution: use cooperating OAI-PMH caches at each site to export local contents ingest remote contents

TRI Production System - Status: 

TRI Production System - Status LaRC TRI System LANL TRI System Sandia TRI System AFRL TRI System ODU TRI System (Listener) Records coming in from other TRI systems Records going out to other TRI systems Slide from M. Zubair, ODU Proposed In Production

Mappings in TRI: 

Mappings in TRI Details in Liu, et al. ECDL 2002; the above table also taken from the same paper

A Single TRI Module: 

A Single TRI Module Slide from M. Zubair, ODU

The Future: Community Building: 

The Future: Community Building Ultimately, protocols and metadata formats are not what makes a difference Rather, the critical mass afforded by a common set of utilities (cf. http, Dublin Core, XML) The best current example: The Open Language Archives Community http://www.language-archives.org/ OAI-PMH provides the basis for communication between strangers, but allows even richer communication between friends

STI Communities: 

STI Communities Government produced/sponsored STI http://ntrs.nasa.gov/ http://www.osti.gov/energycitations/ http://dlib.cs.odu.edu/tri/ Academia self-archiving vs. institutional archives http://www.soros.org/openaccess/ http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm Commercial publishers e.g. BioMed Central http://www.biomedcentral.com/