mla00

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Throwing the Book at Digital Libraries: What’s around the corner?: 

Throwing the Book at Digital Libraries: What’s around the corner? Douglas W. Oard College of Library and Information Services University of Maryland

Outline: 

Outline What is a “Digital Library?” Foreign languages Searching audio and video Getting beyond topicality Finding things that aren’t there?

What is a “Digital Library?”: 

What is a “Digital Library?” A library with digital devices? OPACs, CDROMs, online search services, ... A library with digital content? Programs, data files, digitized media, ... Digital content organized like a library? Collection policy, cataloging, access, preservation

Advantages of Digital Objects: 

Advantages of Digital Objects Perfect reproduction Inexpensive and rapid distribution Compact storage Easily searched

Problems with Digital Objects: 

Problems with Digital Objects Display technology is often inadequate Some traditional cues are missing Shiny new book, dog eared pages, … Inversion of acquisition and cataloging costs Conversion of existing objects is expensive Long-term access is not assured

Unscrambling the Acronym Soup: 

Unscrambling the Acronym Soup Create the technology NSF Digital Library Initiative I Digitize the content NDL, NAIL, BLS, performing arts library, ... Develop the process NSF Digital Library Initiative II Build the systems Industry

Multilingual Information Access: 

Multilingual Information Access Allow anyone to find information that is expressed in any language

Widely Spoken Languages : 

Widely Spoken Languages Source: http://www.g11n.com/faq.html

Slide9: 

Source: James Crawford, http://ourworld.compuserve.com/homepages/JWCRAWFORD/can-pop.htm

Web Content: 

Web Content Source: Network Wizards Jan 99 Internet Domain Survey

European Union Web Sites: 

European Union Web Sites Source: European Commission, Evolution of the Internet and the World Wide Web in Europe

Motivations: 

Motivations Societal benefits Information exchange to improve understanding Economic benefits Information to provide competitive advantage Crisis response Language differences can produce costly delays

System Design: 

System Design Query Interface Search Engine Selection Interface Query Retrieved Set Documents

Query Entry: 

Query Entry Swiss bank Query in English: Search

Query Refinement: 

Query Refinement Swiss bank Query in English: bank: Bankgebäude ( ) bankverbindung (bank account, correspondent) bank (bench, settle) damm (causeway, dam, embankment) ufer (shore, strand, waterside) wall (parapet, rampart) Click on a box to remove a possible translation: Search Continue

Browsing Search Results: 

Browsing Search Results Swiss bank Query in English: Search English German (Swiss) (Bankgebäude, bankverbindung, bank) 1 (0.72) Swiss Bankers Criticized AP / June 14, 1997 2 (0.48) Bank Director Resigns AP / July 24, 1997 1 (0.91) U.S. Senator Warpathing NZZ / June 14, 1997 2 (0.57) [Bankensecret] Law Change SDA / August 22, 1997 3 (0.36) Banks Pressure Existent NZZ / May 3, 1997

Some Important Issues: 

Some Important Issues Understanding the full range of user needs Design / test a variety of user interfaces How will we use the documents we find? Reducing development costs

Searching Audio and Video: 

Searching Audio and Video Try out http://speechbot.research.compaq.com source: www.real.com, Feb 2000 Almost 2000 Internet-accessible Radio and Television Stations

Historical Audio Collections: 

Historical Audio Collections 30,000 hours in the Maryland Libraries National Association of Educational Broadcasters Arthur Godfrey Collection Over 100,000 hours in the National Archives With new material arriving at an increasing rate

System Design: 

System Design Query Formulation Detection and Ranking Delivery Selection Interactive Examination Index Feature Extraction

Slide22: 

HotBot Audio Search Results

BBN Radio News Retrieval: 

BBN Radio News Retrieval

MIT “Speech Skimmer”: 

MIT “Speech Skimmer”

CMU Television News Retrieval: 

CMU Television News Retrieval

The Maryland VoiceGraph Project: 

The Maryland VoiceGraph Project Exploring rich queries Content-based, speaker-based, structure-based Multiple cues to support selection Turn-taking, gender, query terms Flexible examination Text transcript, audio skims

Johns Hopkins Summer Workshop: 

Johns Hopkins Summer Workshop Speech to Speech Translation Cross-Language Audio Browsing Cross-Language Audio Search English Query English Audio Select Examine MEI

Summer Workshop Team: 

Summer Workshop Team Taiwan Academy of Sciences (1) Chinese University of Hong Kong (2) Johns Hopkins University (1) National Taiwan University (1) Princeton University (1) U.S. Government (2) University of Maryland (3)

Recommender Systems: 

Recommender Systems Exploit ratings from other users Personal recommendations, peer review, … Reaches beyond topicality to: Accuracy, coherence, depth, novelty, style, … Applies equally well to audio, video, …

Using Shared Ratings: 

Using Shared Ratings Credit: Jon Herlocker, ASIS 99

Using Shared Ratings: 

Using Shared Ratings Credit: Jon Herlocker, ASIS 99

Some Things We (Sort of) Know: 

Some Things We (Sort of) Know Popularity provides a good starting point People prefer to know who gave the ratings Negative information can be useful “I hate everything my parents like” People don’t like to provide ratings!

The Problem With Self-Interest: 

The Problem With Self-Interest Number of Ratings Value of ratings Value to the User Value to the Community None Lots Cost

Sources of Implicit Ratings: 

Sources of Implicit Ratings

Reading Times Predict Ratings: 

Reading Times Predict Ratings

Some Limiting Factors: 

Some Limiting Factors The cost of distributing the ratings Privacy of the raters and those who use ratings The desire to protect a competitive advantage

The Next Steps?: 

The Next Steps? Multidocument summarization Question answering Text mining

For More Information: 

For More Information Cross-language retrieval http://www.clis.umd.edu/dlrg/clir Speech-based retrieval http://www.clis.umd.edu/dlrg/speech Recommender systems http://www.clis.umd.edu/dlrg/filter My perspective http://www.glue.umd.edu/~oard