Computing - The Next 10 YearsInfinite Memory and Bandwidth : Implications for Universal Access to Information: Computing - The Next 10 Years Infinite Memory and Bandwidth : Implications for Universal Access to Information Raj Reddy
Carnegie Mellon University
Pittsburgh, USA
April 6, 2001
Talk presented at Georgia Tech 10th Anniversary Convocation
Future Technology: Future Technology Computational power doubles every 18 months (Moore’s Law)
100-fold improvement every 10 years
Disk Densities double every 12 months
1000-fold improvement every 10 years
Optical bandwidth doubling every 9 months
10000-fold improvement every 10 years
Infinite Bandwidth and Memory before Computation
Cost decreasing, density increasing
What does the future hold?: What does the future hold? We can see some glimpses of the future
Universities without walls,
Computers that never fail and self healing software
Every home with giga PCs connected by gigabit networks
Access to all the published creative works of the world
anytime anywhere anyone
Emergence of the World Bank of, not money, but Knowledge
Systems, so-called geriatric robotics, that help the disabled lead normal lives, and
Systems that give the rest of us superhuman capabilities, like getting a month’s work done in a day
Universal Access to Information: Universal Access to Information Information at your fingertips
Access to all human knowledge:
Anyone
Anywhere
Anytime
All Human Knowledge Recorded Information: All Human Knowledge Recorded Information Books
Periodicals (journals, newspapers)
Music, opera, dance
Paintings, Sculptures and Monuments
Movies, video
Databases, software
Suppose all of this were on the Web
Examples from www.ulib.org: Examples from www.ulib.org Lecture: Michael Shamos on UL
Books: A Child’s History of England
Art: Greek Art
Slide7:
Examples from www.ulib.org: Examples from www.ulib.org Lecture: Michael Shamos on UL
Books: A Child’s History of England
Art: Greek Art
What is a book? What is a digital book ?: Collection of static content
Collection of dynamic multimedia content
Linearly organised
Browsable, navigable
Selected by an Author as related
Selected by User as related
Occupying a single physical location
No physical existence
Physically bound between cover
Instantly Transmittable What is a book? What is a digital book ?
What is a Library?: What is a Library? Collection of items
Linearly organized (shelves)
Chosen by budget constraints
Occupying physical space
Cataloged for access
What is a Digital Library?: What is a Digital Library? Collection of digital items
(potentially huge)
Encompassing everything (someday)
Organized arbitrarily
Occupying no physical space
Fully content-searchable
Universal Library Implications: Universal Library Implications Elimination of time, space, cost constraints
Democratization of information
'Knowledge is power'
Hyperlinks to related information
Preservation and Dissemination of Knowledge
faster and wider
Backup preservation
Preservation of culture
Universal Library Implications: Universal Library Implications Research
Web of scholarly information, reviews
Teaching
Support for distance education
Academic publishing
Virtual museums
Interactivity
Universal Library Applications: Universal Library Applications Acess to 'Born Digital' Information
World produces a Billion Billion(1018) bytes of information every year(Lyman and Varian)
90% is stored digitally
Digital museum
Digital tour guide
What’s in the Taj Mahal?
Universal Library Applications: Universal Library Applications Research assistant
What did Newton write about color?
What are Moslem views on race?
Teaching resource
'Act out' books in virtual reality
Real-time explanations
Business information
Data mining
We Can Store Everything: We Can Store Everything 1 book = 500 pp.
1MB uncompressed – 300KB compressed
108 to 3x 108 books = ~1014 bytes = 100 terabytes
Over 100 million computers on the Internet
At 1 GB each, andgt;100 petabytes now
1 GB of disk costs ~$3
100 terabytes andlt; $300 thousand to $1 million
Non-textual Material: Non-textual Material 1 Movie = 10 GB
1 petabyte = 100,000 movies
All the movies ever made!
Audio
1 petabyte = 3000 years of music
All music ever performed or recorded
Paintings and Photos @ 1 MB
1 petabyte = 1 billion painting or photos
Non-textual Material: Non-textual Material Gore’s Digital Earth
'A multi-resolution, three-dimensional representation of the planet, into which we can embed vast quantities of geo-referenced data.'
Area of Earth » 1/2 peta m2
1000 bytes/m2 feasible
2 MB/m2 not practical yet Þ 1021 bytes = 1 zettabyte
{peta-, exa-, zetta-, yotta-}
Technological Challenges: Technological Challenges Input (scanning, digitizing, OCR)
Data representation
text, notations, images, web pages
Navigation and Search
Multilingual Issues
Output (voice, pictures, virtual reality)
Synthetic Documents
Universal Library Design: Universal Library Design Modular
Technology plug-ins (e.g. machine translation)
Distributed
Mirror sites
Multiple interfaces
Human (languages, cultures, literacy)
Machine
Universal Library Design: Universal Library Design Speech input/output
Pictorial output
Language support
Translation assistants
Summarization tools
Synthetic documents
Encyclopedia-on-demand
Input Issues: Input Issues Non-digital media
Conversion, scanning, correction
Triple keyboard, uncorrected OCR
Digital media
Formats, conversions, color representation
ASCII, HTML, SGML, XML, PDF, PS, TEX
JPEG, TIFF, GIF?
Input Issues: Input Issues Structured matter
Musical notation, Laban
Chemistry
3D Items
Resource allocation (what’s first?)
Duplication of effort (no registry)
Metadata: Metadata Data about an item not part of the item
Bibliographic
Format, medium, encoding, resolution
Provenance
Reliability, integrity
Permissions
Who generates metadata?
Navigation: Navigation Browsing, finding, searching, flying
Fractal view
Keys are granularity and connectivity
View whole collections or one glyph
Understanding structure of information Making Sense Of The World’s Knowledge
Searching Mathematics: Searching Mathematics
Searching Mathematics: Searching Mathematics
Multilingual Issues: Multilingual Issues Character sets
Representations
Íîäà ôèçè÷åñêè íàõîäèòñÿ â çäàíèè Èçâåñòèé
Нода физически находится в здании Известий
Multilingual navigation
Translation assistance
Synthetic Documents: Synthetic Documents Documents derived automatically from retrieved information
Multilingual translation
Abstracts, summaries, glossaries
Encyclopedia-on-demand
Information Reliability: Information Reliability Existence ¹ validity
Universal Library Philosophy
Avoid value judgments
Provide information from which users (and programs) can assess validity
Source, reputation, recency, reviews, consistency
Scaling Problems: Scaling Problems Search services (e.g. Altavista) index andgt;108 documents
Suppose there were 1012 ?
How can a billion users access the same item at once?
Policy Challenges: Policy Challenges Use of copyrighted material
Economics (Who pays? Who gets?)
Privacy
Reliability of information
Change in the nature of teaching
Use Of © Content: Use Of © Content Philosophy: must pay for use
Authors, publishers will not suffer
Implied license
Automated permissions
Bulk licensing
Compulsory licensing
Owner CAN’T refuse; user MUST pay
Economics: Economics Flat-fee subscriptions (e.g. HBO)
Metered use (electric company)
Microcharge (Tobias 'clickl')
Free (paid by government)
Automated permissions
Use measured by technology
Operating Model: Operating Model Single portal for access to all information
Universal Library provides input, access, multilingual, output and synthesis tools
Universal Library will be a model scanning operation
Registry of digitized works
Operating Model: Operating Model Specialized collections curated by specialists, provided to Universal Library
Foreign collection performed in foreign countries
Universal Library will be mirrored in ~12 sites around the world
Universal Library Status: Universal Library Status andgt;13,000 digital volumes
Art
Newspapers
Music, video
Portal to hundreds of other collections
Visit http://www.ulib.org
Projects: Projects Navigator
Academic electronic publishing
Electronic Union Catalog
Books out of copyright books out of print
Software distribution
Conclusions and Recommendations: Conclusions and Recommendations Conclusions
Barely 10% of all public information is available on the Internet
Government needs to play a leadership role in developing digital libraries
Significant technical and operational challenges in migrating and maintaining holdings in digital form
Intellectual Property rights need to be addressed to facilitate creation and access digital libraries
Recommendations
Support research: meta data, scalability, multiple languages, security, and usability
Create testbeds: million book project
Place all public governmental information online
Preserve IP rights of creators by creating tax incentives for public use of online copyrighted information