Project Gutenberg :Project Gutenberg Digital Library Case Study
by Kaylin Boehme
Introduction :Introduction http://www.gutenberg.org/
Intro, con’t. :Intro, con’t. Michael Hart, University of Illinois
$100 million budget from Xerox company
“inspire the creation of eBooks and related technologies”
Over 250,000 eBooks available to download and use freely
Intro, con’t. :Intro, con’t. 100,000 eBooks downloaded daily
Top 100
Pride and Prejudice
Bram Stoker’s Dracula
Kama Sutra
Alice’s Adventures in Wonderland
War and Peace
Science, math, history
Intro, con’t. :Intro, con’t.
Goals and Scope :Goals and Scope Mission Statement: “to encourage the creation
and distribution of eBooks”
Goals and Scope, con’t. :Goals and Scope, con’t. Free of Charge
No cost
Not necessarily free to distribute or use how you please Freedom of Use
Not necessarily free of cost
Can be used however one wishes
Goals and Scope, con’t. :Goals and Scope, con’t. Entirely run by volunteers
Non-profit organization
Principle of Minimal Regulation
Goals and Scope, con’t. :Goals and Scope, con’t. Replicator technology
Anything inputted to a computer can be reproduced indefinitely
Allows everyone on the network to have a copy of a digitized work
First computer virus
Selection Criteria :Selection Criteria “bang for the buck”
Selection Criteria, con’t. :Selection Criteria, con’t. 3 categories
Light literature (Alice’s Adventures in Wonderland)
Heavy literature (the Bible)
Reference (Roget’s Thesaurus)
Copyright :Copyright public domain and copyright permissions
Copyright, con’t. :Copyright, con’t. Project Gutenberg License
Readers’ rights
Restrictions on public domain and copyrighted books
Use of the Project Gutenberg trademark
Removal of license results in simply a public domain work
Readers must contact the writer before distributing copyrighted materials
Metadata :Metadata “Project Gutenberg is not in the business of establishing standards.”
Metadata, con’t. :Metadata, con’t. Metadata stored separately as RDF file
Fancy formatting (physical layout)
Start line of main text
Title and author
Location of title page and table of contents
Position of illustrations
Proper names to be capitalized
Metadata, con’t. :Metadata, con’t. Bibliographic record info
Converting from RDF to MARC: http://www.youtube.com/watch?v=1zHKIJ6D_dA
File Formats & media types :File Formats & media types Digitization for Project Gutenberg
File Formats & Media Types, con’t. :File Formats & Media Types, con’t. Plain vanilla ASCII text (ASCII art)
File Formats & Media Types, con’t. :Open, editable formats
Other formats
HTML
Compressed/zipped
XML File Formats & Media Types, con’t.
File Formats & Media Types, con’t. :Media types
eBooks, eTexts
MIDI music, sounds
Images
Sheet music
Movies
Human genome File Formats & Media Types, con’t.
Interface Features :Interface Features Searching and Navigation
Interface Features, con’t. :Interface Features, con’t. Basic search
Advanced search
Browsing
Bookshelf by Topic
Search by category
Searcy by language
Search by author or title
Top 100 downloads
Read and bookmark from browser
Evaluations :Evaluations The good, the bad, and my own opinion
Evaluations, con’t. :Evaluations, con’t. Marie Lebert (positive)
“Nobody has done a better job of putting the world’s literature at everyone’s disposal… and to create a vast network of volunteers all over the world, without wasting people’s skills or energy.”
Evaluation, con’t. :Evaluation, con’t. Ronald P. Reck (negative)
“Lexical researchers can save considerable time and effort if community expectations change to reflect the need for accurate machine readable information depicting lexical resources.”
Evaluation, con’t. :Evaluation, con’t. My evaluation
Limited resources and manpower
3 million books downloaded monthly
Poor metadata standards is result of a system run totally on volunteerism
Evaluation, con’t. :Evaluation, con’t. My evaluation, con’t.
YouTube’s metadata and collection models similar to Project Gutenberg
Bad ones get buried, good ones are promoted, no lack of resources
Citations (MLA) :Citations (MLA) “Company History.” YouTube. 2009. 7 November 2009. http://www.youtube.com/t/about
“Doing Multiple Metadata Translations in MarcEdit.” YouTube. 3 March 2009. 6 November 2009. http://www.youtube.com/watch?v=1zHKIJ6D_dA
Kestenbaum, David. “Spam Goes Literary.” NPR. 8 August 2006. 5 November 2009. http://www.npr.org/templates/story/story.php?storyId=5624749
Lebert, Marie. “Project Gutenberg, from 1971 to 2005.” Dossiers du NEF. 15 August 2005. 5 November 2009. http://www.etudes-francaises.net/dossiers/gutenberg_eng.htm#public%20domain
“Project Gutenberg Wiki.” Project Gutenberg. 2009. 5 November 2009. http://www.gutenberg.org/wiki/Main_Page
Individual pages consulted are noted in the slide notes where applicable.
Reck, Ronald P. “Metadata Cards for Describing Project Gutenberg Texts.” RReckTek. 4 October 2007. 5 November 2009. http://iama.rrecktek.com/rreck/Metadata_Cards_for_Describing_Project_Gutenberg_Texts.pdf