2

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

The Internet Archive : 

The Internet Archive 116 Sheridan Avenue The Presidio of San Francisco San Francisco, California 94129 www.archive.org

Our Mission: 

Our Mission Information accessible to anyone from anywhere. Information distributed across the globe in a network of regional digital libraries. Global Access to All Intellectual and Creative Works

Our Grand Vision: 

Our Grand Vision The Internet Archive Proposition If you contribute open content… And if you agree to sharing, use, and reuse… We will provide you with… Unlimited Storage Unlimited Bandwidth Forever For Free

What is in the Archive?: 

What is in the Archive? Web Pages 40 Billion Pages / 500 Terabytes / 47 Million Sites Films & Videos 8,000 Moving Images Music & Spoken Word 19,000 Concerts / 1,000 Lectures & Readings Books & Texts 27,000 Titles Software 10,000 Programs

Web Archive: 

Web Archive Wayback Machine Snapshot Every Two Months 2.5 Billion Pages per Crawl 15 Terabytes per month Recall Full Text Search 11 billion pages The Internet Library Archived URLs Organized by Subject

Moving Images Archive: 

Moving Images Archive Films Prelinger Collection & Feature Films Videos Animation, News, Games, Lectures Television Technology, World Events, Interviews Open Source Collection Individual Contributions

Audio Archive: 

Audio Archive Music Live Concerts, Net Labels, Classical Spoken Word Presidential Recordings, Lectures, Poetry Radio Programs News, Public Affairs, Politics Open Source Collection Individual Contributions

Books & Texts Archive: 

Books & Texts Archive The Million Book Project Partnered with Carnegie Mellon University International Children’s Digital Library 25 Languages / 45 Countries Project Gutenberg Freely Downloadable Public Domain Books Internet Bookmobile Egypt, India, Uganda, USA

Software Archive: 

Software Archive Machinima Animations Using Video Game Engines Speed Runs Record Breaking Game Play Movies Software Electronic Press Kits Background on Major Software Releases Classic Software Preservation Digital Game Archives

Internet Archive Site Map: 

Internet Archive Site Map MOVING IMAGES Prelinger Film Archives Computer Chronicles SIGGRAPH Theater Net Café World at War Open Source Movies Feature Films MSRI Math Lectures Open Mind Shaping San Francisco Brick Films Mosaic Middle East News Guerilla News Network Game Videos Machinima Speed Runs Videogame Previews Software Videos Skill Replays Classic Software Preservation Election 2004 Independent News Media Arts Youth Media Listen Up Youth Sounds Chat the Planet

Internet Archive Site Map: 

Internet Archive Site Map THE WEB Wayback Machine Recall Full Text Search TEXTS & BOOKS Million Books Project Children’s Library Project Gutenberg Arpanet Open Source Books Dance Manuals Internet Bookmobile AUDIO Live Music Archive Net Labels Presidential Recordings Democracy Now Other Minds Conference Proceedings Naropa Audio Archives Gender Talk Open Source Audio Blues & Country Electronic & Experimental Hip Hop & Rock Indie & Jazz Spoken Word

The Television Archive: 

The Television Archive 20 Global Television Networks 24 hours a day 7 days a week 4 Languages English, Russian, Japanese, Arabic 20 terabytes per month

Internet Archive Process: 

Internet Archive Process Acquire Content If Analog, Digitize and Encode If Audio/Video, Create Derivatives Create XML Metadata Update Search Engine Curate Individual Items Create Backups Enable Web Access

Technology – Data Acquisition: 

Technology – Data Acquisition Web Pages Heritrix Web Crawler – IA Developed Book and Text Scanning Kirtas APT BookScan 1200 Film & Video Digitizing Multiple Formats & High Capacity Contribution Engine Automatic Format Deriver

Technology - Storage: 

Technology - Storage Petabox Scalable Data Repository One Million Gigabytes High Density / Low Power Remote Management Geographic Redundancy San Francisco Amsterdam Alexandria Asia (2005)

Technology - Access: 

Technology - Access 10 million hits per day 60,000 unique visitors / day 135,000 files downloaded / day 1.5 gigabits/sec as of Q4 2004 XML based search engine

Internet Archive Partners: 

Internet Archive Partners National Libraries and Archives Library of Alexandria, Egypt Canadian National Library French National Library National Archives UK Library of Congress USA

Internet Archive Partners: 

Internet Archive Partners Universities Ars Digita University – Computer Science Carnegie Mellon University - Million Books Project MIT - Open Courseware Naropa University - Poetry Northwestern University - SCOTUS Rice University - Connexions University of Maryland - Children’s Digital Library University of Toronto - Canadiana Archive University of Virginia - Miller Center Public Affairs

Internet Archive Partners: 

Internet Archive Partners Specific Content Providers ACF Newsource - Radio Program Archives EOGEO - NASA LandSat Project United Nations - UN Environment Program Link TV - Mosaic Middle Eastern News MSRI - Math Sciences Research Institute Tucows - Software Archives

Internet Archive in Numbers: 

Internet Archive in Numbers Web Sites 40 Billion Pages, 500 Terabytes New Web Crawl Every 2 Months = + 30 TB Collections – Video, Audio, Texts 55,000 Unique Items 120 Terabytes Storage Costs $1,500 / Terabyte PetaBox = $1.5 Million Site Activity = 10 million hits / day 60,000 Visitors & 135,000 Downloads / day Typical Bandwidth Usage = 750 Megabits / second

Internet Archive Awards: 

Internet Archive Awards Computerworld Smithsonian Laureate Award - 2000 PC World Best of the Web - 2002 Yahoo Internet Life Site of the Year - 2002 Digital Archives Annual Award - 2002 PC Magazine Top 100 Classic Sites - 2004

Internet Archive: 

Internet Archive Wayback Machine Moving Images Books & Texts Music & Spoken Word Classic Software www.archive.org stewart@archive.org