MyLifeBits: Attempting to realize the Memex Vision : MyLifeBits: Attempting to realize the Memex Vision Jim Gemmell andamp; Roger Lueder
Gordon Bell
http://research.microsoft.com/barc/MediaPresence/MyLifeBits.aspx
Outline … MyLifeBits : Outline … MyLifeBits Background…fulfilling the Memex vision
Cyberizing everything
File to database transition
Use…beyond search
Long-term agenda and outlook
MemexPosited by Vannevar Bush in “As We May Think” The Atlantic Monthly, July 1945 : Memex Posited by Vannevar Bush in 'As We May Think' The Atlantic Monthly, July 1945 'A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility'
Supports: Annotations, links between documents, and 'trails' through the documents
'yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so that he can be profligate and enter material freely'
Sketch of memex : Sketch of memex
Bush’s camera on the head : Bush’s camera on the head
Capturing what you see : Capturing what you see
Memory Overload : Memory Overload As hard drives get bigger and cheaper, we're storing way too much.
By Jim Lewis
There's a famous allegory about a map of the world that grows in detail until every point in reality has its counterpoint on paper; the twist being that such a map is at once ideally accurate and entirely useless, since it's the same size as the thing it's meant to represent.
"The PC is going to be the place where you store the information and really the center of control“ Billg 1/7/2001 : 'The PC is going to be the place where you store the information and really the center of control' Billg 1/7/2001 MyLifeBits is a project to 'cyberize' everything!
What? Recall of all articles, books, CDs, photos, video, communication (e.g. mail, phone), meetings,and web
Why? …'because we can'
Office: communicate, store, andamp; work
Home andamp; Media Center: ambiance andamp;entertainment
Immortality for progeny. Memory aids
Goal: understand the 1 TByte PC for Lonfor Longhorn need, utility, cost, feasibility and tools.
LifeLog: A potential research program : LifeLog: A potential research program LifeLog:
A (sub)system that captures, stores, and makes accessible the flow of one person’s experience in and interactions with the world
LifeLog Thrust:
Capture the 'story' of a human Living
Content
Ontology (format)
The End of the Line… Biographies Sagas Family Bibles Home Movies Photo Albums Videos Cave Paintings Blogs LifeLog
Knowledge worker scenarios : Gordon: Researcher, consumer, computer system tester, nerd wanna-be, and average man
Melissa: middle manager
Patrick: Consultant
Nicholas: Analyst
Sondra: Office manager Knowledge worker scenarios
The guinea pig : The guinea pig Gordon Bell is digitizing his life
Has now scanned virtually all:
Books written (and read when possible)
Personal documents (correspondence including memos and email, bills, legal documents, papers written, …)
Photos
Posters, paintings, photo of things (artifacts, …medals, plaques)
Home movies and videos
CD collection
And, of course, all PC files
Now recording: phone, radio, TV (movies), web pages… conversations and meetings to come
Paperless throughout 2002. 12' scanned, 12’ discarded.
Only 30 GB!!!
I am data : I am data
Capture and encoding : Capture and encoding
Quindi conference capture : Quindi conference capture
I mean everything : I mean everything
Input: tools, time, and cost : Input: tools, time, and cost Scanners: HP Digital Sender, flat beds with ADF, 2-HP photo, faxing. (Duplex, color, feed-thru, etc.)
A good commercial scanner costs 2K-10K
Photos: $1 or 0.5-5 min. Large posters: ~ 1-5 hr. Artifacts: ~ 10 min. including photo
Scanning to TIF, PDF: andlt;1 min/page or .10/page
OCR: for MODI or PDF: ~3-5 pages/min (old data)
OCR: to recreate an editable 'original' 10 min/page!
OCR (Volume paper files): 400 pages/hr. 7 ppm.
Books: scanned at CMU ($10 - 100/book) in 1997
Videos: tbd
CyberAll Nov.1, 2001 : Music
6.9 GB
1.8K files
180 CDs Working
2.3 GB
432 folders
2.9K files Archive
5.1 GB
477 folders
18.7 K files Video
2.6 GB
10 hours
Low res My Books
98 MB 27.1K files andamp; 42K .msg
17.7 GB (by size) Files (by number) .xls .jpg .doc/html .pdf .ppt/ppt albums
.tif CyberAll Nov.1, 2001 Mail
.7 GB 43K msgs .gif
gbell wag: 67 yr, 25Kday life : gbell wag: 67 yr, 25Kday life
gbell wag: 67 yr, 25Kday life : gbell wag: 67 yr, 25Kday life
MyLifeBits organization: time and space : MyLifeBits organization: time and space Timeline/
Context (space)
Personal
(some $s)
GB Co. (angel, etc.)
Professional
ACM, etc., …
@Microsoft.com,
New co’s.
Archival (time) Working
MyLifeBits: Some Lives(t) : MyLifeBits: Some Lives(t) Personal
Parents, children, grandkids
CGB himself
GKB
Close friends
GB $s
Personal incl. several legal structures
Properties: autos, real estate,
Investments andamp; contracts
Past prof. companies/organiz’ns
DEC
Carnegie-Mellon U.
DEC, NSF, Encore, Ardent, Me Inc., CGB@ Microsoft
MLB
Clusters
Telepresence
WWW presence
Computer History Museum
BOD member
Fund-raising
CyberMuseum
Startups andamp; boards
Bell-Mason Director
Diamond andamp; Vanguard Brds.
Slide22 : 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
C,L m d d
CGB... GB SR mB,L KF SB
Where KvMO B ABos P B WCa
6-year --GS-HS---MIT DEC---+++++.+++---++++
Education KV-----mit,F cmu
Work Bell Elec DECcmuDEC E,NSF MSFT
ComputerMuseum M B SiValley
Books BN SBN HiTechVent
Computers 4-6 11 VAX E T
Awards..
Personal LifeLog Applications : Personal LifeLog Applications Conservator Baby Book Companion Caretaker Babysitter Advisor Mentor Tutor Autobiography Photo Album Personal
Assistant Diary/Journal Biography Financial Manager Medical Manager Executor Obituary Others Self Assistant for Elderly Application controlled by: Others Self Application used by: Personal Proxy Parole Officer Pers Flight Recorder Meeting Prep Captain’s Log Trustee
How LifeLog Fits : How LifeLog Fits Physical
Cameras
Microphone
GPS, IMU
Biomedical
Others Transactions
Email
Other Cyber
Phone, Vmail
Fax
Money, etc. Media
TV, Radio
Hardcopy
Softcopy
Ref. Data
Others Data Capture and Distillation LifeLog
Representation andamp; Abstraction (Ontology)
Access Modes Autobiography
Search
Assist
Teach Biography
Monitor
Analyze
Predict Multibiography
Correlate
Statistical Sources Applications Synbiography
Generate
Predict
MyLifeBits is: : MyLifeBits is: Memex and more (audio and video)
Universal store for all personal stuff
Guiding principles for the system:
Full text search andamp; collections (andgt; than hierarchy)
Visualizations for search, display, insight
Annotations and links add value and essential
Increase search ability and value of information.
So make many kinds and them easy to create!
Stories are the ultimate annotation
Keep the links when you author: 'transclusion'
MLB database: size and content? : MLB database: size and content? Database features are essential: Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication.
Folders andamp;Files were the starting point andgt;andgt; database into sets aka 'collections' that are identical to the folder structure
Outlook (msgs, attachments, calendar, contacts)
Web trails including voice message annotation
Journal (Outlook), trails: every document use andamp; transaction
What about?
Money (transactions, payees, etc.)…is their lifelog/trail
Streets and trips to cross-index to all docs
Attributes for photos for retrieval? Location, time, settings
Presentations as a report or trail. Each slide an object!
Searching: the most useful app? : Searching: the most useful app? Challenge: What questions for useful results?
Lots of ways to look at what you retrieve
Need for breaking the returns into segments
Searching for an indexer and search engine: index service, Enfish, dtSearch
Stuff I’ve Seen MSR’s index andamp; search… evolving in the right direction.
Productizing would remove the pressure for Longhorn
Slide28 :
Annotation like this… : Annotation like this… Voice
Annotation
Pivot to look at all of MLB(t) : Pivot to look at all of MLB(t) Call, contact, pivot by time to find web page
Find brig, image, and look for 80 : Find brig, image, and look for 80
Here are the photos : Here are the photos
Timeline view tells a story : Timeline view tells a story
Finding scatological works : Finding scatological works
Statistics of use : Statistics of use
Slide36 :
Slide37 :
Detail view : Detail view
Resource explorerAncestor (collections), annotations, descendant& preview panes turned on : Resource explorer Ancestor (collections), annotations, descendant andamp; preview panes turned on
Interface to xls : Interface to xls
Slide41 :
Slide42 :
Synchronized timelines with histogram guide : Synchronized timelines with histogram guide
Visualization : Visualization Browsing andamp; searching. 'Get me what I want|need!'
Help the user find things among possible items versus
Waiting for an ideal system that can find 'what I want'
Publication: Conventional andamp; web, presentations, etc.
Helps understand the nature of the content e.g. histogram of objects in time
Context: Links to help understand the relationship between objects. Provides more search handles.
Information density: what is it? What is its relationship to others?
Content important. Flash and form, less useful.
Value of media depends on annotations : Value of media depends on annotations 'Its just bits until it is annotated'
System annotations provide base level of value : System annotations provide base level of value Date 7/7/2000
Tracking usage – even better : Tracking usage – even better Date 7/7/2000. Opened 30 times, emailed to 10 people (its valued by the user!)
Get the user to say a little something is a big jump : Get the user to say a little something is a big jump Date 7/7/2000. Opened 30 times, emailed to 10 people. 'BARC dim sum intern farewell Lunch'
Getting the user to tell a story is the ultimate in media value : Getting the user to tell a story is the ultimate in media value A story is a 'layout' in time and space
Most valuable content (by selection, and by being well annotated)
Stories must include links to any media they use (for future navigation/search – 'transclusion').
Cf: MovieMaker; Creative Memories PhotoAlbums Dapeng was an intern at BARC for the summer of 2000 We took him to lunch at our favorite Dim Sum place to say farewell At table L-R: Dapeng, Gordon, Tom, Jim, Don, Vicky, Patrick, Jim
Value of media depends on annotations : Value of media depends on annotations Auto-annotate whenever possible e.g. GPS cameras
Make manual annotation as easy as possible. XP photo capture, voice, photos with voice, etc
Support gang annotation
Make stories easy
'Its just bits until it is annotated'
Slide51 :
Slide52 :
Slide53 : CD VCR Cassette Plasma Panel DVD Media
Center
Computer Set top Set top Kbd Mse Wfr Spkr Spkr IR Cable/
Satellite Ethernet SVHS-wide 5.1 digital 5 speakers stereo stereo stereo
Video* 5.1 digital
comp. stereo
Video* Video* Cables/links
Speaker 5+1
Plasma 2 or 3
Cable/Enet 2
IR 8
Stereo 4
5.1 digital 2
Comp./S-video 3
Plasma panel 1
Power 10
Kbd/mse 2
Monitor II (opt.) 4
Camera 2
Total 42 – 46
Things 18+remotes *Video = composite or S-video Camera
Mic Receiver
Slide54 :
Media center 2 : Media center 2
Photos : Photos
Caneel Bay Vacation Jan. 1998 : Caneel Bay Vacation Jan. 1998 Gordon, Gwen, Brig, Pam, Fiona, Bob, Laura and Kolbe
MyLifeBits use scenarios : MyLifeBits use scenarios Acquire everything! (I mean everything!_
Professional personal use at work!
Home/personal: Provide ambiance andamp; entertainment using Home Media Center
Enhancing content through photo and video albums
Events, places, trips, people, time intervals ---------- Database land and authoring --------
How I spend my time or an interval of time. Recall a 'trail'… What was I thinking about?
Endless need for authoring andamp; reporting tools
ISBQ: Interactive Story By Query
A Person (auto- or -biography web hosted time line
Personal/web/org. hosted collections andamp; catalogs
The Agenda for the Tbyte(s), Lifetime, PC:The killer app after office and mail. : The Agenda for the Tbyte(s), Lifetime, PC: The killer app after office and mail. Guarantee that data will live forever! 'dear appy' problem
Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
Paper capture has to be as easy as discard (scanner/shredder)
Personal meeting capture...
E-book…e-magazines andamp; journals need to have critical mass!
Telephony and audio capture with indexing
Media Center compatible for entertainment (photos, video, TV, radio)
Content analysis (critical for photo andamp; video!)
Information control: privacy, security, expunge/deniability,…
One dbase for everything (articles, books, conversations, ... financial transactions) …vs. long-term use of hierarchical files. Is dbase intuitive?
Annotations/meta-information add every-increasing value Easy annotation for aiding search and it becomes the content
The 'killer apps': Alzheimer, immortality, surrogate memory?
GUI’s to improve use (e.g. time to learn, use, retention)
The “dear appy” problem : The 'dear appy' problem Dear Appy,
How committed are you? Please come back to me, Lost and forgotten data
Who’s responsible?
media
platform, file, and databases
evolving standards and formats
evolving and/or disappearing apps
The Amnesia Control Problem : The Amnesia Control Problem Full sharing of bits that are mine
I created them, OK to copy and distribute
DRM: purchased for my own use
'OK to look at, but I only own half the bits'
Controlling forgetfulness
Private, do not 'demo'
Expunge forever... 'this never happened'
The Content Analysis Problem : The Content Analysis Problem 'Cliplets': Automatic segmentation of a pile of documents and video into individual documents and scenes.
Item typing: Would like a minimal Dublin Core for each item: date, creator, title, source, abstract, and type
'Type' classification: articles, letters, memos, etc.
Ontology creation for collections
The End : The End