logging in or signing up chang Brainy007 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 145 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: June 18, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Entity SearchAre you searching for what you want?: Entity Search Are you searching for what you want? Kevin C. Chang Joint work with: Bin He, Zhen Zhang, Chengkai Li, Govind Kabra, Shui-Lung Chuang, Joe Kelley, Tao Cheng, Bill Davis, Mitesh Patel, Dave Killian Let’s start with the new universal greeting…: What have you been reading lately? Let’s start with the new universal greeting… What have you been searching lately? From the MetaQuerier to WISDM: I am becoming superficial…: From the MetaQuerier to WISDM: I am becoming superficial… Access Structure Deep Web Surface Web Kevin’s 2 projects in the 4-quardants: First Question:: First Question: Where is U. of Illinois? Can we search it? What have you been searching lately? : What have you been searching lately? The university and area of Kevin Chang? The email of Marc Snir? Customer service phone number of Amazon? What profs are doing databases at UIUC? The papers and presentations of ICDE 2007? Due date of SIGMOD 2007? Sale price of 'Canon PowerShot A400'? 'Hamlet' books available at bookstores? Challenge of the surface Web: Despite all the glorious search engines…: Are we searching for what we want? Challenge of the surface Web: Despite all the glorious search engines… Slide7: What you search is not what you want. Function follows view:: Function follows view: What is 'the Web'? Or: How do search engines view the Web? They say: Web is a corpus of PAGES. : They say: Web is a corpus of PAGES. We take an entity view of the Web: : We take an entity view of the Web: What is an “entity”? Your target of information– or, anything.: What is an 'entity'? Your target of information– or, anything. Phone number Email address PDF Image Person name Book title, author, … Price (of something) From pages to entities: From pages to entities Traditional Search Entity Search Slide13: Demo. We build Ver. 0.1, to understand the promises and issues. Three scenarios: Academic: CS sites, DBLP homepages. ECommerce: Books, Cellphones. Yellowpage: Comprehensive corpus. Special Thanks: Data from Stanford WebBase.: Special Thanks: Data from Stanford WebBase. Example application: Question answering : Example application: Question answering Q: Who are DB profs at UIUC? WISDM query: #dtf-nnuw100(#entity(professor) #entity(university) #entity(research Database Systems, Data Mining, IR)) Query Generation Querying Filtering andamp; Validation A: Geneva Belford, Kevin C. Chang, AnHan Doan, Jiawei Han, Marianne Winslett , ChengXiang Zhai Example application: Relation construction : Example application: Relation construction … … … … … … winslett@cs.uiuc.edu 333-3536 Marianne Winslett dewitt@cs.wisc.edu 608-263-5489 David DeWitt email phone prof andlt;prof, phone, emailandgt; WISDM tagging: #entity(prof) query: #tf-nnow50(#entity(professor) #tf-nnuw20(#entity(email) #entity(phone))) App-specific Entity Tagging Querying Relation Construction Example application: Best-effort integration: Example application: Best-effort integration Price of 'Hamlet'? WISDM query: #od50(#entity(title Hamlet) #entity(price)) results: ranked list of (andlt;title, priceandgt;, ) Buy.com: $ $10.99, Amazon.com: $12.00 … … Query Generation Querying Validation andamp; Ranking Slide18: How different is 'entity search'? How to define such searches? Why is Entity Search different…: Why is Entity Search different… Probabilistic entities v.s. A page is for sure a page. Contextual patterns v.s. Match a page by its content. Holistic Aggregates v.s. A page occurs only once. Associative results v.s. We never search for pairs of pages. Consider the entire process:: Consider the entire process: Page Retrieval 1. Input: pages. 2. Criteria: content keywords. 3. Scope: Each page itself. 4. Output: one page per result. Marc Snir Marc Snir Entity search is thus different…: Entity search is thus different… Entity Search 1. Input: probabilistic entities. 2. Criteria: contextual patterns. 3. Scope: holistic aggregates. 4. Output: associative results. Slide22: What are technical challenges? Or, how to write (reviewer-friendly) papers? Issue #1. EntityRank: How to rank entities?: Issue #1. EntityRank: How to rank entities? Say, Jiawei Han with #email, #phone, #researcharea Entity matters Is 'jhan@' an email? Is '2-3457' a phone? Context matters: Order, distance Frequency matters: How often is Jiawei Han – 'data mining'? Associativity matters: 'webmaster@cs.uiuc.edu' 'algorithm' Source matters: Where did you get this info from? Issue #2: Query Processing: How to optimize?: Issue #2: Query Processing: How to optimize? gphone tf #entity(professor) sprof='…' 'fax'-#entity(phone) nnow50 Q: #tf-nnow50(#entity(professor[David DeWitt]) fax #entity(phone)) (pre-materialized context index) Conclusion: One step at a time towards …: Conclusion: One step at a time towards … surface deep What You Search Is What You Want! Slide26: Thank You! Chengkai Li Zhen Zhang ShuiLung Chuang Tao Cheng Govind Kabra And the warriors behind … Arpit Jain Amit Behal David Killian Yuping Tseng Hanna Zhong Ngoc Bui Sonia Jahid Aniruddh Nath Paul Yuan Raj Sodhi Quoc Le Hemanta Maji Sung-Eun Kim Slide27: Thank You! Chengkai Li Zhen Zhang ShuiLung Chuang Tao Cheng Govind Kabra Arpit Jain Amit Behal David Killian Yuping Tseng Hanna Zhong Ngoc Bui Sonia Jahid Aniruddh Nath Paul Yuan Raj Sodhi Quoc Le Hemanta Maji Sung-Eun Kim And the warriors behind … You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
chang Brainy007 Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINT Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 145 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: June 18, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Entity SearchAre you searching for what you want?: Entity Search Are you searching for what you want? Kevin C. Chang Joint work with: Bin He, Zhen Zhang, Chengkai Li, Govind Kabra, Shui-Lung Chuang, Joe Kelley, Tao Cheng, Bill Davis, Mitesh Patel, Dave Killian Let’s start with the new universal greeting…: What have you been reading lately? Let’s start with the new universal greeting… What have you been searching lately? From the MetaQuerier to WISDM: I am becoming superficial…: From the MetaQuerier to WISDM: I am becoming superficial… Access Structure Deep Web Surface Web Kevin’s 2 projects in the 4-quardants: First Question:: First Question: Where is U. of Illinois? Can we search it? What have you been searching lately? : What have you been searching lately? The university and area of Kevin Chang? The email of Marc Snir? Customer service phone number of Amazon? What profs are doing databases at UIUC? The papers and presentations of ICDE 2007? Due date of SIGMOD 2007? Sale price of 'Canon PowerShot A400'? 'Hamlet' books available at bookstores? Challenge of the surface Web: Despite all the glorious search engines…: Are we searching for what we want? Challenge of the surface Web: Despite all the glorious search engines… Slide7: What you search is not what you want. Function follows view:: Function follows view: What is 'the Web'? Or: How do search engines view the Web? They say: Web is a corpus of PAGES. : They say: Web is a corpus of PAGES. We take an entity view of the Web: : We take an entity view of the Web: What is an “entity”? Your target of information– or, anything.: What is an 'entity'? Your target of information– or, anything. Phone number Email address PDF Image Person name Book title, author, … Price (of something) From pages to entities: From pages to entities Traditional Search Entity Search Slide13: Demo. We build Ver. 0.1, to understand the promises and issues. Three scenarios: Academic: CS sites, DBLP homepages. ECommerce: Books, Cellphones. Yellowpage: Comprehensive corpus. Special Thanks: Data from Stanford WebBase.: Special Thanks: Data from Stanford WebBase. Example application: Question answering : Example application: Question answering Q: Who are DB profs at UIUC? WISDM query: #dtf-nnuw100(#entity(professor) #entity(university) #entity(research Database Systems, Data Mining, IR)) Query Generation Querying Filtering andamp; Validation A: Geneva Belford, Kevin C. Chang, AnHan Doan, Jiawei Han, Marianne Winslett , ChengXiang Zhai Example application: Relation construction : Example application: Relation construction … … … … … … winslett@cs.uiuc.edu 333-3536 Marianne Winslett dewitt@cs.wisc.edu 608-263-5489 David DeWitt email phone prof andlt;prof, phone, emailandgt; WISDM tagging: #entity(prof) query: #tf-nnow50(#entity(professor) #tf-nnuw20(#entity(email) #entity(phone))) App-specific Entity Tagging Querying Relation Construction Example application: Best-effort integration: Example application: Best-effort integration Price of 'Hamlet'? WISDM query: #od50(#entity(title Hamlet) #entity(price)) results: ranked list of (andlt;title, priceandgt;, ) Buy.com: $ $10.99, Amazon.com: $12.00 … … Query Generation Querying Validation andamp; Ranking Slide18: How different is 'entity search'? How to define such searches? Why is Entity Search different…: Why is Entity Search different… Probabilistic entities v.s. A page is for sure a page. Contextual patterns v.s. Match a page by its content. Holistic Aggregates v.s. A page occurs only once. Associative results v.s. We never search for pairs of pages. Consider the entire process:: Consider the entire process: Page Retrieval 1. Input: pages. 2. Criteria: content keywords. 3. Scope: Each page itself. 4. Output: one page per result. Marc Snir Marc Snir Entity search is thus different…: Entity search is thus different… Entity Search 1. Input: probabilistic entities. 2. Criteria: contextual patterns. 3. Scope: holistic aggregates. 4. Output: associative results. Slide22: What are technical challenges? Or, how to write (reviewer-friendly) papers? Issue #1. EntityRank: How to rank entities?: Issue #1. EntityRank: How to rank entities? Say, Jiawei Han with #email, #phone, #researcharea Entity matters Is 'jhan@' an email? Is '2-3457' a phone? Context matters: Order, distance Frequency matters: How often is Jiawei Han – 'data mining'? Associativity matters: 'webmaster@cs.uiuc.edu' 'algorithm' Source matters: Where did you get this info from? Issue #2: Query Processing: How to optimize?: Issue #2: Query Processing: How to optimize? gphone tf #entity(professor) sprof='…' 'fax'-#entity(phone) nnow50 Q: #tf-nnow50(#entity(professor[David DeWitt]) fax #entity(phone)) (pre-materialized context index) Conclusion: One step at a time towards …: Conclusion: One step at a time towards … surface deep What You Search Is What You Want! Slide26: Thank You! Chengkai Li Zhen Zhang ShuiLung Chuang Tao Cheng Govind Kabra And the warriors behind … Arpit Jain Amit Behal David Killian Yuping Tseng Hanna Zhong Ngoc Bui Sonia Jahid Aniruddh Nath Paul Yuan Raj Sodhi Quoc Le Hemanta Maji Sung-Eun Kim Slide27: Thank You! Chengkai Li Zhen Zhang ShuiLung Chuang Tao Cheng Govind Kabra Arpit Jain Amit Behal David Killian Yuping Tseng Hanna Zhong Ngoc Bui Sonia Jahid Aniruddh Nath Paul Yuan Raj Sodhi Quoc Le Hemanta Maji Sung-Eun Kim And the warriors behind …