logging in or signing up eggers Toni Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 36 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: March 11, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript The Informative Role of WordNet in Open-Domain Question Answering: The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February 17, 2004Introduction: Introduction Information Extraction: not just for keywords anymore! Massive document collections (databases, webpages) require more sophisticated search techniques than keyword matching Need way to focus and narrow search improve precision One solution: Open-Domain Q/A Find answers to natural language questions from large document collections Examples: “What city is the capital of the United Kingdom?” “Who is the first private citizen to fly in space?” Text Retrieval Conferences (TREC) evaluate entered systems; show that this sort of task can be performed with “satisfactory accuracy” (Voorhees, 2000)Q/A: Previous Approach: Q/A: Previous Approach Captures the semantics of the question by recognizing expected answer type (i.e., its semantic category) relationship between the answer type and the question concepts/keywords The Q/A process: Question processing – Extract concepts/keywords from question Passage retrieval – Identify passages of text relevant to query Answer extraction – Extract answer words from passage Relies on standard IR and IE Techniques Proximity-based features Answer often occurs in text near to question keywords Named-entity Recognizers Categorize proper names into semantic types (persons, locations, organizations, etc) Map semantic types to question types (“How long”, “Who”, “What company”) Problems: Problems NE assumes all answers are named entities Oversimplifies the generative power of language! What about: “What kind of flowers did Van Gogh paint?” Does not account well for morphological, lexical, and semantic alternations Question terms may not exactly match answer terms; connections between alternations of Q and A terms often not documented in flat dictionary Example: “When was Berlin’s Brandenburger Tor erected?” no guarantee to match built Recall suffersWordNet to the rescue!: WordNet to the rescue! WordNet can be used to inform all three steps of the Q/A process 1. Answer-type recognition (Answer Type Taxonomy) 2. Passage Retrieval (“specificity” constraints) 3. Answer extraction (recognition of keyword alternations) Using WN’s lexico-semantic info: Examples “What kind of flowers did Van Gogh paint?” Answer-type recognition: need to know (a) answer is a kind of flower, and (b) sense of the word flower WordNet encodes 470 hyponyms of flower sense #1, flowers as plants Nouns from retrieved passages can be searched against these hyponyms “When was Berlin’s Brandenburger Tor erected?” Semantic alternation: erect is a hyponym of sense #1 of build Interactions between WN and Q/A: Interactions between WN and Q/A Expected Answer Type Keyword Alternations Question Processing Document Processing Answer Processing Index Passage Retrieval Answer Extraction Question Documents Answer(s) WordNetWN in Answer-type Recognition: WN in Answer-type Recognition Answer Type Taxonomy a taxonomy of answer types that incorporates WN information Acts as an “ontological resource” that can be searched to identify a semantic category (representing answer type) Used to associate found semantic categories with a named entity extractor So, still using an NE, but not bound to proper nouns; have found a way to map NEs to more general semantic categories Developed on principles conceived for Q/A environment (rather than as general onto principles) Principle 1: Different parts of speech specialize the same answer type Principle 2: Selected word senses are considered Principle 3: Completeness of the top hierarchy Principle 4: Conceptual average of answer types Principle 5: Correlating the Answer Type Taxonomy with NEs Principle 6: Mining WordNet for additional knowledgeAnswer Type Taxonomy (example): Answer Type Taxonomy (example)WN in Passage Retrieval: WN in Passage Retrieval Identify relevant passages from text Extract keywords from the question, and Pass them to the retrieval module “Specificity” – filtering question concepts/keywords Focuses search, improves performance and precision Question keywords can be omitted from the search if they are too general Specificity calculated by counting the hyponyms of a given keyword in WordNet Count ignores proper names and same-headed concepts Keyword is thrown out if count is above a given threshold (currently 10)WN in Answer Extraction: WN in Answer Extraction If keywords alone cannot find an acceptable answer, look for alternations in WordNet!Evaluation: Evaluation Paşca/Harabagiu approach measured against TREC-8 and TREC-9 test collections WN contributions to Answer Type Recognition Count number of questions for which acceptable answers were found; 3GB text collection, 893 questionsEvaluation (2): Evaluation (2) WN contributions to Passage Retrieval Impact of keyword alternations Impact of specificity knowledgeConclusions: Conclusions Massive lexico-semantic information must be incorporated into the Q/A process Using such information encoded in WN improved system precision by 147% (qualitative analysis) Visions for future: Extend WN so that online resources like encyclopedias can link to WN concepts Answer questions like: “Which classic rock group first performed live in Alburquerque?” Further improve Q/A precision with WN extension projects Eg, “finding keyword morphological alternations could benefit from derivational morphology, a project extension of WordNet” (Harabagiu et al., 1999) You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
eggers Toni Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 36 Category: Education License: All Rights Reserved Like it (0) Dislike it (0) Added: March 11, 2008 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript The Informative Role of WordNet in Open-Domain Question Answering: The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February 17, 2004Introduction: Introduction Information Extraction: not just for keywords anymore! Massive document collections (databases, webpages) require more sophisticated search techniques than keyword matching Need way to focus and narrow search improve precision One solution: Open-Domain Q/A Find answers to natural language questions from large document collections Examples: “What city is the capital of the United Kingdom?” “Who is the first private citizen to fly in space?” Text Retrieval Conferences (TREC) evaluate entered systems; show that this sort of task can be performed with “satisfactory accuracy” (Voorhees, 2000)Q/A: Previous Approach: Q/A: Previous Approach Captures the semantics of the question by recognizing expected answer type (i.e., its semantic category) relationship between the answer type and the question concepts/keywords The Q/A process: Question processing – Extract concepts/keywords from question Passage retrieval – Identify passages of text relevant to query Answer extraction – Extract answer words from passage Relies on standard IR and IE Techniques Proximity-based features Answer often occurs in text near to question keywords Named-entity Recognizers Categorize proper names into semantic types (persons, locations, organizations, etc) Map semantic types to question types (“How long”, “Who”, “What company”) Problems: Problems NE assumes all answers are named entities Oversimplifies the generative power of language! What about: “What kind of flowers did Van Gogh paint?” Does not account well for morphological, lexical, and semantic alternations Question terms may not exactly match answer terms; connections between alternations of Q and A terms often not documented in flat dictionary Example: “When was Berlin’s Brandenburger Tor erected?” no guarantee to match built Recall suffersWordNet to the rescue!: WordNet to the rescue! WordNet can be used to inform all three steps of the Q/A process 1. Answer-type recognition (Answer Type Taxonomy) 2. Passage Retrieval (“specificity” constraints) 3. Answer extraction (recognition of keyword alternations) Using WN’s lexico-semantic info: Examples “What kind of flowers did Van Gogh paint?” Answer-type recognition: need to know (a) answer is a kind of flower, and (b) sense of the word flower WordNet encodes 470 hyponyms of flower sense #1, flowers as plants Nouns from retrieved passages can be searched against these hyponyms “When was Berlin’s Brandenburger Tor erected?” Semantic alternation: erect is a hyponym of sense #1 of build Interactions between WN and Q/A: Interactions between WN and Q/A Expected Answer Type Keyword Alternations Question Processing Document Processing Answer Processing Index Passage Retrieval Answer Extraction Question Documents Answer(s) WordNetWN in Answer-type Recognition: WN in Answer-type Recognition Answer Type Taxonomy a taxonomy of answer types that incorporates WN information Acts as an “ontological resource” that can be searched to identify a semantic category (representing answer type) Used to associate found semantic categories with a named entity extractor So, still using an NE, but not bound to proper nouns; have found a way to map NEs to more general semantic categories Developed on principles conceived for Q/A environment (rather than as general onto principles) Principle 1: Different parts of speech specialize the same answer type Principle 2: Selected word senses are considered Principle 3: Completeness of the top hierarchy Principle 4: Conceptual average of answer types Principle 5: Correlating the Answer Type Taxonomy with NEs Principle 6: Mining WordNet for additional knowledgeAnswer Type Taxonomy (example): Answer Type Taxonomy (example)WN in Passage Retrieval: WN in Passage Retrieval Identify relevant passages from text Extract keywords from the question, and Pass them to the retrieval module “Specificity” – filtering question concepts/keywords Focuses search, improves performance and precision Question keywords can be omitted from the search if they are too general Specificity calculated by counting the hyponyms of a given keyword in WordNet Count ignores proper names and same-headed concepts Keyword is thrown out if count is above a given threshold (currently 10)WN in Answer Extraction: WN in Answer Extraction If keywords alone cannot find an acceptable answer, look for alternations in WordNet!Evaluation: Evaluation Paşca/Harabagiu approach measured against TREC-8 and TREC-9 test collections WN contributions to Answer Type Recognition Count number of questions for which acceptable answers were found; 3GB text collection, 893 questionsEvaluation (2): Evaluation (2) WN contributions to Passage Retrieval Impact of keyword alternations Impact of specificity knowledgeConclusions: Conclusions Massive lexico-semantic information must be incorporated into the Q/A process Using such information encoded in WN improved system precision by 147% (qualitative analysis) Visions for future: Extend WN so that online resources like encyclopedias can link to WN concepts Answer questions like: “Which classic rock group first performed live in Alburquerque?” Further improve Q/A precision with WN extension projects Eg, “finding keyword morphological alternations could benefit from derivational morphology, a project extension of WordNet” (Harabagiu et al., 1999)