Presentation Transcript
Question Answering: Question Answering Zhikun Meng
Overview: Overview What is Question Answering?
Why Question Answering?
How Question Answering?
What is the current status of Question Answering?
How to evaluate Question Answering?
What is the future of Question Answering?
What is Question Answering?: What is Question Answering? What is question answering system? Question answer clear
What is Question Answering?: What is Question Answering? Answer:
Question answering systems are designed to find answers to open domain questions in a large collection of documents.
Why Question Answering?: Why Question Answering?
How Question Answering?: How Question Answering? General architecture
question Question Classification Information Retrieval Answer
Extraction Answer answer e.g.
What is Calvados? /Q is /A where:/Q=“(Calvados)” Query=“Calvados is”
Text retrieva l=“…Calvados is often used in cooking…Calvados is a dry apple brandy made in… /A is : a dry apple brandy Answer:
/Q is /A:
“Calvados” is ”a dry apple brandy”
Question Classification: Question Classification e.g. “How much could you rent a Volkswagen bug for in 1966?”
Key word preprocessing (split/spell check/normalize)
Volkswagen-Volkswagen;
“Rotary engine cars were made by what company?” - “What company were rotary engine cars made by?”
Question Classification: Question Classification 2. Construction of question representation
How much: Question stem
rent: Answer type term
1966:Data constraint
Volkswagen
bug
Question Classification: Question Classification 3.Derivation of answer type
“How much”+ “rent” ->Money
Question Classification: Question Classification 4.Key word selection
Volkswagen AND bug AND rent
Question Classification: Question Classification 5.Key word expansion
rent-rented
Information Retrieval: Information Retrieval Retrieval documents and passages:
query: Volkswagen AND bug AND rent
The retrieval engine returns the documents containing all keywords (e.g.60 document passages from 1,000,000 documents collection)
Information Retrieval: Information Retrieval 2. Passage filtering
date constraint 1966. Out of the 60 passages returned by the retrieval engine for Q013, two passages are retained after passage post filtering.
Answer Extraction: Answer Extraction Identification of candidate answers
Answer type: Money
Identified candidates include $1 and USD 520.
Answer Extraction: Answer Extraction 2. Answer Ranking
score: $1
USD 520
Answer Extraction: Answer Extraction 3. Answer formulation
rent a Volkswagen bug for $1 a day
What is the current status of Question Answering?: What is the current status of Question Answering? Text REtrieval Conference (TREC)
http://trec.nist.gov/
Cross Language Evaluation Forum (CLEF)
http://clef.isti.cnr.it/
NII-NACSIS Test Collection for IR Systems Project (NTCIR)
http://research.nii.ac.jp/ntcir/index-en.html
TREC-8: TREC-8 In 1999, the 8th Text REtrieval Conference (TREC-8) first proposed QA track .
TREC-8 tasks included:
Answer factoid questions by returning a text snippet which contained an answer to the question
Build a reusable QA test collection.
TREC-9: TREC-9 Comparing to TREC-8, in TREC-9 the biggest change was the switch to “real" questions, rather than questions created especially for the track.
The absolute value of scores dropped, yet the performance of the TREC-9 systems improved significantly in QA technology.
TREC 2001 : TREC 2001 The major adjustment of TREC 2001 QA track is to divide the task into three separate tasks:
the main task,
the list task
the context task.
TREC 2002 : TREC 2002 TREC 2002 QA track contained two tasks, the main task and the list task
The system was required to return the exact answers
Systems were limited to one response per question, not five
Ranking metric changed to the confidence-weighted score
TREC 2003 : TREC 2003 The TREC 2003 question answering track contained two tasks:
the passages task (factoid)
the main task (factoids, lists, definitions)
significant participation was involved in lists question task
TREC 2004: TREC 2004 The factoid question and list question are not independent; instead, they are all related to given topics
More participants are involved in resolving list question and definition question tasks.
CLEF: CLEF In 2002, CLEF proposed its own Question Answering track which focuses on European Language QA, especially Cross-lingual QA (CLQA).
CLEF 2003 QA track was divided into monolingual and bilingual tasks
NTCIR: NTCIR In 2003, the Asian information retrieval conference NTCIR proposed its Question Answering track which focuses on Asian Language QA.
How to evaluate Question Answering?: How to evaluate Question Answering? TREC
About 1500 question collections from TREC-8,9,2001
Answers were extracted from a 3-Gbyte text collection containing about 1 million documents from sources such as the Los Angeles Times and Wall Street Journal. Each answer has at most 50 characters.
The answer accuracy used to be measured by the Mean Reciprocal Rank (MRR) metric used by NIST in the TREC QA evaluations. In TREC2002, ranking metric was changed to the confidence-weighted score.
How to evaluate Question Answering?: How to evaluate Question Answering? CLEF
In CLEF 2002, a test set for future cross-lingual research, DISEQuA (Dutch Italian Spanish English Questions and Answers )Corpus, is created.
What is the future of Question Answering?: What is the future of Question Answering? Factoid Questions
Lists Questions
Definition Questions
Non-English monolingual QA
Cross-lingual QA
cross-lingual and cross-media QA
Summary: Summary QA definition
General QA system architecture
QA technologies
QA research milestone
Roadmap
Questions?: Questions?