HITIQA 2 tasks progress

Category: Education

Presentation Description

No description available.


Presentation Transcript

HITIQA-2 Intelligence Analyst’s Assistant in High-Quality, Interactive Question Answering Research Progress Report: 

HITIQA-2 Intelligence Analyst’s Assistant in High-Quality, Interactive Question Answering Research Progress Report DIA Introduction and Demonstration Washington, D.C., November 16, 2004

HITIQA Research Team: 

HITIQA Research Team SUNY Albany: Prof. Tomek Strzalkowski, PI/PM Prof. Boris Yamrom, co-PI Ms. Sharon Small, Research Scientist Ms. Hilda Hardy, Research Scientist Mr. Sean Ryan, Research Assistant Graduate students Rutgers: Prof. Paul Kantor, co-PI Prof. K.B. Ng Prof. Nina Wacholder Graduate students Government COTR: Kelcy Allwein, DIA

HITIQA Research Objectives : 

HITIQA Research Objectives HITIQA is an Analytical QA System “Scenario” QA: look for facts and events in context Not a factoid system – but factoids are complementary Semantics is central: data-driven, knowledge-based QA is Dialogue with Information Analytical task: topic + context (time, recipient, purpose) Evolving analytical strategy: line of questions & actions Detect, follow, anticipate, negotiate strategy turns and shifts HITIQA Approach Phase I: Create basic end-to-end capabilities + validate Phase II: Build-up knowledge + sustain productive dialogue Phase III+: Augment the analytical process through active assistance

HITIQA Deployment: 

HITIQA Deployment Test-bed and evaluation installations MITRE: AQUAINT test-bed RDEC/SAIC: 2 large servers, local/VPN access PNNL: Metrics Challenge Workshop Albany: local & on-line Agency local installs CIA: local server and demo laptop ARDA: Demo laptop (pending) DIA: local install (starting) HITIQA on-line Currently accessible from unsecured internet locations Also within VPN (SAIC) Unlimited access over firewalls – in progress

Tryouts and Evaluations: 

Tryouts and Evaluations On-site workshops with USNR and other analysts Two on-site workshops conducted in 2003 with 4 USNR analysts ARDA Metrics Challenge summer workshop: 2.5 weeks/8 analysts Tampa exercise (24 analyst teams, 2 1-hour sessions) Future workshops (Spring 2005): SAIC, Albany On-line evaluations with USNR Monthly weekend drills for 6 months Longitudinal studies: extended drill scenarios Status: Firewall problems, working around these Testing Facilities PNNL and MITRE installs SAIC large-scale installation AFRL Rome

HITIQA is Analytical QA: 

HITIQA is Analytical QA Factoid Questions “How many states are there in the U.S.?” Answer type can be determined by question form. Factoid Answers Short: one word, phrase, etc. An answer is either right or wrong, … … or unknown Analytical Questions “What is the relationship between Russia and Iraq” Complex and exploratory Asked in larger context of a scenario Analytical “Answers” Analyses and answer estimates Contain explanations, justifications, conditions Answer can be strong, weak, speculative, interim, …

An Analytic Scenario: 

An Analytic Scenario Scenario #1: Nuclear Arms Relationship: Russia and Iraq The Department of Defense has demanded a report on how Russia has influenced the nuclear arms program in Iraq. The department needs the summary by COB today. List the extent of the nuclear program in each country including funding, capabilities, quantity, etc. Your report should also include key figures in both the Russia and Iraq nuclear programs, any travels that these key figures have made to other countries in regards to a nuclear program, any weapons that have been used in the past by either country, any purchases or trades that have been made relevant to weapons of mass destruction (possibly oil trade, etc.), any ingredients and chemicals that have been used, any potential weapons that could be under development, other countries that are involved or have close ties to Russia or Iraq, possible locations of development sites, and possible companies or organizations that these countries work with for their nuclear arms program. Add any other information relating to the Russian and Iraqi Nuclear Arms Programs.

Iraq – Russia Scenario: 

Iraq – Russia Scenario Analyst: What is the history of the nuclear arms program between Russia and Iraq? Analyst: Who has helped financed the nuclear arms program in Iraq? Analyst: What type of nuclear weapons does Iraq possess? Analyst: Russian relationship with Iraq Analyst: What type of debt does exist between Iraq and Russia?

Scenario Structure: 

Scenario Structure Scenario = analytical problem A series of questions asked by analyst What is the history of the nuclear arms program between Russia and Iraq? Who has helped financed the nuclear arms program in Iraq? (Composite) Question A question posed by analyst + Any follow-on by either HITIQA or analyst A: How has al-Qaida conducted its efforts to acquire weapons of mass destruction? H: We have this information referring to bin Laden but no mention of al-Qaida. Are you interested?

Exploring Answer Space via Dialogue: 


Scenario-level answer structure: 

Scenario-level answer structure Q0: What is sarin’s potency? Botulin 100K times more toxic than sarin Persists for 30 minutes in clothes potency Q1: sarin development? Develop(X,sarin) Q2: nerve agents? nerve agents Q0: sarin’s impact on community?

Composite Question Structure : 

Composite Question Structure Q = Q0 + Q1 + Q2 + …+ + Original question posed by analyst Clarification/offer by HITIQA Visual panel action by analyst

Representing Information: Events and Relationships: 

Representing Information: Events and Relationships Events are basic information units in HITIQA Generic events Typed events Domain-grounded events Represented internally as frames: Event type: e.g., transfer, attack, … Attributes: e.g., people, locations, … Roles: e.g., agent, target, destination, … Frames are grouped into topics & “swarms” Attribute & keyword overlap → topical clusters Shared frame types & roles → event clusters Effect, affect, sequence, … → event “swarms”

Event Frames: 

Event Frames … Iraq possesses a few working centrifuges and the blueprints to build them. Iraq imported centrifuge materials from Nukem of the FRG and from other sources. One decade ago, Iraq imported 27 pounds of weapons-grade uranium from France, for Osirak nuclear research center. In 1981, Israel destroyed the Osirak nuclear reactor. In November 1990, the IAEA inspected Iraq and found all material accounted for. Peter Clausen, director of research at the Union of Concerned Scientists, said scientists are divided on whether one nuclear bomb can be made by Iraq from the 27 pounds of weapons-grade uranium. Marvin Miller, senior nuclear scientist at MIT in the US, said a crude Iraqi nuclear bomb couldn't fit on a missile, but could be carried in a large aircraft. FRAME TYPE: TRANSFER WMDTransfer TRANSFER TYPE (TOPIC): imported TRANSFER DEST (LOCATION): Iraq TRANSFER SOURCE (LOCATION): France TRANSFER OBJECT (WEAPON): uranium EXTRACT ASSIGN ROLES & SPECIALIZE

Generating a coherent dialogue : 

Generating a coherent dialogue Exploit internal structure of answer space Guidance for dialogue and direction of exploration Facilitate hypothesis formation by analyst Follow closely related events Shared types & roles → event clusters (imports of uranium) Attributes & text similarity → topical clusters (nerve agents) Explore larger topics containing these events One event makes another event likely → swarming links If missile exports by North Korea is of interest then likely missile developments status in NK may be relevant also.

Multiple Views of Answer Space: 

Multiple Views of Answer Space TRF TRF DEV DEV GEN Topical cluster: e.g. nerve agents TRF TRF DEV DEV GEN Event cluster: e.g., import of sarin target target target target Swarming Links

Knowledge Acquisition Process: 

Knowledge Acquisition Process Template development – frame mining Prop-Bank/Verb bank/Time Bank, etc. Corpus mining of frequently occurring event types Roles and relationships Develop frames for selected event types: Determine entities in the frame & role for each entity Determine seed rules for frame/role extraction Classify modal and other relationships e.g., denials, threats, allegations, …

Knowledge Acquisition Process: 

Knowledge Acquisition Process Bootstrapping over text corpora Develop feature set (context elements) Bootstrapping exploits duality of lexical and pattern space Expand from seed rules to high recall extraction Frame acquisition from structured data Statistical structure-to-text alignment Use CNS/WMD database Extract seed rules from aligned corpora

Preliminary event frames: 

Preliminary event frames

Preliminary event frames, cont’d: 

Preliminary event frames, cont’d Modal Attributes: Polarity: positive, negative, actual, probable, future, … Manner: say, claim, threaten, allege, advise, refute, … Source, if known

HITIQA Dialogue capabilities: 

HITIQA Dialogue capabilities System actions: offers, suggestions, clarifications Offering closely related events (time, space, etc.): Imports from Korea? Missile developments there? Suggesting similar events involving different players Sarin production in Iran? In Pakistan? Analyst’s actions affecting answer space Direct: accept, partially accept, or reject offers Indirect: strategy changes, diversions Support hypothesis formation Explicit: HITIQA suggestions and offers Analyst information saving actions Implicit: HITIQA: Direction of dialogue and answer formation Analyst: Deciding what question to ask next

Example (from a HITIQA drill): 

Example (from a HITIQA drill) Analyst: What is the history of the nuclear arms program between Russia and Iraq? HITIQA: Do you want to see material on cooperation on development of weapons between Iraq and Russia? Analyst: Yes … Who has helped finance the nuclear arms program in Iraq? HITIQA: Would you be interested in information on weapons in Iraq? Analyst: Yes … What type of nuclear weapons does Iraq possess? HITIQA: Would you be interested in information on dollars in Russia relative to your question? Analyst: Yes … What is the finance debt that Iraq owes Russia?

Answer generation: 

Answer generation Current Frame-based summaries Frames in the answer space rendered into headlines Passages sorted by “threads”: target, time, location, etc. Build more coherence into the answer Use elements Discourse Structure Theory Applied to passages from multiple documents Passages are output to form a more logical flow Maximize lucidity of the answer Use dialogue history to structure the answer → folders Compute rhetorical relations between answer elements justification, elaboration, evidence, contradiction, etc. Answer Summaries Summarize answer passages using XDoX Summarizer

Answer Structuring Options: 

Answer Structuring Options We also believe that Bin Ladin was seeking to acquire or develop a nuclear device. Al-Qa'ida may be pursuing a radioactive dispersal device what some call a dirty bomb. Israeli military intelligence sources reported that Bin Laden paid over 2 mil pounds sterling to a middle-man in Kazakhstan, who promised to deliver a dirty bomb to Bin Laden within two years. The Saudi-owned, London-based Arabic newspaper, Al-Hayat, declared that Bin Laden had obtained nuclear weapons. Osama bin Laden probably does not have a nuclear weapon, but likely has chemical or biological weapons, Defense Secretary Donald H. Rumsfeld said. Frame Type: Transfer Type: acquire Source: Destin: Bin Laden, Al-Qaida Cargo: nuc dev., dirty bomb Frame Type: Transfer Type: deliver Source: mid-man in KZ Destin: Bin Laden Cargo: dirty bomb Frame Type: Transfer Type: obtain Source: Destin: Bin Laden Cargo: nuclear weapons Frame Type: ~Capable Type: possess Agent: Bin Laden Instr: nuclear weapons However, more effect negation Specifically, In fact,

Integrated Visual/Language Interface: 

Integrated Visual/Language Interface Visual navigational context for dialogue Visual representation for event frames, answer spaces, and links between answer spaces. Multi-level views: scenario, question, frame Visual interactions integrated with QA process. Integrated Visual/QA interface Questions/Answer actions immediately reflected on visual Folders reflecting user/system dialogue focus Visual alerts for system updates

Integrated Visual interface : 

Integrated Visual interface

Visual Interface: frame view: 

Visual Interface: frame view

Other HITIQA work: 

Other HITIQA work Information Aspects “orthogonal” to content Type of topic: e.g., political, scientific, military, … Type of content: e.g., historical, biographical, Type of communication: e.g., human characteristics Promising preliminary results in automatic classification Intelligence value of information Accuracy, reliability, significance, depth, etc. Detail level, bias, opinion/viewpoint, objectivity, … Most non-metadata features highly personalized Scenario modeling Align GlassBox data with HITIQA activity logs Proposed Scenario Modeling Challenge Workshop

authorStream Live Help