014 fair isaac ABT presentation cg

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

AQUAINT Applications of Artificial Brain Technology Robert Hecht-Nielsen : 

AQUAINT Applications of Artificial Brain Technology Robert Hecht-Nielsen ARDA AQUAINT Program Phase I 24-Month Workshop

Slide2: 

Abstract Project Category: Cross-Cutting Technology Project Goal: Apply recent neuroscience information processing discoveries to question answering and widely disseminate the resulting capabilities. Key Idea 1: Determine meaning content of a data body (text, sound, video, etc.) by identifying the precise meaning of each conceptual unit (word / character / word group / character group / variable element construction / sound bite / video snip / etc.) using (order / spatial position)-dependent long-range context. Key Idea 2: Compare or map meaning content of one data body with that of another data body by expanding each conceptual unit into a list of near-equivalents and compare or map these lists.

Slide3: 

Theory of Cerebral Cortex: References Hecht-Nielsen, R. (2003) A Theory of Cerebral Cortex Technical Report #0301, Institute for Neural Computation, University of California, San Diego. download from: inc2.ucsd.edu

Slide4: 

Conclusions of the Cortical Theory Mammalian cerebral cortex essentially has only one, universal, mechanism for: knowledge acquisition, knowledge storage, and knowledge use. Cortex uses sets of symbolic descriptive terms, each drawn from a separate lexicon, for describing the objects and actions of the mental world. Each item of knowledge is a link from one symbol to another. These links are formed between pairs of symbols which meaningfully co-occur. Given a specific collection of symbols (assumed facts), those symbols within a specified lexicon that receive links from all the assumed facts are termed an expectation. The elements of the expectation include all reasonable conclusions that one could draw from the assumed facts, given the available knowledge. Under some conditions, the ‘quality’ of each element of an expectation can be estimated. In summary, the secret of the brain is that a large, but practically obtainable, collection of pairwise link knowledge, when properly used, can consistently achieve performance surprisingly close to that of a theoretical omniscient system with total knowledge of the information environment.

Slide5: 

Phase I Accomplishments Question Answering for Text Built and demonstrated component to extract precise meaning content of the constituent conceptual units in both English and Chinese sentences. Conceptual unit meaning content is expressed as a list of replacement units. Comparison of text body meaning content can then be carried out by attempting to match meaning content of constituent conceptual units in both directions. With question expressed in the form of a question answering sentence (with missing placeholders of known class), only text bodies very likely to contain relevant answers will strongly match. System only operates on text bodies retrieved by a search engine front end that is operated at high recall and low precision. Ultra-High-Accuracy Speech Transcription Acoustic front end built and demonstrated. Sound lexicon developed and demonstrated. Mapping from sound symbol stream to word symbols begun. This Presentation will Focus on Text Question Answering

Slide6: 

The Problem of Question Answering Question answering often goes beyond factual query answering. “How old is Tony Blair?” is a factual query. “Is there any evidence that terrorist groups are currently operating in or around Herat, Afganistan?” is not. Question answering inherently addresses multiple media: text in many languages, speech, imagery, video, radio, TV, FAX, etc. User interface will usually be via a voice channel. Express question as a body of text in answer form: “The following terrorist groups: *, are currently operating in or around Herat, Afganistan”

Slide7: 

Question Answering, Step 1: Obtain Candidate Text Bodies to be Evaluated Run search engine on high-retrieval / low-precision setting. Question Answering, Step 2: Evaluate Domain of Discourse of Each Candidate Discard those that are not relevant to domain of question. Question Answering, Step 3: Use Context (both short and long range) to Determine Conceptual Unit Meaning Content Apply relevant knowledge to identify close replacement words.

Slide8: 

Context Based Word Expansion Generate a list of synonyms given no context. This list contains multiple word senses, plus noise. Prune the list based on the surrounding context.

Expand Each Word: 

Expand Each Word Expand terrorists by our synonym knowledge base. terrorists, guerrillas, militants, rebels, terrorist, criminals, extremists, gunmen, fighters, forces, groups, terrorism, guerrilla, soldiers, suspected, activists, group, radicals, men, separatists, islamic, rebel, police, suspects,… Prune the list based on antecedent support knowledge bases in the context of the sentence: Context: The terrorists stormed the embassy yesterday in the early morning. Pruned List: terrorists, guerrillas, militants, rebels, extremists, gunmen, fighters, forces groups, soldiers

Pruning Using Accumulated Knowledge: 

Pruning Using Accumulated Knowledge the terrorists stormed the embassy yesterday in the early morning the 1.00000 0.14844 0.08555 0.08980 0.03905 0.00000 0.00000 0.00000 0.00000 0.00000 terrorists 0.14844 1.00000 0.00144 0.00002 0.00004 0.00000 0.00000 0.00000 0.00000 0.00000 stormed 0.08555 0.00144 1.00000 0.00006 0.00050 0.00000 0.00003 0.00000 0.00000 0.00000 the 0.08980 0.00002 0.00006 1.00000 0.20070 0.04424 0.07100 0.08206 0.00000 0.00000 embassy 0.03905 0.00004 0.00050 0.20070 1.00000 0.00002 0.00014 0.00004 0.00010 0.00000 yesterday 0.00000 0.00000 0.00000 0.04424 0.00002 1.00000 0.00029 0.00024 0.00009 0.00018 in 0.00000 0.00000 0.00003 0.07100 0.00014 0.00029 1.00000 0.11329 0.15407 0.03139 the 0.00000 0.00000 0.00000 0.08206 0.00004 0.00024 0.11329 1.00000 0.20136 0.07574 early 0.00000 0.00000 0.00000 0.00000 0.00010 0.00009 0.15407 0.20136 1.00000 0.03263 morning 0.00000 0.00000 0.00000 0.00000 0.00000 0.00018 0.03139 0.07574 0.03263 1.00000 Support for “guerrillas” in place of “terrorists”: guerrillas 0.23772 1.00000 0.01391 0.00012 0.00007 0.00000 0.00000 0.00000 0.00000 0.00000 Accept!! Support for “suspected” in place of terrorists: suspected 0.04661 1.00000 0.00000 0.00004 0.00006 0.00000 0.00000 0.00000 0.00000 0.00000 Reject!!

Compare All Pairs of Pruned Lists: 

Compare All Pairs of Pruned Lists Sentence #1 rebels seized headquarters Tuesday militants attacked ambassador Friday guerrillas raided capital today The terrorists stormed the embassy yesterday. Sentence #2 Thursday hotel seized militants Tuesday apartment attacked rebels Wednesday buildings stormed guerrillas On Monday the building was raided by gunmen.

Pruned List Pair Similarity Measure: 

Pruned List Pair Similarity Measure Compare two sets of words and assign a numerical similarity. Use the synonym database which provides a ranked list of relative strengths. embassy 1.0 capital 1.0 hotel 0.193 hotel 0.039 building 0.189 building 0.051 buildings 0.154 buildings 0.016 apartment 0.104 apartment 0.011

Similarity Measure: 

Similarity Measure Compute an aggregate word group to word group similarity base on synonym strengths. Aggregate all the word group similarities into a sentence similarity measure.

Text Analysis for Meaning Similarity: 

Text Analysis for Meaning Similarity

Text Analysis for Meaning Similarity: 

Text Analysis for Meaning Similarity

Text Analysis for Meaning Similarity: 

Text Analysis for Meaning Similarity

Text Analysis for Meaning Similarity: 

Text Analysis for Meaning Similarity

Text Analysis for Meaning Similarity: 

Text Analysis for Meaning Similarity

Text Analysis for Meaning Similarity: 

Text Analysis for Meaning Similarity

Chinese Experiments: 

Chinese Experiments Segmentation of characters into words Synonymy Phrase Completion

Chinese Experiments: 

Chinese Experiments Chinese words are not separated by white space 在 已 验 收 的 技 术 成 果 中 , 由 成 都 山 地 灾 害 与 环 境 研 究 所 研 究 的 " 川 江 流 域 林 业 生 态 地 貌 及 林 业 生 态 分 区 " , 在 国 内 首 次 提 出 了 防 护 林 建 设 区 一 期 工 程 地 貌 生 态 分 区 原 则 , 等 级 和 定 量 数 据 , 为 长 江 防 护 林 的 区 域 规 划 提 供 了 重 要 科 学 依 据 . Translation: In the field of technical research, a research institute in Cheng-Du province which focuses on disasters has a new project named “Tree planting to change the environment and prevent disasters”, which is the first time a group has mentioned that by use of relocating trees within different levels of the environment, this proposal offers important scientific evidence that such actions prevent erosion near the river.

Chinese Character Frequency: 

Chinese Character Frequency Entry # Frequency Word Meaning 0 26139327 的 ‘s (possessive adj.) 1 10554031 number123 an Arabic number 2 10342370 一 one 3 9034251 中 middle (used in “China”) 4 8615187 在 at (locality) 5 6966996 人 person (very general) 6 786185 國 country 7 776289 十 ten 8 755264 大 large 9 273933 有 have (verb, infinitive) 10 6018090 年 year

Chinese Segmentation and “Word” Frequency: 

Chinese Segmentation and “Word” Frequency Entry # Frequency Word Meaning 0 90239 中 国 China 1 84042 台 灣 Taiwan 2 56841 美 國 America 3 43425 国 家 Country 4 41416 政 府 Government 5 40440 number123 年 Num + year 6 39098 进 行 Carries on 7 38662 发 展 Development 8 37132 美 国 America 9 30960 工 作 Work 10 30941 number123 日 Num + Day

Chinese Synonymy: 

Chinese Synonymy Word Id = 5013: 經 濟 English Equivalent Word 5013, Strength = 4237.62402 - 經 濟 Economy Word 6710, Strength = 1311.82837 - 經 濟 發 展 Economic Development Word 12154, Strength = 757.95203 - 經 濟 景 氣 Economic Boom Word 6419, Strength = 715.68579 - 經 貿 Economics and Trade Word 5041, Strength = 644.58606 - 投 資 Investment Word 11064, Strength = 616.31641 - 經 濟 研 究 Research in Economy Word 5449, Strength = 564.08844 - 農 業 Agriculture Word 5059, Strength = 539.64380 - 地 區 Region Word 26111, Strength = 532.93903 - 經 濟 結 構 Economic Structure Word 8291, Strength = 529.57587 - 漁 業 Fishery

Chinese Phrase Completion Examples: 

Chinese Phrase Completion Examples PHRASE: 今 天 他 们 ? Today, they ? Word 4, Strength = 14319.00000 - 在 Today, they are at Word 12, Strength = 2334.00000 - 是 Today, they are Word 2, Strength = 2075.00000 - 一 Today, they First/Expression Word 9, Strength = 1876.00000 - 有 Today, they have Word 13, Strength = 1765.00000 - 不 Today, they not (did not, had not, etc) Word 5039, Strength = 1395.00000 - 表 示 Today, they expressed Word 8, Strength = 1087.00000 - 大 Today, they Large/Expression Word 5006, Strength = 1067.00000 - 进 行 Today, they carry on Word 5250, Strength = 953.00000 - 提 出 Today, they proposed Word 192, Strength = 883.00000 - 对 Today, they are right… Word 5195, Strength = 876.00000 - 指 出 Today, they pointed out Word 203, Strength = 760.00000 - 向 Today, they approach

Slide26: 

Dr. Robert Means - Chief Technologist Kate Mark - Project Coordinator David Busby - Brain Software Architect Dr. Syrus Nemat-Nasser - Researcher Dr. Shailesh Kumar - Researcher Rion Snow - Researcher Adrian Fan - Researcher Luke Barrington - Intern The Fair Isaac Team