WNPres

Uploaded from authorPOINTLite
Views:
 
Category: Entertainment
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

WordNet: 

WordNet Question Answering Systems March 17, 2003 John Hankins

Overview: 

Overview WordNet Design “Five Papers on WordNet” WordNet Interfaces C API, JWNL (Java)

WN Design: 

WN Design Lexical database for English 129,509 word forms (~45% collocations) 99,643 synsets 391,885 relations Parts of speech: nouns, verbs, adjectives, adverbs Psycholinguistics WN attempts to model human lexical memory Design based on psychological testing

Synsets: 

Synsets Synonym Sets Synsets are the fundamental component of WordNet Each synset represents a concept {board, plank} {board, committee} {board, get on}

Polysemy & Word Forms: 

Polysemy & Word Forms Synsets (word meanings) contain one or more word forms Word forms in multiple synsets are polysemous Polysemous word forms are said to have multiple senses

Familiarity: 

Familiarity Familiarity can predict: speed of reading, speed of comprehension, ease of recall, probability of use Familiarity based on word count: Hard to find reliable data Influenced by corpus Error prone

Polysemy, Familarity & Zipf's Law: 

Polysemy, Familarity & Zipf's Law Zipf's law: There is a constant k such that f * r = k In words: there is a predictable relation between the frequency of a word and its rank Zipf's other law: The number of meanings of a word is related to its frequency of use So, WN uses polysemy to indicated familiarity Example: horse 6, equine 1

Nouns in WN: 

Nouns in WN Typical noun definition: superordinate term plus distinguishing features Maple is a kind of tree of the genus Acer bearing winged seeds in pairs; north temperate zone Superordinate (parent) terms known as hypernyms (‘tree’ in our example) Subordinate (child) terms known as hyponyms

Nouns & Lexical Inheritance: 

Hypernym/Hyponym hierarchy: lexical inheritance Each child inherits the characteristics of its parent Similar to inheritance systems in computer science Nouns & Lexical Inheritance

Lexical Inheritance Example: 

Hypernyms of maple (sense 2): => angiospermous tree, flowering tree => tree  => woody plant, ligneous plant  => vascular plant, tracheophyte  => plant, flora, plant life  => organism, being => living thing, animate thing => object, physical object => entity, physical thing  Lexical Inheritance Example

Unique Beginners: 

Unique Beginners 25 unique beginners

noun.Tops file: 

noun.Tops file Contains very general classifications

Noun Characteristics: 

Three types of characteristics that distinguish a noun: Attributes (modified by adjectives—not implemented) Parts (links to other nouns—implemented as meronymy relationship) Functions (links to verbs—not implemented) Noun Characteristics

Meronymy: 

Meronyms represent parts of a concept Meronyms of tree: HAS SUBSTANCE: sapwood  HAS SUBSTANCE: heartwood, duramen HAS PART: stump, tree stump HAS PART: crown, capitulum, treetop  HAS PART: limb, tree branch  HAS PART: trunk, tree trunk, bole  HAS PART: burl  Meronymy

Other Noun Relations: 

Coordinate terms Words that have the same hypernym Antonyms Other Noun Relations

Adjectives: 

Adjectives modify a noun Two major categories: descriptive (big, interesting, possible) & relational (presidential, nuclear) Plus a small set of reference-modifying nouns (former, alleged), and chromatic color adjectives Antonyms key to organization of adjs Adjectives

Descriptive Adjectives: 

Assign a value to an attribute of a noun The package is heavy WEIGHT(package) = heavy The key organizing relationship for adjectives is antonymy Bipolar opposition wet ! dry, heavy ! light Descriptive Adjectives

Adjectives & Antonymy: 

Important question: Should antonym pointers be between synsets or between word forms? Consider these two synsets: {heavy, weighty, ponderous} {light, weightless, airy} Heavy/light are antonyms, but what about ponderous/light? Therefore, antonym relationship between word forms, not synsets Adjectives & Antonymy

Adjectives Relationships: 

Adjective relation between antonym pair Adjectives Relationships

Relational Adjectives: 

Relational adjectives such as presidential are derived from nouns Pointers (Pertains to…) to their related noun Un- or non- prefix usually needed to represent antonyms Presidential/unpresidential Relational Adjectives

Adverbs: 

Adverbs modify verbs Similar to antonym/similarity relationship of adjectives Have links to ‘root adjectives’ Quickly: Derived from adj quick (Sense 1) Adverbs

Verbs: 

Polysemous 2.11 senses on average vs. 1.74 of nouns More flexible meanings 21,000 verb word forms in 8,400 synsets Divided into 14 files, based on semantic criteria. (plus 1 file of state verbs such as suffice, belong, resemble) Verbs

Verb Semantics: 

Decompositional semantics represents verbs with irreducible meaning atoms (EVENT, STATE, ACTION, PATH, MANNER, PLACE, etc.) Relational semantic analysis uses lexical items as the smallest unit of analysis Verbs in WordNet are semantically defined in relation to other verbs Verb Semantics

Entailment: 

Similar to the lexical inheritance of nouns, verbs are related via entailment Definition: A proposition P entails a proposition Q if and only if there is no conceivable state of affairs that could make P true and Q false In terms of verbs:Entailment holds when the sentence Someone V1 logically entails the sentence Someone V2 Examples: limp entails walk, snore entails sleep Entailment

Entailment: 

Four kinds of entailment relationships Entailment

Entailment as Pointers: 

The entailment relationship is represented with hypernym/hyponym pointers Example: hypernyms of limp: limp, hobble, hitch => walk => travel, go, move, locomote Rarely more than 3 or 4 levels deep Entailment as Pointers

Part 2: WordNet Interfaces: 

Programming interfaces to the WN database C (The original WN API) Java: jwnl Perl, PHP, MySQL, .NET (C#), Lisp, Prolog, Python Browsable interfaces Command line GUI interfaces (X Windows, Win32) Web interfaces Part 2: WordNet Interfaces

Implementation Overview: 

Implementation Overview Lexicographers' files Plain text Written by linguists Grinder Database files Software that accesses the database Original WN C interface Similar interfaces in other languages

C Interface: 

Original WN interface High- and low-level search functions, and morphological functions (Morphy) int wninit(void); /* initializes the db */ C Interface

C Interface: Search Functions: 

findtheinfo and findtheinfo_ds are the primary high-level search functions char *findtheinfo(char *searchstr, int pos, int ptr_type, int sense_num); SynsetPtr findtheinfo_ds(char *searchstr, int pos, int ptr_type, int sense_num ); Findtheinfo_ds returns a linked list data structure C Interface: Search Functions

C Interface: Search Type: 

int ptr_type specifies the search type Full list: WNHOME/include/wnconsts.h ANTPTR: Antonyms HYPERPTR: Hypernyms HYPOPTR: Hyponyms … C Interface: Search Type

C Interface: Synset Structure: 

Rules for reading ds: http://www.cogsci.princeton.edu/~wn/man1.7.1/wnsearch.3WN.html C Interface: Synset Structure

C Interface: Morhpy: 

Morphy: The WordNet morphological processor int morphinit(void); char *morphstr(char *origstr, int pos); Collocations: formed by joining words with _’s (i.e. look_up) C Interface: Morhpy

JWNL: WN for Java: 

Java WordNet Library JWNL: WN for Java

Global WordNet: 

Global WordNet Association Multi-lingual Interlinking WordNets of different languages Global WordNet

Links: 

Princeton’s WordNet site: http://www.cogsci.princeton.edu/~wn/ “Five Papers on WordNet”: http://www.cogsci.princeton.edu/~wn/5papers.pdf JWNL: http://sourceforge.net/projects/jwordnet/ Links