Outline : Outline Linguistic Theories of semantic representation
Case Frames – Fillmore – FrameNet
Lexical Conceptual Structure – Jackendoff – LCS
Proto-Roles – Dowty – PropBank
English verb classes (diathesis alternations) -
Levin - VerbNet
Manual Semantic Annotation
Automatic Semantic annotation
Parallel PropBanks and Event Relations
Thematic Proto-Roles and Argument Selection, David Dowty, Language 67: 547-619, 1991 : Thematic Proto-Roles and Argument Selection, David Dowty, Language 67: 547-619, 1991 Thanks to Michael Mulyar
Context: Thematic Roles : Context: Thematic Roles Thematic relations (Gruber 1965, Jackendoff 1972)
Traditional thematic roles types include:
Agent, Patient, Goal, Source, Theme, Experiencer, Instrument (p. 548).
“Argument-Indexing View”: thematic roles objects at syntax-semantics interface, determining a syntactic derivation or the linking relations.
Θ-Criterion (GB Theory): each NP of predicate in lexicon assigned unique θ-role (Chomsky 1981).
Problems with Thematic Role Types : Problems with Thematic Role Types Thematic role types used in many syntactic generalizations, e.g. involving empirical thematic role hierarchies. Are thematic roles syntactic universals (or e.g. constructionally defined)?
Relevance of role types to syntactic description needs motivation, e.g. in describing transitivity.
Thematic roles lack independent semantic motivation.
Apparent counter-examples to θ-criterion (Jackendoff 1987).
Encoding semantic features (Cruse 1973) may not be relevant to syntax.
Problems with Thematic Role Types : Problems with Thematic Role Types Fragmentation: Cruse (1973) subdivides Agent into four types.
Ambiguity: Andrews (1985) is Extent, an adjunct or a core argument?
Symmetric stative predicates: e.g. “This is similar to that” Distinct roles or not?
Searching for a Generalization: What is a Thematic Role?
Proto-Roles : Proto-Roles Event-dependent Proto-roles introduced
Prototypes based on shared entailments
Grammatical relations such as subject related to observed (empirical) classification of participants
Typology of grammatical relations
Proto-Agent
Proto-Patient
Proto-Agent : Proto-Agent Properties
Volitional involvement in event or state
Sentience (and/or perception)
Causing an event or change of state in another participant
Movement (relative to position of another participant)
(exists independently of event named)
*may be discourse pragmatic
Proto-Patient : Proto-Patient Properties:
Undergoes change of state
Incremental theme
Causally affected by another participant
Stationary relative to movement of another participant
(does not exist independently of the event, or at all) *may be discourse pragmatic
Argument Selection Principle : Argument Selection Principle For 2 or 3 place predicates
Based on empirical count (total of entailments for each role).
Greatest number of Proto-Agent entailments Subject; greatest number of Proto-Patient entailments Direct Object.
Alternation predicted if number of entailments for each role similar (nondiscreteness).
Worked Example: Psychological Predicates : Worked Example: Psychological Predicates Examples:
Experiencer Subject Stimulus Subject
x likes y y pleases x
x fears y y frightens x
Describes “almost the same” relation
Experiencer: sentient (P-Agent)
Stimulus: causes emotional reaction (P-Agent)
Number of proto-entailments same; but for stimulus subject verbs, experiencer also undergoes change of state (P-Patient) and is therefore lexicalized as the patient.
Symmetric Stative Predicates : Symmetric Stative Predicates Examples:
This one and that one rhyme / intersect / are similar.
This rhymes with / intersects with / is similar to that.
(cf. The drunk embraced the lamppost. / *The drunk and the lamppost embraced.)
Symmetric Predicates: Generalizing via Proto-Roles : Symmetric Predicates: Generalizing via Proto-Roles Conjoined predicate subject has Proto-Agent entailments which two-place predicate relation lacks (i.e. for object of two-place predicate).
Generalization entirely reducible to proto-roles.
Strong cognitive evidence for proto-roles: would be difficult to deduce lexically, but easy via knowledge of proto-roles.
Diathesis Alternations : Diathesis Alternations Alternations:
Spray / Load
Hit / Break
Non-alternating:
Swat / Dash
Fill / Cover
Spray / Load Alternation : Spray / Load Alternation Example:
Mary loaded the hay onto the truck.
Mary loaded the truck with hay.
Mary sprayed the paint onto the wall.
Mary sprayed the wall with paint.
Analyzed via proto-roles, not e.g. as a theme / location alternation.
Direct object analyzed as an Incremental Theme, i.e. either of two non-subject arguments qualifies as incremental theme. This accounts for alternating behavior.
Hit / Break Alternation : Hit / Break Alternation John hit the fence with a stick.
John hit the stick against a fence.
John broke the fence with a stick.
John broke the stick against the fence.
Radical change in meaning associated with break but not hit.
Explained via proto-roles (change of state for direct object with break class).
Swat doesn’t alternate… : Swat doesn’t alternate… swat the boy with a stick
*swat the stick at / against the boy
Fill / Cover : Fill / Cover Fill / Cover are non-alternating:
Bill filled the tank (with water).
*Bill filled water (into the tank).
Bill covered the ground (with a tarpaulin).
*Bill covered a tarpaulin (over the ground).
Only goal lexicalizes as incremental theme (direct object).
Conclusion : Conclusion Dowty argues for Proto-Roles based on linguistic and cognitive observations.
Objections: Are P-roles empirical (extending arguments about hit class)?
Proposition Bank:From Sentences to Propositions : Proposition Bank: From Sentences to Propositions . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.
meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) meet(Somebody1, Somebody2)
A TreeBanked phrase : A TreeBanked phrase NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give NP PP-LOC a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.
A TreeBanked phrase : A TreeBanked phrase NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give NP PP-LOC a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.
The same phrase, PropBanked : The same phrase, PropBanked that would give *T*-1 the US car maker an eventual 30% stake in the British company
Arg0 Arg2 Arg1 a GM-Jaguar pact that would give the U.S.
car maker an eventual 30% stake in the
British company.
The full sentence, PropBanked : The full sentence, PropBanked that would give *T*-1 the US car maker an eventual 30% stake in the British company
Arg0 Arg2 Arg1 Analysts have been expecting a GM-Jaguar pact
that would give the U.S. car maker an eventual
30% stake in the British company. Arg0 Arg1 have been expecting Analysts
Slide24 : Frames File Example: expect Roles:
Arg0: expecter
Arg1: thing expected
Example: Transitive, active:
Portfolio managers expect further declines in
interest rates.
Arg0: Portfolio managers
REL: expect
Arg1: further declines in interest rates
Frames File example: give : Frames File example: give Roles:
Arg0: giver
Arg1: thing given
Arg2: entity given to
Example: double object
The executives gave the chefs a standing ovation.
Arg0: The executives
REL: gave
Arg2: the chefs
Arg1: a standing ovation
Word Senses in PropBank : Word Senses in PropBank Orders to ignore word sense not feasible for 700+ verbs
Mary left the room
Mary left her daughter-in-law her pearls in her will
Frameset leave.01 "move away from":
Arg0: entity leaving
Arg1: place left
Frameset leave.02 "give":
Arg0: giver
Arg1: thing given
Arg2: beneficiary
How do these relate to traditional word senses in VerbNet and WordNet?
Annotation procedure : Annotation procedure PTB II - Extraction of all sentences with given verb
Create Frame File for that verb Paul Kingsbury
(3100+ lemmas, 4400 framesets,118K predicates)
Over 300 created automatically via VerbNet
First pass: Automatic tagging (Joseph Rosenzweig)
http://www.cis.upenn.edu/~josephr/TIDES/index.html#lexicon
Second pass: Double blind hand correction
Paul Kingsbury
Tagging tool highlights discrepancies Scott Cotton
Third pass: Solomonization (adjudication)
Betsy Klipple, Olga Babko-Malaya
Semantic role labels: : Semantic role labels: Jan broke the LCD projector.
break (agent(Jan), patient(LCD-projector))
cause(agent(Jan),
change-of-state(LCD-projector))
(broken(LCD-projector)) agent(A) -> intentional(A), sentient(A),
causer(A), affector(A)
patient(P) -> affected(P), change(P),… Filmore, 68 Jackendoff, 72 Dowty, 91
Trends in Argument Numbering : Trends in Argument Numbering Arg0 = agent
Arg1 = direct object / theme / patient
Arg2 = indirect object / benefactive / instrument / attribute / end state
Arg3 = start point / benefactive / instrument / attribute
Arg4 = end point
Per word vs frame level – more general?
Additional tags (arguments or adjuncts?) : Additional tags (arguments or adjuncts?) Variety of ArgM’s (Arg#>4):
TMP - when?
LOC - where at?
DIR - where to?
MNR - how?
PRP -why?
REC - himself, themselves, each other
PRD -this argument refers to or modifies another
ADV –others
Inflection : Inflection Verbs also marked for tense/aspect
Passive/Active
Perfect/Progressive
Third singular (is has does was)
Present/Past/Future
Infinitives/Participles/Gerunds/Finites
Modals and negations marked as ArgMs
Frames: Multiple Framesets : Frames: Multiple Framesets Framesets are not necessarily consistent between different senses of the same verb
Framesets are consistent between different verbs that share similar argument structures, (like FrameNet)
Out of the 787 most frequent verbs:
1 FrameNet – 521
2 FrameNet – 169
3+ FrameNet - 97 (includes light verbs)
Ergative/Unaccusative Verbs : Ergative/Unaccusative Verbs Roles (no ARG0 for unaccusative verbs)
Arg1 = Logical subject, patient, thing rising
Arg2 = EXT, amount risen
Arg3* = start point
Arg4 = end point
Sales rose 4% to $3.28 billion from $3.16 billion. The Nasdaq composite index added 1.01
to 456.6 on paltry volume.
PropBank/FrameNet : PropBank/FrameNet Buy
Arg0: buyer
Arg1: goods
Arg2: seller
Arg3: rate
Arg4: payment Sell
Arg0: seller
Arg1: goods
Arg2: buyer
Arg3: rate
Arg4: payment More generic, more neutral – maps readily to VN,TR
Rambow, et al, PMLB03
Annotator accuracy – ITA 84% : Annotator accuracy – ITA 84%
Limitations to PropBank : Limitations to PropBank Args2-4 seriously overloaded, poor performance
VerbNet and FrameNet both provide more fine-grained role labels
WSJ too domain specific, too financial, need broader coverage genres for more general annotation
Additional Brown corpus annotation, also GALE data
FrameNet has selected instances from BNC
Levin – English Verb Classes and Alternations: A Preliminary Investigation, 1993. : Levin – English Verb Classes and Alternations: A Preliminary Investigation, 1993.
Levin classes (Levin, 1993) : Levin classes (Levin, 1993) 3100 verbs, 47 top level classes, 193 second and third level
Each class has a syntactic signature based on alternations.
John broke the jar. / The jar broke. / Jars break easily.
John cut the bread. / *The bread cut. / Bread cuts easily.
John hit the wall. / *The wall hit. / *Walls hit easily.
Levin classes (Levin, 1993) : Levin classes (Levin, 1993) Verb class hierarchy: 3100 verbs, 47 top level classes, 193
Each class has a syntactic signature based on alternations.
John broke the jar. / The jar broke. / Jars break easily.
change-of-state
John cut the bread. / *The bread cut. / Bread cuts easily.
change-of-state, recognizable action,
sharp instrument
John hit the wall. / *The wall hit. / *Walls hit easily.
contact, exertion of force
Limitations to Levin Classes : Limitations to Levin Classes Coverage of only half of the verbs (types) in the Penn Treebank (1M words,WSJ)
Usually only one or two basic senses are covered for each verb
Confusing sets of alternations
Different classes have almost identical “syntactic signatures”
or worse, contradictory signatures Dang, Kipper & Palmer, ACL98
Multiple class listings : Multiple class listings Homonymy or polysemy?
draw a picture, draw water from the well
Conflicting alternations?
Carry verbs disallow the Conative,
(*she carried at the ball), but include
{push,pull,shove,kick,yank,tug}
also in Push/pull class, does take the Conative (she kicked at the ball)
Intersective Levin Classes : Intersective Levin Classes “at” ¬CH-LOC “across the room”
CH-LOC “apart” CH-STATE Dang, Kipper & Palmer, ACL98
Intersective Levin Classes : Intersective Levin Classes More syntactically and semantically coherent
sets of syntactic patterns
explicit semantic components
relations between senses
VERBNET
verbs.colorado.edu/~mpalmer/verbnet Dang, Kipper & Palmer, IJCAI00, Coling00
VerbNet – Karin Kipper : VerbNet – Karin Kipper
Class entries:
Capture generalizations about verb behavior
Organized hierarchically
Members have common semantic elements, semantic roles and syntactic frames
Verb entries:
Refer to a set of classes (different senses)
each class member linked to WN synset(s) (not all WN senses are covered)
Hand built resources vs. Real data : Hand built resources vs. Real data
VerbNet is based on linguistic theory –
how useful is it?
How well does it correspond to syntactic variations found in naturally occurring text?
Mapping from PropBank to VerbNet : Mapping from PropBank to VerbNet
Mapping from PB to VerbNet : Mapping from PB to VerbNet
Mapping from PropBank to VerbNet : Mapping from PropBank to VerbNet Overlap with PropBank framesets
50,000 PropBank instances
< 50% VN entries, > 85% VN classes
Results
MATCH - 78.63%. (80.90% relaxed)
(VerbNet isn’t just linguistic theory!)
Benefits
Thematic role labels and semantic predicates
Can extend PropBank coverage with VerbNet classes
WordNet sense tags Kingsbury & Kipper, NAACL03, Text Meaning Workshop
http://verbs.colorado.edu/~mpalmer/verbnet
Mapping PropBank/VerbNet : Mapping PropBank/VerbNet Extended VerbNet now covers 80% of PropBank tokens. Kipper, et. al., LREC-04, LREC-06
(added Korhonen and Briscoe classes)
Semi-automatic mapping of PropBank instances to VerbNet classes and thematic roles, hand-corrected. (final cleanup stage)
VerbNet class tagging as automatic WSD
Run SRL, map Args to VerbNet roles
Can SemLink improve Generalization? : Can SemLink improve Generalization? Overloaded Arg2-Arg5
PB: verb-by-verb
VerbNet: same thematic roles across verbs
Example
Rudolph Agnew,…, was named [ARG2 {Predicate} a nonexecutive director of this British industrial conglomerate.]
….the latest results appear in today’s New England Journal of Medicine, a forum likely to bring new attention [ARG2 {Destination} to the problem.]
Use VerbNet as a bridge to merge PB and FN and expand the Size and Variety of the Training
Automatic Labelling of Semantic Relations – Gold Standard, 77% : Automatic Labelling of Semantic Relations – Gold Standard, 77% Given a constituent to be labelled
Stochastic Model
Features:
Predicate, (verb)
Phrase Type, (NP or S-BAR)
Parse Tree Path
Position (Before/after predicate)
Voice (active/passive)
Head Word of constituent Gildea & Jurafsky, CL02, Gildea & Palmer, ACL02
Additional Automatic Role Labelers : Additional Automatic Role Labelers Performance improved from 77% to 88%
Automatic parses, 81% F, Brown corpus, 68%
Same features plus
Named Entity tags
Head word POS
For unseen verbs – backoff to automatic verb clusters
SVM’s
Role or not role
For each likely role, for each Arg#, Arg# or not
No overlapping role labels allowed
Pradhan, et. al., ICDM03, Sardeneau, et. al, ACL03,Chen & Rambow, EMNLP03, Gildea & Hockemaier, EMNLP03, Yi & Palmer, ICON04
CoNLL-04, 05 Shared Task
Arg1 groupings; (Total count 59710) : Arg1 groupings; (Total count 59710)
Arg2 groupings; (Total count 11068) : Arg2 groupings; (Total count 11068)
Process : Process Retrain the SRL tagger
Original:
Arg[0-5,A,M]
ARG1 Grouping: (similar for Arg2)
Arg[0,2-5,A,M] Arg1-Group[1-6]
Evaluation on both WSJ and Brown
More Coarse-grained or Fine-grained?
more specific: data more coherent, but more sparse
more general: consistency across verbs even for new domains?
SRL Performance (WSJ/BROWN) : SRL Performance (WSJ/BROWN) Loper, Yi, Palmer, SIGSEM07