Computational Semanticshttp://www.coli.uni-sb.de/cl/projects/milca/esslliDay II: A Modular Architecture : Computational Semantics http://www.coli.uni-sb.de/cl/projects/milca/esslli Day II: A Modular Architecture Aljoscha Burchardt,
Alexander Koller,
Stephan Walter,
Universität des Saarlandes,
Saarbrücken, Germany
ESSLLI 2004, Nancy, France
Computing Semantic Representations : Computing Semantic Representations Yesterday:
-Calculus is a nice tool for systematic meaning construction.
We saw a first, sketchy implementation
Some things still to be done
Today:
Let’s fix the problems
Let’s build nice software
Yesterday: -Calculus : Semantic representations constructed along the syntax tree: How to get there?
By using functional application
s help to guide arguments in the right place on -reduction:
Yesterday: -Calculus x.love(x,mary)@john love(john,mary)
Yesterday’s disappointment : Yesterday’s disappointment Our first idea for NPs with determiner didn’t work out:
'A man' ~andgt; z.man(z)
„A man loves Mary' ~andgt; * love(z.man(z),mary)
z.man(z) just isn‘t the meaning of „a man'.
If anything, it translates the complete sentence
„There is a man'
Let‘s try again, systematically… But what was the idea after all?
Nothing!
A solution : z(man(z) love(z,mary)) z(y.man(y)(z) x.love(x,mary)(z)) A solution What we want is:
„A man loves Mary' ~andgt; z(man(z) love(z,mary)) What we have is:
'man' ~andgt; y.man(y)
'loves Mary' ~andgt; x.love(x,mary)
How about: z(man(z) love(z,mary)) z(y.man(y)(z) love(z,mary)) z(y.man(y)(z) love(z,mary)) z(y.man(y)(z) x.love(x,mary)(z)) z(y.man(y)(z) x.love(x,mary)(z)) Remember: We can use variables for any kind of term.
So next: z(y.man(y)(z) x.love(x,mary)(z)) Q. Q(z)) x.love(x,mary) P( P )y.man(y) P(Q.z(P(z) Q(z))) andlt;~ 'A'
But… : P(Q.z(P(z)Q(z)))@y.man(y) x.love(x,mary) @ Q.z(man(z)Q(z)) But… 'A man … loves Mary' x.love(x,mary) 'John … loves Mary' @ john not systematic! P.P@john @ x.love(x,mary) better! x.love(x,mary)@john love(john,mary) So: 'John' ~andgt; P.P(john) fine! z.man(z) x.love(x,mary)(z) man(z) love(z,mary) not reducible! x.love(x,mary) @ john P(Q.z(P(z) Q(z)))@ y.man(y) @ x.love(x,mary)
Transitive Verbs : 'loves Mary' ~andgt; yx.love(x,y)@Q.Q(mary) Transitive Verbs What about transitive verbs (like 'love')? 'Mary' ~andgt; Q.Q(mary) x.love(x,Q.Q(mary)) How about something a little more complicated: 'loves' ~andgt; Rx(R@y.love(x,y)) The only way to understand this is to see it in action... 'loves' ~andgt; yx.love(x,y) ??? won't do:
"John loves Mary" again... : x(y.love(x,y)(mary)) 'John loves Mary' again... loves Mary Rx(R@y.love(x,y)) P.P(mary) x.love(x,mary)(john) John P.P(john) x.love(x,mary) love(john,mary) love(john,mary) love(john,mary)
Summing up : Summing up nouns: 'man' ~andgt; x.man(x)
intransitive verbs: „smoke' ~andgt; x.smoke(x)
determiner: „a' ~andgt; P(Q.z(P(z) Q(z)))
proper names: „mary' ~andgt; P.P(mary)
transitive verbs: 'love' ~andgt; Rx(R@y.love(x,y))
Today‘s first success : Today‘s first success What we can do now (and could not do yesterday):
Complex NPs (with determiners)
Transitive verbs
… and all in the same way.
Key ideas:
Extra λs for NPs
Variables for predicates
Apply subject NP to VP
Yesterday’s implementation : Yesterday’s implementation s(VP@NP) --andgt; np(NP),vp(VP).
np(john) --andgt; [john].
np(mary) --andgt; [mary].
tv(lambda(X,lambda(Y,love(Y,X)))) --andgt; [loves], {vars2atoms(X),vars2atoms(Y)}.
iv(lambda(X,smoke(X))) --andgt; [smokes], {vars2atoms(X)}.
iv(lambda(X,snore(X))) --andgt; [snorts], {vars2atoms(X)}.
vp(TV@NP) --andgt; tv(TV),np(NP).
vp(IV) --andgt; iv(IV).
% This doesn't work!
np(exists(X,man(X))) --andgt; [a,man], {vars2atoms(X)}.
Was this a good implementation?
A Nice Implementation : A Nice Implementation What is a nice implementation? It should be:
Scalable: If it works with five examples, upgrading to 5000 shouldn’t be a great problem (e.g. new constructions in the grammar, more words...)
Re-usable: Small changes in our ideas about the system shouldn’t lead to complex changes in the implementation (e.g. a new representation language)
Solution: Modularity : Solution: Modularity Think about your problem in terms of interacting conceptual components
Encapsulate these components into modules of your implementation, with clean and abstract pre-defined interfaces to each other
Extend or change modules to scale / adapt the implementation
Another look at yesterday’s implementation : Another look at yesterday’s implementation Okay, because it was small
Not modular at all: all linguistic functionality in one file, packed inside the DCG
E.g. scalability of the lexicon: Always have to write new rules, like:
tv(lambda(X,lambda(Y,visit(Y,X)))) --andgt; [visit], {vars2atoms(X),vars2atoms(Y)}.
Changing parts for Adaptation? Change every single rule!
Let's modularize!
Semantic Construction:Conceptual Components : smoke(j)
'John smokes' Semantic Construction: Conceptual Components Black Box
Semantic Construction:Inside the Black Box : Semantic Construction: Inside the Black Box Black Box Words
(lexical) Phrases
(combinatorial) Syntax Semantics DCG combine-rules lexicon-facts
DCG : DCG The DCG-rules tell us what phrases are acceptable (mainly). Their basic structure is:
s(...) --andgt; np(...), vp(...), {...}.
np(...) --andgt; det(...), noun(...), {...}.
np(...) --andgt; pn(...), {...}.
vp(...) --andgt; tv(...), np(...), {...}.
vp(...) --andgt; iv(...), {...}.
(The gaps will be filled later on)
combine-rules : combine-rules The combine-rules encode the actual semantic construction process. That is, they glue representations together using @:
combine(s:(NP@VP),[np:NP,vp:VP]). combine(np:(DET@N),[det:DET,n:N]). combine(np:PN,[pn:PN]).
combine(vp:IV,[iv:IV]). combine(vp:(TV@NP),[tv:TV,np:NP]).
Lexicon : The lexicon-facts hold the elementary information connected to words:
lexicon(noun,bird,[bird]).
lexicon(pn,anna,[anna]).
lexicon(iv,purr,[purrs]).
lexicon(tv,eat,[eats]).
Lexicon Their slots contain:
syntactic category
constant / relation symbol ('core' semantics)
the surface form of the word. lexicon(tv,eat,[eats]). lexicon(tv,eat,[eats]). lexicon(tv,eat,[eats]).
Interfaces : Interfaces Words
(lexical) Phrases
(combinatorial) Syntax Semantics DCG combine-rules lexicon-facts lexicon-calls Semantic macros combine-calls
Interfaces in the DCG : Interfaces in the DCG
Lexical rules are now fully abstract. We have one for each category (iv, tv, n, ...). The DCG uses lexicon-calls and semantic macros like this:
iv(IV)--andgt; {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word.
pn(PN)--andgt; {lexicon(pn,Sym,Word),pnSem(Sym,PN)}, Word.
In the combinatorial rules, using combine-calls like this:
vp(VP)--andgt; iv(IV),{combine(vp:VP,[iv:IV])}.
s(S)--andgt; np(NP), vp(VP), {combine(s:S,[np:NP,vp:VP])}.
Information is transported between the three components of our system by additional calls and variables in the DCG:
Interfaces: How they work : Interfaces: How they work iv(IV)--andgt; {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word. (e.g. 'smokes') looks up the Word found in the string, ... When this rule applies, the syntactic analysis component: ... checks that its category is iv, ... ... and retrieves the relation symbol Sym to be used in the semantic construction. lexicon(iv, smoke, [smokes]) lexicon(iv, smoke, [smokes]) lexicon(iv, smoke, [smokes]) So we have: Word = [smokes]
Sym = smoke
Interfaces: How they work II : Interfaces: How they work II The DCG-rule is now fully instantiated and looks like this:
iv(lambda(X, smoke(X)))--andgt;
{lexicon(iv,smoke,[smokes]), ivSem(smoke, lambda(X, smoke(X)))}, [smokes]. iv(IV)--andgt; {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word. Then, the semantic construction component: Sym = smoke takes Sym ... ... and uses the semantic macro ivSem ... ... to transfer it into a full semantic representation for an intransitive verb. ivSem(Sym,IV) ivSem(smoke,IV) ivSem(smoke,lambda(X, smoke(X)))
What’s inside a semantic macro? : What’s inside a semantic macro? Semantic macros simply specify how to make a valid semantic representation out of a naked symbol. The one we’ve just seen in action for the verb 'smokes' was: ivSem(Sym,lambda(X,Formula)):- compose(Formula,Sym,[X]). compose builds a first-order formula out of Sym and a new variable X:
Formula = smoke(X)
This is then embedded into a - abstraction over the same X:
lambda(X, smoke(X)) Another one, without compose:
pnSem(Sym,lambda(P,P@Sym)).
john lambda(P,P@john)
Slide25 : Words
(lexical) Phrases
(combinatorial) Syntax Semantics pn(PN) --andgt; …,[john]
iv(IV) --andgt; …,[smokes] Word =[john]
Word = [smokes] pnSem(Sym,PN) Sym = john
ivSem(Sym,IV) Sym = smoke PN = lambda(P,P@john)
IV = lambda(X,smoke(X)) 'John smokes' lexicon(pn,john,[john]). lexicon(iv,smoke,[smokes]). np(NP) --andgt; …,pn(PN)
vp(VP) --andgt; …,iv(IV) NP = lambda(P,P@john)
VP = lambda(X,smoke(X)) s(S)--andgt; np(NP), vp(VP) ,{combine(s:S,[np:NP,vp:VP])}.
A look at combine : A look at combine combine(s:NP@VP,[np:NP,vp:VP]). S = NP@VP
NP = lambda(P,P@john)
VP = lambda(X,smoke(X)) So: S = lambda(P,P@john)@lambda(X,smoke(X)) That’s almost all, folks… betaConvert(lambda(P,P@john)@lambda(X,smoke(X), Converted)
Converted = smoke(john)
Little Cheats : Little Cheats Determiners: ('every man')
No semantic Sym in the lexicon:
lexicon(det,_,[every],uni).
Semantic representation generated by the macro alone:
detSem(uni,lambda(P,lambda(Q,
forall(X,(P@X)andgt;(Q@X))))). Negation – same thing: ('does not walk')
No semantic Sym in the lexicon:
lexicon(mod,_,[does,not],neg).
Representation solely from macro:
modSem(neg,lambda(P,lambda(X,~(P@X)))).
A few 'special words' are dealt with in a somewhat different manner:
The code that's online(http://www.coli.uni-sb.de/cl/projects/milca/esslli) : The code that's online (http://www.coli.uni-sb.de/cl/projects/milca/esslli) lexicon-facts have fourth argument for any kind of additional information:
lexicon(tv,eat,[eats],fin).
iv/tv have additional argument for infinite /fin.:
iv(I,IV)--andgt; {lexicon(iv,Sym,Word,I),…}, Word.
limited coordination, hence doubled categories:
vp2(VP2)--andgt; vp1(VP1A), coord(C), vp1(VP1B),
{combine(vp2:VP2,[vp1:VP1A,coord:C,vp1:VP1B])}.
vp1(VP1)--andgt; v2(fin,V2),
{combine(vp1:VP1,[v2:V2])}.
e.g. 'eat' vs. 'eats' e.g. fin/inf, gender e.g. 'talks and walks'
A demo : A demo lambda :-
readLine(Sentence),
parse(Sentence,Formula), resetVars, vars2atoms(Formula),
betaConvert(Formula,Converted),
printRepresentations([Converted]).
Evaluation : Evaluation Our new program has become much bigger, but it's…
Modular: everything's in its right place:
Syntax in englishGrammar.pl
Semantics (macros + combine) in lambda.pl
Lexicon in lexicon.pl
Scalable: E.g. extend the lexicon by adding facts to lexicon.pl
Re-usable: E.g change only lambda.pl and keep the rest for changing the semantic construction method (e.g. to CLLS on Thursday)
What we‘ve done today : What we‘ve done today Complex NPs, PNs and TVs in λ-based semantic construction
A clean semantic construction framework in Prolog
Its instantiation for -based semantic construction
Ambiguity : Ambiguity Some sentences have more than one reading, i.e. more than one semantic representation.
Standard Example: 'Every man loves a woman':
Reading 1: the women may be different
x(man(x) -andgt; y(woman(y) love(x,y)))
Reading 2: there is one particular woman
y(woman(y) x(man(x) -andgt; love(x,y)))
What does our system do?
Excursion: lambda, variables and atoms : betaReduce(lambda(X, F)@X,F). betaReduce(lambda(john,walk(john))@john, walk(john)) Excursion: lambda, variables and atoms Question yesterday: Why don't we use Prolog variables for FO-variables?
Advantage (at first sight): -reduction as unification:
betaReduce(lambda(X, F)@X,F).
Now: X = john, F = walk(X) ('John walks')
F = walk(john) Nice, but…
Problem: Coordination : Problem: Coordination 'John and Mary'
(X. Y.P((X@P) (Y@P))@ Q.Q(john))@R.R(mary) P((Q.Q(john)@P) (R.R(mary)@P)) P(P(john) P(mary)) 'John and Mary walk' P(P(john) P(mary))@ x.walk(x) x.walk(x)@john x.walk(x)@mary lambda(X,walk(X))@john andamp; lambda(X,walk(X))@mary -reduction as unification:
X = john
X = mary